[PATCH] nfsd: Make creates return EEXIST correctly instead of EPERM

linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* [PATCH] nfsd: Make creates return EEXIST correctly instead of EPERM
@ 2016-07-08  1:47 Oleg Drokin
  2016-07-08 11:02 ` Jeff Layton
  2016-07-08 20:54 ` J. Bruce Fields
  0 siblings, 2 replies; 37+ messages in thread
From: Oleg Drokin @ 2016-07-08  1:47 UTC (permalink / raw)
  To: Jeff Layton, J. Bruce Fields; +Cc: linux-nfs, linux-kernel, Oleg Drokin

It looks like we are bit overzealous about failing mkdir/create/mknod
with permission denied if the parent dir is not writeable.
Need to make sure the name does not exist first, because we need to
return EEXIST in that case.

Signed-off-by: Oleg Drokin <green@linuxhacker.ru>
---
A very similar problem exists with symlinks, but the patch is more
involved, so assuming this one is ok, I'll send a symlink one separately.
 fs/nfsd/nfs4proc.c |  6 +++++-
 fs/nfsd/vfs.c      | 11 ++++++++++-
 2 files changed, 15 insertions(+), 2 deletions(-)

diff --git a/fs/nfsd/nfs4proc.c b/fs/nfsd/nfs4proc.c
index de1ff1d..0067520 100644
--- a/fs/nfsd/nfs4proc.c
+++ b/fs/nfsd/nfs4proc.c
@@ -605,8 +605,12 @@ nfsd4_create(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
 
 	fh_init(&resfh, NFS4_FHSIZE);
 
+	/*
+	 * We just check thta parent is accessible here, nfsd_* do their
+	 * own access permission checks
+	 */
 	status = fh_verify(rqstp, &cstate->current_fh, S_IFDIR,
-			   NFSD_MAY_CREATE);
+			   NFSD_MAY_EXEC);
 	if (status)
 		return status;
 
diff --git a/fs/nfsd/vfs.c b/fs/nfsd/vfs.c
index 6fbd81e..6a45ec6 100644
--- a/fs/nfsd/vfs.c
+++ b/fs/nfsd/vfs.c
@@ -1161,7 +1161,11 @@ nfsd_create(struct svc_rqst *rqstp, struct svc_fh *fhp,
 	if (isdotent(fname, flen))
 		goto out;
 
-	err = fh_verify(rqstp, fhp, S_IFDIR, NFSD_MAY_CREATE);
+	/*
+	 * Even though it is a create, first we see if we are even allowed
+	 * to peek inside the parent
+	 */
+	err = fh_verify(rqstp, fhp, S_IFDIR, NFSD_MAY_EXEC);
 	if (err)
 		goto out;
 
@@ -1211,6 +1215,11 @@ nfsd_create(struct svc_rqst *rqstp, struct svc_fh *fhp,
 		goto out; 
 	}
 
+	/* Now let's see if we actually have permissions to create */
+	err = nfsd_permission(rqstp, fhp->fh_export, dentry, NFSD_MAY_CREATE);
+	if (err)
+		goto out;
+
 	if (!(iap->ia_valid & ATTR_MODE))
 		iap->ia_mode = 0;
 	iap->ia_mode = (iap->ia_mode & S_IALLUGO) | type;
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 37+ messages in thread

* Re: [PATCH] nfsd: Make creates return EEXIST correctly instead of EPERM
  2016-07-08  1:47 [PATCH] nfsd: Make creates return EEXIST correctly instead of EPERM Oleg Drokin
@ 2016-07-08 11:02 ` Jeff Layton
  2016-07-08 15:14   ` Oleg Drokin
  2016-07-08 20:54 ` J. Bruce Fields
  1 sibling, 1 reply; 37+ messages in thread
From: Jeff Layton @ 2016-07-08 11:02 UTC (permalink / raw)
  To: Oleg Drokin, J. Bruce Fields; +Cc: linux-nfs, linux-kernel

On Thu, 2016-07-07 at 21:47 -0400, Oleg Drokin wrote:
> It looks like we are bit overzealous about failing mkdir/create/mknod
> with permission denied if the parent dir is not writeable.
> Need to make sure the name does not exist first, because we need to
> return EEXIST in that case.
> 
> Signed-off-by: Oleg Drokin <green@linuxhacker.ru>
> ---
> A very similar problem exists with symlinks, but the patch is more
> involved, so assuming this one is ok, I'll send a symlink one separately.
>  fs/nfsd/nfs4proc.c |  6 +++++-
>  fs/nfsd/vfs.c      | 11 ++++++++++-
>  2 files changed, 15 insertions(+), 2 deletions(-)
> 


nit: subject says EPERM, but I think you mean EACCES. The mnemonic I've
always used is that EPERM is "permanent". IOW, changing permissions
won't ever allow the user to do something. For instance, unprivileged
users can never chown a file, so they should get back EPERM there. When
a directory isn't writeable on a create they should get EACCES since
they could do the create if the directory were writeable.


> diff --git a/fs/nfsd/nfs4proc.c b/fs/nfsd/nfs4proc.c
> index de1ff1d..0067520 100644
> --- a/fs/nfsd/nfs4proc.c
> +++ b/fs/nfsd/nfs4proc.c
> @@ -605,8 +605,12 @@ nfsd4_create(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
>  
>  	fh_init(&resfh, NFS4_FHSIZE);
>  
> +	/*
> +	 * We just check thta parent is accessible here, nfsd_* do their
> +	 * own access permission checks
> +	 */
>  	status = fh_verify(rqstp, &cstate->current_fh, S_IFDIR,
> -			   NFSD_MAY_CREATE);
> +			   NFSD_MAY_EXEC);
>  	if (status)
>  		return status;
>  
> diff --git a/fs/nfsd/vfs.c b/fs/nfsd/vfs.c
> index 6fbd81e..6a45ec6 100644
> --- a/fs/nfsd/vfs.c
> +++ b/fs/nfsd/vfs.c
> @@ -1161,7 +1161,11 @@ nfsd_create(struct svc_rqst *rqstp, struct svc_fh *fhp,
>  	if (isdotent(fname, flen))
>  		goto out;
>  
> -	err = fh_verify(rqstp, fhp, S_IFDIR, NFSD_MAY_CREATE);
> +	/*
> +	 * Even though it is a create, first we see if we are even allowed
> +	 * to peek inside the parent
> +	 */
> +	err = fh_verify(rqstp, fhp, S_IFDIR, NFSD_MAY_EXEC);
>  	if (err)
>  		goto out;
>  
> @@ -1211,6 +1215,11 @@ nfsd_create(struct svc_rqst *rqstp, struct svc_fh *fhp,
>  		goto out; 
>  	}
>  
> +	/* Now let's see if we actually have permissions to create */
> +	err = nfsd_permission(rqstp, fhp->fh_export, dentry, NFSD_MAY_CREATE);
> +	if (err)
> +		goto out;
> +
>  	if (!(iap->ia_valid & ATTR_MODE))
>  		iap->ia_mode = 0;
>  	iap->ia_mode = (iap->ia_mode & S_IALLUGO) | type;


Ouch. This means two nfsd_permission calls per create operation. If
it's necessary for correctness then so be it, but is it actually
documented anywhere (POSIX perhaps?) that we must prefer EEXIST over
EACCES in this situation?

-- 

Jeff Layton <jlayton@poochiereds.net>

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH] nfsd: Make creates return EEXIST correctly instead of EPERM
  2016-07-08 11:02 ` Jeff Layton
@ 2016-07-08 15:14   ` Oleg Drokin
  2016-07-08 15:53     ` Jeff Layton
  0 siblings, 1 reply; 37+ messages in thread
From: Oleg Drokin @ 2016-07-08 15:14 UTC (permalink / raw)
  To: Jeff Layton; +Cc: J. Bruce Fields, linux-nfs, linux-kernel


On Jul 8, 2016, at 7:02 AM, Jeff Layton wrote:

> On Thu, 2016-07-07 at 21:47 -0400, Oleg Drokin wrote:
>> It looks like we are bit overzealous about failing mkdir/create/mknod
>> with permission denied if the parent dir is not writeable.
>> Need to make sure the name does not exist first, because we need to
>> return EEXIST in that case.
>> 
>> Signed-off-by: Oleg Drokin <green@linuxhacker.ru>
>> ---
>> A very similar problem exists with symlinks, but the patch is more
>> involved, so assuming this one is ok, I'll send a symlink one separately.
>>  fs/nfsd/nfs4proc.c |  6 +++++-
>>  fs/nfsd/vfs.c      | 11 ++++++++++-
>>  2 files changed, 15 insertions(+), 2 deletions(-)
>> 
> 
> 
> nit: subject says EPERM, but I think you mean EACCES. The mnemonic I've
> always used is that EPERM is "permanent". IOW, changing permissions
> won't ever allow the user to do something. For instance, unprivileged
> users can never chown a file, so they should get back EPERM there. When
> a directory isn't writeable on a create they should get EACCES since
> they could do the create if the directory were writeable.

Hm, I see, thanks.
Confusing that you get "Permission denied" from perror ;)

>> diff --git a/fs/nfsd/nfs4proc.c b/fs/nfsd/nfs4proc.c
>> index de1ff1d..0067520 100644
>> --- a/fs/nfsd/nfs4proc.c
>> +++ b/fs/nfsd/nfs4proc.c
>> @@ -605,8 +605,12 @@ nfsd4_create(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
>>  
>>  	fh_init(&resfh, NFS4_FHSIZE);
>>  
>> +	/*
>> +	 * We just check thta parent is accessible here, nfsd_* do their
>> +	 * own access permission checks
>> +	 */
>>  	status = fh_verify(rqstp, &cstate->current_fh, S_IFDIR,
>> -			   NFSD_MAY_CREATE);
>> +			   NFSD_MAY_EXEC);
>>  	if (status)
>>  		return status;
>>  
>> diff --git a/fs/nfsd/vfs.c b/fs/nfsd/vfs.c
>> index 6fbd81e..6a45ec6 100644
>> --- a/fs/nfsd/vfs.c
>> +++ b/fs/nfsd/vfs.c
>> @@ -1161,7 +1161,11 @@ nfsd_create(struct svc_rqst *rqstp, struct svc_fh *fhp,
>>  	if (isdotent(fname, flen))
>>  		goto out;
>>  
>> -	err = fh_verify(rqstp, fhp, S_IFDIR, NFSD_MAY_CREATE);
>> +	/*
>> +	 * Even though it is a create, first we see if we are even allowed
>> +	 * to peek inside the parent
>> +	 */
>> +	err = fh_verify(rqstp, fhp, S_IFDIR, NFSD_MAY_EXEC);
>>  	if (err)
>>  		goto out;
>>  
>> @@ -1211,6 +1215,11 @@ nfsd_create(struct svc_rqst *rqstp, struct svc_fh *fhp,
>>  		goto out; 
>>  	}
>>  
>> +	/* Now let's see if we actually have permissions to create */
>> +	err = nfsd_permission(rqstp, fhp->fh_export, dentry, NFSD_MAY_CREATE);
>> +	if (err)
>> +		goto out;
>> +
>>  	if (!(iap->ia_valid & ATTR_MODE))
>>  		iap->ia_mode = 0;
>>  	iap->ia_mode = (iap->ia_mode & S_IALLUGO) | type;
> 
> 
> Ouch. This means two nfsd_permission calls per create operation. If
> it's necessary for correctness then so be it, but is it actually
> documented anywhere (POSIX perhaps?) that we must prefer EEXIST over
> EACCES in this situation?

Opengroup manpage: http://pubs.opengroup.org/onlinepubs/009695399/functions/mkdir.html
newer version is here:
http://pubs.opengroup.org/onlinepubs/9699919799/

They tell us that we absolutely must fail with EEXIST if the name is a symlink
(so we need to lookup it anyway), and also that EEXIST is the failure code
if the path exists.

Are double permission checks really as bad for nfs? it looked like it would
call mostly into VFS so even if first call would be expensive, second call should
be really cheap?

In theory we can do the fh_verify(.., NFSD_MAY_NOP) to skip the first
nfsd_permission() call, but I suspect then we'd violate the spec in the other way:
i.e. mkdir in a directory where you cannot read and execute will give you
EEXIST where as that's when EACCES is desired I imagine.


> 
> -- 
> 
> Jeff Layton <jlayton@poochiereds.net>

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH] nfsd: Make creates return EEXIST correctly instead of EPERM
  2016-07-08 15:14   ` Oleg Drokin
@ 2016-07-08 15:53     ` Jeff Layton
  2016-07-08 15:59       ` Oleg Drokin
  2016-07-08 16:04       ` J. Bruce Fields
  0 siblings, 2 replies; 37+ messages in thread
From: Jeff Layton @ 2016-07-08 15:53 UTC (permalink / raw)
  To: Oleg Drokin; +Cc: J. Bruce Fields, linux-nfs, linux-kernel

On Fri, 2016-07-08 at 11:14 -0400, Oleg Drokin wrote:
> On Jul 8, 2016, at 7:02 AM, Jeff Layton wrote:
> 
> > On Thu, 2016-07-07 at 21:47 -0400, Oleg Drokin wrote:
> > > It looks like we are bit overzealous about failing mkdir/create/mknod
> > > with permission denied if the parent dir is not writeable.
> > > Need to make sure the name does not exist first, because we need to
> > > return EEXIST in that case.
> > > 
> > > Signed-off-by: Oleg Drokin <green@linuxhacker.ru>
> > > ---
> > > A very similar problem exists with symlinks, but the patch is more
> > > involved, so assuming this one is ok, I'll send a symlink one separately.
> > >  fs/nfsd/nfs4proc.c |  6 +++++-
> > >  fs/nfsd/vfs.c      | 11 ++++++++++-
> > >  2 files changed, 15 insertions(+), 2 deletions(-)
> > > 
> > 
> > 
> > nit: subject says EPERM, but I think you mean EACCES. The mnemonic I've
> > always used is that EPERM is "permanent". IOW, changing permissions
> > won't ever allow the user to do something. For instance, unprivileged
> > users can never chown a file, so they should get back EPERM there. When
> > a directory isn't writeable on a create they should get EACCES since
> > they could do the create if the directory were writeable.
> 
> Hm, I see, thanks.
> Confusing that you get "Permission denied" from perror ;)
> 

Yes indeed. It's a subtle and confusing distinction.

> > > diff --git a/fs/nfsd/nfs4proc.c b/fs/nfsd/nfs4proc.c
> > > index de1ff1d..0067520 100644
> > > --- a/fs/nfsd/nfs4proc.c
> > > +++ b/fs/nfsd/nfs4proc.c
> > > @@ -605,8 +605,12 @@ nfsd4_create(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
> > >  
> > >  	fh_init(&resfh, NFS4_FHSIZE);
> > >  
> > > +	/*
> > > +	 * We just check thta parent is accessible here, nfsd_* do their
> > > +	 * own access permission checks
> > > +	 */
> > >  	status = fh_verify(rqstp, &cstate->current_fh, S_IFDIR,
> > > -			   NFSD_MAY_CREATE);
> > > +			   NFSD_MAY_EXEC);
> > >  	if (status)
> > >  		return status;
> > >  
> > > diff --git a/fs/nfsd/vfs.c b/fs/nfsd/vfs.c
> > > index 6fbd81e..6a45ec6 100644
> > > --- a/fs/nfsd/vfs.c
> > > +++ b/fs/nfsd/vfs.c
> > > @@ -1161,7 +1161,11 @@ nfsd_create(struct svc_rqst *rqstp, struct svc_fh *fhp,
> > >  	if (isdotent(fname, flen))
> > >  		goto out;
> > >  
> > > -	err = fh_verify(rqstp, fhp, S_IFDIR, NFSD_MAY_CREATE);
> > > +	/*
> > > +	 * Even though it is a create, first we see if we are even allowed
> > > +	 * to peek inside the parent
> > > +	 */
> > > +	err = fh_verify(rqstp, fhp, S_IFDIR, NFSD_MAY_EXEC);
> > >  	if (err)
> > >  		goto out;
> > >  
> > > @@ -1211,6 +1215,11 @@ nfsd_create(struct svc_rqst *rqstp, struct svc_fh *fhp,
> > >  		goto out; 
> > >  	}
> > >  
> > > +	/* Now let's see if we actually have permissions to create */
> > > +	err = nfsd_permission(rqstp, fhp->fh_export, dentry, NFSD_MAY_CREATE);
> > > +	if (err)
> > > +		goto out;
> > > +
> > >  	if (!(iap->ia_valid & ATTR_MODE))
> > >  		iap->ia_mode = 0;
> > >  	iap->ia_mode = (iap->ia_mode & S_IALLUGO) | type;
> > 
> > 
> > Ouch. This means two nfsd_permission calls per create operation. If
> > it's necessary for correctness then so be it, but is it actually
> > documented anywhere (POSIX perhaps?) that we must prefer EEXIST over
> > EACCES in this situation?
> 
> Opengroup manpage: http://pubs.opengroup.org/onlinepubs/009695399/functions/mkdir.html
> newer version is here:
> http://pubs.opengroup.org/onlinepubs/9699919799/
> 
> They tell us that we absolutely must fail with EEXIST if the name is a symlink
> (so we need to lookup it anyway), and also that EEXIST is the failure code
> if the path exists.
> 

I'm not sure that that verbiage supersedes the fact that you don't have
write permissions on the directory. Does it?

ISTM that it's perfectly valid to shortcut looking up the dentry if the
user doesn't have write permissions on the directory, even when the
target is a symlink.

IOW, I'm not sure I see a bug here.

> Are double permission checks really as bad for nfs? it looked like it would
> call mostly into VFS so even if first call would be expensive, second call should
> be really cheap?
> 

It depends on the underlying fs. In most cases, you're right, but you
can export things that overload the ->permission op, and those can be
as expensive as they like (within reason of course).


> In theory we can do the fh_verify(.., NFSD_MAY_NOP) to skip the first
> nfsd_permission() call, but I suspect then we'd violate the spec in the other way:
> i.e. mkdir in a directory where you cannot read and execute will give you
> EEXIST where as that's when EACCES is desired I imagine.
> 

-- 

Jeff Layton <jlayton@poochiereds.net>

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH] nfsd: Make creates return EEXIST correctly instead of EPERM
  2016-07-08 15:53     ` Jeff Layton
@ 2016-07-08 15:59       ` Oleg Drokin
  2016-07-08 16:17         ` Jeff Layton
  2016-07-09  2:52         ` Al Viro
  2016-07-08 16:04       ` J. Bruce Fields
  1 sibling, 2 replies; 37+ messages in thread
From: Oleg Drokin @ 2016-07-08 15:59 UTC (permalink / raw)
  To: Jeff Layton; +Cc: J. Bruce Fields, linux-nfs, linux-kernel


On Jul 8, 2016, at 11:53 AM, Jeff Layton wrote:

> On Fri, 2016-07-08 at 11:14 -0400, Oleg Drokin wrote:
>> On Jul 8, 2016, at 7:02 AM, Jeff Layton wrote:
>> 
>>> On Thu, 2016-07-07 at 21:47 -0400, Oleg Drokin wrote:
>>>> It looks like we are bit overzealous about failing mkdir/create/mknod
>>>> with permission denied if the parent dir is not writeable.
>>>> Need to make sure the name does not exist first, because we need to
>>>> return EEXIST in that case.
>>>> 
>>>> Signed-off-by: Oleg Drokin <green@linuxhacker.ru>
>>>> ---
>>>> A very similar problem exists with symlinks, but the patch is more
>>>> involved, so assuming this one is ok, I'll send a symlink one separately.
>>>>  fs/nfsd/nfs4proc.c |  6 +++++-
>>>>  fs/nfsd/vfs.c      | 11 ++++++++++-
>>>>  2 files changed, 15 insertions(+), 2 deletions(-)
>>>> 
>>> 
>>> 
>>> nit: subject says EPERM, but I think you mean EACCES. The mnemonic I've
>>> always used is that EPERM is "permanent". IOW, changing permissions
>>> won't ever allow the user to do something. For instance, unprivileged
>>> users can never chown a file, so they should get back EPERM there. When
>>> a directory isn't writeable on a create they should get EACCES since
>>> they could do the create if the directory were writeable.
>> 
>> Hm, I see, thanks.
>> Confusing that you get "Permission denied" from perror ;)
>> 
> 
> Yes indeed. It's a subtle and confusing distinction.
> 
>>>> diff --git a/fs/nfsd/nfs4proc.c b/fs/nfsd/nfs4proc.c
>>>> index de1ff1d..0067520 100644
>>>> --- a/fs/nfsd/nfs4proc.c
>>>> +++ b/fs/nfsd/nfs4proc.c
>>>> @@ -605,8 +605,12 @@ nfsd4_create(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
>>>>  
>>>>  	fh_init(&resfh, NFS4_FHSIZE);
>>>>  
>>>> +	/*
>>>> +	 * We just check thta parent is accessible here, nfsd_* do their
>>>> +	 * own access permission checks
>>>> +	 */
>>>>  	status = fh_verify(rqstp, &cstate->current_fh, S_IFDIR,
>>>> -			   NFSD_MAY_CREATE);
>>>> +			   NFSD_MAY_EXEC);
>>>>  	if (status)
>>>>  		return status;
>>>>  
>>>> diff --git a/fs/nfsd/vfs.c b/fs/nfsd/vfs.c
>>>> index 6fbd81e..6a45ec6 100644
>>>> --- a/fs/nfsd/vfs.c
>>>> +++ b/fs/nfsd/vfs.c
>>>> @@ -1161,7 +1161,11 @@ nfsd_create(struct svc_rqst *rqstp, struct svc_fh *fhp,
>>>>  	if (isdotent(fname, flen))
>>>>  		goto out;
>>>>  
>>>> -	err = fh_verify(rqstp, fhp, S_IFDIR, NFSD_MAY_CREATE);
>>>> +	/*
>>>> +	 * Even though it is a create, first we see if we are even allowed
>>>> +	 * to peek inside the parent
>>>> +	 */
>>>> +	err = fh_verify(rqstp, fhp, S_IFDIR, NFSD_MAY_EXEC);
>>>>  	if (err)
>>>>  		goto out;
>>>>  
>>>> @@ -1211,6 +1215,11 @@ nfsd_create(struct svc_rqst *rqstp, struct svc_fh *fhp,
>>>>  		goto out; 
>>>>  	}
>>>>  
>>>> +	/* Now let's see if we actually have permissions to create */
>>>> +	err = nfsd_permission(rqstp, fhp->fh_export, dentry, NFSD_MAY_CREATE);
>>>> +	if (err)
>>>> +		goto out;
>>>> +
>>>>  	if (!(iap->ia_valid & ATTR_MODE))
>>>>  		iap->ia_mode = 0;
>>>>  	iap->ia_mode = (iap->ia_mode & S_IALLUGO) | type;
>>> 
>>> 
>>> Ouch. This means two nfsd_permission calls per create operation. If
>>> it's necessary for correctness then so be it, but is it actually
>>> documented anywhere (POSIX perhaps?) that we must prefer EEXIST over
>>> EACCES in this situation?
>> 
>> Opengroup manpage: http://pubs.opengroup.org/onlinepubs/009695399/functions/mkdir.html
>> newer version is here:
>> http://pubs.opengroup.org/onlinepubs/9699919799/
>> 
>> They tell us that we absolutely must fail with EEXIST if the name is a symlink
>> (so we need to lookup it anyway), and also that EEXIST is the failure code
>> if the path exists.
>> 
> 
> I'm not sure that that verbiage supersedes the fact that you don't have
> write permissions on the directory. Does it?

"If path names a symbolic link, mkdir() shall fail and set errno to [EEXIST]."

This sounds pretty straightforward to me, no?
Since it does not matter that we do not have write permissions here, because
the name already exists.

(also there are tons of applications that make this assumption when
badly reimplementing their mkdir -p thing, I imagine they also have this same
reading of the man page - this is why I even care about it).

> ISTM that it's perfectly valid to shortcut looking up the dentry if the
> user doesn't have write permissions on the directory, even when the
> target is a symlink.
> 
> IOW, I'm not sure I see a bug here.
> 
>> Are double permission checks really as bad for nfs? it looked like it would
>> call mostly into VFS so even if first call would be expensive, second call should
>> be really cheap?
>> 
> 
> It depends on the underlying fs. In most cases, you're right, but you
> can export things that overload the ->permission op, and those can be
> as expensive as they like (within reason of course).
> 
> 
>> In theory we can do the fh_verify(.., NFSD_MAY_NOP) to skip the first
>> nfsd_permission() call, but I suspect then we'd violate the spec in the other way:
>> i.e. mkdir in a directory where you cannot read and execute will give you
>> EEXIST where as that's when EACCES is desired I imagine.
>> 
> 
> -- 
> 
> Jeff Layton <jlayton@poochiereds.net>

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH] nfsd: Make creates return EEXIST correctly instead of EPERM
  2016-07-08 15:53     ` Jeff Layton
  2016-07-08 15:59       ` Oleg Drokin
@ 2016-07-08 16:04       ` J. Bruce Fields
  2016-07-08 16:16         ` Oleg Drokin
  1 sibling, 1 reply; 37+ messages in thread
From: J. Bruce Fields @ 2016-07-08 16:04 UTC (permalink / raw)
  To: Jeff Layton; +Cc: Oleg Drokin, linux-nfs, linux-kernel

On Fri, Jul 08, 2016 at 11:53:28AM -0400, Jeff Layton wrote:
> On Fri, 2016-07-08 at 11:14 -0400, Oleg Drokin wrote:
> > On Jul 8, 2016, at 7:02 AM, Jeff Layton wrote:
> > 
> > > On Thu, 2016-07-07 at 21:47 -0400, Oleg Drokin wrote:
> > > > It looks like we are bit overzealous about failing mkdir/create/mknod
> > > > with permission denied if the parent dir is not writeable.
> > > > Need to make sure the name does not exist first, because we need to
> > > > return EEXIST in that case.
> > > > 
> > > > Signed-off-by: Oleg Drokin <green@linuxhacker.ru>
> > > > ---
> > > > A very similar problem exists with symlinks, but the patch is more
> > > > involved, so assuming this one is ok, I'll send a symlink one separately.
> > > >  fs/nfsd/nfs4proc.c |  6 +++++-
> > > >  fs/nfsd/vfs.c      | 11 ++++++++++-
> > > >  2 files changed, 15 insertions(+), 2 deletions(-)
> > > > 
> > > 
> > > 
> > > nit: subject says EPERM, but I think you mean EACCES. The mnemonic I've
> > > always used is that EPERM is "permanent". IOW, changing permissions
> > > won't ever allow the user to do something. For instance, unprivileged
> > > users can never chown a file, so they should get back EPERM there. When
> > > a directory isn't writeable on a create they should get EACCES since
> > > they could do the create if the directory were writeable.
> > 
> > Hm, I see, thanks.
> > Confusing that you get "Permission denied" from perror ;)
> > 
> 
> Yes indeed. It's a subtle and confusing distinction.
> 
> > > > diff --git a/fs/nfsd/nfs4proc.c b/fs/nfsd/nfs4proc.c
> > > > index de1ff1d..0067520 100644
> > > > --- a/fs/nfsd/nfs4proc.c
> > > > +++ b/fs/nfsd/nfs4proc.c
> > > > @@ -605,8 +605,12 @@ nfsd4_create(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
> > > >  
> > > >  	fh_init(&resfh, NFS4_FHSIZE);
> > > >  
> > > > +	/*
> > > > +	 * We just check thta parent is accessible here, nfsd_* do their
> > > > +	 * own access permission checks
> > > > +	 */
> > > >  	status = fh_verify(rqstp, &cstate->current_fh, S_IFDIR,
> > > > -			   NFSD_MAY_CREATE);
> > > > +			   NFSD_MAY_EXEC);
> > > >  	if (status)
> > > >  		return status;
> > > >  
> > > > diff --git a/fs/nfsd/vfs.c b/fs/nfsd/vfs.c
> > > > index 6fbd81e..6a45ec6 100644
> > > > --- a/fs/nfsd/vfs.c
> > > > +++ b/fs/nfsd/vfs.c
> > > > @@ -1161,7 +1161,11 @@ nfsd_create(struct svc_rqst *rqstp, struct svc_fh *fhp,
> > > >  	if (isdotent(fname, flen))
> > > >  		goto out;
> > > >  
> > > > -	err = fh_verify(rqstp, fhp, S_IFDIR, NFSD_MAY_CREATE);
> > > > +	/*
> > > > +	 * Even though it is a create, first we see if we are even allowed
> > > > +	 * to peek inside the parent
> > > > +	 */
> > > > +	err = fh_verify(rqstp, fhp, S_IFDIR, NFSD_MAY_EXEC);
> > > >  	if (err)
> > > >  		goto out;
> > > >  
> > > > @@ -1211,6 +1215,11 @@ nfsd_create(struct svc_rqst *rqstp, struct svc_fh *fhp,
> > > >  		goto out; 
> > > >  	}
> > > >  
> > > > +	/* Now let's see if we actually have permissions to create */
> > > > +	err = nfsd_permission(rqstp, fhp->fh_export, dentry, NFSD_MAY_CREATE);
> > > > +	if (err)
> > > > +		goto out;
> > > > +
> > > >  	if (!(iap->ia_valid & ATTR_MODE))
> > > >  		iap->ia_mode = 0;
> > > >  	iap->ia_mode = (iap->ia_mode & S_IALLUGO) | type;
> > > 
> > > 
> > > Ouch. This means two nfsd_permission calls per create operation. If
> > > it's necessary for correctness then so be it, but is it actually
> > > documented anywhere (POSIX perhaps?) that we must prefer EEXIST over
> > > EACCES in this situation?
> > 
> > Opengroup manpage: http://pubs.opengroup.org/onlinepubs/009695399/functions/mkdir.html
> > newer version is here:
> > http://pubs.opengroup.org/onlinepubs/9699919799/
> > 
> > They tell us that we absolutely must fail with EEXIST if the name is a symlink
> > (so we need to lookup it anyway), and also that EEXIST is the failure code
> > if the path exists.
> > 
> 
> I'm not sure that that verbiage supersedes the fact that you don't have
> write permissions on the directory. Does it?
> 
> ISTM that it's perfectly valid to shortcut looking up the dentry if the
> user doesn't have write permissions on the directory, even when the
> target is a symlink.
> 
> IOW, I'm not sure I see a bug here.

If this is causing real programs to behave incorrectly, then that may
matter more than the letter of the spec.  But I'm a little curious why
we'd be hearing about that just now--did the client or server's behavior
change recently?

> > Are double permission checks really as bad for nfs? it looked like it would
> > call mostly into VFS so even if first call would be expensive, second call should
> > be really cheap?
> > 
> 
> It depends on the underlying fs. In most cases, you're right, but you
> can export things that overload the ->permission op, and those can be
> as expensive as they like (within reason of course).

Weird if the expense of a second permission call is significant compared
to following the mkdir and sync.  But, what do I know.

--b.

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH] nfsd: Make creates return EEXIST correctly instead of EPERM
  2016-07-08 16:04       ` J. Bruce Fields
@ 2016-07-08 16:16         ` Oleg Drokin
  2016-07-08 20:49           ` J. Bruce Fields
  0 siblings, 1 reply; 37+ messages in thread
From: Oleg Drokin @ 2016-07-08 16:16 UTC (permalink / raw)
  To: J. Bruce Fields; +Cc: Jeff Layton, linux-nfs, linux-kernel


On Jul 8, 2016, at 12:04 PM, J. Bruce Fields wrote:

> On Fri, Jul 08, 2016 at 11:53:28AM -0400, Jeff Layton wrote:
>> On Fri, 2016-07-08 at 11:14 -0400, Oleg Drokin wrote:
>>> On Jul 8, 2016, at 7:02 AM, Jeff Layton wrote:
>>> 
>>>> On Thu, 2016-07-07 at 21:47 -0400, Oleg Drokin wrote:
>>>>> It looks like we are bit overzealous about failing mkdir/create/mknod
>>>>> with permission denied if the parent dir is not writeable.
>>>>> Need to make sure the name does not exist first, because we need to
>>>>> return EEXIST in that case.
>>>>> 
>>>>> Signed-off-by: Oleg Drokin <green@linuxhacker.ru>
>>>>> ---
>>>>> A very similar problem exists with symlinks, but the patch is more
>>>>> involved, so assuming this one is ok, I'll send a symlink one separately.
>>>>>  fs/nfsd/nfs4proc.c |  6 +++++-
>>>>>  fs/nfsd/vfs.c      | 11 ++++++++++-
>>>>>  2 files changed, 15 insertions(+), 2 deletions(-)
>>>>> 
>>>> 
>>>> 
>>>> nit: subject says EPERM, but I think you mean EACCES. The mnemonic I've
>>>> always used is that EPERM is "permanent". IOW, changing permissions
>>>> won't ever allow the user to do something. For instance, unprivileged
>>>> users can never chown a file, so they should get back EPERM there. When
>>>> a directory isn't writeable on a create they should get EACCES since
>>>> they could do the create if the directory were writeable.
>>> 
>>> Hm, I see, thanks.
>>> Confusing that you get "Permission denied" from perror ;)
>>> 
>> 
>> Yes indeed. It's a subtle and confusing distinction.
>> 
>>>>> diff --git a/fs/nfsd/nfs4proc.c b/fs/nfsd/nfs4proc.c
>>>>> index de1ff1d..0067520 100644
>>>>> --- a/fs/nfsd/nfs4proc.c
>>>>> +++ b/fs/nfsd/nfs4proc.c
>>>>> @@ -605,8 +605,12 @@ nfsd4_create(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
>>>>>  
>>>>>  	fh_init(&resfh, NFS4_FHSIZE);
>>>>>  
>>>>> +	/*
>>>>> +	 * We just check thta parent is accessible here, nfsd_* do their
>>>>> +	 * own access permission checks
>>>>> +	 */
>>>>>  	status = fh_verify(rqstp, &cstate->current_fh, S_IFDIR,
>>>>> -			   NFSD_MAY_CREATE);
>>>>> +			   NFSD_MAY_EXEC);
>>>>>  	if (status)
>>>>>  		return status;
>>>>>  
>>>>> diff --git a/fs/nfsd/vfs.c b/fs/nfsd/vfs.c
>>>>> index 6fbd81e..6a45ec6 100644
>>>>> --- a/fs/nfsd/vfs.c
>>>>> +++ b/fs/nfsd/vfs.c
>>>>> @@ -1161,7 +1161,11 @@ nfsd_create(struct svc_rqst *rqstp, struct svc_fh *fhp,
>>>>>  	if (isdotent(fname, flen))
>>>>>  		goto out;
>>>>>  
>>>>> -	err = fh_verify(rqstp, fhp, S_IFDIR, NFSD_MAY_CREATE);
>>>>> +	/*
>>>>> +	 * Even though it is a create, first we see if we are even allowed
>>>>> +	 * to peek inside the parent
>>>>> +	 */
>>>>> +	err = fh_verify(rqstp, fhp, S_IFDIR, NFSD_MAY_EXEC);
>>>>>  	if (err)
>>>>>  		goto out;
>>>>>  
>>>>> @@ -1211,6 +1215,11 @@ nfsd_create(struct svc_rqst *rqstp, struct svc_fh *fhp,
>>>>>  		goto out; 
>>>>>  	}
>>>>>  
>>>>> +	/* Now let's see if we actually have permissions to create */
>>>>> +	err = nfsd_permission(rqstp, fhp->fh_export, dentry, NFSD_MAY_CREATE);
>>>>> +	if (err)
>>>>> +		goto out;
>>>>> +
>>>>>  	if (!(iap->ia_valid & ATTR_MODE))
>>>>>  		iap->ia_mode = 0;
>>>>>  	iap->ia_mode = (iap->ia_mode & S_IALLUGO) | type;
>>>> 
>>>> 
>>>> Ouch. This means two nfsd_permission calls per create operation. If
>>>> it's necessary for correctness then so be it, but is it actually
>>>> documented anywhere (POSIX perhaps?) that we must prefer EEXIST over
>>>> EACCES in this situation?
>>> 
>>> Opengroup manpage: http://pubs.opengroup.org/onlinepubs/009695399/functions/mkdir.html
>>> newer version is here:
>>> http://pubs.opengroup.org/onlinepubs/9699919799/
>>> 
>>> They tell us that we absolutely must fail with EEXIST if the name is a symlink
>>> (so we need to lookup it anyway), and also that EEXIST is the failure code
>>> if the path exists.
>>> 
>> 
>> I'm not sure that that verbiage supersedes the fact that you don't have
>> write permissions on the directory. Does it?
>> 
>> ISTM that it's perfectly valid to shortcut looking up the dentry if the
>> user doesn't have write permissions on the directory, even when the
>> target is a symlink.
>> 
>> IOW, I'm not sure I see a bug here.
> 
> If this is causing real programs to behave incorrectly, then that may
> matter more than the letter of the spec.  But I'm a little curious why
> we'd be hearing about that just now--did the client or server's behavior
> change recently?

We, on the Lustre side, have been hearing about this since 2010, (this optimization
was enabled in 2009).

I suspect some people just complain in places that not everybody monitors.
I tried 3.10 and it has the same problem here.
I just tried on RHEL6 (2.6.32) and the problem is also apparent there.

Also it's confusing how you get different errors depending on if the cache is hot or not:
[green@centos6-16 racer]$ mkdir test
mkdir: cannot create directory `test': Permission denied
[green@centos6-16 racer]$ ls -ld test
drwxr-xr-x 2 root root 4096 Jul  8 12:12 test
[green@centos6-16 racer]$ mkdir test
mkdir: cannot create directory `test': File exists


>>> Are double permission checks really as bad for nfs? it looked like it would
>>> call mostly into VFS so even if first call would be expensive, second call should
>>> be really cheap?
>>> 
>> 
>> It depends on the underlying fs. In most cases, you're right, but you
>> can export things that overload the ->permission op, and those can be
>> as expensive as they like (within reason of course).
> 
> Weird if the expense of a second permission call is significant compared
> to following the mkdir and sync.  But, what do I know.
> 
> --b.

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH] nfsd: Make creates return EEXIST correctly instead of EPERM
  2016-07-08 15:59       ` Oleg Drokin
@ 2016-07-08 16:17         ` Jeff Layton
  2016-07-08 16:28           ` Oleg Drokin
  2016-07-09  2:52         ` Al Viro
  1 sibling, 1 reply; 37+ messages in thread
From: Jeff Layton @ 2016-07-08 16:17 UTC (permalink / raw)
  To: Oleg Drokin; +Cc: J. Bruce Fields, linux-nfs, linux-kernel

On Fri, 2016-07-08 at 11:59 -0400, Oleg Drokin wrote:
> On Jul 8, 2016, at 11:53 AM, Jeff Layton wrote:
> 
> > On Fri, 2016-07-08 at 11:14 -0400, Oleg Drokin wrote:
> > > On Jul 8, 2016, at 7:02 AM, Jeff Layton wrote:
> > > 
> > > > On Thu, 2016-07-07 at 21:47 -0400, Oleg Drokin wrote:
> > > > > It looks like we are bit overzealous about failing mkdir/create/mknod
> > > > > with permission denied if the parent dir is not writeable.
> > > > > Need to make sure the name does not exist first, because we need to
> > > > > return EEXIST in that case.
> > > > > 
> > > > > Signed-off-by: Oleg Drokin <green@linuxhacker.ru>
> > > > > ---
> > > > > A very similar problem exists with symlinks, but the patch is more
> > > > > involved, so assuming this one is ok, I'll send a symlink one separately.
> > > > >  fs/nfsd/nfs4proc.c |  6 +++++-
> > > > >  fs/nfsd/vfs.c      | 11 ++++++++++-
> > > > >  2 files changed, 15 insertions(+), 2 deletions(-)
> > > > > 
> > > > 
> > > > 
> > > > nit: subject says EPERM, but I think you mean EACCES. The mnemonic I've
> > > > always used is that EPERM is "permanent". IOW, changing permissions
> > > > won't ever allow the user to do something. For instance, unprivileged
> > > > users can never chown a file, so they should get back EPERM there. When
> > > > a directory isn't writeable on a create they should get EACCES since
> > > > they could do the create if the directory were writeable.
> > > 
> > > Hm, I see, thanks.
> > > Confusing that you get "Permission denied" from perror ;)
> > > 
> > 
> > Yes indeed. It's a subtle and confusing distinction.
> > 
> > > > > diff --git a/fs/nfsd/nfs4proc.c b/fs/nfsd/nfs4proc.c
> > > > > index de1ff1d..0067520 100644
> > > > > --- a/fs/nfsd/nfs4proc.c
> > > > > +++ b/fs/nfsd/nfs4proc.c
> > > > > @@ -605,8 +605,12 @@ nfsd4_create(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
> > > > >  
> > > > >  	fh_init(&resfh, NFS4_FHSIZE);
> > > > >  
> > > > > +	/*
> > > > > +	 * We just check thta parent is accessible here, nfsd_* do their
> > > > > +	 * own access permission checks
> > > > > +	 */
> > > > >  	status = fh_verify(rqstp, &cstate->current_fh, S_IFDIR,
> > > > > -			   NFSD_MAY_CREATE);
> > > > > +			   NFSD_MAY_EXEC);
> > > > >  	if (status)
> > > > >  		return status;
> > > > >  
> > > > > diff --git a/fs/nfsd/vfs.c b/fs/nfsd/vfs.c
> > > > > index 6fbd81e..6a45ec6 100644
> > > > > --- a/fs/nfsd/vfs.c
> > > > > +++ b/fs/nfsd/vfs.c
> > > > > @@ -1161,7 +1161,11 @@ nfsd_create(struct svc_rqst *rqstp, struct svc_fh *fhp,
> > > > >  	if (isdotent(fname, flen))
> > > > >  		goto out;
> > > > >  
> > > > > -	err = fh_verify(rqstp, fhp, S_IFDIR, NFSD_MAY_CREATE);
> > > > > +	/*
> > > > > +	 * Even though it is a create, first we see if we are even allowed
> > > > > +	 * to peek inside the parent
> > > > > +	 */
> > > > > +	err = fh_verify(rqstp, fhp, S_IFDIR, NFSD_MAY_EXEC);
> > > > >  	if (err)
> > > > >  		goto out;
> > > > >  
> > > > > @@ -1211,6 +1215,11 @@ nfsd_create(struct svc_rqst *rqstp, struct svc_fh *fhp,
> > > > >  		goto out; 
> > > > >  	}
> > > > >  
> > > > > +	/* Now let's see if we actually have permissions to create */
> > > > > +	err = nfsd_permission(rqstp, fhp->fh_export, dentry, NFSD_MAY_CREATE);
> > > > > +	if (err)
> > > > > +		goto out;
> > > > > +
> > > > >  	if (!(iap->ia_valid & ATTR_MODE))
> > > > >  		iap->ia_mode = 0;
> > > > >  	iap->ia_mode = (iap->ia_mode & S_IALLUGO) | type;
> > > > 
> > > > 
> > > > Ouch. This means two nfsd_permission calls per create operation. If
> > > > it's necessary for correctness then so be it, but is it actually
> > > > documented anywhere (POSIX perhaps?) that we must prefer EEXIST over
> > > > EACCES in this situation?
> > > 
> > > Opengroup manpage: http://pubs.opengroup.org/onlinepubs/009695399/functions/mkdir.html
> > > newer version is here:
> > > http://pubs.opengroup.org/onlinepubs/9699919799/
> > > 
> > > They tell us that we absolutely must fail with EEXIST if the name is a symlink
> > > (so we need to lookup it anyway), and also that EEXIST is the failure code
> > > if the path exists.
> > > 
> > 
> > I'm not sure that that verbiage supersedes the fact that you don't have
> > write permissions on the directory. Does it?
> 
> "If path names a symbolic link, mkdir() shall fail and set errno to [EEXIST]."
> 
> This sounds pretty straightforward to me, no?
> Since it does not matter that we do not have write permissions here, because
> the name already exists.
> 
> (also there are tons of applications that make this assumption when
> badly reimplementing their mkdir -p thing, I imagine they also have this same
> reading of the man page - this is why I even care about it).
> 

I always have trouble with this sort of thing. Just because it's in
DESCRIPTION, does that make it supersede the part in ERRORS? IOW, I
think that's just telling you how to handle a symlink as the last
component, not that you have to do that before the permissions check.

Now that said, as a practical matter I do agree that EEXIST _is_
probably the more helpful error message. If there are applications that
rely on this then we probably should just take your patch.

Reviewed-by: Jeff Layton <jlayton@poochiereds.net>

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH] nfsd: Make creates return EEXIST correctly instead of EPERM
  2016-07-08 16:17         ` Jeff Layton
@ 2016-07-08 16:28           ` Oleg Drokin
  0 siblings, 0 replies; 37+ messages in thread
From: Oleg Drokin @ 2016-07-08 16:28 UTC (permalink / raw)
  To: Jeff Layton; +Cc: J. Bruce Fields, linux-nfs, linux-kernel


On Jul 8, 2016, at 12:17 PM, Jeff Layton wrote:

> On Fri, 2016-07-08 at 11:59 -0400, Oleg Drokin wrote:
>> On Jul 8, 2016, at 11:53 AM, Jeff Layton wrote:
>> 
>>> On Fri, 2016-07-08 at 11:14 -0400, Oleg Drokin wrote:
>>>> On Jul 8, 2016, at 7:02 AM, Jeff Layton wrote:
>>>> 
>>>>> On Thu, 2016-07-07 at 21:47 -0400, Oleg Drokin wrote:
>>>>>> It looks like we are bit overzealous about failing mkdir/create/mknod
>>>>>> with permission denied if the parent dir is not writeable.
>>>>>> Need to make sure the name does not exist first, because we need to
>>>>>> return EEXIST in that case.
>>>>>> 
>>>>>> Signed-off-by: Oleg Drokin <green@linuxhacker.ru>
>>>>>> ---
>>>>>> A very similar problem exists with symlinks, but the patch is more
>>>>>> involved, so assuming this one is ok, I'll send a symlink one separately.
>>>>>>  fs/nfsd/nfs4proc.c |  6 +++++-
>>>>>>  fs/nfsd/vfs.c      | 11 ++++++++++-
>>>>>>  2 files changed, 15 insertions(+), 2 deletions(-)
>>>>>> 
>>>>> 
>>>>> 
>>>>> nit: subject says EPERM, but I think you mean EACCES. The mnemonic I've
>>>>> always used is that EPERM is "permanent". IOW, changing permissions
>>>>> won't ever allow the user to do something. For instance, unprivileged
>>>>> users can never chown a file, so they should get back EPERM there. When
>>>>> a directory isn't writeable on a create they should get EACCES since
>>>>> they could do the create if the directory were writeable.
>>>> 
>>>> Hm, I see, thanks.
>>>> Confusing that you get "Permission denied" from perror ;)
>>>> 
>>> 
>>> Yes indeed. It's a subtle and confusing distinction.
>>> 
>>>>>> diff --git a/fs/nfsd/nfs4proc.c b/fs/nfsd/nfs4proc.c
>>>>>> index de1ff1d..0067520 100644
>>>>>> --- a/fs/nfsd/nfs4proc.c
>>>>>> +++ b/fs/nfsd/nfs4proc.c
>>>>>> @@ -605,8 +605,12 @@ nfsd4_create(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
>>>>>>  
>>>>>>  	fh_init(&resfh, NFS4_FHSIZE);
>>>>>>  
>>>>>> +	/*
>>>>>> +	 * We just check thta parent is accessible here, nfsd_* do their
>>>>>> +	 * own access permission checks
>>>>>> +	 */
>>>>>>  	status = fh_verify(rqstp, &cstate->current_fh, S_IFDIR,
>>>>>> -			   NFSD_MAY_CREATE);
>>>>>> +			   NFSD_MAY_EXEC);
>>>>>>  	if (status)
>>>>>>  		return status;
>>>>>>  
>>>>>> diff --git a/fs/nfsd/vfs.c b/fs/nfsd/vfs.c
>>>>>> index 6fbd81e..6a45ec6 100644
>>>>>> --- a/fs/nfsd/vfs.c
>>>>>> +++ b/fs/nfsd/vfs.c
>>>>>> @@ -1161,7 +1161,11 @@ nfsd_create(struct svc_rqst *rqstp, struct svc_fh *fhp,
>>>>>>  	if (isdotent(fname, flen))
>>>>>>  		goto out;
>>>>>>  
>>>>>> -	err = fh_verify(rqstp, fhp, S_IFDIR, NFSD_MAY_CREATE);
>>>>>> +	/*
>>>>>> +	 * Even though it is a create, first we see if we are even allowed
>>>>>> +	 * to peek inside the parent
>>>>>> +	 */
>>>>>> +	err = fh_verify(rqstp, fhp, S_IFDIR, NFSD_MAY_EXEC);
>>>>>>  	if (err)
>>>>>>  		goto out;
>>>>>>  
>>>>>> @@ -1211,6 +1215,11 @@ nfsd_create(struct svc_rqst *rqstp, struct svc_fh *fhp,
>>>>>>  		goto out; 
>>>>>>  	}
>>>>>>  
>>>>>> +	/* Now let's see if we actually have permissions to create */
>>>>>> +	err = nfsd_permission(rqstp, fhp->fh_export, dentry, NFSD_MAY_CREATE);
>>>>>> +	if (err)
>>>>>> +		goto out;
>>>>>> +
>>>>>>  	if (!(iap->ia_valid & ATTR_MODE))
>>>>>>  		iap->ia_mode = 0;
>>>>>>  	iap->ia_mode = (iap->ia_mode & S_IALLUGO) | type;
>>>>> 
>>>>> 
>>>>> Ouch. This means two nfsd_permission calls per create operation. If
>>>>> it's necessary for correctness then so be it, but is it actually
>>>>> documented anywhere (POSIX perhaps?) that we must prefer EEXIST over
>>>>> EACCES in this situation?
>>>> 
>>>> Opengroup manpage: http://pubs.opengroup.org/onlinepubs/009695399/functions/mkdir.html
>>>> newer version is here:
>>>> http://pubs.opengroup.org/onlinepubs/9699919799/
>>>> 
>>>> They tell us that we absolutely must fail with EEXIST if the name is a symlink
>>>> (so we need to lookup it anyway), and also that EEXIST is the failure code
>>>> if the path exists.
>>>> 
>>> 
>>> I'm not sure that that verbiage supersedes the fact that you don't have
>>> write permissions on the directory. Does it?
>> 
>> "If path names a symbolic link, mkdir() shall fail and set errno to [EEXIST]."
>> 
>> This sounds pretty straightforward to me, no?
>> Since it does not matter that we do not have write permissions here, because
>> the name already exists.
>> 
>> (also there are tons of applications that make this assumption when
>> badly reimplementing their mkdir -p thing, I imagine they also have this same
>> reading of the man page - this is why I even care about it).
>> 
> 
> I always have trouble with this sort of thing. Just because it's in
> DESCRIPTION, does that make it supersede the part in ERRORS? IOW, I
> think that's just telling you how to handle a symlink as the last
> component, not that you have to do that before the permissions check.

Personally I think of it like open(O_CREAT).
If the file exists, you just open it even if you have no permissions to
add stuff in the parent.

Same with mkdir/mknod - if the name already exists - you don't need any
write rights to create it - it's already there.

symlink was likely just special cased to highlight that it should not
just be followed (unlike what happens with open).

Also VFS even goes to some pains to prioritize returning EEXIST over
other errors like EROFS and such (something nfs (both server and client?)/lustre
does not, but I suspect it's a lot more fringe case?).

> Now that said, as a practical matter I do agree that EEXIST _is_
> probably the more helpful error message. If there are applications that
> rely on this then we probably should just take your patch.
> 
> Reviewed-by: Jeff Layton <jlayton@poochiereds.net>

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH] nfsd: Make creates return EEXIST correctly instead of EPERM
  2016-07-08 16:16         ` Oleg Drokin
@ 2016-07-08 20:49           ` J. Bruce Fields
  2016-07-08 21:47             ` Oleg Drokin
  0 siblings, 1 reply; 37+ messages in thread
From: J. Bruce Fields @ 2016-07-08 20:49 UTC (permalink / raw)
  To: Oleg Drokin; +Cc: Jeff Layton, linux-nfs, linux-kernel

On Fri, Jul 08, 2016 at 12:16:14PM -0400, Oleg Drokin wrote:
> 
> On Jul 8, 2016, at 12:04 PM, J. Bruce Fields wrote:
> 
> > On Fri, Jul 08, 2016 at 11:53:28AM -0400, Jeff Layton wrote:
> >> On Fri, 2016-07-08 at 11:14 -0400, Oleg Drokin wrote:
> >>> On Jul 8, 2016, at 7:02 AM, Jeff Layton wrote:
> >>> 
> >>>> On Thu, 2016-07-07 at 21:47 -0400, Oleg Drokin wrote:
> >>>>> It looks like we are bit overzealous about failing mkdir/create/mknod
> >>>>> with permission denied if the parent dir is not writeable.
> >>>>> Need to make sure the name does not exist first, because we need to
> >>>>> return EEXIST in that case.
> >>>>> 
> >>>>> Signed-off-by: Oleg Drokin <green@linuxhacker.ru>
> >>>>> ---
> >>>>> A very similar problem exists with symlinks, but the patch is more
> >>>>> involved, so assuming this one is ok, I'll send a symlink one separately.
> >>>>>  fs/nfsd/nfs4proc.c |  6 +++++-
> >>>>>  fs/nfsd/vfs.c      | 11 ++++++++++-
> >>>>>  2 files changed, 15 insertions(+), 2 deletions(-)
> >>>>> 
> >>>> 
> >>>> 
> >>>> nit: subject says EPERM, but I think you mean EACCES. The mnemonic I've
> >>>> always used is that EPERM is "permanent". IOW, changing permissions
> >>>> won't ever allow the user to do something. For instance, unprivileged
> >>>> users can never chown a file, so they should get back EPERM there. When
> >>>> a directory isn't writeable on a create they should get EACCES since
> >>>> they could do the create if the directory were writeable.
> >>> 
> >>> Hm, I see, thanks.
> >>> Confusing that you get "Permission denied" from perror ;)
> >>> 
> >> 
> >> Yes indeed. It's a subtle and confusing distinction.
> >> 
> >>>>> diff --git a/fs/nfsd/nfs4proc.c b/fs/nfsd/nfs4proc.c
> >>>>> index de1ff1d..0067520 100644
> >>>>> --- a/fs/nfsd/nfs4proc.c
> >>>>> +++ b/fs/nfsd/nfs4proc.c
> >>>>> @@ -605,8 +605,12 @@ nfsd4_create(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
> >>>>>  
> >>>>>  	fh_init(&resfh, NFS4_FHSIZE);
> >>>>>  
> >>>>> +	/*
> >>>>> +	 * We just check thta parent is accessible here, nfsd_* do their
> >>>>> +	 * own access permission checks
> >>>>> +	 */
> >>>>>  	status = fh_verify(rqstp, &cstate->current_fh, S_IFDIR,
> >>>>> -			   NFSD_MAY_CREATE);
> >>>>> +			   NFSD_MAY_EXEC);
> >>>>>  	if (status)
> >>>>>  		return status;
> >>>>>  
> >>>>> diff --git a/fs/nfsd/vfs.c b/fs/nfsd/vfs.c
> >>>>> index 6fbd81e..6a45ec6 100644
> >>>>> --- a/fs/nfsd/vfs.c
> >>>>> +++ b/fs/nfsd/vfs.c
> >>>>> @@ -1161,7 +1161,11 @@ nfsd_create(struct svc_rqst *rqstp, struct svc_fh *fhp,
> >>>>>  	if (isdotent(fname, flen))
> >>>>>  		goto out;
> >>>>>  
> >>>>> -	err = fh_verify(rqstp, fhp, S_IFDIR, NFSD_MAY_CREATE);
> >>>>> +	/*
> >>>>> +	 * Even though it is a create, first we see if we are even allowed
> >>>>> +	 * to peek inside the parent
> >>>>> +	 */
> >>>>> +	err = fh_verify(rqstp, fhp, S_IFDIR, NFSD_MAY_EXEC);
> >>>>>  	if (err)
> >>>>>  		goto out;
> >>>>>  
> >>>>> @@ -1211,6 +1215,11 @@ nfsd_create(struct svc_rqst *rqstp, struct svc_fh *fhp,
> >>>>>  		goto out; 
> >>>>>  	}
> >>>>>  
> >>>>> +	/* Now let's see if we actually have permissions to create */
> >>>>> +	err = nfsd_permission(rqstp, fhp->fh_export, dentry, NFSD_MAY_CREATE);
> >>>>> +	if (err)
> >>>>> +		goto out;
> >>>>> +
> >>>>>  	if (!(iap->ia_valid & ATTR_MODE))
> >>>>>  		iap->ia_mode = 0;
> >>>>>  	iap->ia_mode = (iap->ia_mode & S_IALLUGO) | type;
> >>>> 
> >>>> 
> >>>> Ouch. This means two nfsd_permission calls per create operation. If
> >>>> it's necessary for correctness then so be it, but is it actually
> >>>> documented anywhere (POSIX perhaps?) that we must prefer EEXIST over
> >>>> EACCES in this situation?
> >>> 
> >>> Opengroup manpage: http://pubs.opengroup.org/onlinepubs/009695399/functions/mkdir.html
> >>> newer version is here:
> >>> http://pubs.opengroup.org/onlinepubs/9699919799/
> >>> 
> >>> They tell us that we absolutely must fail with EEXIST if the name is a symlink
> >>> (so we need to lookup it anyway), and also that EEXIST is the failure code
> >>> if the path exists.
> >>> 
> >> 
> >> I'm not sure that that verbiage supersedes the fact that you don't have
> >> write permissions on the directory. Does it?
> >> 
> >> ISTM that it's perfectly valid to shortcut looking up the dentry if the
> >> user doesn't have write permissions on the directory, even when the
> >> target is a symlink.
> >> 
> >> IOW, I'm not sure I see a bug here.
> > 
> > If this is causing real programs to behave incorrectly, then that may
> > matter more than the letter of the spec.  But I'm a little curious why
> > we'd be hearing about that just now--did the client or server's behavior
> > change recently?
> 
> We, on the Lustre side, have been hearing about this since 2010, (this optimization
> was enabled in 2009).
> 
> I suspect some people just complain in places that not everybody monitors.

Sure, but you said "tons of programs" do this, and off hand I can't
recall a single report.  That's weird.

Anyway, I agree that the behavior your want seems more consistent at
least.

--b.

> I tried 3.10 and it has the same problem here.
> I just tried on RHEL6 (2.6.32) and the problem is also apparent there.
> 
> Also it's confusing how you get different errors depending on if the cache is hot or not:
> [green@centos6-16 racer]$ mkdir test
> mkdir: cannot create directory `test': Permission denied
> [green@centos6-16 racer]$ ls -ld test
> drwxr-xr-x 2 root root 4096 Jul  8 12:12 test
> [green@centos6-16 racer]$ mkdir test
> mkdir: cannot create directory `test': File exists
> 
> 
> >>> Are double permission checks really as bad for nfs? it looked like it would
> >>> call mostly into VFS so even if first call would be expensive, second call should
> >>> be really cheap?
> >>> 
> >> 
> >> It depends on the underlying fs. In most cases, you're right, but you
> >> can export things that overload the ->permission op, and those can be
> >> as expensive as they like (within reason of course).
> > 
> > Weird if the expense of a second permission call is significant compared
> > to following the mkdir and sync.  But, what do I know.
> > 
> > --b.

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH] nfsd: Make creates return EEXIST correctly instead of EPERM
  2016-07-08  1:47 [PATCH] nfsd: Make creates return EEXIST correctly instead of EPERM Oleg Drokin
  2016-07-08 11:02 ` Jeff Layton
@ 2016-07-08 20:54 ` J. Bruce Fields
  2016-07-08 21:53   ` Oleg Drokin
  1 sibling, 1 reply; 37+ messages in thread
From: J. Bruce Fields @ 2016-07-08 20:54 UTC (permalink / raw)
  To: Oleg Drokin; +Cc: Jeff Layton, linux-nfs, linux-kernel

On Thu, Jul 07, 2016 at 09:47:46PM -0400, Oleg Drokin wrote:
> It looks like we are bit overzealous about failing mkdir/create/mknod
> with permission denied if the parent dir is not writeable.
> Need to make sure the name does not exist first, because we need to
> return EEXIST in that case.
> 
> Signed-off-by: Oleg Drokin <green@linuxhacker.ru>
> ---
> A very similar problem exists with symlinks, but the patch is more
> involved, so assuming this one is ok, I'll send a symlink one separately.
>  fs/nfsd/nfs4proc.c |  6 +++++-
>  fs/nfsd/vfs.c      | 11 ++++++++++-
>  2 files changed, 15 insertions(+), 2 deletions(-)
> 
> diff --git a/fs/nfsd/nfs4proc.c b/fs/nfsd/nfs4proc.c
> index de1ff1d..0067520 100644
> --- a/fs/nfsd/nfs4proc.c
> +++ b/fs/nfsd/nfs4proc.c
> @@ -605,8 +605,12 @@ nfsd4_create(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
>  
>  	fh_init(&resfh, NFS4_FHSIZE);
>  
> +	/*
> +	 * We just check thta parent is accessible here, nfsd_* do their
> +	 * own access permission checks
> +	 */
>  	status = fh_verify(rqstp, &cstate->current_fh, S_IFDIR,
> -			   NFSD_MAY_CREATE);
> +			   NFSD_MAY_EXEC);
>  	if (status)
>  		return status;
>  
> diff --git a/fs/nfsd/vfs.c b/fs/nfsd/vfs.c
> index 6fbd81e..6a45ec6 100644
> --- a/fs/nfsd/vfs.c
> +++ b/fs/nfsd/vfs.c
> @@ -1161,7 +1161,11 @@ nfsd_create(struct svc_rqst *rqstp, struct svc_fh *fhp,
>  	if (isdotent(fname, flen))
>  		goto out;
>  
> -	err = fh_verify(rqstp, fhp, S_IFDIR, NFSD_MAY_CREATE);
> +	/*
> +	 * Even though it is a create, first we see if we are even allowed
> +	 * to peek inside the parent
> +	 */
> +	err = fh_verify(rqstp, fhp, S_IFDIR, NFSD_MAY_EXEC);

Looks like in the v3 case we haven't actually locked the directory yet
at this point so this check is a little race-prone.

I wonder why the code's structured that way--it's confusing.

--b.

>  	if (err)
>  		goto out;
>  
> @@ -1211,6 +1215,11 @@ nfsd_create(struct svc_rqst *rqstp, struct svc_fh *fhp,
>  		goto out; 
>  	}
>  
> +	/* Now let's see if we actually have permissions to create */
> +	err = nfsd_permission(rqstp, fhp->fh_export, dentry, NFSD_MAY_CREATE);
> +	if (err)
> +		goto out;
> +
>  	if (!(iap->ia_valid & ATTR_MODE))
>  		iap->ia_mode = 0;
>  	iap->ia_mode = (iap->ia_mode & S_IALLUGO) | type;
> -- 
> 2.7.4

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH] nfsd: Make creates return EEXIST correctly instead of EPERM
  2016-07-08 20:49           ` J. Bruce Fields
@ 2016-07-08 21:47             ` Oleg Drokin
  2016-07-09  3:10               ` Al Viro
  0 siblings, 1 reply; 37+ messages in thread
From: Oleg Drokin @ 2016-07-08 21:47 UTC (permalink / raw)
  To: J. Bruce Fields; +Cc: Jeff Layton, linux-nfs, linux-kernel


On Jul 8, 2016, at 4:49 PM, J. Bruce Fields wrote:

> On Fri, Jul 08, 2016 at 12:16:14PM -0400, Oleg Drokin wrote:
>> 
>> On Jul 8, 2016, at 12:04 PM, J. Bruce Fields wrote:
>> 
>>> On Fri, Jul 08, 2016 at 11:53:28AM -0400, Jeff Layton wrote:
>>>> On Fri, 2016-07-08 at 11:14 -0400, Oleg Drokin wrote:
>>>>> On Jul 8, 2016, at 7:02 AM, Jeff Layton wrote:
>>>>> 
>>>>>> On Thu, 2016-07-07 at 21:47 -0400, Oleg Drokin wrote:
>>>>>>> It looks like we are bit overzealous about failing mkdir/create/mknod
>>>>>>> with permission denied if the parent dir is not writeable.
>>>>>>> Need to make sure the name does not exist first, because we need to
>>>>>>> return EEXIST in that case.
>>>>>>> 
>>>>>>> Signed-off-by: Oleg Drokin <green@linuxhacker.ru>
>>>>>>> ---
>>>>>>> A very similar problem exists with symlinks, but the patch is more
>>>>>>> involved, so assuming this one is ok, I'll send a symlink one separately.
>>>>>>> fs/nfsd/nfs4proc.c |  6 +++++-
>>>>>>> fs/nfsd/vfs.c      | 11 ++++++++++-
>>>>>>> 2 files changed, 15 insertions(+), 2 deletions(-)
>>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> nit: subject says EPERM, but I think you mean EACCES. The mnemonic I've
>>>>>> always used is that EPERM is "permanent". IOW, changing permissions
>>>>>> won't ever allow the user to do something. For instance, unprivileged
>>>>>> users can never chown a file, so they should get back EPERM there. When
>>>>>> a directory isn't writeable on a create they should get EACCES since
>>>>>> they could do the create if the directory were writeable.
>>>>> 
>>>>> Hm, I see, thanks.
>>>>> Confusing that you get "Permission denied" from perror ;)
>>>>> 
>>>> 
>>>> Yes indeed. It's a subtle and confusing distinction.
>>>> 
>>>>>>> diff --git a/fs/nfsd/nfs4proc.c b/fs/nfsd/nfs4proc.c
>>>>>>> index de1ff1d..0067520 100644
>>>>>>> --- a/fs/nfsd/nfs4proc.c
>>>>>>> +++ b/fs/nfsd/nfs4proc.c
>>>>>>> @@ -605,8 +605,12 @@ nfsd4_create(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
>>>>>>> 
>>>>>>> 	fh_init(&resfh, NFS4_FHSIZE);
>>>>>>> 
>>>>>>> +	/*
>>>>>>> +	 * We just check thta parent is accessible here, nfsd_* do their
>>>>>>> +	 * own access permission checks
>>>>>>> +	 */
>>>>>>> 	status = fh_verify(rqstp, &cstate->current_fh, S_IFDIR,
>>>>>>> -			   NFSD_MAY_CREATE);
>>>>>>> +			   NFSD_MAY_EXEC);
>>>>>>> 	if (status)
>>>>>>> 		return status;
>>>>>>> 
>>>>>>> diff --git a/fs/nfsd/vfs.c b/fs/nfsd/vfs.c
>>>>>>> index 6fbd81e..6a45ec6 100644
>>>>>>> --- a/fs/nfsd/vfs.c
>>>>>>> +++ b/fs/nfsd/vfs.c
>>>>>>> @@ -1161,7 +1161,11 @@ nfsd_create(struct svc_rqst *rqstp, struct svc_fh *fhp,
>>>>>>> 	if (isdotent(fname, flen))
>>>>>>> 		goto out;
>>>>>>> 
>>>>>>> -	err = fh_verify(rqstp, fhp, S_IFDIR, NFSD_MAY_CREATE);
>>>>>>> +	/*
>>>>>>> +	 * Even though it is a create, first we see if we are even allowed
>>>>>>> +	 * to peek inside the parent
>>>>>>> +	 */
>>>>>>> +	err = fh_verify(rqstp, fhp, S_IFDIR, NFSD_MAY_EXEC);
>>>>>>> 	if (err)
>>>>>>> 		goto out;
>>>>>>> 
>>>>>>> @@ -1211,6 +1215,11 @@ nfsd_create(struct svc_rqst *rqstp, struct svc_fh *fhp,
>>>>>>> 		goto out; 
>>>>>>> 	}
>>>>>>> 
>>>>>>> +	/* Now let's see if we actually have permissions to create */
>>>>>>> +	err = nfsd_permission(rqstp, fhp->fh_export, dentry, NFSD_MAY_CREATE);
>>>>>>> +	if (err)
>>>>>>> +		goto out;
>>>>>>> +
>>>>>>> 	if (!(iap->ia_valid & ATTR_MODE))
>>>>>>> 		iap->ia_mode = 0;
>>>>>>> 	iap->ia_mode = (iap->ia_mode & S_IALLUGO) | type;
>>>>>> 
>>>>>> 
>>>>>> Ouch. This means two nfsd_permission calls per create operation. If
>>>>>> it's necessary for correctness then so be it, but is it actually
>>>>>> documented anywhere (POSIX perhaps?) that we must prefer EEXIST over
>>>>>> EACCES in this situation?
>>>>> 
>>>>> Opengroup manpage: http://pubs.opengroup.org/onlinepubs/009695399/functions/mkdir.html
>>>>> newer version is here:
>>>>> http://pubs.opengroup.org/onlinepubs/9699919799/
>>>>> 
>>>>> They tell us that we absolutely must fail with EEXIST if the name is a symlink
>>>>> (so we need to lookup it anyway), and also that EEXIST is the failure code
>>>>> if the path exists.
>>>>> 
>>>> 
>>>> I'm not sure that that verbiage supersedes the fact that you don't have
>>>> write permissions on the directory. Does it?
>>>> 
>>>> ISTM that it's perfectly valid to shortcut looking up the dentry if the
>>>> user doesn't have write permissions on the directory, even when the
>>>> target is a symlink.
>>>> 
>>>> IOW, I'm not sure I see a bug here.
>>> 
>>> If this is causing real programs to behave incorrectly, then that may
>>> matter more than the letter of the spec.  But I'm a little curious why
>>> we'd be hearing about that just now--did the client or server's behavior
>>> change recently?
>> 
>> We, on the Lustre side, have been hearing about this since 2010, (this optimization
>> was enabled in 2009).
>> 
>> I suspect some people just complain in places that not everybody monitors.
> 
> Sure, but you said "tons of programs" do this, and off hand I can't
> recall a single report.  That's weird.

I wonder if people just accept that "NFS is just weird" and code in workarounds,
where as with Lustre we promise (almost) full POSIX compliance, and also came much later
so people are just seeing that "this does not work" and complain more loudly?

>From the two ready examples, the torque job scheduler and Matlab are the two examples
of programs we got bugreports from, but they are not the only ones for sure.
There are some "custom code references too.
So I guess "tons of" is a relative measure.

> Anyway, I agree that the behavior your want seems more consistent at
> least.
> 
> --b.
> 
>> I tried 3.10 and it has the same problem here.
>> I just tried on RHEL6 (2.6.32) and the problem is also apparent there.
>> 
>> Also it's confusing how you get different errors depending on if the cache is hot or not:
>> [green@centos6-16 racer]$ mkdir test
>> mkdir: cannot create directory `test': Permission denied
>> [green@centos6-16 racer]$ ls -ld test
>> drwxr-xr-x 2 root root 4096 Jul  8 12:12 test
>> [green@centos6-16 racer]$ mkdir test
>> mkdir: cannot create directory `test': File exists
>> 
>> 
>>>>> Are double permission checks really as bad for nfs? it looked like it would
>>>>> call mostly into VFS so even if first call would be expensive, second call should
>>>>> be really cheap?
>>>>> 
>>>> 
>>>> It depends on the underlying fs. In most cases, you're right, but you
>>>> can export things that overload the ->permission op, and those can be
>>>> as expensive as they like (within reason of course).
>>> 
>>> Weird if the expense of a second permission call is significant compared
>>> to following the mkdir and sync.  But, what do I know.
>>> 
>>> --b.

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH] nfsd: Make creates return EEXIST correctly instead of EPERM
  2016-07-08 20:54 ` J. Bruce Fields
@ 2016-07-08 21:53   ` Oleg Drokin
  2016-07-21 20:34     ` J. Bruce Fields
  0 siblings, 1 reply; 37+ messages in thread
From: Oleg Drokin @ 2016-07-08 21:53 UTC (permalink / raw)
  To: J. Bruce Fields; +Cc: Jeff Layton, linux-nfs, linux-kernel


On Jul 8, 2016, at 4:54 PM, J. Bruce Fields wrote:

> On Thu, Jul 07, 2016 at 09:47:46PM -0400, Oleg Drokin wrote:
>> It looks like we are bit overzealous about failing mkdir/create/mknod
>> with permission denied if the parent dir is not writeable.
>> Need to make sure the name does not exist first, because we need to
>> return EEXIST in that case.
>> 
>> Signed-off-by: Oleg Drokin <green@linuxhacker.ru>
>> ---
>> A very similar problem exists with symlinks, but the patch is more
>> involved, so assuming this one is ok, I'll send a symlink one separately.
>> fs/nfsd/nfs4proc.c |  6 +++++-
>> fs/nfsd/vfs.c      | 11 ++++++++++-
>> 2 files changed, 15 insertions(+), 2 deletions(-)
>> 
>> diff --git a/fs/nfsd/nfs4proc.c b/fs/nfsd/nfs4proc.c
>> index de1ff1d..0067520 100644
>> --- a/fs/nfsd/nfs4proc.c
>> +++ b/fs/nfsd/nfs4proc.c
>> @@ -605,8 +605,12 @@ nfsd4_create(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
>> 
>> 	fh_init(&resfh, NFS4_FHSIZE);
>> 
>> +	/*
>> +	 * We just check thta parent is accessible here, nfsd_* do their
>> +	 * own access permission checks
>> +	 */
>> 	status = fh_verify(rqstp, &cstate->current_fh, S_IFDIR,
>> -			   NFSD_MAY_CREATE);
>> +			   NFSD_MAY_EXEC);
>> 	if (status)
>> 		return status;
>> 
>> diff --git a/fs/nfsd/vfs.c b/fs/nfsd/vfs.c
>> index 6fbd81e..6a45ec6 100644
>> --- a/fs/nfsd/vfs.c
>> +++ b/fs/nfsd/vfs.c
>> @@ -1161,7 +1161,11 @@ nfsd_create(struct svc_rqst *rqstp, struct svc_fh *fhp,
>> 	if (isdotent(fname, flen))
>> 		goto out;
>> 
>> -	err = fh_verify(rqstp, fhp, S_IFDIR, NFSD_MAY_CREATE);
>> +	/*
>> +	 * Even though it is a create, first we see if we are even allowed
>> +	 * to peek inside the parent
>> +	 */
>> +	err = fh_verify(rqstp, fhp, S_IFDIR, NFSD_MAY_EXEC);
> 
> Looks like in the v3 case we haven't actually locked the directory yet
> at this point so this check is a little race-prone.

In reality this check is not really needed, I suspect.
When we call vfs_create/mknod/mkdir later on, it has it's own permission check
anyway so if there was a race and somebody changed dir access in the middle,
there's going to be another check anyway and it would be caught.
Unless there's some weird server-side permission wiggling as well that makes it
ineffective, but I imagine that one cannot really change in a racy way?

> I wonder why the code's structured that way--it's confusing.

Probably years of accumulated "damage" ;)

> --b.
> 
>> 	if (err)
>> 		goto out;
>> 
>> @@ -1211,6 +1215,11 @@ nfsd_create(struct svc_rqst *rqstp, struct svc_fh *fhp,
>> 		goto out; 
>> 	}
>> 
>> +	/* Now let's see if we actually have permissions to create */
>> +	err = nfsd_permission(rqstp, fhp->fh_export, dentry, NFSD_MAY_CREATE);
>> +	if (err)
>> +		goto out;
>> +
>> 	if (!(iap->ia_valid & ATTR_MODE))
>> 		iap->ia_mode = 0;
>> 	iap->ia_mode = (iap->ia_mode & S_IALLUGO) | type;
>> -- 
>> 2.7.4

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH] nfsd: Make creates return EEXIST correctly instead of EPERM
  2016-07-08 15:59       ` Oleg Drokin
  2016-07-08 16:17         ` Jeff Layton
@ 2016-07-09  2:52         ` Al Viro
  2016-07-09  2:58           ` Oleg Drokin
  1 sibling, 1 reply; 37+ messages in thread
From: Al Viro @ 2016-07-09  2:52 UTC (permalink / raw)
  To: Oleg Drokin; +Cc: Jeff Layton, J. Bruce Fields, linux-nfs, linux-kernel

On Fri, Jul 08, 2016 at 11:59:50AM -0400, Oleg Drokin wrote:

> "If path names a symbolic link, mkdir() shall fail and set errno to [EEXIST]."
> 
> This sounds pretty straightforward to me, no?
> Since it does not matter that we do not have write permissions here, because
> the name already exists.

When more than one condition applies, we have every right to return any of
them.  POSIX does *NOT* specify the order of checks.  Never had.

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH] nfsd: Make creates return EEXIST correctly instead of EPERM
  2016-07-09  2:52         ` Al Viro
@ 2016-07-09  2:58           ` Oleg Drokin
  2016-07-09  3:13             ` Al Viro
  0 siblings, 1 reply; 37+ messages in thread
From: Oleg Drokin @ 2016-07-09  2:58 UTC (permalink / raw)
  To: Al Viro; +Cc: Jeff Layton, J. Bruce Fields, linux-nfs, linux-kernel


On Jul 8, 2016, at 10:52 PM, Al Viro wrote:

> On Fri, Jul 08, 2016 at 11:59:50AM -0400, Oleg Drokin wrote:
> 
>> "If path names a symbolic link, mkdir() shall fail and set errno to [EEXIST]."
>> 
>> This sounds pretty straightforward to me, no?
>> Since it does not matter that we do not have write permissions here, because
>> the name already exists.
> 
> When more than one condition applies, we have every right to return any of
> them.  POSIX does *NOT* specify the order of checks.  Never had.

Out of curiosity, why does filename_create() delay EROFS then?

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH] nfsd: Make creates return EEXIST correctly instead of EPERM
  2016-07-08 21:47             ` Oleg Drokin
@ 2016-07-09  3:10               ` Al Viro
  2016-07-09  3:41                 ` Oleg Drokin
  0 siblings, 1 reply; 37+ messages in thread
From: Al Viro @ 2016-07-09  3:10 UTC (permalink / raw)
  To: Oleg Drokin; +Cc: J. Bruce Fields, Jeff Layton, linux-nfs, linux-kernel

On Fri, Jul 08, 2016 at 05:47:22PM -0400, Oleg Drokin wrote:

> I wonder if people just accept that "NFS is just weird" and code in workarounds,
> where as with Lustre we promise (almost) full POSIX compliance, and also came much later
> so people are just seeing that "this does not work" and complain more loudly?

To quote POSIX: "If more than one error occurs in processing a function call,
any one of the possible errors may be returned, as the order of detection is
undefined." (from System Interfaces: General Information: 2.3 Error Numbers)

And regarding mkdir(2) it has
[EACCES]
    Search permission is denied on a component of the path prefix, or write
permission is denied on the parent directory of the directory to be created.
[EEXIST]
    The named file exists.
among the error conditions.  In situations when both apply, the implementation
is bloody well allowed to return either.  It might be nicer to return EEXIST
in such cases, for consistency sake (if another thread does stat() on the
pathname in question just as you are about to call mkdir(2), you will get
EEXIST without ever reaching permission(9), let alone ->mkdir() method), but
please do not bring POSIX compliance as an argument.  It's a QoI argument and
nothing beyond that.

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH] nfsd: Make creates return EEXIST correctly instead of EPERM
  2016-07-09  2:58           ` Oleg Drokin
@ 2016-07-09  3:13             ` Al Viro
  0 siblings, 0 replies; 37+ messages in thread
From: Al Viro @ 2016-07-09  3:13 UTC (permalink / raw)
  To: Oleg Drokin; +Cc: Jeff Layton, J. Bruce Fields, linux-nfs, linux-kernel

On Fri, Jul 08, 2016 at 10:58:38PM -0400, Oleg Drokin wrote:

> > When more than one condition applies, we have every right to return any of
> > them.  POSIX does *NOT* specify the order of checks.  Never had.
> 
> Out of curiosity, why does filename_create() delay EROFS then?

QoI and historical behaviour...

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH] nfsd: Make creates return EEXIST correctly instead of EPERM
  2016-07-09  3:10               ` Al Viro
@ 2016-07-09  3:41                 ` Oleg Drokin
  2016-07-13 19:00                   ` J. Bruce Fields
  0 siblings, 1 reply; 37+ messages in thread
From: Oleg Drokin @ 2016-07-09  3:41 UTC (permalink / raw)
  To: Al Viro, J. Bruce Fields; +Cc: Jeff Layton, linux-nfs, Mailing List


On Jul 8, 2016, at 11:10 PM, Al Viro wrote:

> On Fri, Jul 08, 2016 at 05:47:22PM -0400, Oleg Drokin wrote:
> 
>> I wonder if people just accept that "NFS is just weird" and code in workarounds,
>> where as with Lustre we promise (almost) full POSIX compliance, and also came much later
>> so people are just seeing that "this does not work" and complain more loudly?
> 
> To quote POSIX: "If more than one error occurs in processing a function call,
> any one of the possible errors may be returned, as the order of detection is
> undefined." (from System Interfaces: General Information: 2.3 Error Numbers)
> 
> And regarding mkdir(2) it has
> [EACCES]
>    Search permission is denied on a component of the path prefix, or write
> permission is denied on the parent directory of the directory to be created.
> [EEXIST]
>    The named file exists.
> among the error conditions.  In situations when both apply, the implementation
> is bloody well allowed to return either.  It might be nicer to return EEXIST
> in such cases, for consistency sake (if another thread does stat() on the
> pathname in question just as you are about to call mkdir(2), you will get
> EEXIST without ever reaching permission(9), let alone ->mkdir() method), but
> please do not bring POSIX compliance as an argument.  It's a QoI argument and
> nothing beyond that.

Ok, I see.
Thanks.

Bruce, do you want the patch resubmitted with a rewritten commit message,
or do you think it's best to just drop it them?

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH] nfsd: Make creates return EEXIST correctly instead of EPERM
  2016-07-09  3:41                 ` Oleg Drokin
@ 2016-07-13 19:00                   ` J. Bruce Fields
  0 siblings, 0 replies; 37+ messages in thread
From: J. Bruce Fields @ 2016-07-13 19:00 UTC (permalink / raw)
  To: Oleg Drokin; +Cc: Al Viro, Jeff Layton, linux-nfs, Mailing List

On Fri, Jul 08, 2016 at 11:41:41PM -0400, Oleg Drokin wrote:
> 
> On Jul 8, 2016, at 11:10 PM, Al Viro wrote:
> 
> > On Fri, Jul 08, 2016 at 05:47:22PM -0400, Oleg Drokin wrote:
> > 
> >> I wonder if people just accept that "NFS is just weird" and code in workarounds,
> >> where as with Lustre we promise (almost) full POSIX compliance, and also came much later
> >> so people are just seeing that "this does not work" and complain more loudly?
> > 
> > To quote POSIX: "If more than one error occurs in processing a function call,
> > any one of the possible errors may be returned, as the order of detection is
> > undefined." (from System Interfaces: General Information: 2.3 Error Numbers)
> > 
> > And regarding mkdir(2) it has
> > [EACCES]
> >    Search permission is denied on a component of the path prefix, or write
> > permission is denied on the parent directory of the directory to be created.
> > [EEXIST]
> >    The named file exists.
> > among the error conditions.  In situations when both apply, the implementation
> > is bloody well allowed to return either.  It might be nicer to return EEXIST
> > in such cases, for consistency sake (if another thread does stat() on the
> > pathname in question just as you are about to call mkdir(2), you will get
> > EEXIST without ever reaching permission(9), let alone ->mkdir() method), but
> > please do not bring POSIX compliance as an argument.  It's a QoI argument and
> > nothing beyond that.
> 
> Ok, I see.
> Thanks.
> 
> Bruce, do you want the patch resubmitted with a rewritten commit message,
> or do you think it's best to just drop it them?

Other things being equal I still agree with you that there'd be
advantages to being more consistent, so a changelog update would be
fine.

--b.

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH] nfsd: Make creates return EEXIST correctly instead of EPERM
  2016-07-08 21:53   ` Oleg Drokin
@ 2016-07-21 20:34     ` J. Bruce Fields
  2016-07-21 20:37       ` Oleg Drokin
  0 siblings, 1 reply; 37+ messages in thread
From: J. Bruce Fields @ 2016-07-21 20:34 UTC (permalink / raw)
  To: Oleg Drokin; +Cc: Jeff Layton, linux-nfs, linux-kernel

On Fri, Jul 08, 2016 at 05:53:19PM -0400, Oleg Drokin wrote:
> 
> On Jul 8, 2016, at 4:54 PM, J. Bruce Fields wrote:
> 
> > On Thu, Jul 07, 2016 at 09:47:46PM -0400, Oleg Drokin wrote:
> >> It looks like we are bit overzealous about failing mkdir/create/mknod
> >> with permission denied if the parent dir is not writeable.
> >> Need to make sure the name does not exist first, because we need to
> >> return EEXIST in that case.
> >> 
> >> Signed-off-by: Oleg Drokin <green@linuxhacker.ru>
> >> ---
> >> A very similar problem exists with symlinks, but the patch is more
> >> involved, so assuming this one is ok, I'll send a symlink one separately.
> >> fs/nfsd/nfs4proc.c |  6 +++++-
> >> fs/nfsd/vfs.c      | 11 ++++++++++-
> >> 2 files changed, 15 insertions(+), 2 deletions(-)
> >> 
> >> diff --git a/fs/nfsd/nfs4proc.c b/fs/nfsd/nfs4proc.c
> >> index de1ff1d..0067520 100644
> >> --- a/fs/nfsd/nfs4proc.c
> >> +++ b/fs/nfsd/nfs4proc.c
> >> @@ -605,8 +605,12 @@ nfsd4_create(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
> >> 
> >> 	fh_init(&resfh, NFS4_FHSIZE);
> >> 
> >> +	/*
> >> +	 * We just check thta parent is accessible here, nfsd_* do their
> >> +	 * own access permission checks
> >> +	 */
> >> 	status = fh_verify(rqstp, &cstate->current_fh, S_IFDIR,
> >> -			   NFSD_MAY_CREATE);
> >> +			   NFSD_MAY_EXEC);
> >> 	if (status)
> >> 		return status;
> >> 
> >> diff --git a/fs/nfsd/vfs.c b/fs/nfsd/vfs.c
> >> index 6fbd81e..6a45ec6 100644
> >> --- a/fs/nfsd/vfs.c
> >> +++ b/fs/nfsd/vfs.c
> >> @@ -1161,7 +1161,11 @@ nfsd_create(struct svc_rqst *rqstp, struct svc_fh *fhp,
> >> 	if (isdotent(fname, flen))
> >> 		goto out;
> >> 
> >> -	err = fh_verify(rqstp, fhp, S_IFDIR, NFSD_MAY_CREATE);
> >> +	/*
> >> +	 * Even though it is a create, first we see if we are even allowed
> >> +	 * to peek inside the parent
> >> +	 */
> >> +	err = fh_verify(rqstp, fhp, S_IFDIR, NFSD_MAY_EXEC);
> > 
> > Looks like in the v3 case we haven't actually locked the directory yet
> > at this point so this check is a little race-prone.
> 
> In reality this check is not really needed, I suspect.
> When we call vfs_create/mknod/mkdir later on, it has it's own permission check
> anyway so if there was a race and somebody changed dir access in the middle,
> there's going to be another check anyway and it would be caught.
> Unless there's some weird server-side permission wiggling as well that makes it
> ineffective, but I imagine that one cannot really change in a racy way?

Yeah, I think I'll just change those NFSD_MAY_EXEC's to NFSD_MAY_NOP's.
We still need the fh_verify there since it's also what does the
filehandle->dentry translation, but we don't need permission checking
here yet.

Applying with that one change.  (And I'll followup with some additional
minor cleanup of the create code.)

--b.

> 
> > I wonder why the code's structured that way--it's confusing.
> 
> Probably years of accumulated "damage" ;)
> 
> > --b.
> > 
> >> 	if (err)
> >> 		goto out;
> >> 
> >> @@ -1211,6 +1215,11 @@ nfsd_create(struct svc_rqst *rqstp, struct svc_fh *fhp,
> >> 		goto out; 
> >> 	}
> >> 
> >> +	/* Now let's see if we actually have permissions to create */
> >> +	err = nfsd_permission(rqstp, fhp->fh_export, dentry, NFSD_MAY_CREATE);
> >> +	if (err)
> >> +		goto out;
> >> +
> >> 	if (!(iap->ia_valid & ATTR_MODE))
> >> 		iap->ia_mode = 0;
> >> 	iap->ia_mode = (iap->ia_mode & S_IALLUGO) | type;
> >> -- 
> >> 2.7.4

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH] nfsd: Make creates return EEXIST correctly instead of EPERM
  2016-07-21 20:34     ` J. Bruce Fields
@ 2016-07-21 20:37       ` Oleg Drokin
  2016-07-22  1:57         ` J. Bruce Fields
  0 siblings, 1 reply; 37+ messages in thread
From: Oleg Drokin @ 2016-07-21 20:37 UTC (permalink / raw)
  To: J. Bruce Fields; +Cc: Jeff Layton, linux-nfs, linux-kernel


On Jul 21, 2016, at 4:34 PM, J. Bruce Fields wrote:

> On Fri, Jul 08, 2016 at 05:53:19PM -0400, Oleg Drokin wrote:
>> 
>> On Jul 8, 2016, at 4:54 PM, J. Bruce Fields wrote:
>> 
>>> On Thu, Jul 07, 2016 at 09:47:46PM -0400, Oleg Drokin wrote:
>>>> It looks like we are bit overzealous about failing mkdir/create/mknod
>>>> with permission denied if the parent dir is not writeable.
>>>> Need to make sure the name does not exist first, because we need to
>>>> return EEXIST in that case.
>>>> 
>>>> Signed-off-by: Oleg Drokin <green@linuxhacker.ru>
>>>> ---
>>>> A very similar problem exists with symlinks, but the patch is more
>>>> involved, so assuming this one is ok, I'll send a symlink one separately.
>>>> fs/nfsd/nfs4proc.c |  6 +++++-
>>>> fs/nfsd/vfs.c      | 11 ++++++++++-
>>>> 2 files changed, 15 insertions(+), 2 deletions(-)
>>>> 
>>>> diff --git a/fs/nfsd/nfs4proc.c b/fs/nfsd/nfs4proc.c
>>>> index de1ff1d..0067520 100644
>>>> --- a/fs/nfsd/nfs4proc.c
>>>> +++ b/fs/nfsd/nfs4proc.c
>>>> @@ -605,8 +605,12 @@ nfsd4_create(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
>>>> 
>>>> 	fh_init(&resfh, NFS4_FHSIZE);
>>>> 
>>>> +	/*
>>>> +	 * We just check thta parent is accessible here, nfsd_* do their
>>>> +	 * own access permission checks
>>>> +	 */
>>>> 	status = fh_verify(rqstp, &cstate->current_fh, S_IFDIR,
>>>> -			   NFSD_MAY_CREATE);
>>>> +			   NFSD_MAY_EXEC);
>>>> 	if (status)
>>>> 		return status;
>>>> 
>>>> diff --git a/fs/nfsd/vfs.c b/fs/nfsd/vfs.c
>>>> index 6fbd81e..6a45ec6 100644
>>>> --- a/fs/nfsd/vfs.c
>>>> +++ b/fs/nfsd/vfs.c
>>>> @@ -1161,7 +1161,11 @@ nfsd_create(struct svc_rqst *rqstp, struct svc_fh *fhp,
>>>> 	if (isdotent(fname, flen))
>>>> 		goto out;
>>>> 
>>>> -	err = fh_verify(rqstp, fhp, S_IFDIR, NFSD_MAY_CREATE);
>>>> +	/*
>>>> +	 * Even though it is a create, first we see if we are even allowed
>>>> +	 * to peek inside the parent
>>>> +	 */
>>>> +	err = fh_verify(rqstp, fhp, S_IFDIR, NFSD_MAY_EXEC);
>>> 
>>> Looks like in the v3 case we haven't actually locked the directory yet
>>> at this point so this check is a little race-prone.
>> 
>> In reality this check is not really needed, I suspect.
>> When we call vfs_create/mknod/mkdir later on, it has it's own permission check
>> anyway so if there was a race and somebody changed dir access in the middle,
>> there's going to be another check anyway and it would be caught.
>> Unless there's some weird server-side permission wiggling as well that makes it
>> ineffective, but I imagine that one cannot really change in a racy way?
> 
> Yeah, I think I'll just change those NFSD_MAY_EXEC's to NFSD_MAY_NOP's.
> We still need the fh_verify there since it's also what does the
> filehandle->dentry translation, but we don't need permission checking
> here yet.

This will likely need an extra test to ensure that when you
do mkdir where you do not have exec permissions, you would get EACCES instead
of EEXIST, otherwise that would be information leakage, no?
Or do you think the second time we do nfsd_permission, that would be covered?

> Applying with that one change.  (And I'll followup with some additional
> minor cleanup of the create code.)
> 
> --b.
> 
>> 
>>> I wonder why the code's structured that way--it's confusing.
>> 
>> Probably years of accumulated "damage" ;)
>> 
>>> --b.
>>> 
>>>> 	if (err)
>>>> 		goto out;
>>>> 
>>>> @@ -1211,6 +1215,11 @@ nfsd_create(struct svc_rqst *rqstp, struct svc_fh *fhp,
>>>> 		goto out; 
>>>> 	}
>>>> 
>>>> +	/* Now let's see if we actually have permissions to create */
>>>> +	err = nfsd_permission(rqstp, fhp->fh_export, dentry, NFSD_MAY_CREATE);
>>>> +	if (err)
>>>> +		goto out;
>>>> +
>>>> 	if (!(iap->ia_valid & ATTR_MODE))
>>>> 		iap->ia_mode = 0;
>>>> 	iap->ia_mode = (iap->ia_mode & S_IALLUGO) | type;
>>>> -- 
>>>> 2.7.4

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH] nfsd: Make creates return EEXIST correctly instead of EPERM
  2016-07-21 20:37       ` Oleg Drokin
@ 2016-07-22  1:57         ` J. Bruce Fields
  2016-07-22  6:35           ` Oleg Drokin
  0 siblings, 1 reply; 37+ messages in thread
From: J. Bruce Fields @ 2016-07-22  1:57 UTC (permalink / raw)
  To: Oleg Drokin; +Cc: Jeff Layton, linux-nfs, linux-kernel

On Thu, Jul 21, 2016 at 04:37:40PM -0400, Oleg Drokin wrote:
> 
> On Jul 21, 2016, at 4:34 PM, J. Bruce Fields wrote:
> 
> > On Fri, Jul 08, 2016 at 05:53:19PM -0400, Oleg Drokin wrote:
> >> 
> >> On Jul 8, 2016, at 4:54 PM, J. Bruce Fields wrote:
> >> 
> >>> On Thu, Jul 07, 2016 at 09:47:46PM -0400, Oleg Drokin wrote:
> >>>> It looks like we are bit overzealous about failing mkdir/create/mknod
> >>>> with permission denied if the parent dir is not writeable.
> >>>> Need to make sure the name does not exist first, because we need to
> >>>> return EEXIST in that case.
> >>>> 
> >>>> Signed-off-by: Oleg Drokin <green@linuxhacker.ru>
> >>>> ---
> >>>> A very similar problem exists with symlinks, but the patch is more
> >>>> involved, so assuming this one is ok, I'll send a symlink one separately.
> >>>> fs/nfsd/nfs4proc.c |  6 +++++-
> >>>> fs/nfsd/vfs.c      | 11 ++++++++++-
> >>>> 2 files changed, 15 insertions(+), 2 deletions(-)
> >>>> 
> >>>> diff --git a/fs/nfsd/nfs4proc.c b/fs/nfsd/nfs4proc.c
> >>>> index de1ff1d..0067520 100644
> >>>> --- a/fs/nfsd/nfs4proc.c
> >>>> +++ b/fs/nfsd/nfs4proc.c
> >>>> @@ -605,8 +605,12 @@ nfsd4_create(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
> >>>> 
> >>>> 	fh_init(&resfh, NFS4_FHSIZE);
> >>>> 
> >>>> +	/*
> >>>> +	 * We just check thta parent is accessible here, nfsd_* do their
> >>>> +	 * own access permission checks
> >>>> +	 */
> >>>> 	status = fh_verify(rqstp, &cstate->current_fh, S_IFDIR,
> >>>> -			   NFSD_MAY_CREATE);
> >>>> +			   NFSD_MAY_EXEC);
> >>>> 	if (status)
> >>>> 		return status;
> >>>> 
> >>>> diff --git a/fs/nfsd/vfs.c b/fs/nfsd/vfs.c
> >>>> index 6fbd81e..6a45ec6 100644
> >>>> --- a/fs/nfsd/vfs.c
> >>>> +++ b/fs/nfsd/vfs.c
> >>>> @@ -1161,7 +1161,11 @@ nfsd_create(struct svc_rqst *rqstp, struct svc_fh *fhp,
> >>>> 	if (isdotent(fname, flen))
> >>>> 		goto out;
> >>>> 
> >>>> -	err = fh_verify(rqstp, fhp, S_IFDIR, NFSD_MAY_CREATE);
> >>>> +	/*
> >>>> +	 * Even though it is a create, first we see if we are even allowed
> >>>> +	 * to peek inside the parent
> >>>> +	 */
> >>>> +	err = fh_verify(rqstp, fhp, S_IFDIR, NFSD_MAY_EXEC);
> >>> 
> >>> Looks like in the v3 case we haven't actually locked the directory yet
> >>> at this point so this check is a little race-prone.
> >> 
> >> In reality this check is not really needed, I suspect.
> >> When we call vfs_create/mknod/mkdir later on, it has it's own permission check
> >> anyway so if there was a race and somebody changed dir access in the middle,
> >> there's going to be another check anyway and it would be caught.
> >> Unless there's some weird server-side permission wiggling as well that makes it
> >> ineffective, but I imagine that one cannot really change in a racy way?
> > 
> > Yeah, I think I'll just change those NFSD_MAY_EXEC's to NFSD_MAY_NOP's.
> > We still need the fh_verify there since it's also what does the
> > filehandle->dentry translation, but we don't need permission checking
> > here yet.
> 
> This will likely need an extra test to ensure that when you
> do mkdir where you do not have exec permissions, you would get EACCES instead
> of EEXIST, otherwise that would be information leakage, no?
> Or do you think the second time we do nfsd_permission, that would be covered?

No, you're right, for some reason I thought that the check for a
positive inode didn't happen till later.  But actually the logic is
basically:

	lock inode
	lookup_one_len
	return nfserr_exist if looked up dentry is positive.
	check for create permission
	vfs_create

So, yes, the initial MAY_EXEC test's needed to prevent that information
leak.

That said... I wonder why it's done that way?  Seems to me we could just
tremove that nfserr_exist check and the vfs would handle it for us....
I'll try that.

--b.

> 
> > Applying with that one change.  (And I'll followup with some additional
> > minor cleanup of the create code.)
> > 
> > --b.
> > 
> >> 
> >>> I wonder why the code's structured that way--it's confusing.
> >> 
> >> Probably years of accumulated "damage" ;)
> >> 
> >>> --b.
> >>> 
> >>>> 	if (err)
> >>>> 		goto out;
> >>>> 
> >>>> @@ -1211,6 +1215,11 @@ nfsd_create(struct svc_rqst *rqstp, struct svc_fh *fhp,
> >>>> 		goto out; 
> >>>> 	}
> >>>> 
> >>>> +	/* Now let's see if we actually have permissions to create */
> >>>> +	err = nfsd_permission(rqstp, fhp->fh_export, dentry, NFSD_MAY_CREATE);
> >>>> +	if (err)
> >>>> +		goto out;
> >>>> +
> >>>> 	if (!(iap->ia_valid & ATTR_MODE))
> >>>> 		iap->ia_mode = 0;
> >>>> 	iap->ia_mode = (iap->ia_mode & S_IALLUGO) | type;
> >>>> -- 
> >>>> 2.7.4

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH] nfsd: Make creates return EEXIST correctly instead of EPERM
  2016-07-22  1:57         ` J. Bruce Fields
@ 2016-07-22  6:35           ` Oleg Drokin
  2016-07-22 10:55             ` J. Bruce Fields
  0 siblings, 1 reply; 37+ messages in thread
From: Oleg Drokin @ 2016-07-22  6:35 UTC (permalink / raw)
  To: J. Bruce Fields; +Cc: Jeff Layton, linux-nfs, linux-kernel


On Jul 21, 2016, at 9:57 PM, J. Bruce Fields wrote:

> On Thu, Jul 21, 2016 at 04:37:40PM -0400, Oleg Drokin wrote:
>> 
>> On Jul 21, 2016, at 4:34 PM, J. Bruce Fields wrote:
>> 
>>> On Fri, Jul 08, 2016 at 05:53:19PM -0400, Oleg Drokin wrote:
>>>> 
>>>> On Jul 8, 2016, at 4:54 PM, J. Bruce Fields wrote:
>>>> 
>>>>> On Thu, Jul 07, 2016 at 09:47:46PM -0400, Oleg Drokin wrote:
>>>>>> It looks like we are bit overzealous about failing mkdir/create/mknod
>>>>>> with permission denied if the parent dir is not writeable.
>>>>>> Need to make sure the name does not exist first, because we need to
>>>>>> return EEXIST in that case.
>>>>>> 
>>>>>> Signed-off-by: Oleg Drokin <green@linuxhacker.ru>
>>>>>> ---
>>>>>> A very similar problem exists with symlinks, but the patch is more
>>>>>> involved, so assuming this one is ok, I'll send a symlink one separately.
>>>>>> fs/nfsd/nfs4proc.c |  6 +++++-
>>>>>> fs/nfsd/vfs.c      | 11 ++++++++++-
>>>>>> 2 files changed, 15 insertions(+), 2 deletions(-)
>>>>>> 
>>>>>> diff --git a/fs/nfsd/nfs4proc.c b/fs/nfsd/nfs4proc.c
>>>>>> index de1ff1d..0067520 100644
>>>>>> --- a/fs/nfsd/nfs4proc.c
>>>>>> +++ b/fs/nfsd/nfs4proc.c
>>>>>> @@ -605,8 +605,12 @@ nfsd4_create(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
>>>>>> 
>>>>>> 	fh_init(&resfh, NFS4_FHSIZE);
>>>>>> 
>>>>>> +	/*
>>>>>> +	 * We just check thta parent is accessible here, nfsd_* do their
>>>>>> +	 * own access permission checks
>>>>>> +	 */
>>>>>> 	status = fh_verify(rqstp, &cstate->current_fh, S_IFDIR,
>>>>>> -			   NFSD_MAY_CREATE);
>>>>>> +			   NFSD_MAY_EXEC);
>>>>>> 	if (status)
>>>>>> 		return status;
>>>>>> 
>>>>>> diff --git a/fs/nfsd/vfs.c b/fs/nfsd/vfs.c
>>>>>> index 6fbd81e..6a45ec6 100644
>>>>>> --- a/fs/nfsd/vfs.c
>>>>>> +++ b/fs/nfsd/vfs.c
>>>>>> @@ -1161,7 +1161,11 @@ nfsd_create(struct svc_rqst *rqstp, struct svc_fh *fhp,
>>>>>> 	if (isdotent(fname, flen))
>>>>>> 		goto out;
>>>>>> 
>>>>>> -	err = fh_verify(rqstp, fhp, S_IFDIR, NFSD_MAY_CREATE);
>>>>>> +	/*
>>>>>> +	 * Even though it is a create, first we see if we are even allowed
>>>>>> +	 * to peek inside the parent
>>>>>> +	 */
>>>>>> +	err = fh_verify(rqstp, fhp, S_IFDIR, NFSD_MAY_EXEC);
>>>>> 
>>>>> Looks like in the v3 case we haven't actually locked the directory yet
>>>>> at this point so this check is a little race-prone.
>>>> 
>>>> In reality this check is not really needed, I suspect.
>>>> When we call vfs_create/mknod/mkdir later on, it has it's own permission check
>>>> anyway so if there was a race and somebody changed dir access in the middle,
>>>> there's going to be another check anyway and it would be caught.
>>>> Unless there's some weird server-side permission wiggling as well that makes it
>>>> ineffective, but I imagine that one cannot really change in a racy way?
>>> 
>>> Yeah, I think I'll just change those NFSD_MAY_EXEC's to NFSD_MAY_NOP's.
>>> We still need the fh_verify there since it's also what does the
>>> filehandle->dentry translation, but we don't need permission checking
>>> here yet.
>> 
>> This will likely need an extra test to ensure that when you
>> do mkdir where you do not have exec permissions, you would get EACCES instead
>> of EEXIST, otherwise that would be information leakage, no?
>> Or do you think the second time we do nfsd_permission, that would be covered?
> 
> No, you're right, for some reason I thought that the check for a
> positive inode didn't happen till later.  But actually the logic is
> basically:
> 
> 	lock inode
> 	lookup_one_len
> 	return nfserr_exist if looked up dentry is positive.
> 	check for create permission
> 	vfs_create
> 
> So, yes, the initial MAY_EXEC test's needed to prevent that information
> leak.
> 
> That said... I wonder why it's done that way?  Seems to me we could just
> tremove that nfserr_exist check and the vfs would handle it for us....
> I'll try that.

It won't work because the very first thing vfs_create does is may_create(),
and so you get EACCES right there instead of the EEXIST.

> 
> --b.
> 
>> 
>>> Applying with that one change.  (And I'll followup with some additional
>>> minor cleanup of the create code.)
>>> 
>>> --b.
>>> 
>>>> 
>>>>> I wonder why the code's structured that way--it's confusing.
>>>> 
>>>> Probably years of accumulated "damage" ;)
>>>> 
>>>>> --b.
>>>>> 
>>>>>> 	if (err)
>>>>>> 		goto out;
>>>>>> 
>>>>>> @@ -1211,6 +1215,11 @@ nfsd_create(struct svc_rqst *rqstp, struct svc_fh *fhp,
>>>>>> 		goto out; 
>>>>>> 	}
>>>>>> 
>>>>>> +	/* Now let's see if we actually have permissions to create */
>>>>>> +	err = nfsd_permission(rqstp, fhp->fh_export, dentry, NFSD_MAY_CREATE);
>>>>>> +	if (err)
>>>>>> +		goto out;
>>>>>> +
>>>>>> 	if (!(iap->ia_valid & ATTR_MODE))
>>>>>> 		iap->ia_mode = 0;
>>>>>> 	iap->ia_mode = (iap->ia_mode & S_IALLUGO) | type;
>>>>>> -- 
>>>>>> 2.7.4

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH] nfsd: Make creates return EEXIST correctly instead of EPERM
  2016-07-22  6:35           ` Oleg Drokin
@ 2016-07-22 10:55             ` J. Bruce Fields
  2016-07-22 15:13               ` Oleg Drokin
  0 siblings, 1 reply; 37+ messages in thread
From: J. Bruce Fields @ 2016-07-22 10:55 UTC (permalink / raw)
  To: Oleg Drokin; +Cc: Jeff Layton, linux-nfs, linux-kernel

On Fri, Jul 22, 2016 at 02:35:26AM -0400, Oleg Drokin wrote:
> 
> On Jul 21, 2016, at 9:57 PM, J. Bruce Fields wrote:
> 
> > On Thu, Jul 21, 2016 at 04:37:40PM -0400, Oleg Drokin wrote:
> >> 
> >> On Jul 21, 2016, at 4:34 PM, J. Bruce Fields wrote:
> >> 
> >>> On Fri, Jul 08, 2016 at 05:53:19PM -0400, Oleg Drokin wrote:
> >>>> 
> >>>> On Jul 8, 2016, at 4:54 PM, J. Bruce Fields wrote:
> >>>> 
> >>>>> On Thu, Jul 07, 2016 at 09:47:46PM -0400, Oleg Drokin wrote:
> >>>>>> It looks like we are bit overzealous about failing mkdir/create/mknod
> >>>>>> with permission denied if the parent dir is not writeable.
> >>>>>> Need to make sure the name does not exist first, because we need to
> >>>>>> return EEXIST in that case.
> >>>>>> 
> >>>>>> Signed-off-by: Oleg Drokin <green@linuxhacker.ru>
> >>>>>> ---
> >>>>>> A very similar problem exists with symlinks, but the patch is more
> >>>>>> involved, so assuming this one is ok, I'll send a symlink one separately.
> >>>>>> fs/nfsd/nfs4proc.c |  6 +++++-
> >>>>>> fs/nfsd/vfs.c      | 11 ++++++++++-
> >>>>>> 2 files changed, 15 insertions(+), 2 deletions(-)
> >>>>>> 
> >>>>>> diff --git a/fs/nfsd/nfs4proc.c b/fs/nfsd/nfs4proc.c
> >>>>>> index de1ff1d..0067520 100644
> >>>>>> --- a/fs/nfsd/nfs4proc.c
> >>>>>> +++ b/fs/nfsd/nfs4proc.c
> >>>>>> @@ -605,8 +605,12 @@ nfsd4_create(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
> >>>>>> 
> >>>>>> 	fh_init(&resfh, NFS4_FHSIZE);
> >>>>>> 
> >>>>>> +	/*
> >>>>>> +	 * We just check thta parent is accessible here, nfsd_* do their
> >>>>>> +	 * own access permission checks
> >>>>>> +	 */
> >>>>>> 	status = fh_verify(rqstp, &cstate->current_fh, S_IFDIR,
> >>>>>> -			   NFSD_MAY_CREATE);
> >>>>>> +			   NFSD_MAY_EXEC);
> >>>>>> 	if (status)
> >>>>>> 		return status;
> >>>>>> 
> >>>>>> diff --git a/fs/nfsd/vfs.c b/fs/nfsd/vfs.c
> >>>>>> index 6fbd81e..6a45ec6 100644
> >>>>>> --- a/fs/nfsd/vfs.c
> >>>>>> +++ b/fs/nfsd/vfs.c
> >>>>>> @@ -1161,7 +1161,11 @@ nfsd_create(struct svc_rqst *rqstp, struct svc_fh *fhp,
> >>>>>> 	if (isdotent(fname, flen))
> >>>>>> 		goto out;
> >>>>>> 
> >>>>>> -	err = fh_verify(rqstp, fhp, S_IFDIR, NFSD_MAY_CREATE);
> >>>>>> +	/*
> >>>>>> +	 * Even though it is a create, first we see if we are even allowed
> >>>>>> +	 * to peek inside the parent
> >>>>>> +	 */
> >>>>>> +	err = fh_verify(rqstp, fhp, S_IFDIR, NFSD_MAY_EXEC);
> >>>>> 
> >>>>> Looks like in the v3 case we haven't actually locked the directory yet
> >>>>> at this point so this check is a little race-prone.
> >>>> 
> >>>> In reality this check is not really needed, I suspect.
> >>>> When we call vfs_create/mknod/mkdir later on, it has it's own permission check
> >>>> anyway so if there was a race and somebody changed dir access in the middle,
> >>>> there's going to be another check anyway and it would be caught.
> >>>> Unless there's some weird server-side permission wiggling as well that makes it
> >>>> ineffective, but I imagine that one cannot really change in a racy way?
> >>> 
> >>> Yeah, I think I'll just change those NFSD_MAY_EXEC's to NFSD_MAY_NOP's.
> >>> We still need the fh_verify there since it's also what does the
> >>> filehandle->dentry translation, but we don't need permission checking
> >>> here yet.
> >> 
> >> This will likely need an extra test to ensure that when you
> >> do mkdir where you do not have exec permissions, you would get EACCES instead
> >> of EEXIST, otherwise that would be information leakage, no?
> >> Or do you think the second time we do nfsd_permission, that would be covered?
> > 
> > No, you're right, for some reason I thought that the check for a
> > positive inode didn't happen till later.  But actually the logic is
> > basically:
> > 
> > 	lock inode
> > 	lookup_one_len
> > 	return nfserr_exist if looked up dentry is positive.
> > 	check for create permission
> > 	vfs_create
> > 
> > So, yes, the initial MAY_EXEC test's needed to prevent that information
> > leak.
> > 
> > That said... I wonder why it's done that way?  Seems to me we could just
> > tremove that nfserr_exist check and the vfs would handle it for us....
> > I'll try that.
> 
> It won't work because the very first thing vfs_create does is may_create(),
> and so you get EACCES right there instead of the EEXIST.

static inline int may_create(struct inode *dir, struct dentry *child)
{
        audit_inode_child(dir, child, AUDIT_TYPE_CHILD_CREATE);
        if (child->d_inode)
                return -EEXIST;
	...

So it looks OK to me.

--b.

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH] nfsd: Make creates return EEXIST correctly instead of EPERM
  2016-07-22 10:55             ` J. Bruce Fields
@ 2016-07-22 15:13               ` Oleg Drokin
  2016-07-22 17:48                 ` J. Bruce Fields
  0 siblings, 1 reply; 37+ messages in thread
From: Oleg Drokin @ 2016-07-22 15:13 UTC (permalink / raw)
  To: J. Bruce Fields; +Cc: Jeff Layton, linux-nfs, linux-kernel


On Jul 22, 2016, at 6:55 AM, J. Bruce Fields wrote:

> On Fri, Jul 22, 2016 at 02:35:26AM -0400, Oleg Drokin wrote:
>> 
>> On Jul 21, 2016, at 9:57 PM, J. Bruce Fields wrote:
>> 
>>> On Thu, Jul 21, 2016 at 04:37:40PM -0400, Oleg Drokin wrote:
>>>> 
>>>> On Jul 21, 2016, at 4:34 PM, J. Bruce Fields wrote:
>>>> 
>>>>> On Fri, Jul 08, 2016 at 05:53:19PM -0400, Oleg Drokin wrote:
>>>>>> 
>>>>>> On Jul 8, 2016, at 4:54 PM, J. Bruce Fields wrote:
>>>>>> 
>>>>>>> On Thu, Jul 07, 2016 at 09:47:46PM -0400, Oleg Drokin wrote:
>>>>>>>> It looks like we are bit overzealous about failing mkdir/create/mknod
>>>>>>>> with permission denied if the parent dir is not writeable.
>>>>>>>> Need to make sure the name does not exist first, because we need to
>>>>>>>> return EEXIST in that case.
>>>>>>>> 
>>>>>>>> Signed-off-by: Oleg Drokin <green@linuxhacker.ru>
>>>>>>>> ---
>>>>>>>> A very similar problem exists with symlinks, but the patch is more
>>>>>>>> involved, so assuming this one is ok, I'll send a symlink one separately.
>>>>>>>> fs/nfsd/nfs4proc.c |  6 +++++-
>>>>>>>> fs/nfsd/vfs.c      | 11 ++++++++++-
>>>>>>>> 2 files changed, 15 insertions(+), 2 deletions(-)
>>>>>>>> 
>>>>>>>> diff --git a/fs/nfsd/nfs4proc.c b/fs/nfsd/nfs4proc.c
>>>>>>>> index de1ff1d..0067520 100644
>>>>>>>> --- a/fs/nfsd/nfs4proc.c
>>>>>>>> +++ b/fs/nfsd/nfs4proc.c
>>>>>>>> @@ -605,8 +605,12 @@ nfsd4_create(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
>>>>>>>> 
>>>>>>>> 	fh_init(&resfh, NFS4_FHSIZE);
>>>>>>>> 
>>>>>>>> +	/*
>>>>>>>> +	 * We just check thta parent is accessible here, nfsd_* do their
>>>>>>>> +	 * own access permission checks
>>>>>>>> +	 */
>>>>>>>> 	status = fh_verify(rqstp, &cstate->current_fh, S_IFDIR,
>>>>>>>> -			   NFSD_MAY_CREATE);
>>>>>>>> +			   NFSD_MAY_EXEC);
>>>>>>>> 	if (status)
>>>>>>>> 		return status;
>>>>>>>> 
>>>>>>>> diff --git a/fs/nfsd/vfs.c b/fs/nfsd/vfs.c
>>>>>>>> index 6fbd81e..6a45ec6 100644
>>>>>>>> --- a/fs/nfsd/vfs.c
>>>>>>>> +++ b/fs/nfsd/vfs.c
>>>>>>>> @@ -1161,7 +1161,11 @@ nfsd_create(struct svc_rqst *rqstp, struct svc_fh *fhp,
>>>>>>>> 	if (isdotent(fname, flen))
>>>>>>>> 		goto out;
>>>>>>>> 
>>>>>>>> -	err = fh_verify(rqstp, fhp, S_IFDIR, NFSD_MAY_CREATE);
>>>>>>>> +	/*
>>>>>>>> +	 * Even though it is a create, first we see if we are even allowed
>>>>>>>> +	 * to peek inside the parent
>>>>>>>> +	 */
>>>>>>>> +	err = fh_verify(rqstp, fhp, S_IFDIR, NFSD_MAY_EXEC);
>>>>>>> 
>>>>>>> Looks like in the v3 case we haven't actually locked the directory yet
>>>>>>> at this point so this check is a little race-prone.
>>>>>> 
>>>>>> In reality this check is not really needed, I suspect.
>>>>>> When we call vfs_create/mknod/mkdir later on, it has it's own permission check
>>>>>> anyway so if there was a race and somebody changed dir access in the middle,
>>>>>> there's going to be another check anyway and it would be caught.
>>>>>> Unless there's some weird server-side permission wiggling as well that makes it
>>>>>> ineffective, but I imagine that one cannot really change in a racy way?
>>>>> 
>>>>> Yeah, I think I'll just change those NFSD_MAY_EXEC's to NFSD_MAY_NOP's.
>>>>> We still need the fh_verify there since it's also what does the
>>>>> filehandle->dentry translation, but we don't need permission checking
>>>>> here yet.
>>>> 
>>>> This will likely need an extra test to ensure that when you
>>>> do mkdir where you do not have exec permissions, you would get EACCES instead
>>>> of EEXIST, otherwise that would be information leakage, no?
>>>> Or do you think the second time we do nfsd_permission, that would be covered?
>>> 
>>> No, you're right, for some reason I thought that the check for a
>>> positive inode didn't happen till later.  But actually the logic is
>>> basically:
>>> 
>>> 	lock inode
>>> 	lookup_one_len
>>> 	return nfserr_exist if looked up dentry is positive.
>>> 	check for create permission
>>> 	vfs_create
>>> 
>>> So, yes, the initial MAY_EXEC test's needed to prevent that information
>>> leak.
>>> 
>>> That said... I wonder why it's done that way?  Seems to me we could just
>>> tremove that nfserr_exist check and the vfs would handle it for us....
>>> I'll try that.
>> 
>> It won't work because the very first thing vfs_create does is may_create(),
>> and so you get EACCES right there instead of the EEXIST.
> 
> static inline int may_create(struct inode *dir, struct dentry *child)
> {
>        audit_inode_child(dir, child, AUDIT_TYPE_CHILD_CREATE);
>        if (child->d_inode)
>                return -EEXIST;
> 	...
> 
> So it looks OK to me.

Hm, in fact indeed. I was just too worked up about the client side, but on the
server side there was a real lookup already, so it does look workable.

> 
> --b.

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH] nfsd: Make creates return EEXIST correctly instead of EPERM
  2016-07-22 15:13               ` Oleg Drokin
@ 2016-07-22 17:48                 ` J. Bruce Fields
  2016-07-22 17:48                   ` [PATCH 1/7] nfsd: Make creates return EEXIST instead of EACCES J. Bruce Fields
                                     ` (6 more replies)
  0 siblings, 7 replies; 37+ messages in thread
From: J. Bruce Fields @ 2016-07-22 17:48 UTC (permalink / raw)
  To: Oleg Drokin
  Cc: Jeff Layton, linux-nfs, linux-kernel, J. Bruce Fields, J. Bruce Fields

From: "J. Bruce Fields" <bfields@redhat.com>

On Fri, Jul 22, 2016 at 11:13:20AM -0400, Oleg Drokin wrote:
> Hm, in fact indeed. I was just too worked up about the client side,
> but on the server side there was a real lookup already, so it does
> look workable.

So I end up with the following.  This is all (after your patch) pretty
trivial cleanup, but I think it's overdue for that code.

--b.


J. Bruce Fields (6):
  nfsd: remove redundant zero-length check from create
  nfsd: remove redundant i_lookup check
  nfsd: reorganize nfsd_create
  nfsd: remove unnecessary positive-dentry check
  nfsd: clean up bad-type check in nfsd_create_locked
  nfsd: drop unnecessary MAY_EXEC check from create

Oleg Drokin (1):
  nfsd: Make creates return EEXIST instead of EACCES

 fs/nfsd/nfs4proc.c |   3 +-
 fs/nfsd/nfsproc.c  |   7 +--
 fs/nfsd/vfs.c      | 131 ++++++++++++++++++++++++-----------------------------
 fs/nfsd/vfs.h      |   3 ++
 4 files changed, 66 insertions(+), 78 deletions(-)

-- 
2.7.4

^ permalink raw reply	[flat|nested] 37+ messages in thread

* [PATCH 1/7] nfsd: Make creates return EEXIST instead of EACCES
  2016-07-22 17:48                 ` J. Bruce Fields
@ 2016-07-22 17:48                   ` J. Bruce Fields
  2016-07-22 17:48                   ` [PATCH 2/7] nfsd: remove redundant zero-length check from create J. Bruce Fields
                                     ` (5 subsequent siblings)
  6 siblings, 0 replies; 37+ messages in thread
From: J. Bruce Fields @ 2016-07-22 17:48 UTC (permalink / raw)
  To: Oleg Drokin
  Cc: Jeff Layton, linux-nfs, linux-kernel, J. Bruce Fields, J. Bruce Fields

From: Oleg Drokin <green@linuxhacker.ru>

When doing a create (mkdir/mknod) on a name, it's worth
checking the name exists first before returning EACCES in case
the directory is not writeable by the user.
This makes return values on the client more consistent
regardless of whenever the entry there is cached in the local
cache or not.
Another positive side effect is certain programs only expect
EEXIST in that case even despite POSIX allowing any valid
error to be returned.

Signed-off-by: Oleg Drokin <green@linuxhacker.ru>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
---
 fs/nfsd/nfs4proc.c |  6 +++++-
 fs/nfsd/vfs.c      | 11 ++++++++++-
 2 files changed, 15 insertions(+), 2 deletions(-)

diff --git a/fs/nfsd/nfs4proc.c b/fs/nfsd/nfs4proc.c
index e0c15f879d89..9d7e1edf0cca 100644
--- a/fs/nfsd/nfs4proc.c
+++ b/fs/nfsd/nfs4proc.c
@@ -605,8 +605,12 @@ nfsd4_create(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
 
 	fh_init(&resfh, NFS4_FHSIZE);
 
+	/*
+	 * We just check that parent is accessible here, nfsd_* do their
+	 * own access permission checks
+	 */
 	status = fh_verify(rqstp, &cstate->current_fh, S_IFDIR,
-			   NFSD_MAY_CREATE);
+			   NFSD_MAY_EXEC);
 	if (status)
 		return status;
 
diff --git a/fs/nfsd/vfs.c b/fs/nfsd/vfs.c
index 6fbd81ecb410..fda4f86161f8 100644
--- a/fs/nfsd/vfs.c
+++ b/fs/nfsd/vfs.c
@@ -1161,7 +1161,11 @@ nfsd_create(struct svc_rqst *rqstp, struct svc_fh *fhp,
 	if (isdotent(fname, flen))
 		goto out;
 
-	err = fh_verify(rqstp, fhp, S_IFDIR, NFSD_MAY_CREATE);
+	/*
+	 * Even though it is a create, first let's see if we are even allowed
+	 * to peek inside the parent
+	 */
+	err = fh_verify(rqstp, fhp, S_IFDIR, NFSD_MAY_EXEC);
 	if (err)
 		goto out;
 
@@ -1211,6 +1215,11 @@ nfsd_create(struct svc_rqst *rqstp, struct svc_fh *fhp,
 		goto out; 
 	}
 
+	/* Now let's see if we actually have permissions to create */
+	err = nfsd_permission(rqstp, fhp->fh_export, dentry, NFSD_MAY_CREATE);
+	if (err)
+		goto out;
+
 	if (!(iap->ia_valid & ATTR_MODE))
 		iap->ia_mode = 0;
 	iap->ia_mode = (iap->ia_mode & S_IALLUGO) | type;
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [PATCH 2/7] nfsd: remove redundant zero-length check from create
  2016-07-22 17:48                 ` J. Bruce Fields
  2016-07-22 17:48                   ` [PATCH 1/7] nfsd: Make creates return EEXIST instead of EACCES J. Bruce Fields
@ 2016-07-22 17:48                   ` J. Bruce Fields
  2016-07-22 17:48                   ` [PATCH 3/7] nfsd: remove redundant i_lookup check J. Bruce Fields
                                     ` (4 subsequent siblings)
  6 siblings, 0 replies; 37+ messages in thread
From: J. Bruce Fields @ 2016-07-22 17:48 UTC (permalink / raw)
  To: Oleg Drokin
  Cc: Jeff Layton, linux-nfs, linux-kernel, J. Bruce Fields, J. Bruce Fields

From: "J. Bruce Fields" <bfields@redhat.com>

lookup_one_len already has this check.

The only effect of this patch is to return access instead of perm in the
0-length-filename case.  I actually prefer nfserr_perm (or _inval?), but
I doubt anyone cares.

The isdotent check seems redundant too, but I worry that some client
might actually care about that strange nfserr_exist error.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
---
 fs/nfsd/nfsproc.c | 3 ---
 fs/nfsd/vfs.c     | 3 ---
 2 files changed, 6 deletions(-)

diff --git a/fs/nfsd/nfsproc.c b/fs/nfsd/nfsproc.c
index 409fdde8e95d..c30b12388bf6 100644
--- a/fs/nfsd/nfsproc.c
+++ b/fs/nfsd/nfsproc.c
@@ -250,9 +250,6 @@ nfsd_proc_create(struct svc_rqst *rqstp, struct nfsd_createargs *argp,
 
 	/* Check for NFSD_MAY_WRITE in nfsd_create if necessary */
 
-	nfserr = nfserr_acces;
-	if (!argp->len)
-		goto done;
 	nfserr = nfserr_exist;
 	if (isdotent(argp->name, argp->len))
 		goto done;
diff --git a/fs/nfsd/vfs.c b/fs/nfsd/vfs.c
index fda4f86161f8..fba8e7e521e0 100644
--- a/fs/nfsd/vfs.c
+++ b/fs/nfsd/vfs.c
@@ -1154,9 +1154,6 @@ nfsd_create(struct svc_rqst *rqstp, struct svc_fh *fhp,
 	__be32		err2;
 	int		host_err;
 
-	err = nfserr_perm;
-	if (!flen)
-		goto out;
 	err = nfserr_exist;
 	if (isdotent(fname, flen))
 		goto out;
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [PATCH 3/7] nfsd: remove redundant i_lookup check
  2016-07-22 17:48                 ` J. Bruce Fields
  2016-07-22 17:48                   ` [PATCH 1/7] nfsd: Make creates return EEXIST instead of EACCES J. Bruce Fields
  2016-07-22 17:48                   ` [PATCH 2/7] nfsd: remove redundant zero-length check from create J. Bruce Fields
@ 2016-07-22 17:48                   ` J. Bruce Fields
  2016-07-24  0:22                     ` Al Viro
  2016-07-22 17:48                   ` [PATCH 4/7] nfsd: reorganize nfsd_create J. Bruce Fields
                                     ` (3 subsequent siblings)
  6 siblings, 1 reply; 37+ messages in thread
From: J. Bruce Fields @ 2016-07-22 17:48 UTC (permalink / raw)
  To: Oleg Drokin
  Cc: Jeff Layton, linux-nfs, linux-kernel, J. Bruce Fields, J. Bruce Fields

From: "J. Bruce Fields" <bfields@redhat.com>

I'm not sure why this was added.  It doesn't seem necessary, and no
other caller does this.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
---
 fs/nfsd/vfs.c | 3 ---
 1 file changed, 3 deletions(-)

diff --git a/fs/nfsd/vfs.c b/fs/nfsd/vfs.c
index fba8e7e521e0..7ae3b5a72a4d 100644
--- a/fs/nfsd/vfs.c
+++ b/fs/nfsd/vfs.c
@@ -1169,9 +1169,6 @@ nfsd_create(struct svc_rqst *rqstp, struct svc_fh *fhp,
 	dentry = fhp->fh_dentry;
 	dirp = d_inode(dentry);
 
-	err = nfserr_notdir;
-	if (!dirp->i_op->lookup)
-		goto out;
 	/*
 	 * Check whether the response file handle has been verified yet.
 	 * If it has, the parent directory should already be locked.
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [PATCH 4/7] nfsd: reorganize nfsd_create
  2016-07-22 17:48                 ` J. Bruce Fields
                                     ` (2 preceding siblings ...)
  2016-07-22 17:48                   ` [PATCH 3/7] nfsd: remove redundant i_lookup check J. Bruce Fields
@ 2016-07-22 17:48                   ` J. Bruce Fields
  2016-07-22 17:48                   ` [PATCH 5/7] nfsd: remove unnecessary positive-dentry check J. Bruce Fields
                                     ` (2 subsequent siblings)
  6 siblings, 0 replies; 37+ messages in thread
From: J. Bruce Fields @ 2016-07-22 17:48 UTC (permalink / raw)
  To: Oleg Drokin
  Cc: Jeff Layton, linux-nfs, linux-kernel, J. Bruce Fields, J. Bruce Fields

From: "J. Bruce Fields" <bfields@redhat.com>

There's some odd logic in nfsd_create() that allows it to be called with
the parent directory either locked or unlocked.  The only already-locked
caller is NFSv2's nfsd_proc_create().  It's less confusing to split out
the unlocked case into a separate function which the NFSv2 code can call
directly.

Also fix some comments while we're here.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
---
 fs/nfsd/nfsproc.c |   4 +-
 fs/nfsd/vfs.c     | 109 ++++++++++++++++++++++++++++--------------------------
 fs/nfsd/vfs.h     |   3 ++
 3 files changed, 61 insertions(+), 55 deletions(-)

diff --git a/fs/nfsd/nfsproc.c b/fs/nfsd/nfsproc.c
index c30b12388bf6..8914206c867c 100644
--- a/fs/nfsd/nfsproc.c
+++ b/fs/nfsd/nfsproc.c
@@ -358,8 +358,8 @@ nfsd_proc_create(struct svc_rqst *rqstp, struct nfsd_createargs *argp,
 	nfserr = 0;
 	if (!inode) {
 		/* File doesn't exist. Create it and set attrs */
-		nfserr = nfsd_create(rqstp, dirfhp, argp->name, argp->len,
-					attr, type, rdev, newfhp);
+		nfserr = nfsd_create_locked(rqstp, dirfhp, argp->name,
+					argp->len, attr, type, rdev, newfhp);
 	} else if (type == S_IFREG) {
 		dprintk("nfsd:   existing %s, valid=%x, size=%ld\n",
 			argp->name, attr->ia_valid, (long) attr->ia_size);
diff --git a/fs/nfsd/vfs.c b/fs/nfsd/vfs.c
index 7ae3b5a72a4d..9c7830cdaeda 100644
--- a/fs/nfsd/vfs.c
+++ b/fs/nfsd/vfs.c
@@ -1135,16 +1135,9 @@ nfsd_check_ignore_resizing(struct iattr *iap)
 		iap->ia_valid &= ~ATTR_SIZE;
 }
 
-/*
- * Create a file (regular, directory, device, fifo); UNIX sockets 
- * not yet implemented.
- * If the response fh has been verified, the parent directory should
- * already be locked. Note that the parent directory is left locked.
- *
- * N.B. Every call to nfsd_create needs an fh_put for _both_ fhp and resfhp
- */
+/* The parent directory should already be locked: */
 __be32
-nfsd_create(struct svc_rqst *rqstp, struct svc_fh *fhp,
+nfsd_create_locked(struct svc_rqst *rqstp, struct svc_fh *fhp,
 		char *fname, int flen, struct iattr *iap,
 		int type, dev_t rdev, struct svc_fh *resfhp)
 {
@@ -1154,50 +1147,15 @@ nfsd_create(struct svc_rqst *rqstp, struct svc_fh *fhp,
 	__be32		err2;
 	int		host_err;
 
-	err = nfserr_exist;
-	if (isdotent(fname, flen))
-		goto out;
-
-	/*
-	 * Even though it is a create, first let's see if we are even allowed
-	 * to peek inside the parent
-	 */
-	err = fh_verify(rqstp, fhp, S_IFDIR, NFSD_MAY_EXEC);
-	if (err)
-		goto out;
-
 	dentry = fhp->fh_dentry;
 	dirp = d_inode(dentry);
 
-	/*
-	 * Check whether the response file handle has been verified yet.
-	 * If it has, the parent directory should already be locked.
-	 */
-	if (!resfhp->fh_dentry) {
-		host_err = fh_want_write(fhp);
-		if (host_err)
-			goto out_nfserr;
-
-		/* called from nfsd_proc_mkdir, or possibly nfsd3_proc_create */
-		fh_lock_nested(fhp, I_MUTEX_PARENT);
-		dchild = lookup_one_len(fname, dentry, flen);
-		host_err = PTR_ERR(dchild);
-		if (IS_ERR(dchild))
-			goto out_nfserr;
-		err = fh_compose(resfhp, fhp->fh_export, dchild, fhp);
-		if (err)
-			goto out;
-	} else {
-		/* called from nfsd_proc_create */
-		dchild = dget(resfhp->fh_dentry);
-		if (!fhp->fh_locked) {
-			/* not actually possible */
-			printk(KERN_ERR
-				"nfsd_create: parent %pd2 not locked!\n",
+	dchild = dget(resfhp->fh_dentry);
+	if (!fhp->fh_locked) {
+		WARN_ONCE(1, "nfsd_create: parent %pd2 not locked!\n",
 				dentry);
-			err = nfserr_io;
-			goto out;
-		}
+		err = nfserr_io;
+		goto out;
 	}
 	/*
 	 * Make sure the child dentry is still negative ...
@@ -1225,9 +1183,6 @@ nfsd_create(struct svc_rqst *rqstp, struct svc_fh *fhp,
 		goto out;
 	}
 
-	/*
-	 * Get the dir op function pointer.
-	 */
 	err = 0;
 	host_err = 0;
 	switch (type) {
@@ -1254,7 +1209,7 @@ nfsd_create(struct svc_rqst *rqstp, struct svc_fh *fhp,
 	/*
 	 * nfsd_create_setattr already committed the child.  Transactional
 	 * filesystems had a chance to commit changes for both parent and
-	 * child * simultaneously making the following commit_metadata a
+	 * child simultaneously making the following commit_metadata a
 	 * noop.
 	 */
 	err2 = nfserrno(commit_metadata(fhp));
@@ -1275,6 +1230,54 @@ out_nfserr:
 	goto out;
 }
 
+/*
+ * Create a filesystem object (regular, directory, special).
+ * Note that the parent directory is left locked.
+ *
+ * N.B. Every call to nfsd_create needs an fh_put for _both_ fhp and resfhp
+ */
+__be32
+nfsd_create(struct svc_rqst *rqstp, struct svc_fh *fhp,
+		char *fname, int flen, struct iattr *iap,
+		int type, dev_t rdev, struct svc_fh *resfhp)
+{
+	struct dentry	*dentry, *dchild = NULL;
+	struct inode	*dirp;
+	__be32		err;
+	int		host_err;
+
+	if (isdotent(fname, flen))
+		return nfserr_exist;
+
+	/*
+	 * Even though it is a create, first let's see if we are even allowed
+	 * to peek inside the parent
+	 */
+	err = fh_verify(rqstp, fhp, S_IFDIR, NFSD_MAY_EXEC);
+	if (err)
+		return err;
+
+	dentry = fhp->fh_dentry;
+	dirp = d_inode(dentry);
+
+	host_err = fh_want_write(fhp);
+	if (host_err)
+		return nfserrno(host_err);
+
+	fh_lock_nested(fhp, I_MUTEX_PARENT);
+	dchild = lookup_one_len(fname, dentry, flen);
+	host_err = PTR_ERR(dchild);
+	if (IS_ERR(dchild))
+		return nfserrno(host_err);
+	err = fh_compose(resfhp, fhp->fh_export, dchild, fhp);
+	if (err) {
+		dput(dchild);
+		return err;
+	}
+	return nfsd_create_locked(rqstp, fhp, fname, flen, iap, type,
+					rdev, resfhp);
+}
+
 #ifdef CONFIG_NFSD_V3
 
 /*
diff --git a/fs/nfsd/vfs.h b/fs/nfsd/vfs.h
index 2d573ec057f8..3cbb1b33777b 100644
--- a/fs/nfsd/vfs.h
+++ b/fs/nfsd/vfs.h
@@ -59,6 +59,9 @@ __be32		nfsd4_vfs_fallocate(struct svc_rqst *, struct svc_fh *,
 __be32		nfsd4_clone_file_range(struct file *, u64, struct file *,
 			u64, u64);
 #endif /* CONFIG_NFSD_V4 */
+__be32		nfsd_create_locked(struct svc_rqst *, struct svc_fh *,
+				char *name, int len, struct iattr *attrs,
+				int type, dev_t rdev, struct svc_fh *res);
 __be32		nfsd_create(struct svc_rqst *, struct svc_fh *,
 				char *name, int len, struct iattr *attrs,
 				int type, dev_t rdev, struct svc_fh *res);
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [PATCH 5/7] nfsd: remove unnecessary positive-dentry check
  2016-07-22 17:48                 ` J. Bruce Fields
                                     ` (3 preceding siblings ...)
  2016-07-22 17:48                   ` [PATCH 4/7] nfsd: reorganize nfsd_create J. Bruce Fields
@ 2016-07-22 17:48                   ` J. Bruce Fields
  2016-07-22 17:48                   ` [PATCH 6/7] nfsd: clean up bad-type check in nfsd_create_locked J. Bruce Fields
  2016-07-22 17:48                   ` [PATCH 7/7] nfsd: drop unnecessary MAY_EXEC check from create J. Bruce Fields
  6 siblings, 0 replies; 37+ messages in thread
From: J. Bruce Fields @ 2016-07-22 17:48 UTC (permalink / raw)
  To: Oleg Drokin
  Cc: Jeff Layton, linux-nfs, linux-kernel, J. Bruce Fields, J. Bruce Fields

From: "J. Bruce Fields" <bfields@redhat.com>

vfs_{create,mkdir,mknod} each begin with a call to may_create(), which
returns EEXIST if the object already exists.

This check is therefore unnecessary.

(In the NFSv2 case, nfsd_proc_create also has such a check.  Contrary to
RFC 1094, our code seems to believe that a CREATE of an existing file
should succeed.  I'm leaving that behavior alone.)

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
---
 fs/nfsd/vfs.c | 10 ----------
 1 file changed, 10 deletions(-)

diff --git a/fs/nfsd/vfs.c b/fs/nfsd/vfs.c
index 9c7830cdaeda..d45b39b408a1 100644
--- a/fs/nfsd/vfs.c
+++ b/fs/nfsd/vfs.c
@@ -1157,17 +1157,7 @@ nfsd_create_locked(struct svc_rqst *rqstp, struct svc_fh *fhp,
 		err = nfserr_io;
 		goto out;
 	}
-	/*
-	 * Make sure the child dentry is still negative ...
-	 */
-	err = nfserr_exist;
-	if (d_really_is_positive(dchild)) {
-		dprintk("nfsd_create: dentry %pd/%pd not negative!\n",
-			dentry, dchild);
-		goto out; 
-	}
 
-	/* Now let's see if we actually have permissions to create */
 	err = nfsd_permission(rqstp, fhp->fh_export, dentry, NFSD_MAY_CREATE);
 	if (err)
 		goto out;
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [PATCH 6/7] nfsd: clean up bad-type check in nfsd_create_locked
  2016-07-22 17:48                 ` J. Bruce Fields
                                     ` (4 preceding siblings ...)
  2016-07-22 17:48                   ` [PATCH 5/7] nfsd: remove unnecessary positive-dentry check J. Bruce Fields
@ 2016-07-22 17:48                   ` J. Bruce Fields
  2016-07-22 17:48                   ` [PATCH 7/7] nfsd: drop unnecessary MAY_EXEC check from create J. Bruce Fields
  6 siblings, 0 replies; 37+ messages in thread
From: J. Bruce Fields @ 2016-07-22 17:48 UTC (permalink / raw)
  To: Oleg Drokin
  Cc: Jeff Layton, linux-nfs, linux-kernel, J. Bruce Fields, J. Bruce Fields

From: "J. Bruce Fields" <bfields@redhat.com>

Minor cleanup, no change in behavior.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
---
 fs/nfsd/vfs.c | 11 ++++-------
 1 file changed, 4 insertions(+), 7 deletions(-)

diff --git a/fs/nfsd/vfs.c b/fs/nfsd/vfs.c
index d45b39b408a1..cd06c6511cfc 100644
--- a/fs/nfsd/vfs.c
+++ b/fs/nfsd/vfs.c
@@ -1166,13 +1166,6 @@ nfsd_create_locked(struct svc_rqst *rqstp, struct svc_fh *fhp,
 		iap->ia_mode = 0;
 	iap->ia_mode = (iap->ia_mode & S_IALLUGO) | type;
 
-	err = nfserr_inval;
-	if (!S_ISREG(type) && !S_ISDIR(type) && !special_file(type)) {
-		printk(KERN_WARNING "nfsd: bad file type %o in nfsd_create\n",
-		       type);
-		goto out;
-	}
-
 	err = 0;
 	host_err = 0;
 	switch (type) {
@@ -1190,6 +1183,10 @@ nfsd_create_locked(struct svc_rqst *rqstp, struct svc_fh *fhp,
 	case S_IFSOCK:
 		host_err = vfs_mknod(dirp, dchild, iap->ia_mode, rdev);
 		break;
+	default:
+		printk(KERN_WARNING "nfsd: bad file type %o in nfsd_create\n",
+		       type);
+		host_err = -EINVAL;
 	}
 	if (host_err < 0)
 		goto out_nfserr;
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [PATCH 7/7] nfsd: drop unnecessary MAY_EXEC check from create
  2016-07-22 17:48                 ` J. Bruce Fields
                                     ` (5 preceding siblings ...)
  2016-07-22 17:48                   ` [PATCH 6/7] nfsd: clean up bad-type check in nfsd_create_locked J. Bruce Fields
@ 2016-07-22 17:48                   ` J. Bruce Fields
  6 siblings, 0 replies; 37+ messages in thread
From: J. Bruce Fields @ 2016-07-22 17:48 UTC (permalink / raw)
  To: Oleg Drokin
  Cc: Jeff Layton, linux-nfs, linux-kernel, J. Bruce Fields, J. Bruce Fields

From: "J. Bruce Fields" <bfields@redhat.com>

We need an fh_verify to make sure we at least have a dentry, but actual
permission checks happen later.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
---
 fs/nfsd/nfs4proc.c | 7 +------
 fs/nfsd/vfs.c      | 6 +-----
 2 files changed, 2 insertions(+), 11 deletions(-)

diff --git a/fs/nfsd/nfs4proc.c b/fs/nfsd/nfs4proc.c
index 9d7e1edf0cca..1fb222752b2b 100644
--- a/fs/nfsd/nfs4proc.c
+++ b/fs/nfsd/nfs4proc.c
@@ -605,12 +605,7 @@ nfsd4_create(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
 
 	fh_init(&resfh, NFS4_FHSIZE);
 
-	/*
-	 * We just check that parent is accessible here, nfsd_* do their
-	 * own access permission checks
-	 */
-	status = fh_verify(rqstp, &cstate->current_fh, S_IFDIR,
-			   NFSD_MAY_EXEC);
+	status = fh_verify(rqstp, &cstate->current_fh, S_IFDIR, NFSD_MAY_NOP);
 	if (status)
 		return status;
 
diff --git a/fs/nfsd/vfs.c b/fs/nfsd/vfs.c
index cd06c6511cfc..c844fd601381 100644
--- a/fs/nfsd/vfs.c
+++ b/fs/nfsd/vfs.c
@@ -1236,11 +1236,7 @@ nfsd_create(struct svc_rqst *rqstp, struct svc_fh *fhp,
 	if (isdotent(fname, flen))
 		return nfserr_exist;
 
-	/*
-	 * Even though it is a create, first let's see if we are even allowed
-	 * to peek inside the parent
-	 */
-	err = fh_verify(rqstp, fhp, S_IFDIR, NFSD_MAY_EXEC);
+	err = fh_verify(rqstp, fhp, S_IFDIR, NFSD_MAY_NOP);
 	if (err)
 		return err;
 
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 37+ messages in thread

* Re: [PATCH 3/7] nfsd: remove redundant i_lookup check
  2016-07-22 17:48                   ` [PATCH 3/7] nfsd: remove redundant i_lookup check J. Bruce Fields
@ 2016-07-24  0:22                     ` Al Viro
  2016-07-24 12:10                       ` J. Bruce Fields
  0 siblings, 1 reply; 37+ messages in thread
From: Al Viro @ 2016-07-24  0:22 UTC (permalink / raw)
  To: J. Bruce Fields
  Cc: Oleg Drokin, Jeff Layton, linux-nfs, linux-kernel, J. Bruce Fields

On Fri, Jul 22, 2016 at 01:48:52PM -0400, J. Bruce Fields wrote:
> From: "J. Bruce Fields" <bfields@redhat.com>
> 
> I'm not sure why this was added.  It doesn't seem necessary, and no
> other caller does this.

lookup_one_len() will explode if you call it for non-directory (==
!d_can_lookup(), i.e. something without ->lookup()).  So unless the callers
do guarantee that check being true, it *is* needed.

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH 3/7] nfsd: remove redundant i_lookup check
  2016-07-24  0:22                     ` Al Viro
@ 2016-07-24 12:10                       ` J. Bruce Fields
  2016-07-24 14:23                         ` Al Viro
  0 siblings, 1 reply; 37+ messages in thread
From: J. Bruce Fields @ 2016-07-24 12:10 UTC (permalink / raw)
  To: Al Viro
  Cc: J. Bruce Fields, Oleg Drokin, Jeff Layton, linux-nfs, linux-kernel

On Sun, Jul 24, 2016 at 01:22:06AM +0100, Al Viro wrote:
> On Fri, Jul 22, 2016 at 01:48:52PM -0400, J. Bruce Fields wrote:
> > From: "J. Bruce Fields" <bfields@redhat.com>
> > 
> > I'm not sure why this was added.  It doesn't seem necessary, and no
> > other caller does this.
> 
> lookup_one_len() will explode if you call it for non-directory (==
> !d_can_lookup(), i.e. something without ->lookup()).  So unless the callers
> do guarantee that check being true, it *is* needed.

Both callers call fh_verify(.,.,S_IFDIR,.), so at this point we know
that i_mode & S_IFMT == S_IFDIR.  Is there some odd case where that's
insufficient?  If so, I think there may be bugs elsewhere in nfsd.  If
not, I'll add a note to the changelog.

Thanks for reminding me to check this, I hadn't thought of that as an
"is this a directory" check, it makes more sense now.

--b.

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH 3/7] nfsd: remove redundant i_lookup check
  2016-07-24 12:10                       ` J. Bruce Fields
@ 2016-07-24 14:23                         ` Al Viro
  2016-07-24 20:21                           ` J. Bruce Fields
  0 siblings, 1 reply; 37+ messages in thread
From: Al Viro @ 2016-07-24 14:23 UTC (permalink / raw)
  To: J. Bruce Fields
  Cc: J. Bruce Fields, Oleg Drokin, Jeff Layton, linux-nfs, linux-kernel

On Sun, Jul 24, 2016 at 08:10:14AM -0400, J. Bruce Fields wrote:
> On Sun, Jul 24, 2016 at 01:22:06AM +0100, Al Viro wrote:
> > On Fri, Jul 22, 2016 at 01:48:52PM -0400, J. Bruce Fields wrote:
> > > From: "J. Bruce Fields" <bfields@redhat.com>
> > > 
> > > I'm not sure why this was added.  It doesn't seem necessary, and no
> > > other caller does this.
> > 
> > lookup_one_len() will explode if you call it for non-directory (==
> > !d_can_lookup(), i.e. something without ->lookup()).  So unless the callers
> > do guarantee that check being true, it *is* needed.
> 
> Both callers call fh_verify(.,.,S_IFDIR,.), so at this point we know
> that i_mode & S_IFMT == S_IFDIR.  Is there some odd case where that's
> insufficient?  If so, I think there may be bugs elsewhere in nfsd.  If
> not, I'll add a note to the changelog.

First of all, such objects do exist; they probably won't be encountered by
nfsd and all instances I can think of are not writable, but...

> Thanks for reminding me to check this, I hadn't thought of that as an
> "is this a directory" check, it makes more sense now.

I'd have turned that into d_can_lookup(fhp->fh_dentry), actually.

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH 3/7] nfsd: remove redundant i_lookup check
  2016-07-24 14:23                         ` Al Viro
@ 2016-07-24 20:21                           ` J. Bruce Fields
  0 siblings, 0 replies; 37+ messages in thread
From: J. Bruce Fields @ 2016-07-24 20:21 UTC (permalink / raw)
  To: Al Viro
  Cc: J. Bruce Fields, Oleg Drokin, Jeff Layton, linux-nfs, linux-kernel

On Sun, Jul 24, 2016 at 03:23:07PM +0100, Al Viro wrote:
> On Sun, Jul 24, 2016 at 08:10:14AM -0400, J. Bruce Fields wrote:
> > On Sun, Jul 24, 2016 at 01:22:06AM +0100, Al Viro wrote:
> > > On Fri, Jul 22, 2016 at 01:48:52PM -0400, J. Bruce Fields wrote:
> > > > From: "J. Bruce Fields" <bfields@redhat.com>
> > > > 
> > > > I'm not sure why this was added.  It doesn't seem necessary, and no
> > > > other caller does this.
> > > 
> > > lookup_one_len() will explode if you call it for non-directory (==
> > > !d_can_lookup(), i.e. something without ->lookup()).  So unless the callers
> > > do guarantee that check being true, it *is* needed.
> > 
> > Both callers call fh_verify(.,.,S_IFDIR,.), so at this point we know
> > that i_mode & S_IFMT == S_IFDIR.  Is there some odd case where that's
> > insufficient?  If so, I think there may be bugs elsewhere in nfsd.  If
> > not, I'll add a note to the changelog.
> 
> First of all, such objects do exist; they probably won't be encountered by
> nfsd and all instances I can think of are not writable, but...
> 
> > Thanks for reminding me to check this, I hadn't thought of that as an
> > "is this a directory" check, it makes more sense now.
> 
> I'd have turned that into d_can_lookup(fhp->fh_dentry), actually.

So would such a check mainly just protect developers from themselves if
they try to make a weird filesystems exportable?

If we need to catch this I'd rather do it in fh_verify, which would
cover some other operations, too.  Maybe like the below.  We could be
nicer and WARN()/error out instead of BUG.  But it's unclear to me
whether this case is worth checking for at all.

--b.

diff --git a/fs/nfsd/nfsfh.c b/fs/nfsd/nfsfh.c
index 27250e279c37..372747a00214 100644
--- a/fs/nfsd/nfsfh.c
+++ b/fs/nfsd/nfsfh.c
@@ -59,14 +59,17 @@ static int nfsd_acceptable(void *expv, struct dentry *dentry)
  * the write call).
  */
 static inline __be32
-nfsd_mode_check(struct svc_rqst *rqstp, umode_t mode, umode_t requested)
+nfsd_mode_check(struct svc_rqst *rqstp, struct dentry *dentry,
+		umode_t requested)
 {
-	mode &= S_IFMT;
+	umode_t mode = d_inode(dentry)->i_mode & S_IFMT;
 
 	if (requested == 0) /* the caller doesn't care */
 		return nfs_ok;
-	if (mode == requested)
+	if (mode == requested) {
+		BUG_ON(mode == S_IFDIR && !d_can_lookup(dentry));
 		return nfs_ok;
+	}
 	/*
 	 * v4 has an error more specific than err_notdir which we should
 	 * return in preference to err_notdir:
@@ -340,7 +343,7 @@ fh_verify(struct svc_rqst *rqstp, struct svc_fh *fhp, umode_t type, int access)
 	if (error)
 		goto out;
 
-	error = nfsd_mode_check(rqstp, d_inode(dentry)->i_mode, type);
+	error = nfsd_mode_check(rqstp, dentry, type);
 	if (error)
 		goto out;
 

^ permalink raw reply related	[flat|nested] 37+ messages in thread

end of thread, other threads:[~2016-07-24 20:21 UTC | newest]

Thread overview: 37+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-07-08  1:47 [PATCH] nfsd: Make creates return EEXIST correctly instead of EPERM Oleg Drokin
2016-07-08 11:02 ` Jeff Layton
2016-07-08 15:14   ` Oleg Drokin
2016-07-08 15:53     ` Jeff Layton
2016-07-08 15:59       ` Oleg Drokin
2016-07-08 16:17         ` Jeff Layton
2016-07-08 16:28           ` Oleg Drokin
2016-07-09  2:52         ` Al Viro
2016-07-09  2:58           ` Oleg Drokin
2016-07-09  3:13             ` Al Viro
2016-07-08 16:04       ` J. Bruce Fields
2016-07-08 16:16         ` Oleg Drokin
2016-07-08 20:49           ` J. Bruce Fields
2016-07-08 21:47             ` Oleg Drokin
2016-07-09  3:10               ` Al Viro
2016-07-09  3:41                 ` Oleg Drokin
2016-07-13 19:00                   ` J. Bruce Fields
2016-07-08 20:54 ` J. Bruce Fields
2016-07-08 21:53   ` Oleg Drokin
2016-07-21 20:34     ` J. Bruce Fields
2016-07-21 20:37       ` Oleg Drokin
2016-07-22  1:57         ` J. Bruce Fields
2016-07-22  6:35           ` Oleg Drokin
2016-07-22 10:55             ` J. Bruce Fields
2016-07-22 15:13               ` Oleg Drokin
2016-07-22 17:48                 ` J. Bruce Fields
2016-07-22 17:48                   ` [PATCH 1/7] nfsd: Make creates return EEXIST instead of EACCES J. Bruce Fields
2016-07-22 17:48                   ` [PATCH 2/7] nfsd: remove redundant zero-length check from create J. Bruce Fields
2016-07-22 17:48                   ` [PATCH 3/7] nfsd: remove redundant i_lookup check J. Bruce Fields
2016-07-24  0:22                     ` Al Viro
2016-07-24 12:10                       ` J. Bruce Fields
2016-07-24 14:23                         ` Al Viro
2016-07-24 20:21                           ` J. Bruce Fields
2016-07-22 17:48                   ` [PATCH 4/7] nfsd: reorganize nfsd_create J. Bruce Fields
2016-07-22 17:48                   ` [PATCH 5/7] nfsd: remove unnecessary positive-dentry check J. Bruce Fields
2016-07-22 17:48                   ` [PATCH 6/7] nfsd: clean up bad-type check in nfsd_create_locked J. Bruce Fields
2016-07-22 17:48                   ` [PATCH 7/7] nfsd: drop unnecessary MAY_EXEC check from create J. Bruce Fields

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).