* split setattr operations take 2 @ 2017-02-20 6:21 Christoph Hellwig 2017-02-20 6:21 ` [PATCH] nfsd: special case truncates some more Christoph Hellwig 0 siblings, 1 reply; 23+ messages in thread From: Christoph Hellwig @ 2017-02-20 6:21 UTC (permalink / raw) To: bfields, jlayton; +Cc: linux-nfs This just splits the setattr operations and doesn't try to use more of the VFS infrastructure. ^ permalink raw reply [flat|nested] 23+ messages in thread
* [PATCH] nfsd: special case truncates some more 2017-02-20 6:21 split setattr operations take 2 Christoph Hellwig @ 2017-02-20 6:21 ` Christoph Hellwig 2017-02-20 22:23 ` J. Bruce Fields 2017-02-21 15:07 ` Chuck Lever 0 siblings, 2 replies; 23+ messages in thread From: Christoph Hellwig @ 2017-02-20 6:21 UTC (permalink / raw) To: bfields, jlayton; +Cc: linux-nfs, stable Both the NFS protocols and the Linux VFS use a setattr operation with a bitmap of attributs to set to set various file attributes including the file size and the uid/gid. The Linux syscalls never mixes size updates with unrelated updates like the uid/gid, and some file systems like XFS and GFS2 rely on the fact that truncates might not update random other attributes, and many other file systems handle the case but do not update the different attributes in the same transaction. NFSD on the other hand passes the attributes it gets on the wire more or less directly through to the VFS, leading to updates the file systems don't expect. XFS at least has an assert on the allowed attributes, which caught an unusual NFS client setting the size and group at the same time. To handle this issue properly this splits the notify_change call in nfsd_setattr into two separate ones. Signed-off-by: Christoph Hellwig <hch@lst.de> Cc: stable@kernel.org --- fs/nfsd/vfs.c | 59 +++++++++++++++++++++++++++++++++++++---------------------- 1 file changed, 37 insertions(+), 22 deletions(-) diff --git a/fs/nfsd/vfs.c b/fs/nfsd/vfs.c index 26c6fdb4bf67..3c36ed5a1f07 100644 --- a/fs/nfsd/vfs.c +++ b/fs/nfsd/vfs.c @@ -377,7 +377,7 @@ nfsd_setattr(struct svc_rqst *rqstp, struct svc_fh *fhp, struct iattr *iap, __be32 err; int host_err; bool get_write_count; - int size_change = 0; + bool size_change = (iap->ia_valid & ATTR_SIZE); if (iap->ia_valid & (ATTR_ATIME | ATTR_MTIME | ATTR_SIZE)) accmode |= NFSD_MAY_WRITE|NFSD_MAY_OWNER_OVERRIDE; @@ -390,11 +390,11 @@ nfsd_setattr(struct svc_rqst *rqstp, struct svc_fh *fhp, struct iattr *iap, /* Get inode */ err = fh_verify(rqstp, fhp, ftype, accmode); if (err) - goto out; + return err; if (get_write_count) { host_err = fh_want_write(fhp); if (host_err) - return nfserrno(host_err); + goto out; } dentry = fhp->fh_dentry; @@ -405,20 +405,28 @@ nfsd_setattr(struct svc_rqst *rqstp, struct svc_fh *fhp, struct iattr *iap, iap->ia_valid &= ~ATTR_MODE; if (!iap->ia_valid) - goto out; + return 0; nfsd_sanitize_attrs(inode, iap); + if (check_guard && guardtime != inode->i_ctime.tv_sec) + return nfserr_notsync; + /* * The size case is special, it changes the file in addition to the - * attributes. + * attributes, and file systems don't expect it to be mixed with + * "random" attribute changes. We thus split out the size change + * into a separate call to ->setattr, and do the rest as a separate + * setattr call. */ - if (iap->ia_valid & ATTR_SIZE) { + if (size_change) { err = nfsd_get_write_access(rqstp, fhp, iap); if (err) - goto out; - size_change = 1; + return err; + } + fh_lock(fhp); + if (size_change) { /* * RFC5661, Section 18.30.4: * Changing the size of a file with SETATTR indirectly @@ -426,29 +434,36 @@ nfsd_setattr(struct svc_rqst *rqstp, struct svc_fh *fhp, struct iattr *iap, * * (and similar for the older RFCs) */ - if (iap->ia_size != i_size_read(inode)) - iap->ia_valid |= ATTR_MTIME; - } + struct iattr size_attr = { + .ia_valid = ATTR_SIZE | ATTR_CTIME | ATTR_MTIME, + .ia_size = iap->ia_size, + }; - iap->ia_valid |= ATTR_CTIME; + host_err = notify_change(dentry, &size_attr, NULL); + if (host_err) + goto out_unlock; + iap->ia_valid &= ~ATTR_SIZE; - if (check_guard && guardtime != inode->i_ctime.tv_sec) { - err = nfserr_notsync; - goto out_put_write_access; + /* + * Avoid the additional setattr call below if the only other + * attribute that the client sends is the mtime, as we update + * it as part of the size change above. + */ + if ((iap->ia_valid & ~ATTR_MTIME) == 0) + goto out_unlock; } - fh_lock(fhp); + iap->ia_valid |= ATTR_CTIME; host_err = notify_change(dentry, iap, NULL); - fh_unlock(fhp); - err = nfserrno(host_err); -out_put_write_access: +out_unlock: + fh_unlock(fhp); if (size_change) put_write_access(inode); - if (!err) - err = nfserrno(commit_metadata(fhp)); out: - return err; + if (!host_err) + host_err = commit_metadata(fhp); + return nfserrno(host_err); } #if defined(CONFIG_NFSD_V4) -- 2.11.0 ^ permalink raw reply related [flat|nested] 23+ messages in thread
* Re: [PATCH] nfsd: special case truncates some more 2017-02-20 6:21 ` [PATCH] nfsd: special case truncates some more Christoph Hellwig @ 2017-02-20 22:23 ` J. Bruce Fields 2017-02-21 15:07 ` Chuck Lever 1 sibling, 0 replies; 23+ messages in thread From: J. Bruce Fields @ 2017-02-20 22:23 UTC (permalink / raw) To: Christoph Hellwig; +Cc: bfields, jlayton, linux-nfs, stable Thanks! I split out the cleanup into a separate patch just to make sure I understood what the important change was.... Looks good to me. --b. On Mon, Feb 20, 2017 at 07:21:33AM +0100, Christoph Hellwig wrote: > Both the NFS protocols and the Linux VFS use a setattr operation with a > bitmap of attributs to set to set various file attributes including the > file size and the uid/gid. > > The Linux syscalls never mixes size updates with unrelated updates like > the uid/gid, and some file systems like XFS and GFS2 rely on the fact > that truncates might not update random other attributes, and many other > file systems handle the case but do not update the different attributes > in the same transaction. NFSD on the other hand passes the attributes > it gets on the wire more or less directly through to the VFS, leading to > updates the file systems don't expect. XFS at least has an assert on > the allowed attributes, which caught an unusual NFS client setting the > size and group at the same time. > > To handle this issue properly this splits the notify_change call in > nfsd_setattr into two separate ones. > > Signed-off-by: Christoph Hellwig <hch@lst.de> > Cc: stable@kernel.org > --- > fs/nfsd/vfs.c | 59 +++++++++++++++++++++++++++++++++++++---------------------- > 1 file changed, 37 insertions(+), 22 deletions(-) > > diff --git a/fs/nfsd/vfs.c b/fs/nfsd/vfs.c > index 26c6fdb4bf67..3c36ed5a1f07 100644 > --- a/fs/nfsd/vfs.c > +++ b/fs/nfsd/vfs.c > @@ -377,7 +377,7 @@ nfsd_setattr(struct svc_rqst *rqstp, struct svc_fh *fhp, struct iattr *iap, > __be32 err; > int host_err; > bool get_write_count; > - int size_change = 0; > + bool size_change = (iap->ia_valid & ATTR_SIZE); > > if (iap->ia_valid & (ATTR_ATIME | ATTR_MTIME | ATTR_SIZE)) > accmode |= NFSD_MAY_WRITE|NFSD_MAY_OWNER_OVERRIDE; > @@ -390,11 +390,11 @@ nfsd_setattr(struct svc_rqst *rqstp, struct svc_fh *fhp, struct iattr *iap, > /* Get inode */ > err = fh_verify(rqstp, fhp, ftype, accmode); > if (err) > - goto out; > + return err; > if (get_write_count) { > host_err = fh_want_write(fhp); > if (host_err) > - return nfserrno(host_err); > + goto out; > } > > dentry = fhp->fh_dentry; > @@ -405,20 +405,28 @@ nfsd_setattr(struct svc_rqst *rqstp, struct svc_fh *fhp, struct iattr *iap, > iap->ia_valid &= ~ATTR_MODE; > > if (!iap->ia_valid) > - goto out; > + return 0; > > nfsd_sanitize_attrs(inode, iap); > > + if (check_guard && guardtime != inode->i_ctime.tv_sec) > + return nfserr_notsync; > + > /* > * The size case is special, it changes the file in addition to the > - * attributes. > + * attributes, and file systems don't expect it to be mixed with > + * "random" attribute changes. We thus split out the size change > + * into a separate call to ->setattr, and do the rest as a separate > + * setattr call. > */ > - if (iap->ia_valid & ATTR_SIZE) { > + if (size_change) { > err = nfsd_get_write_access(rqstp, fhp, iap); > if (err) > - goto out; > - size_change = 1; > + return err; > + } > > + fh_lock(fhp); > + if (size_change) { > /* > * RFC5661, Section 18.30.4: > * Changing the size of a file with SETATTR indirectly > @@ -426,29 +434,36 @@ nfsd_setattr(struct svc_rqst *rqstp, struct svc_fh *fhp, struct iattr *iap, > * > * (and similar for the older RFCs) > */ > - if (iap->ia_size != i_size_read(inode)) > - iap->ia_valid |= ATTR_MTIME; > - } > + struct iattr size_attr = { > + .ia_valid = ATTR_SIZE | ATTR_CTIME | ATTR_MTIME, > + .ia_size = iap->ia_size, > + }; > > - iap->ia_valid |= ATTR_CTIME; > + host_err = notify_change(dentry, &size_attr, NULL); > + if (host_err) > + goto out_unlock; > + iap->ia_valid &= ~ATTR_SIZE; > > - if (check_guard && guardtime != inode->i_ctime.tv_sec) { > - err = nfserr_notsync; > - goto out_put_write_access; > + /* > + * Avoid the additional setattr call below if the only other > + * attribute that the client sends is the mtime, as we update > + * it as part of the size change above. > + */ > + if ((iap->ia_valid & ~ATTR_MTIME) == 0) > + goto out_unlock; > } > > - fh_lock(fhp); > + iap->ia_valid |= ATTR_CTIME; > host_err = notify_change(dentry, iap, NULL); > - fh_unlock(fhp); > - err = nfserrno(host_err); > > -out_put_write_access: > +out_unlock: > + fh_unlock(fhp); > if (size_change) > put_write_access(inode); > - if (!err) > - err = nfserrno(commit_metadata(fhp)); > out: > - return err; > + if (!host_err) > + host_err = commit_metadata(fhp); > + return nfserrno(host_err); > } > > #if defined(CONFIG_NFSD_V4) > -- > 2.11.0 ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH] nfsd: special case truncates some more 2017-02-20 6:21 ` [PATCH] nfsd: special case truncates some more Christoph Hellwig 2017-02-20 22:23 ` J. Bruce Fields @ 2017-02-21 15:07 ` Chuck Lever 2017-02-21 15:14 ` J. Bruce Fields 1 sibling, 1 reply; 23+ messages in thread From: Chuck Lever @ 2017-02-21 15:07 UTC (permalink / raw) To: Christoph Hellwig Cc: J. Bruce Fields, Jeff Layton, Linux NFS Mailing List, stable > On Feb 20, 2017, at 1:21 AM, Christoph Hellwig <hch@lst.de> wrote: > > Both the NFS protocols and the Linux VFS use a setattr operation with a > bitmap of attributs to set to set various file attributes including the > file size and the uid/gid. > > The Linux syscalls never mixes size updates with unrelated updates like > the uid/gid, and some file systems like XFS and GFS2 rely on the fact > that truncates might not update random other attributes, and many other > file systems handle the case but do not update the different attributes > in the same transaction. NFSD on the other hand passes the attributes > it gets on the wire more or less directly through to the VFS, leading to > updates the file systems don't expect. XFS at least has an assert on > the allowed attributes, which caught an unusual NFS client setting the > size and group at the same time. > > To handle this issue properly this splits the notify_change call in > nfsd_setattr into two separate ones. > > Signed-off-by: Christoph Hellwig <hch@lst.de> > Cc: stable@kernel.org Tested-by: Chuck Lever <chuck.lever@oracle.com> > --- > fs/nfsd/vfs.c | 59 +++++++++++++++++++++++++++++++++++++---------------------- > 1 file changed, 37 insertions(+), 22 deletions(-) > > diff --git a/fs/nfsd/vfs.c b/fs/nfsd/vfs.c > index 26c6fdb4bf67..3c36ed5a1f07 100644 > --- a/fs/nfsd/vfs.c > +++ b/fs/nfsd/vfs.c > @@ -377,7 +377,7 @@ nfsd_setattr(struct svc_rqst *rqstp, struct svc_fh *fhp, struct iattr *iap, > __be32 err; > int host_err; > bool get_write_count; > - int size_change = 0; > + bool size_change = (iap->ia_valid & ATTR_SIZE); > > if (iap->ia_valid & (ATTR_ATIME | ATTR_MTIME | ATTR_SIZE)) > accmode |= NFSD_MAY_WRITE|NFSD_MAY_OWNER_OVERRIDE; > @@ -390,11 +390,11 @@ nfsd_setattr(struct svc_rqst *rqstp, struct svc_fh *fhp, struct iattr *iap, > /* Get inode */ > err = fh_verify(rqstp, fhp, ftype, accmode); > if (err) > - goto out; > + return err; > if (get_write_count) { > host_err = fh_want_write(fhp); > if (host_err) > - return nfserrno(host_err); > + goto out; > } > > dentry = fhp->fh_dentry; > @@ -405,20 +405,28 @@ nfsd_setattr(struct svc_rqst *rqstp, struct svc_fh *fhp, struct iattr *iap, > iap->ia_valid &= ~ATTR_MODE; > > if (!iap->ia_valid) > - goto out; > + return 0; > > nfsd_sanitize_attrs(inode, iap); > > + if (check_guard && guardtime != inode->i_ctime.tv_sec) > + return nfserr_notsync; > + > /* > * The size case is special, it changes the file in addition to the > - * attributes. > + * attributes, and file systems don't expect it to be mixed with > + * "random" attribute changes. We thus split out the size change > + * into a separate call to ->setattr, and do the rest as a separate > + * setattr call. > */ > - if (iap->ia_valid & ATTR_SIZE) { > + if (size_change) { > err = nfsd_get_write_access(rqstp, fhp, iap); > if (err) > - goto out; > - size_change = 1; > + return err; > + } > > + fh_lock(fhp); > + if (size_change) { > /* > * RFC5661, Section 18.30.4: > * Changing the size of a file with SETATTR indirectly > @@ -426,29 +434,36 @@ nfsd_setattr(struct svc_rqst *rqstp, struct svc_fh *fhp, struct iattr *iap, > * > * (and similar for the older RFCs) > */ > - if (iap->ia_size != i_size_read(inode)) > - iap->ia_valid |= ATTR_MTIME; > - } > + struct iattr size_attr = { > + .ia_valid = ATTR_SIZE | ATTR_CTIME | ATTR_MTIME, > + .ia_size = iap->ia_size, > + }; > > - iap->ia_valid |= ATTR_CTIME; > + host_err = notify_change(dentry, &size_attr, NULL); > + if (host_err) > + goto out_unlock; > + iap->ia_valid &= ~ATTR_SIZE; > > - if (check_guard && guardtime != inode->i_ctime.tv_sec) { > - err = nfserr_notsync; > - goto out_put_write_access; > + /* > + * Avoid the additional setattr call below if the only other > + * attribute that the client sends is the mtime, as we update > + * it as part of the size change above. > + */ > + if ((iap->ia_valid & ~ATTR_MTIME) == 0) > + goto out_unlock; > } > > - fh_lock(fhp); > + iap->ia_valid |= ATTR_CTIME; > host_err = notify_change(dentry, iap, NULL); > - fh_unlock(fhp); > - err = nfserrno(host_err); > > -out_put_write_access: > +out_unlock: > + fh_unlock(fhp); > if (size_change) > put_write_access(inode); > - if (!err) > - err = nfserrno(commit_metadata(fhp)); > out: > - return err; > + if (!host_err) > + host_err = commit_metadata(fhp); > + return nfserrno(host_err); > } > > #if defined(CONFIG_NFSD_V4) > -- > 2.11.0 > > -- > To unsubscribe from this list: send the line "unsubscribe linux-nfs" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html -- Chuck Lever ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH] nfsd: special case truncates some more 2017-02-21 15:07 ` Chuck Lever @ 2017-02-21 15:14 ` J. Bruce Fields 0 siblings, 0 replies; 23+ messages in thread From: J. Bruce Fields @ 2017-02-21 15:14 UTC (permalink / raw) To: Chuck Lever Cc: Christoph Hellwig, Jeff Layton, Linux NFS Mailing List, stable On Tue, Feb 21, 2017 at 10:07:51AM -0500, Chuck Lever wrote: > > > On Feb 20, 2017, at 1:21 AM, Christoph Hellwig <hch@lst.de> wrote: > > > > Both the NFS protocols and the Linux VFS use a setattr operation with a > > bitmap of attributs to set to set various file attributes including the > > file size and the uid/gid. > > > > The Linux syscalls never mixes size updates with unrelated updates like > > the uid/gid, and some file systems like XFS and GFS2 rely on the fact > > that truncates might not update random other attributes, and many other > > file systems handle the case but do not update the different attributes > > in the same transaction. NFSD on the other hand passes the attributes > > it gets on the wire more or less directly through to the VFS, leading to > > updates the file systems don't expect. XFS at least has an assert on > > the allowed attributes, which caught an unusual NFS client setting the > > size and group at the same time. > > > > To handle this issue properly this splits the notify_change call in > > nfsd_setattr into two separate ones. > > > > Signed-off-by: Christoph Hellwig <hch@lst.de> > > Cc: stable@kernel.org > > Tested-by: Chuck Lever <chuck.lever@oracle.com> Thanks.--b. ^ permalink raw reply [flat|nested] 23+ messages in thread
* setattr ATTR_SIZE vs the rest @ 2017-01-22 16:54 Christoph Hellwig 2017-01-22 16:54 ` [PATCH] nfsd: special case truncates some more Christoph Hellwig 0 siblings, 1 reply; 23+ messages in thread From: Christoph Hellwig @ 2017-01-22 16:54 UTC (permalink / raw) To: bfields; +Cc: linux-nfs, linux-fsdevel Hi Bruce, I've got a report that there NFS clients that send SETATTR requests that mix size changes with uid/gid changes (see the recent pynfs patch for an artifical reproducer). At least XFS and GFS2 are very unhappy with this, and other file systems also don't seem to handle the case correctly. This patch splits the truncate processing in NFS out into a separate ->setattr call and uses the vfs_truncate helper for it, which also happens to shrink the NFSD code size by reusing more VFS boiler plate code. I suspect in the mid-term we really should add a ->truncate method (different from the previous callback of the same name) to separate the two concepts clearly at the VFS level. ^ permalink raw reply [flat|nested] 23+ messages in thread
* [PATCH] nfsd: special case truncates some more 2017-01-22 16:54 setattr ATTR_SIZE vs the rest Christoph Hellwig @ 2017-01-22 16:54 ` Christoph Hellwig 2017-01-23 12:21 ` Jeff Layton 0 siblings, 1 reply; 23+ messages in thread From: Christoph Hellwig @ 2017-01-22 16:54 UTC (permalink / raw) To: bfields; +Cc: linux-nfs, linux-fsdevel Both the NFS protocols and the Linux VFS use a setattr operation with a bitmap of attributs to set to set various file attributes including the file size and the uid/gid. The Linux syscalls never mixe size updates with unrelated updates like the uid/gid, and some file systems like XFS and GFS2 rely on the fact that truncates might not update random other attributes, and many other file systems handle the case but do not update the different attributes in the same transaction. NFSD on the other hand passes the attributes it gets on the wire more or less directly through to the VFS, leading to updates the file systems don't expect. XFS at least has an assert on the allowed attributes, which cought an NFS client sets the size and group ІD at the same time. To handles this issue properly this switches nfsd to call vfs_truncate for size changes, and then handling all other attributes through notify_change. As a side effect this also means less boilerplace code around the size change as we can now reuse the VFS code. Signed-off-by: Christoph Hellwig <hch@lst.de> --- fs/nfsd/vfs.c | 77 +++++++++++++++++++---------------------------------------- 1 file changed, 24 insertions(+), 53 deletions(-) diff --git a/fs/nfsd/vfs.c b/fs/nfsd/vfs.c index 26c6fdb..fafff37 100644 --- a/fs/nfsd/vfs.c +++ b/fs/nfsd/vfs.c @@ -332,37 +332,6 @@ nfsd_sanitize_attrs(struct inode *inode, struct iattr *iap) } } -static __be32 -nfsd_get_write_access(struct svc_rqst *rqstp, struct svc_fh *fhp, - struct iattr *iap) -{ - struct inode *inode = d_inode(fhp->fh_dentry); - int host_err; - - if (iap->ia_size < inode->i_size) { - __be32 err; - - err = nfsd_permission(rqstp, fhp->fh_export, fhp->fh_dentry, - NFSD_MAY_TRUNC | NFSD_MAY_OWNER_OVERRIDE); - if (err) - return err; - } - - host_err = get_write_access(inode); - if (host_err) - goto out_nfserrno; - - host_err = locks_verify_truncate(inode, NULL, iap->ia_size); - if (host_err) - goto out_put_write_access; - return 0; - -out_put_write_access: - put_write_access(inode); -out_nfserrno: - return nfserrno(host_err); -} - /* * Set various file attributes. After this call fhp needs an fh_put. */ @@ -377,7 +346,6 @@ nfsd_setattr(struct svc_rqst *rqstp, struct svc_fh *fhp, struct iattr *iap, __be32 err; int host_err; bool get_write_count; - int size_change = 0; if (iap->ia_valid & (ATTR_ATIME | ATTR_MTIME | ATTR_SIZE)) accmode |= NFSD_MAY_WRITE|NFSD_MAY_OWNER_OVERRIDE; @@ -390,11 +358,11 @@ nfsd_setattr(struct svc_rqst *rqstp, struct svc_fh *fhp, struct iattr *iap, /* Get inode */ err = fh_verify(rqstp, fhp, ftype, accmode); if (err) - goto out; + return err; if (get_write_count) { host_err = fh_want_write(fhp); if (host_err) - return nfserrno(host_err); + goto out_host_err; } dentry = fhp->fh_dentry; @@ -405,19 +373,25 @@ nfsd_setattr(struct svc_rqst *rqstp, struct svc_fh *fhp, struct iattr *iap, iap->ia_valid &= ~ATTR_MODE; if (!iap->ia_valid) - goto out; + return 0; nfsd_sanitize_attrs(inode, iap); + if (check_guard && guardtime != inode->i_ctime.tv_sec) + return nfserr_notsync; + /* * The size case is special, it changes the file in addition to the - * attributes. + * attributes, and file systems don't expect it to be mixed with + * "random" attribute changes. We thus split out the size change + * into a separate calo for vfs_truncate, and do the rest as a + * a separate setattr call. */ if (iap->ia_valid & ATTR_SIZE) { - err = nfsd_get_write_access(rqstp, fhp, iap); - if (err) - goto out; - size_change = 1; + struct path path = { + .mnt = fhp->fh_export->ex_path.mnt, + .dentry = dentry, + }; /* * RFC5661, Section 18.30.4: @@ -428,27 +402,24 @@ nfsd_setattr(struct svc_rqst *rqstp, struct svc_fh *fhp, struct iattr *iap, */ if (iap->ia_size != i_size_read(inode)) iap->ia_valid |= ATTR_MTIME; - } - iap->ia_valid |= ATTR_CTIME; + host_err = vfs_truncate(&path, iap->ia_size); + if (host_err) + goto out_host_err; - if (check_guard && guardtime != inode->i_ctime.tv_sec) { - err = nfserr_notsync; - goto out_put_write_access; + iap->ia_valid &= ~ATTR_SIZE; } + iap->ia_valid |= ATTR_CTIME; + fh_lock(fhp); host_err = notify_change(dentry, iap, NULL); fh_unlock(fhp); - err = nfserrno(host_err); -out_put_write_access: - if (size_change) - put_write_access(inode); - if (!err) - err = nfserrno(commit_metadata(fhp)); -out: - return err; + if (!host_err) + host_err = commit_metadata(fhp); +out_host_err: + return nfserrno(host_err); } #if defined(CONFIG_NFSD_V4) -- 2.1.4 ^ permalink raw reply related [flat|nested] 23+ messages in thread
* Re: [PATCH] nfsd: special case truncates some more 2017-01-22 16:54 ` [PATCH] nfsd: special case truncates some more Christoph Hellwig @ 2017-01-23 12:21 ` Jeff Layton 2017-01-23 12:33 ` Christoph Hellwig 0 siblings, 1 reply; 23+ messages in thread From: Jeff Layton @ 2017-01-23 12:21 UTC (permalink / raw) To: Christoph Hellwig, bfields; +Cc: linux-nfs, linux-fsdevel On Sun, 2017-01-22 at 17:54 +0100, Christoph Hellwig wrote: > Both the NFS protocols and the Linux VFS use a setattr operation with a > bitmap of attributs to set to set various file attributes including the > file size and the uid/gid. > > The Linux syscalls never mixe size updates with unrelated updates like > the uid/gid, and some file systems like XFS and GFS2 rely on the fact > that truncates might not update random other attributes, and many > other file systems handle the case but do not update the different > attributes in the same transaction. NFSD on the other hand passes > the attributes it gets on the wire more or less directly through to > the VFS, leading to updates the file systems don't expect. XFS at > least has an assert on the allowed attributes, which cought an NFS > client sets the size and group ІD at the same time. > > To handles this issue properly this switches nfsd to call vfs_truncate > for size changes, and then handling all other attributes through > notify_change. As a side effect this also means less boilerplace > code around the size change as we can now reuse the VFS code. > > Signed-off-by: Christoph Hellwig <hch@lst.de> > --- > fs/nfsd/vfs.c | 77 +++++++++++++++++++---------------------------------------- > 1 file changed, 24 insertions(+), 53 deletions(-) > > diff --git a/fs/nfsd/vfs.c b/fs/nfsd/vfs.c > index 26c6fdb..fafff37 100644 > --- a/fs/nfsd/vfs.c > +++ b/fs/nfsd/vfs.c > @@ -332,37 +332,6 @@ nfsd_sanitize_attrs(struct inode *inode, struct iattr *iap) > } > } > > -static __be32 > -nfsd_get_write_access(struct svc_rqst *rqstp, struct svc_fh *fhp, > - struct iattr *iap) > -{ > - struct inode *inode = d_inode(fhp->fh_dentry); > - int host_err; > - > - if (iap->ia_size < inode->i_size) { > - __be32 err; > - > - err = nfsd_permission(rqstp, fhp->fh_export, fhp->fh_dentry, > - NFSD_MAY_TRUNC | NFSD_MAY_OWNER_OVERRIDE); > - if (err) > - return err; > - } > - > - host_err = get_write_access(inode); > - if (host_err) > - goto out_nfserrno; > - > - host_err = locks_verify_truncate(inode, NULL, iap->ia_size); > - if (host_err) > - goto out_put_write_access; > - return 0; > - > -out_put_write_access: > - put_write_access(inode); > -out_nfserrno: > - return nfserrno(host_err); > -} > - > /* > * Set various file attributes. After this call fhp needs an fh_put. > */ > @@ -377,7 +346,6 @@ nfsd_setattr(struct svc_rqst *rqstp, struct svc_fh *fhp, struct iattr *iap, > __be32 err; > int host_err; > bool get_write_count; > - int size_change = 0; > > if (iap->ia_valid & (ATTR_ATIME | ATTR_MTIME | ATTR_SIZE)) > accmode |= NFSD_MAY_WRITE|NFSD_MAY_OWNER_OVERRIDE; > @@ -390,11 +358,11 @@ nfsd_setattr(struct svc_rqst *rqstp, struct svc_fh *fhp, struct iattr *iap, > /* Get inode */ > err = fh_verify(rqstp, fhp, ftype, accmode); > if (err) > - goto out; > + return err; > if (get_write_count) { > host_err = fh_want_write(fhp); > if (host_err) > - return nfserrno(host_err); > + goto out_host_err; > } > > dentry = fhp->fh_dentry; > @@ -405,19 +373,25 @@ nfsd_setattr(struct svc_rqst *rqstp, struct svc_fh *fhp, struct iattr *iap, > iap->ia_valid &= ~ATTR_MODE; > > if (!iap->ia_valid) > - goto out; > + return 0; > > nfsd_sanitize_attrs(inode, iap); > > + if (check_guard && guardtime != inode->i_ctime.tv_sec) > + return nfserr_notsync; > + > /* > * The size case is special, it changes the file in addition to the > - * attributes. > + * attributes, and file systems don't expect it to be mixed with > + * "random" attribute changes. We thus split out the size change > + * into a separate calo for vfs_truncate, and do the rest as a > + * a separate setattr call. > */ > if (iap->ia_valid & ATTR_SIZE) { > - err = nfsd_get_write_access(rqstp, fhp, iap); > - if (err) > - goto out; > - size_change = 1; > + struct path path = { > + .mnt = fhp->fh_export->ex_path.mnt, > + .dentry = dentry, > + }; > > /* > * RFC5661, Section 18.30.4: > @@ -428,27 +402,24 @@ nfsd_setattr(struct svc_rqst *rqstp, struct svc_fh *fhp, struct iattr *iap, > */ > if (iap->ia_size != i_size_read(inode)) > iap->ia_valid |= ATTR_MTIME; > - } > > - iap->ia_valid |= ATTR_CTIME; > + host_err = vfs_truncate(&path, iap->ia_size); > + if (host_err) > + goto out_host_err; > > - if (check_guard && guardtime != inode->i_ctime.tv_sec) { > - err = nfserr_notsync; > - goto out_put_write_access; > + iap->ia_valid &= ~ATTR_SIZE; > } > > + iap->ia_valid |= ATTR_CTIME; > + So if you only have ATTR_SIZE then you're going to end up having to do another notify_change to update the ctime? Can we get away with just calling vfs_truncate when only ATTR_SIZE is set and skipping the notify_change to update the ctime? > fh_lock(fhp); > host_err = notify_change(dentry, iap, NULL); > fh_unlock(fhp); > - err = nfserrno(host_err); > > -out_put_write_access: > - if (size_change) > - put_write_access(inode); > - if (!err) > - err = nfserrno(commit_metadata(fhp)); > -out: > - return err; > + if (!host_err) > + host_err = commit_metadata(fhp); > +out_host_err: > + return nfserrno(host_err); > } > > #if defined(CONFIG_NFSD_V4) Overall though, this is a reasonable change I think. Size changes are more of a special case. I also like the idea of breaking out truncates to a separate operation, but that's a much bigger project. -- Jeff Layton <jlayton@poochiereds.net> ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH] nfsd: special case truncates some more 2017-01-23 12:21 ` Jeff Layton @ 2017-01-23 12:33 ` Christoph Hellwig 2017-01-23 15:36 ` Christoph Hellwig 0 siblings, 1 reply; 23+ messages in thread From: Christoph Hellwig @ 2017-01-23 12:33 UTC (permalink / raw) To: Jeff Layton; +Cc: Christoph Hellwig, bfields, linux-nfs, linux-fsdevel On Mon, Jan 23, 2017 at 07:21:56AM -0500, Jeff Layton wrote: > So if you only have ATTR_SIZE then you're going to end up having to do > another notify_change to update the ctime? Can we get away with just > calling vfs_truncate when only ATTR_SIZE is set and skipping the > notify_change to update the ctime? We probably could, but there are some fine details there: For truncate(2) Posix require us to not updated the time stamps when truncating and already zero length file to 0, but for ftruncate(2) and O_TRUNC we do have to update the mtime and ctime. The Linux VFS communicates that difference by not setting ATTR_CTIME and ATTR_MTIME in ia_valid for truncate(2), but expecting the fs to update them anyway. (another reason for a proper truncate method to make this explicit). I'll need to look at the exact NFS semantics in that area, but after a bit of research I can probably come up with something that will work. ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH] nfsd: special case truncates some more 2017-01-23 12:33 ` Christoph Hellwig @ 2017-01-23 15:36 ` Christoph Hellwig 0 siblings, 0 replies; 23+ messages in thread From: Christoph Hellwig @ 2017-01-23 15:36 UTC (permalink / raw) To: Jeff Layton; +Cc: Christoph Hellwig, bfields, linux-nfs, linux-fsdevel On Mon, Jan 23, 2017 at 01:33:48PM +0100, Christoph Hellwig wrote: > I'll need to look at the exact NFS semantics in that area, but after > a bit of research I can probably come up with something that will work. Here is my first attempt. As vfs_truncate will add the ctime and mtime updates when needed it just leaves handling that quirk to vfs_truncate and then exits early if no other attributes are set. Unfortunately at least the Linux client always seems to also request a mtime update with a size update. We could keep the if (iap->ia_size != i_size_read(inode)) check from the old code and remove ATTR_MTIME, but these racy checks outside i_rwsem make me feel a bit uneasy. Jeff, Bruce - any opinion if we should add something like this: /* vfs_truncate will update ctime and mtime if the size changes */ if (iap->ia_size != i_size_read(inode)) iap->ia_valid &= ATTR_MTIME; back to nfsd_setattr? This would avoid the additional setattr call, but make me feel dirty :) --- >From 0e06e2fc6157bb97692ed47c21e36120efb9f15c Mon Sep 17 00:00:00 2001 From: Christoph Hellwig <hch@lst.de> Date: Sun, 22 Jan 2017 17:17:48 +0100 Subject: nfsd: special case truncates some more MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Both the NFS protocols and the Linux VFS use a setattr operation with a bitmap of attributs to set to set various file attributes including the file size and the uid/gid. The Linux syscalls never mixe size updates with unrelated updates like the uid/gid, and some file systems like XFS and GFS2 rely on the fact that truncates might not update random other attributes, and many other file systems handle the case but do not update the different attributes in the same transaction. NFSD on the other hand passes the attributes it gets on the wire more or less directly through to the VFS, leading to updates the file systems don't expect. XFS at least has an assert on the allowed attributes, which cought an NFS client sets the size and group ІD at the same time. To handles this issue properly this switches nfsd to call vfs_truncate for size changes, and then handling all other attributes through notify_change. As a side effect this also means less boilerplace code around the size change as we can now reuse the VFS code. Signed-off-by: Christoph Hellwig <hch@lst.de> --- fs/nfsd/vfs.c | 92 +++++++++++++++++++---------------------------------------- 1 file changed, 30 insertions(+), 62 deletions(-) diff --git a/fs/nfsd/vfs.c b/fs/nfsd/vfs.c index 26c6fdb..4ca5b92 100644 --- a/fs/nfsd/vfs.c +++ b/fs/nfsd/vfs.c @@ -332,37 +332,6 @@ nfsd_sanitize_attrs(struct inode *inode, struct iattr *iap) } } -static __be32 -nfsd_get_write_access(struct svc_rqst *rqstp, struct svc_fh *fhp, - struct iattr *iap) -{ - struct inode *inode = d_inode(fhp->fh_dentry); - int host_err; - - if (iap->ia_size < inode->i_size) { - __be32 err; - - err = nfsd_permission(rqstp, fhp->fh_export, fhp->fh_dentry, - NFSD_MAY_TRUNC | NFSD_MAY_OWNER_OVERRIDE); - if (err) - return err; - } - - host_err = get_write_access(inode); - if (host_err) - goto out_nfserrno; - - host_err = locks_verify_truncate(inode, NULL, iap->ia_size); - if (host_err) - goto out_put_write_access; - return 0; - -out_put_write_access: - put_write_access(inode); -out_nfserrno: - return nfserrno(host_err); -} - /* * Set various file attributes. After this call fhp needs an fh_put. */ @@ -377,7 +346,6 @@ nfsd_setattr(struct svc_rqst *rqstp, struct svc_fh *fhp, struct iattr *iap, __be32 err; int host_err; bool get_write_count; - int size_change = 0; if (iap->ia_valid & (ATTR_ATIME | ATTR_MTIME | ATTR_SIZE)) accmode |= NFSD_MAY_WRITE|NFSD_MAY_OWNER_OVERRIDE; @@ -390,11 +358,11 @@ nfsd_setattr(struct svc_rqst *rqstp, struct svc_fh *fhp, struct iattr *iap, /* Get inode */ err = fh_verify(rqstp, fhp, ftype, accmode); if (err) - goto out; + return err; if (get_write_count) { host_err = fh_want_write(fhp); if (host_err) - return nfserrno(host_err); + goto out_host_err; } dentry = fhp->fh_dentry; @@ -405,50 +373,50 @@ nfsd_setattr(struct svc_rqst *rqstp, struct svc_fh *fhp, struct iattr *iap, iap->ia_valid &= ~ATTR_MODE; if (!iap->ia_valid) - goto out; + return 0; nfsd_sanitize_attrs(inode, iap); + if (check_guard && guardtime != inode->i_ctime.tv_sec) + return nfserr_notsync; + /* * The size case is special, it changes the file in addition to the - * attributes. + * attributes, and file systems don't expect it to be mixed with + * "random" attribute changes. We thus split out the size change + * into a separate calo for vfs_truncate, and do the rest as a + * a separate setattr call. + * + * Note that vfs_truncate will also update ctime and mtime if + * the file size changes. */ if (iap->ia_valid & ATTR_SIZE) { - err = nfsd_get_write_access(rqstp, fhp, iap); - if (err) - goto out; - size_change = 1; + struct path path = { + .mnt = fhp->fh_export->ex_path.mnt, + .dentry = dentry, + }; - /* - * RFC5661, Section 18.30.4: - * Changing the size of a file with SETATTR indirectly - * changes the time_modify and change attributes. - * - * (and similar for the older RFCs) - */ - if (iap->ia_size != i_size_read(inode)) - iap->ia_valid |= ATTR_MTIME; + host_err = vfs_truncate(&path, iap->ia_size); + if (host_err) + goto out_host_err; + + iap->ia_valid &= ~ATTR_SIZE; + if (!iap->ia_valid) + goto done; } iap->ia_valid |= ATTR_CTIME; - if (check_guard && guardtime != inode->i_ctime.tv_sec) { - err = nfserr_notsync; - goto out_put_write_access; - } - fh_lock(fhp); host_err = notify_change(dentry, iap, NULL); fh_unlock(fhp); - err = nfserrno(host_err); + if (host_err) + goto out_host_err; -out_put_write_access: - if (size_change) - put_write_access(inode); - if (!err) - err = nfserrno(commit_metadata(fhp)); -out: - return err; +done: + host_err = commit_metadata(fhp); +out_host_err: + return nfserrno(host_err); } #if defined(CONFIG_NFSD_V4) -- 2.1.4 ^ permalink raw reply related [flat|nested] 23+ messages in thread
* Re: [PATCH] nfsd: special case truncates some more @ 2017-01-23 15:36 ` Christoph Hellwig 0 siblings, 0 replies; 23+ messages in thread From: Christoph Hellwig @ 2017-01-23 15:36 UTC (permalink / raw) To: Jeff Layton; +Cc: Christoph Hellwig, bfields, linux-nfs, linux-fsdevel On Mon, Jan 23, 2017 at 01:33:48PM +0100, Christoph Hellwig wrote: > I'll need to look at the exact NFS semantics in that area, but after > a bit of research I can probably come up with something that will work. Here is my first attempt. As vfs_truncate will add the ctime and mtime updates when needed it just leaves handling that quirk to vfs_truncate and then exits early if no other attributes are set. Unfortunately at least the Linux client always seems to also request a mtime update with a size update. We could keep the if (iap->ia_size != i_size_read(inode)) check from the old code and remove ATTR_MTIME, but these racy checks outside i_rwsem make me feel a bit uneasy. Jeff, Bruce - any opinion if we should add something like this: /* vfs_truncate will update ctime and mtime if the size changes */ if (iap->ia_size != i_size_read(inode)) iap->ia_valid &= ATTR_MTIME; back to nfsd_setattr? This would avoid the additional setattr call, but make me feel dirty :) --- ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH] nfsd: special case truncates some more 2017-01-23 15:36 ` Christoph Hellwig (?) @ 2017-01-23 15:52 ` Jeff Layton 2017-01-23 16:05 ` Christoph Hellwig 2017-01-24 22:02 ` J. Bruce Fields -1 siblings, 2 replies; 23+ messages in thread From: Jeff Layton @ 2017-01-23 15:52 UTC (permalink / raw) To: Christoph Hellwig; +Cc: bfields, linux-nfs, linux-fsdevel On Mon, 2017-01-23 at 16:36 +0100, Christoph Hellwig wrote: > On Mon, Jan 23, 2017 at 01:33:48PM +0100, Christoph Hellwig wrote: > > I'll need to look at the exact NFS semantics in that area, but after > > a bit of research I can probably come up with something that will work. > > Here is my first attempt. As vfs_truncate will add the ctime and mtime > updates when needed it just leaves handling that quirk to vfs_truncate > and then exits early if no other attributes are set. > > Unfortunately at least the Linux client always seems to also request > a mtime update with a size update. We could keep the > > if (iap->ia_size != i_size_read(inode)) > > check from the old code and remove ATTR_MTIME, but these racy checks > outside i_rwsem make me feel a bit uneasy. Jeff, Bruce - any opinion > if we should add something like this: > Ok, that's more complicated than it looked at first blush. :) To be clear, the client is requesting to set the mtime to current server time and not to a specific mtime, right? > /* vfs_truncate will update ctime and mtime if the size changes */ > if (iap->ia_size != i_size_read(inode)) > iap->ia_valid &= ATTR_MTIME; > > back to nfsd_setattr? This would avoid the additional setattr call, > but make me feel dirty :) > I agree that I wouldn't want to go with a potentially racy check. I don't see where vfs_truncate will handle the times though. do_truncate will, but you have to pass in a non-zero time_attrs and vfs_truncate always sets that to 0. If we did want to do this, it seems like it might be better to just add a new time_attrs arg to vfs_truncate that gets passed to do_truncate. Most callers would set it to zero, but nfsd could set it to: iap->ia_valid & (ATTR_MTIME|ATTR_CTIME) Would that work? > --- > From 0e06e2fc6157bb97692ed47c21e36120efb9f15c Mon Sep 17 00:00:00 2001 > From: Christoph Hellwig <hch@lst.de> > Date: Sun, 22 Jan 2017 17:17:48 +0100 > Subject: nfsd: special case truncates some more > MIME-Version: 1.0 > Content-Type: text/plain; charset=UTF-8 > Content-Transfer-Encoding: 8bit > > Both the NFS protocols and the Linux VFS use a setattr operation with a > bitmap of attributs to set to set various file attributes including the > file size and the uid/gid. > > The Linux syscalls never mixe size updates with unrelated updates like > the uid/gid, and some file systems like XFS and GFS2 rely on the fact > that truncates might not update random other attributes, and many > other file systems handle the case but do not update the different > attributes in the same transaction. NFSD on the other hand passes > the attributes it gets on the wire more or less directly through to > the VFS, leading to updates the file systems don't expect. XFS at > least has an assert on the allowed attributes, which cought an NFS > client sets the size and group ІD at the same time. > > To handles this issue properly this switches nfsd to call vfs_truncate > for size changes, and then handling all other attributes through > notify_change. As a side effect this also means less boilerplace > code around the size change as we can now reuse the VFS code. > > Signed-off-by: Christoph Hellwig <hch@lst.de> > --- > fs/nfsd/vfs.c | 92 +++++++++++++++++++---------------------------------------- > 1 file changed, 30 insertions(+), 62 deletions(-) > > diff --git a/fs/nfsd/vfs.c b/fs/nfsd/vfs.c > index 26c6fdb..4ca5b92 100644 > --- a/fs/nfsd/vfs.c > +++ b/fs/nfsd/vfs.c > @@ -332,37 +332,6 @@ nfsd_sanitize_attrs(struct inode *inode, struct iattr *iap) > } > } > > -static __be32 > -nfsd_get_write_access(struct svc_rqst *rqstp, struct svc_fh *fhp, > - struct iattr *iap) > -{ > - struct inode *inode = d_inode(fhp->fh_dentry); > - int host_err; > - > - if (iap->ia_size < inode->i_size) { > - __be32 err; > - > - err = nfsd_permission(rqstp, fhp->fh_export, fhp->fh_dentry, > - NFSD_MAY_TRUNC | NFSD_MAY_OWNER_OVERRIDE); > - if (err) > - return err; > - } > - > - host_err = get_write_access(inode); > - if (host_err) > - goto out_nfserrno; > - > - host_err = locks_verify_truncate(inode, NULL, iap->ia_size); > - if (host_err) > - goto out_put_write_access; > - return 0; > - > -out_put_write_access: > - put_write_access(inode); > -out_nfserrno: > - return nfserrno(host_err); > -} > - > /* > * Set various file attributes. After this call fhp needs an fh_put. > */ > @@ -377,7 +346,6 @@ nfsd_setattr(struct svc_rqst *rqstp, struct svc_fh *fhp, struct iattr *iap, > __be32 err; > int host_err; > bool get_write_count; > - int size_change = 0; > > if (iap->ia_valid & (ATTR_ATIME | ATTR_MTIME | ATTR_SIZE)) > accmode |= NFSD_MAY_WRITE|NFSD_MAY_OWNER_OVERRIDE; > @@ -390,11 +358,11 @@ nfsd_setattr(struct svc_rqst *rqstp, struct svc_fh *fhp, struct iattr *iap, > /* Get inode */ > err = fh_verify(rqstp, fhp, ftype, accmode); > if (err) > - goto out; > + return err; > if (get_write_count) { > host_err = fh_want_write(fhp); > if (host_err) > - return nfserrno(host_err); > + goto out_host_err; > } > > dentry = fhp->fh_dentry; > @@ -405,50 +373,50 @@ nfsd_setattr(struct svc_rqst *rqstp, struct svc_fh *fhp, struct iattr *iap, > iap->ia_valid &= ~ATTR_MODE; > > if (!iap->ia_valid) > - goto out; > + return 0; > > nfsd_sanitize_attrs(inode, iap); > > + if (check_guard && guardtime != inode->i_ctime.tv_sec) > + return nfserr_notsync; > + > /* > * The size case is special, it changes the file in addition to the > - * attributes. > + * attributes, and file systems don't expect it to be mixed with > + * "random" attribute changes. We thus split out the size change > + * into a separate calo for vfs_truncate, and do the rest as a > + * a separate setattr call. > + * > + * Note that vfs_truncate will also update ctime and mtime if > + * the file size changes. > */ > if (iap->ia_valid & ATTR_SIZE) { > - err = nfsd_get_write_access(rqstp, fhp, iap); > - if (err) > - goto out; > - size_change = 1; > + struct path path = { > + .mnt = fhp->fh_export->ex_path.mnt, > + .dentry = dentry, > + }; > > - /* > - * RFC5661, Section 18.30.4: > - * Changing the size of a file with SETATTR indirectly > - * changes the time_modify and change attributes. > - * > - * (and similar for the older RFCs) > - */ > - if (iap->ia_size != i_size_read(inode)) > - iap->ia_valid |= ATTR_MTIME; > + host_err = vfs_truncate(&path, iap->ia_size); > + if (host_err) > + goto out_host_err; > + > + iap->ia_valid &= ~ATTR_SIZE; > + if (!iap->ia_valid) > + goto done; > } > > iap->ia_valid |= ATTR_CTIME; > > - if (check_guard && guardtime != inode->i_ctime.tv_sec) { > - err = nfserr_notsync; > - goto out_put_write_access; > - } > - > fh_lock(fhp); > host_err = notify_change(dentry, iap, NULL); > fh_unlock(fhp); > - err = nfserrno(host_err); > + if (host_err) > + goto out_host_err; > > -out_put_write_access: > - if (size_change) > - put_write_access(inode); > - if (!err) > - err = nfserrno(commit_metadata(fhp)); > -out: > - return err; > +done: > + host_err = commit_metadata(fhp); > +out_host_err: > + return nfserrno(host_err); > } > > #if defined(CONFIG_NFSD_V4) -- Jeff Layton <jlayton@poochiereds.net> ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH] nfsd: special case truncates some more 2017-01-23 15:52 ` Jeff Layton @ 2017-01-23 16:05 ` Christoph Hellwig 2017-01-23 16:14 ` Jeff Layton 2017-01-23 16:20 ` Trond Myklebust 2017-01-24 22:02 ` J. Bruce Fields 1 sibling, 2 replies; 23+ messages in thread From: Christoph Hellwig @ 2017-01-23 16:05 UTC (permalink / raw) To: Jeff Layton; +Cc: Christoph Hellwig, bfields, linux-nfs, linux-fsdevel On Mon, Jan 23, 2017 at 10:52:09AM -0500, Jeff Layton wrote: > To be clear, the client is requesting to set the mtime to current server > time and not to a specific mtime, right? Yes. And I think it's mostly the Linux client being lazy - ATTR_MTIME is what it gets from the VFS for a truncate operation (but not ftrunate, so we probably won't see it on the wire in that case, but I need to verify that first). Yet another reason for ->truncate :) > I don't see where vfs_truncate will handle the times though. do_truncate > will, but you have to pass in a non-zero time_attrs and vfs_truncate > always sets that to 0. This is the magic of the Linux VFS interface. For a ATTR_SIZE operation the file system is expected to update mtime and ctime if the size changes even if ATTR_MTIME and ATTR_CTIME are not set. See the comments in xfs_vn_setattr_size, which I wrote many years ago when I tripped over this interesting calling convention. > If we did want to do this, it seems like it might be better to just add > a new time_attrs arg to vfs_truncate that gets passed to do_truncate. > Most callers would set it to zero, but nfsd could set it to: > > iap->ia_valid & (ATTR_MTIME|ATTR_CTIME) > > Would that work? I'd hate it. I'd rather spent my time on a real truncate operation which makes all the above magic explicit, and as a side effect would fix the Linux client sending spurious mtime update requests that the procotol already requires to be done implicitly. ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH] nfsd: special case truncates some more 2017-01-23 16:05 ` Christoph Hellwig @ 2017-01-23 16:14 ` Jeff Layton 2017-01-23 16:20 ` Trond Myklebust 1 sibling, 0 replies; 23+ messages in thread From: Jeff Layton @ 2017-01-23 16:14 UTC (permalink / raw) To: Christoph Hellwig; +Cc: bfields, linux-nfs, linux-fsdevel On Mon, 2017-01-23 at 17:05 +0100, Christoph Hellwig wrote: > On Mon, Jan 23, 2017 at 10:52:09AM -0500, Jeff Layton wrote: > > To be clear, the client is requesting to set the mtime to current server > > time and not to a specific mtime, right? > > Yes. And I think it's mostly the Linux client being lazy - ATTR_MTIME > is what it gets from the VFS for a truncate operation (but not ftrunate, > so we probably won't see it on the wire in that case, but I need to verify > that first). Yet another reason for ->truncate :) > Heh, ok. Makes sense. > > I don't see where vfs_truncate will handle the times though. do_truncate > > will, but you have to pass in a non-zero time_attrs and vfs_truncate > > always sets that to 0. > > This is the magic of the Linux VFS interface. For a ATTR_SIZE operation > the file system is expected to update mtime and ctime if the size changes > even if ATTR_MTIME and ATTR_CTIME are not set. See the comments > in xfs_vn_setattr_size, which I wrote many years ago when I tripped > over this interesting calling convention. > Ick. > > If we did want to do this, it seems like it might be better to just add > > a new time_attrs arg to vfs_truncate that gets passed to do_truncate. > > Most callers would set it to zero, but nfsd could set it to: > > > > iap->ia_valid & (ATTR_MTIME|ATTR_CTIME) > > > > Would that work? > > I'd hate it. I'd rather spent my time on a real truncate operation > which makes all the above magic explicit, and as a side effect would > fix the Linux client sending spurious mtime update requests that > the procotol already requires to be done implicitly. Fair enough. In that case, I wouldn't try to optimize away the second notify_change if the client sets ATTR_MTIME/ATTR_CTIME for now We might have to dip down into the fs twice for a truncate, but so be it. If it becomes a problem then we can consider that more reason to add a real ->truncate operation. -- Jeff Layton <jlayton@poochiereds.net> ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH] nfsd: special case truncates some more 2017-01-23 16:05 ` Christoph Hellwig @ 2017-01-23 16:20 ` Trond Myklebust 2017-01-23 16:20 ` Trond Myklebust 1 sibling, 0 replies; 23+ messages in thread From: Trond Myklebust @ 2017-01-23 16:20 UTC (permalink / raw) To: hch, jlayton; +Cc: bfields, linux-nfs, linux-fsdevel On Mon, 2017-01-23 at 17:05 +0100, Christoph Hellwig wrote: > On Mon, Jan 23, 2017 at 10:52:09AM -0500, Jeff Layton wrote: > > To be clear, the client is requesting to set the mtime to current > > server > > time and not to a specific mtime, right? > > Yes. And I think it's mostly the Linux client being lazy - > ATTR_MTIME > is what it gets from the VFS for a truncate operation (but not > ftrunate, > so we probably won't see it on the wire in that case, but I need to > verify > that first). Yet another reason for ->truncate :) > Note that the POSIX spec seems to have changed recently. The current spec appears to state that we should set the mtime and ctime (and change attribute) on success in open(O_TRUNC), truncate() and ftruncate(). In previous incarnations of the spec, truncate() would only set the time if the size was changed: See: http://pubs.opengroup.org/onlinepubs/9699919799/functions/ftruncate.htm l http://pubs.opengroup.org/onlinepubs/9699919799/functions/truncate.html http://pubs.opengroup.org/onlinepubs/9699919799/functions/open.html -- Trond Myklebust Linux NFS client maintainer, PrimaryData trond.myklebust@primarydata.com ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH] nfsd: special case truncates some more @ 2017-01-23 16:20 ` Trond Myklebust 0 siblings, 0 replies; 23+ messages in thread From: Trond Myklebust @ 2017-01-23 16:20 UTC (permalink / raw) To: hch, jlayton; +Cc: bfields, linux-nfs, linux-fsdevel T24gTW9uLCAyMDE3LTAxLTIzIGF0IDE3OjA1ICswMTAwLCBDaHJpc3RvcGggSGVsbHdpZyB3cm90 ZToNCj4gT24gTW9uLCBKYW4gMjMsIDIwMTcgYXQgMTA6NTI6MDlBTSAtMDUwMCwgSmVmZiBMYXl0 b24gd3JvdGU6DQo+ID4gVG8gYmUgY2xlYXIsIHRoZSBjbGllbnQgaXMgcmVxdWVzdGluZyB0byBz ZXQgdGhlIG10aW1lIHRvIGN1cnJlbnQNCj4gPiBzZXJ2ZXINCj4gPiB0aW1lIGFuZCBub3QgdG8g YSBzcGVjaWZpYyBtdGltZSwgcmlnaHQ/DQo+IA0KPiBZZXMuwqDCoEFuZCBJIHRoaW5rIGl0J3Mg bW9zdGx5IHRoZSBMaW51eCBjbGllbnQgYmVpbmcgbGF6eSAtDQo+IEFUVFJfTVRJTUUNCj4gaXMg d2hhdCBpdCBnZXRzIGZyb20gdGhlIFZGUyBmb3IgYSB0cnVuY2F0ZSBvcGVyYXRpb24gKGJ1dCBu b3QNCj4gZnRydW5hdGUsDQo+IHNvIHdlIHByb2JhYmx5IHdvbid0IHNlZSBpdCBvbiB0aGUgd2ly ZSBpbiB0aGF0IGNhc2UsIGJ1dCBJIG5lZWQgdG8NCj4gdmVyaWZ5DQo+IHRoYXQgZmlyc3QpLsKg wqBZZXQgYW5vdGhlciByZWFzb24gZm9yIC0+dHJ1bmNhdGUgOikNCj4gDQoNCk5vdGUgdGhhdCB0 aGUgUE9TSVggc3BlYyBzZWVtcyB0byBoYXZlIGNoYW5nZWQgcmVjZW50bHkuIFRoZSBjdXJyZW50 DQpzcGVjIGFwcGVhcnMgdG8gc3RhdGUgdGhhdCB3ZSBzaG91bGQgc2V0IHRoZSBtdGltZSBhbmQg Y3RpbWUgKGFuZA0KY2hhbmdlIGF0dHJpYnV0ZSkgb24gc3VjY2VzcyBpbiBvcGVuKE9fVFJVTkMp LCB0cnVuY2F0ZSgpIGFuZA0KZnRydW5jYXRlKCkuIEluIHByZXZpb3VzIGluY2FybmF0aW9ucyBv ZiB0aGUgc3BlYywgdHJ1bmNhdGUoKSB3b3VsZA0Kb25seSBzZXQgdGhlIHRpbWUgaWYgdGhlIHNp emUgd2FzIGNoYW5nZWQ6DQoNClNlZToNCmh0dHA6Ly9wdWJzLm9wZW5ncm91cC5vcmcvb25saW5l cHVicy85Njk5OTE5Nzk5L2Z1bmN0aW9ucy9mdHJ1bmNhdGUuaHRtDQpsDQpodHRwOi8vcHVicy5v cGVuZ3JvdXAub3JnL29ubGluZXB1YnMvOTY5OTkxOTc5OS9mdW5jdGlvbnMvdHJ1bmNhdGUuaHRt bA0KaHR0cDovL3B1YnMub3Blbmdyb3VwLm9yZy9vbmxpbmVwdWJzLzk2OTk5MTk3OTkvZnVuY3Rp b25zL29wZW4uaHRtbA0KDQotLSANClRyb25kIE15a2xlYnVzdA0KTGludXggTkZTIGNsaWVudCBt YWludGFpbmVyLCBQcmltYXJ5RGF0YQ0KdHJvbmQubXlrbGVidXN0QHByaW1hcnlkYXRhLmNvbQ0K ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH] nfsd: special case truncates some more 2017-01-23 16:20 ` Trond Myklebust (?) @ 2017-01-23 16:26 ` hch 2017-01-23 17:25 ` Trond Myklebust 2017-01-24 16:25 ` J. Bruce Fields -1 siblings, 2 replies; 23+ messages in thread From: hch @ 2017-01-23 16:26 UTC (permalink / raw) To: Trond Myklebust; +Cc: hch, jlayton, bfields, linux-nfs, linux-fsdevel On Mon, Jan 23, 2017 at 04:20:45PM +0000, Trond Myklebust wrote: > Note that the POSIX spec seems to have changed recently. The current > spec appears to state that we should set the mtime and ctime (and > change attribute) on success in open(O_TRUNC), truncate() and > ftruncate(). In previous incarnations of the spec, truncate() would > only set the time if the size was changed: Interesting. But in this case historical Posix and thus Linux behavior still takes precedence and we're not suddently going to change behavior. ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH] nfsd: special case truncates some more 2017-01-23 16:26 ` hch @ 2017-01-23 17:25 ` Trond Myklebust 2017-01-24 16:25 ` J. Bruce Fields 1 sibling, 0 replies; 23+ messages in thread From: Trond Myklebust @ 2017-01-23 17:25 UTC (permalink / raw) To: hch; +Cc: bfields, jlayton, linux-nfs, linux-fsdevel On Mon, 2017-01-23 at 17:26 +0100, hch wrote: > On Mon, Jan 23, 2017 at 04:20:45PM +0000, Trond Myklebust wrote: > > Note that the POSIX spec seems to have changed recently. The > > current > > spec appears to state that we should set the mtime and ctime (and > > change attribute) on success in open(O_TRUNC), truncate() and > > ftruncate(). In previous incarnations of the spec, truncate() would > > only set the time if the size was changed: > > Interesting. But in this case historical Posix and thus Linux > behavior > still takes precedence and we're not suddently going to change > behavior. > In that case the client will be required to continue to need to send mtime/ctime in order to ensure that we get the same historical semantics w.r.t. ftruncate() vs truncate(). IOW: It's not a question of the client being lazy about clearing the flags. It's a question of enforcing the correct semantics. -- Trond Myklebust Linux NFS client maintainer, PrimaryData trond.myklebust@primarydata.com ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH] nfsd: special case truncates some more @ 2017-01-23 17:25 ` Trond Myklebust 0 siblings, 0 replies; 23+ messages in thread From: Trond Myklebust @ 2017-01-23 17:25 UTC (permalink / raw) To: hch; +Cc: bfields, jlayton, linux-nfs, linux-fsdevel T24gTW9uLCAyMDE3LTAxLTIzIGF0IDE3OjI2ICswMTAwLCBoY2ggd3JvdGU6DQo+IE9uIE1vbiwg SmFuIDIzLCAyMDE3IGF0IDA0OjIwOjQ1UE0gKzAwMDAsIFRyb25kIE15a2xlYnVzdCB3cm90ZToN Cj4gPiBOb3RlIHRoYXQgdGhlIFBPU0lYIHNwZWMgc2VlbXMgdG8gaGF2ZSBjaGFuZ2VkIHJlY2Vu dGx5LiBUaGUNCj4gPiBjdXJyZW50DQo+ID4gc3BlYyBhcHBlYXJzIHRvIHN0YXRlIHRoYXQgd2Ug c2hvdWxkIHNldCB0aGUgbXRpbWUgYW5kIGN0aW1lIChhbmQNCj4gPiBjaGFuZ2UgYXR0cmlidXRl KSBvbiBzdWNjZXNzIGluIG9wZW4oT19UUlVOQyksIHRydW5jYXRlKCkgYW5kDQo+ID4gZnRydW5j YXRlKCkuIEluIHByZXZpb3VzIGluY2FybmF0aW9ucyBvZiB0aGUgc3BlYywgdHJ1bmNhdGUoKSB3 b3VsZA0KPiA+IG9ubHkgc2V0IHRoZSB0aW1lIGlmIHRoZSBzaXplIHdhcyBjaGFuZ2VkOg0KPiAN Cj4gSW50ZXJlc3RpbmcuwqDCoEJ1dCBpbiB0aGlzIGNhc2UgaGlzdG9yaWNhbCBQb3NpeCBhbmQg dGh1cyBMaW51eA0KPiBiZWhhdmlvcg0KPiBzdGlsbCB0YWtlcyBwcmVjZWRlbmNlIGFuZCB3ZSdy ZSBub3Qgc3VkZGVudGx5IGdvaW5nIHRvIGNoYW5nZQ0KPiBiZWhhdmlvci4NCj4gDQoNCg0KSW4g dGhhdCBjYXNlIHRoZSBjbGllbnQgd2lsbCBiZSByZXF1aXJlZCB0byBjb250aW51ZSB0byBuZWVk IHRvIHNlbmQNCm10aW1lL2N0aW1lIGluIG9yZGVyIHRvIGVuc3VyZSB0aGF0IHdlIGdldCB0aGUg c2FtZSBoaXN0b3JpY2FsDQpzZW1hbnRpY3Mgdy5yLnQuIGZ0cnVuY2F0ZSgpIHZzIHRydW5jYXRl KCkuDQoNCklPVzogSXQncyBub3QgYSBxdWVzdGlvbiBvZiB0aGUgY2xpZW50IGJlaW5nIGxhenkg YWJvdXQgY2xlYXJpbmcgdGhlDQpmbGFncy4gSXQncyBhIHF1ZXN0aW9uIG9mIGVuZm9yY2luZyB0 aGUgY29ycmVjdCBzZW1hbnRpY3MuDQoNCi0tIA0KVHJvbmQgTXlrbGVidXN0DQpMaW51eCBORlMg Y2xpZW50IG1haW50YWluZXIsIFByaW1hcnlEYXRhDQp0cm9uZC5teWtsZWJ1c3RAcHJpbWFyeWRh dGEuY29tDQo= ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH] nfsd: special case truncates some more 2017-01-23 17:25 ` Trond Myklebust (?) @ 2017-01-23 17:38 ` hch 2017-01-23 17:42 ` Trond Myklebust -1 siblings, 1 reply; 23+ messages in thread From: hch @ 2017-01-23 17:38 UTC (permalink / raw) To: Trond Myklebust; +Cc: hch, bfields, jlayton, linux-nfs, linux-fsdevel On Mon, Jan 23, 2017 at 05:25:34PM +0000, Trond Myklebust wrote: > In that case the client will be required to continue to need to send > mtime/ctime in order to ensure that we get the same historical > semantics w.r.t. ftruncate() vs truncate(). > > IOW: It's not a question of the client being lazy about clearing the > flags. It's a question of enforcing the correct semantics. No, the NFS spec requires the server to add an implicit mtime when the size actually changes. In fact the current code has a comment pointing to the section: * RFC5661, Section 18.30.4: * Changing the size of a file with SETATTR indirectly * changes the time_modify and change attributes. * * (and similar for the older RFCs) And yes, I've double checked that in the RFC. ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH] nfsd: special case truncates some more 2017-01-23 17:38 ` hch @ 2017-01-23 17:42 ` Trond Myklebust 0 siblings, 0 replies; 23+ messages in thread From: Trond Myklebust @ 2017-01-23 17:42 UTC (permalink / raw) To: hch; +Cc: bfields, jlayton, linux-nfs, linux-fsdevel On Mon, 2017-01-23 at 18:38 +0100, hch wrote: > On Mon, Jan 23, 2017 at 05:25:34PM +0000, Trond Myklebust wrote: > > In that case the client will be required to continue to need to > > send > > mtime/ctime in order to ensure that we get the same historical > > semantics w.r.t. ftruncate() vs truncate(). > > > > IOW: It's not a question of the client being lazy about clearing > > the > > flags. It's a question of enforcing the correct semantics. > > No, the NFS spec requires the server to add an implicit mtime > when the size actually changes. In fact the current code has a > comment > pointing to the section: > > * RFC5661, Section 18.30.4: > * Changing the size of a file with SETATTR indirectly > * changes the time_modify and change attributes. > * > * (and similar for the older RFCs) > > And yes, I've double checked that in the RFC. Sure, but truncate() on POSIX adds the requirement that the mtime/ctime should change even when the file size is not changed. -- Trond Myklebust Linux NFS client maintainer, PrimaryData trond.myklebust@primarydata.com ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH] nfsd: special case truncates some more @ 2017-01-23 17:42 ` Trond Myklebust 0 siblings, 0 replies; 23+ messages in thread From: Trond Myklebust @ 2017-01-23 17:42 UTC (permalink / raw) To: hch; +Cc: bfields, jlayton, linux-nfs, linux-fsdevel T24gTW9uLCAyMDE3LTAxLTIzIGF0IDE4OjM4ICswMTAwLCBoY2ggd3JvdGU6DQo+IE9uIE1vbiwg SmFuIDIzLCAyMDE3IGF0IDA1OjI1OjM0UE0gKzAwMDAsIFRyb25kIE15a2xlYnVzdCB3cm90ZToN Cj4gPiBJbiB0aGF0IGNhc2UgdGhlIGNsaWVudCB3aWxsIGJlIHJlcXVpcmVkIHRvIGNvbnRpbnVl IHRvIG5lZWQgdG8NCj4gPiBzZW5kDQo+ID4gbXRpbWUvY3RpbWUgaW4gb3JkZXIgdG8gZW5zdXJl IHRoYXQgd2UgZ2V0IHRoZSBzYW1lIGhpc3RvcmljYWwNCj4gPiBzZW1hbnRpY3Mgdy5yLnQuIGZ0 cnVuY2F0ZSgpIHZzIHRydW5jYXRlKCkuDQo+ID4gDQo+ID4gSU9XOiBJdCdzIG5vdCBhIHF1ZXN0 aW9uIG9mIHRoZSBjbGllbnQgYmVpbmcgbGF6eSBhYm91dCBjbGVhcmluZw0KPiA+IHRoZQ0KPiA+ IGZsYWdzLiBJdCdzIGEgcXVlc3Rpb24gb2YgZW5mb3JjaW5nIHRoZSBjb3JyZWN0IHNlbWFudGlj cy4NCj4gDQo+IE5vLCB0aGUgTkZTIHNwZWMgcmVxdWlyZXMgdGhlIHNlcnZlciB0byBhZGQgYW4g aW1wbGljaXQgbXRpbWUNCj4gd2hlbiB0aGUgc2l6ZSBhY3R1YWxseSBjaGFuZ2VzLsKgwqBJbiBm YWN0IHRoZSBjdXJyZW50IGNvZGUgaGFzIGENCj4gY29tbWVudA0KPiBwb2ludGluZyB0byB0aGUg c2VjdGlvbjoNCj4gDQo+IMKgKiBSRkM1NjYxLCBTZWN0aW9uIDE4LjMwLjQ6DQo+IMKgKsKgwqDC oENoYW5naW5nIHRoZSBzaXplIG9mIGEgZmlsZSB3aXRoIFNFVEFUVFIgaW5kaXJlY3RseQ0KPiDC oCrCoMKgwqBjaGFuZ2VzIHRoZSB0aW1lX21vZGlmeSBhbmQgY2hhbmdlIGF0dHJpYnV0ZXMuDQo+ IMKgKg0KPiDCoCogKGFuZCBzaW1pbGFyIGZvciB0aGUgb2xkZXIgUkZDcykNCj4gDQo+IEFuZCB5 ZXMsIEkndmUgZG91YmxlIGNoZWNrZWQgdGhhdCBpbiB0aGUgUkZDLg0KDQpTdXJlLCBidXQgdHJ1 bmNhdGUoKSBvbiBQT1NJWCBhZGRzIHRoZSByZXF1aXJlbWVudCB0aGF0IHRoZSBtdGltZS9jdGlt ZQ0Kc2hvdWxkIGNoYW5nZSBldmVuIHdoZW4gdGhlIGZpbGUgc2l6ZSBpcyBub3QgY2hhbmdlZC4N Cg0KLS0gDQpUcm9uZCBNeWtsZWJ1c3QNCkxpbnV4IE5GUyBjbGllbnQgbWFpbnRhaW5lciwgUHJp bWFyeURhdGENCnRyb25kLm15a2xlYnVzdEBwcmltYXJ5ZGF0YS5jb20NCg== ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH] nfsd: special case truncates some more 2017-01-23 16:26 ` hch 2017-01-23 17:25 ` Trond Myklebust @ 2017-01-24 16:25 ` J. Bruce Fields 1 sibling, 0 replies; 23+ messages in thread From: J. Bruce Fields @ 2017-01-24 16:25 UTC (permalink / raw) To: hch; +Cc: Trond Myklebust, jlayton, bfields, linux-nfs, linux-fsdevel On Mon, Jan 23, 2017 at 05:26:07PM +0100, hch wrote: > On Mon, Jan 23, 2017 at 04:20:45PM +0000, Trond Myklebust wrote: > > Note that the POSIX spec seems to have changed recently. The current > > spec appears to state that we should set the mtime and ctime (and > > change attribute) on success in open(O_TRUNC), truncate() and > > ftruncate(). In previous incarnations of the spec, truncate() would > > only set the time if the size was changed: > > Interesting. But in this case historical Posix and thus Linux behavior > still takes precedence and we're not suddently going to change behavior. Makes sense as a general rule, but is it really likely that anyone depends on ctime/mtime not changing on a non-size-changing truncate()? --b. ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH] nfsd: special case truncates some more 2017-01-23 15:52 ` Jeff Layton 2017-01-23 16:05 ` Christoph Hellwig @ 2017-01-24 22:02 ` J. Bruce Fields 1 sibling, 0 replies; 23+ messages in thread From: J. Bruce Fields @ 2017-01-24 22:02 UTC (permalink / raw) To: Jeff Layton; +Cc: Christoph Hellwig, bfields, linux-nfs, linux-fsdevel On Mon, Jan 23, 2017 at 10:52:09AM -0500, Jeff Layton wrote: > On Mon, 2017-01-23 at 16:36 +0100, Christoph Hellwig wrote: > > On Mon, Jan 23, 2017 at 01:33:48PM +0100, Christoph Hellwig wrote: > > > I'll need to look at the exact NFS semantics in that area, but after > > > a bit of research I can probably come up with something that will work. > > > > Here is my first attempt. As vfs_truncate will add the ctime and mtime > > updates when needed it just leaves handling that quirk to vfs_truncate > > and then exits early if no other attributes are set. > > > > Unfortunately at least the Linux client always seems to also request > > a mtime update with a size update. We could keep the > > > > if (iap->ia_size != i_size_read(inode)) > > > > check from the old code and remove ATTR_MTIME, but these racy checks > > outside i_rwsem make me feel a bit uneasy. Jeff, Bruce - any opinion > > if we should add something like this: > > > > Ok, that's more complicated than it looked at first blush. :) > > To be clear, the client is requesting to set the mtime to current server > time and not to a specific mtime, right? > > > /* vfs_truncate will update ctime and mtime if the size changes */ > > if (iap->ia_size != i_size_read(inode)) > > iap->ia_valid &= ATTR_MTIME; > > > > back to nfsd_setattr? This would avoid the additional setattr call, > > but make me feel dirty :) > > > > I agree that I wouldn't want to go with a potentially racy check. Unless I'm misunderstanding: we've always had the race, and the consequence is just an unnecessary update in the case the truncate didn't actually change the size (although it looked like it would at the time of the check). I don't like that, but it's not going to keep me up at night. --b. > > I don't see where vfs_truncate will handle the times though. do_truncate > will, but you have to pass in a non-zero time_attrs and vfs_truncate > always sets that to 0. > > If we did want to do this, it seems like it might be better to just add > a new time_attrs arg to vfs_truncate that gets passed to do_truncate. > Most callers would set it to zero, but nfsd could set it to: > > iap->ia_valid & (ATTR_MTIME|ATTR_CTIME) > > Would that work? > > > --- > > From 0e06e2fc6157bb97692ed47c21e36120efb9f15c Mon Sep 17 00:00:00 2001 > > From: Christoph Hellwig <hch@lst.de> > > Date: Sun, 22 Jan 2017 17:17:48 +0100 > > Subject: nfsd: special case truncates some more > > MIME-Version: 1.0 > > Content-Type: text/plain; charset=UTF-8 > > Content-Transfer-Encoding: 8bit > > > > Both the NFS protocols and the Linux VFS use a setattr operation with a > > bitmap of attributs to set to set various file attributes including the > > file size and the uid/gid. > > > > The Linux syscalls never mixe size updates with unrelated updates like > > the uid/gid, and some file systems like XFS and GFS2 rely on the fact > > that truncates might not update random other attributes, and many > > other file systems handle the case but do not update the different > > attributes in the same transaction. NFSD on the other hand passes > > the attributes it gets on the wire more or less directly through to > > the VFS, leading to updates the file systems don't expect. XFS at > > least has an assert on the allowed attributes, which cought an NFS > > client sets the size and group ІD at the same time. > > > > To handles this issue properly this switches nfsd to call vfs_truncate > > for size changes, and then handling all other attributes through > > notify_change. As a side effect this also means less boilerplace > > code around the size change as we can now reuse the VFS code. > > > > Signed-off-by: Christoph Hellwig <hch@lst.de> > > --- > > fs/nfsd/vfs.c | 92 +++++++++++++++++++---------------------------------------- > > 1 file changed, 30 insertions(+), 62 deletions(-) > > > > diff --git a/fs/nfsd/vfs.c b/fs/nfsd/vfs.c > > index 26c6fdb..4ca5b92 100644 > > --- a/fs/nfsd/vfs.c > > +++ b/fs/nfsd/vfs.c > > @@ -332,37 +332,6 @@ nfsd_sanitize_attrs(struct inode *inode, struct iattr *iap) > > } > > } > > > > -static __be32 > > -nfsd_get_write_access(struct svc_rqst *rqstp, struct svc_fh *fhp, > > - struct iattr *iap) > > -{ > > - struct inode *inode = d_inode(fhp->fh_dentry); > > - int host_err; > > - > > - if (iap->ia_size < inode->i_size) { > > - __be32 err; > > - > > - err = nfsd_permission(rqstp, fhp->fh_export, fhp->fh_dentry, > > - NFSD_MAY_TRUNC | NFSD_MAY_OWNER_OVERRIDE); > > - if (err) > > - return err; > > - } > > - > > - host_err = get_write_access(inode); > > - if (host_err) > > - goto out_nfserrno; > > - > > - host_err = locks_verify_truncate(inode, NULL, iap->ia_size); > > - if (host_err) > > - goto out_put_write_access; > > - return 0; > > - > > -out_put_write_access: > > - put_write_access(inode); > > -out_nfserrno: > > - return nfserrno(host_err); > > -} > > - > > /* > > * Set various file attributes. After this call fhp needs an fh_put. > > */ > > @@ -377,7 +346,6 @@ nfsd_setattr(struct svc_rqst *rqstp, struct svc_fh *fhp, struct iattr *iap, > > __be32 err; > > int host_err; > > bool get_write_count; > > - int size_change = 0; > > > > if (iap->ia_valid & (ATTR_ATIME | ATTR_MTIME | ATTR_SIZE)) > > accmode |= NFSD_MAY_WRITE|NFSD_MAY_OWNER_OVERRIDE; > > @@ -390,11 +358,11 @@ nfsd_setattr(struct svc_rqst *rqstp, struct svc_fh *fhp, struct iattr *iap, > > /* Get inode */ > > err = fh_verify(rqstp, fhp, ftype, accmode); > > if (err) > > - goto out; > > + return err; > > if (get_write_count) { > > host_err = fh_want_write(fhp); > > if (host_err) > > - return nfserrno(host_err); > > + goto out_host_err; > > } > > > > dentry = fhp->fh_dentry; > > @@ -405,50 +373,50 @@ nfsd_setattr(struct svc_rqst *rqstp, struct svc_fh *fhp, struct iattr *iap, > > iap->ia_valid &= ~ATTR_MODE; > > > > if (!iap->ia_valid) > > - goto out; > > + return 0; > > > > nfsd_sanitize_attrs(inode, iap); > > > > + if (check_guard && guardtime != inode->i_ctime.tv_sec) > > + return nfserr_notsync; > > + > > /* > > * The size case is special, it changes the file in addition to the > > - * attributes. > > + * attributes, and file systems don't expect it to be mixed with > > + * "random" attribute changes. We thus split out the size change > > + * into a separate calo for vfs_truncate, and do the rest as a > > + * a separate setattr call. > > + * > > + * Note that vfs_truncate will also update ctime and mtime if > > + * the file size changes. > > */ > > if (iap->ia_valid & ATTR_SIZE) { > > - err = nfsd_get_write_access(rqstp, fhp, iap); > > - if (err) > > - goto out; > > - size_change = 1; > > + struct path path = { > > + .mnt = fhp->fh_export->ex_path.mnt, > > + .dentry = dentry, > > + }; > > > > - /* > > - * RFC5661, Section 18.30.4: > > - * Changing the size of a file with SETATTR indirectly > > - * changes the time_modify and change attributes. > > - * > > - * (and similar for the older RFCs) > > - */ > > - if (iap->ia_size != i_size_read(inode)) > > - iap->ia_valid |= ATTR_MTIME; > > + host_err = vfs_truncate(&path, iap->ia_size); > > + if (host_err) > > + goto out_host_err; > > + > > + iap->ia_valid &= ~ATTR_SIZE; > > + if (!iap->ia_valid) > > + goto done; > > } > > > > iap->ia_valid |= ATTR_CTIME; > > > > - if (check_guard && guardtime != inode->i_ctime.tv_sec) { > > - err = nfserr_notsync; > > - goto out_put_write_access; > > - } > > - > > fh_lock(fhp); > > host_err = notify_change(dentry, iap, NULL); > > fh_unlock(fhp); > > - err = nfserrno(host_err); > > + if (host_err) > > + goto out_host_err; > > > > -out_put_write_access: > > - if (size_change) > > - put_write_access(inode); > > - if (!err) > > - err = nfserrno(commit_metadata(fhp)); > > -out: > > - return err; > > +done: > > + host_err = commit_metadata(fhp); > > +out_host_err: > > + return nfserrno(host_err); > > } > > > > #if defined(CONFIG_NFSD_V4) > > -- > Jeff Layton <jlayton@poochiereds.net> ^ permalink raw reply [flat|nested] 23+ messages in thread
end of thread, other threads:[~2017-02-21 15:14 UTC | newest] Thread overview: 23+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2017-02-20 6:21 split setattr operations take 2 Christoph Hellwig 2017-02-20 6:21 ` [PATCH] nfsd: special case truncates some more Christoph Hellwig 2017-02-20 22:23 ` J. Bruce Fields 2017-02-21 15:07 ` Chuck Lever 2017-02-21 15:14 ` J. Bruce Fields -- strict thread matches above, loose matches on Subject: below -- 2017-01-22 16:54 setattr ATTR_SIZE vs the rest Christoph Hellwig 2017-01-22 16:54 ` [PATCH] nfsd: special case truncates some more Christoph Hellwig 2017-01-23 12:21 ` Jeff Layton 2017-01-23 12:33 ` Christoph Hellwig 2017-01-23 15:36 ` Christoph Hellwig 2017-01-23 15:36 ` Christoph Hellwig 2017-01-23 15:52 ` Jeff Layton 2017-01-23 16:05 ` Christoph Hellwig 2017-01-23 16:14 ` Jeff Layton 2017-01-23 16:20 ` Trond Myklebust 2017-01-23 16:20 ` Trond Myklebust 2017-01-23 16:26 ` hch 2017-01-23 17:25 ` Trond Myklebust 2017-01-23 17:25 ` Trond Myklebust 2017-01-23 17:38 ` hch 2017-01-23 17:42 ` Trond Myklebust 2017-01-23 17:42 ` Trond Myklebust 2017-01-24 16:25 ` J. Bruce Fields 2017-01-24 22:02 ` J. Bruce Fields
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.