* lease/delegation/oplock semantics @ 2011-06-10 0:09 J. Bruce Fields 2011-06-10 0:10 ` [PATCH 1/2] locks: introduce i_blockleases to close lease races J. Bruce Fields 0 siblings, 1 reply; 13+ messages in thread From: J. Bruce Fields @ 2011-06-10 0:09 UTC (permalink / raw) To: linux-nfs-u79uwXL29TY76Z2rM5mHXA Cc: linux-fsdevel-u79uwXL29TY76Z2rM5mHXA, samba-technical-w/Ol4Ecudpl8XjKLYN78aQ, Christoph Hellwig, Mimi Zohar, Eric Paris Resending with some extra cc's, hope that's OK: NFSv4 (along with, from what I'm told, Samba) has a problem with read leases, which is that they are currently only broken on write opens, whereas they should really be broken on any operation that changes an inode's metadata or any of the links pointing to it. Also, to break leases in a non-racy way it's not sufficient to have a single break_lease() call from the vfs code; we also need to prevent anyone from acquiring a new lease while the operation is still in progress. So here's one attempt to deal with those problems, by adding to the inode a counter which is incremented whenever we start such an operation and decremented when we finish it. I only handle unlink; we'd also need to do link, rename, chown, and chmod, at least. Comments? Better ideas? --b. -- To unsubscribe from this list: send the line "unsubscribe linux-nfs" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 13+ messages in thread
* [PATCH 1/2] locks: introduce i_blockleases to close lease races 2011-06-10 0:09 lease/delegation/oplock semantics J. Bruce Fields @ 2011-06-10 0:10 ` J. Bruce Fields [not found] ` <20110610001011.GD22215-uC3wQj2KruNg9hUCZPvPmw@public.gmane.org> 2011-06-10 20:24 ` [PATCH 1/2] locks: introduce i_blockleases to close lease races Mimi Zohar 0 siblings, 2 replies; 13+ messages in thread From: J. Bruce Fields @ 2011-06-10 0:10 UTC (permalink / raw) To: linux-nfs Cc: linux-fsdevel, samba-technical, Christoph Hellwig, Mimi Zohar, Eric Paris From: J. Bruce Fields <bfields@redhat.com> Since break_lease is called before i_writecount is incremented, there's a window between the two where a setlease call would have no way to know that an open is about to happen. We fix this by adding a new inode field, i_blockleases, that is incremented while a lease-breaking operation is in progress. We will later reuse i_blockleases to enforce lease-breaking for rename, unlink, etc. Signed-off-by: J. Bruce Fields <bfields@redhat.com> --- fs/inode.c | 1 + fs/locks.c | 28 ++++++++++++++++++++++++++++ fs/namei.c | 6 +++++- include/linux/fs.h | 3 +++ 4 files changed, 37 insertions(+), 1 deletions(-) diff --git a/fs/inode.c b/fs/inode.c index 33c963d..4f253a2 100644 --- a/fs/inode.c +++ b/fs/inode.c @@ -198,6 +198,7 @@ int inode_init_always(struct super_block *sb, struct inode *inode) inode->i_uid = 0; inode->i_gid = 0; atomic_set(&inode->i_writecount, 0); + atomic_set(&inode->i_blockleases, 0); inode->i_size = 0; inode->i_blocks = 0; inode->i_bytes = 0; diff --git a/fs/locks.c b/fs/locks.c index 0a4f50d..7699b1b 100644 --- a/fs/locks.c +++ b/fs/locks.c @@ -1164,6 +1164,32 @@ static void time_out_leases(struct inode *inode) } } +/* Disallow all leases (read or write): */ +void disallow_leases(struct inode *inode, int flags) +{ + if (!inode) + return; + if (!S_ISREG(inode->i_mode)) + return; + if ((flags & O_ACCMODE) == O_RDONLY) + return; + atomic_inc(&inode->i_blockleases); +} +EXPORT_SYMBOL_GPL(disallow_leases); + +void reallow_leases(struct inode *inode, int flags) +{ + if (!inode) + return; + if (!S_ISREG(inode->i_mode)) + return; + if ((flags & O_ACCMODE) == O_RDONLY) + return; + BUG_ON(atomic_read(&inode->i_blockleases) <= 0); + atomic_dec(&inode->i_blockleases); +} +EXPORT_SYMBOL_GPL(reallow_leases); + /** * __break_lease - revoke all outstanding leases on file * @inode: the inode of the file to return @@ -1369,6 +1395,8 @@ int generic_setlease(struct file *filp, long arg, struct file_lock **flp) if (arg != F_UNLCK) { error = -EAGAIN; + if (atomic_read(&inode->i_blockleases)) + goto out; if ((arg == F_RDLCK) && (atomic_read(&inode->i_writecount) > 0)) goto out; if ((arg == F_WRLCK) diff --git a/fs/namei.c b/fs/namei.c index 3cb616d..079e68c 100644 --- a/fs/namei.c +++ b/fs/namei.c @@ -2277,10 +2277,14 @@ ok: want_write = 1; } common: + disallow_leases(nd->path.dentry->d_inode, open_flag); error = may_open(&nd->path, acc_mode, open_flag); - if (error) + if (error) { + reallow_leases(nd->path.dentry->d_inode, open_flag); goto exit; + } filp = nameidata_to_filp(nd); + reallow_leases(nd->path.dentry->d_inode, open_flag); if (!IS_ERR(filp)) { error = ima_file_check(filp, op->acc_mode); if (error) { diff --git a/include/linux/fs.h b/include/linux/fs.h index 1b95af3..daf8443 100644 --- a/include/linux/fs.h +++ b/include/linux/fs.h @@ -796,6 +796,7 @@ struct inode { #ifdef CONFIG_IMA atomic_t i_readcount; /* struct files open RO */ #endif + atomic_t i_blockleases; /* setlease fails when >0 */ atomic_t i_writecount; #ifdef CONFIG_SECURITY void *i_security; @@ -1161,6 +1162,8 @@ extern void lease_get_mtime(struct inode *, struct timespec *time); extern int generic_setlease(struct file *, long, struct file_lock **); extern int vfs_setlease(struct file *, long, struct file_lock **); extern int lease_modify(struct file_lock **, int); +extern void disallow_leases(struct inode *, int flags); +extern void reallow_leases(struct inode *, int flags); extern int lock_may_read(struct inode *, loff_t start, unsigned long count); extern int lock_may_write(struct inode *, loff_t start, unsigned long count); extern void lock_flocks(void); -- 1.7.4.1 ^ permalink raw reply related [flat|nested] 13+ messages in thread
[parent not found: <20110610001011.GD22215-uC3wQj2KruNg9hUCZPvPmw@public.gmane.org>]
* [PATCH 2/2] locks: break lease on unlink [not found] ` <20110610001011.GD22215-uC3wQj2KruNg9hUCZPvPmw@public.gmane.org> @ 2011-06-10 0:11 ` J. Bruce Fields 0 siblings, 0 replies; 13+ messages in thread From: J. Bruce Fields @ 2011-06-10 0:11 UTC (permalink / raw) To: linux-nfs-u79uwXL29TY76Z2rM5mHXA Cc: linux-fsdevel-u79uwXL29TY76Z2rM5mHXA, samba-technical-w/Ol4Ecudpl8XjKLYN78aQ, Christoph Hellwig, Mimi Zohar, Eric Paris NFSv4 uses leases (delegations) to allow clients to do opens locally. An open takes a component name, so to do a local open a client needs to know that at least that last component name points at the same time as long as they hold the delegation. For that reason, the NFSv4 spec requires that delegations be broken on unlink. Waiting for a lease to be broken may mean waiting for an nfs client or a user process to give it up, so doing that while holding the i_mutex would risk deadlocks. So instead we do a non-blocking lease break, then drop the i_mutex and do a blocking lease break if necessary. Signed-off-by: J. Bruce Fields <bfields-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> --- fs/namei.c | 32 ++++++++++++++++++++++++++++---- fs/nfsd/vfs.c | 6 +++++- 2 files changed, 33 insertions(+), 5 deletions(-) diff --git a/fs/namei.c b/fs/namei.c index 079e68c..4661469 100644 --- a/fs/namei.c +++ b/fs/namei.c @@ -2799,6 +2799,7 @@ static long do_unlinkat(int dfd, const char __user *pathname) nd.flags &= ~LOOKUP_PARENT; +retry: mutex_lock_nested(&nd.path.dentry->d_inode->i_mutex, I_MUTEX_PARENT); dentry = lookup_hash(&nd); error = PTR_ERR(dentry); @@ -2806,15 +2807,27 @@ static long do_unlinkat(int dfd, const char __user *pathname) /* Why not before? Because we want correct error value */ if (nd.last.name[nd.last.len]) goto slashes; - inode = dentry->d_inode; - if (inode) - ihold(inode); + if (inode && inode != dentry->d_inode) { + reallow_leases(inode, O_WRONLY); + iput(inode); + inode = NULL; + } + if (!inode) { + inode = dentry->d_inode; + if (inode) + ihold(inode); + disallow_leases(inode, O_WRONLY); + } error = mnt_want_write(nd.path.mnt); if (error) goto exit2; error = security_path_unlink(&nd.path, dentry); if (error) goto exit3; + if (inode) + error = break_lease(inode, O_WRONLY|O_NONBLOCK); + if (error) + goto exit3; error = vfs_unlink(nd.path.dentry->d_inode, dentry); exit3: mnt_drop_write(nd.path.mnt); @@ -2822,8 +2835,19 @@ exit3: dput(dentry); } mutex_unlock(&nd.path.dentry->d_inode->i_mutex); - if (inode) + if (inode) { + /* XXX: safe to use EWOULDBLOCK==EAGAIN to do this?: */ + if (error == -EWOULDBLOCK) { + /* + * Wait this time, then retry with leases left + * disabled: + */ + break_lease(inode, O_WRONLY); + goto retry; + } + reallow_leases(inode, O_WRONLY); iput(inode); /* truncate the inode here */ + } exit1: path_put(&nd.path); putname(name); diff --git a/fs/nfsd/vfs.c b/fs/nfsd/vfs.c index f4e056a..ef01b98 100644 --- a/fs/nfsd/vfs.c +++ b/fs/nfsd/vfs.c @@ -1826,13 +1826,17 @@ nfsd_unlink(struct svc_rqst *rqstp, struct svc_fh *fhp, int type, if (host_err) goto out_put; + disallow_leases(rdentry->d_inode, O_WRONLY); host_err = nfsd_break_lease(rdentry->d_inode); - if (host_err) + if (host_err) { + reallow_leases(rdentry->d_inode, O_WRONLY); goto out_drop_write; + } if (type != S_IFDIR) host_err = vfs_unlink(dirp, rdentry); else host_err = vfs_rmdir(dirp, rdentry); + reallow_leases(rdentry->d_inode, O_WRONLY); if (!host_err) host_err = commit_metadata(fhp); out_drop_write: -- 1.7.4.1 -- To unsubscribe from this list: send the line "unsubscribe linux-nfs" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply related [flat|nested] 13+ messages in thread
* Re: [PATCH 1/2] locks: introduce i_blockleases to close lease races 2011-06-10 0:10 ` [PATCH 1/2] locks: introduce i_blockleases to close lease races J. Bruce Fields [not found] ` <20110610001011.GD22215-uC3wQj2KruNg9hUCZPvPmw@public.gmane.org> @ 2011-06-10 20:24 ` Mimi Zohar 2011-06-10 21:34 ` J. Bruce Fields 1 sibling, 1 reply; 13+ messages in thread From: Mimi Zohar @ 2011-06-10 20:24 UTC (permalink / raw) To: J. Bruce Fields Cc: linux-nfs, linux-fsdevel, samba-technical, Christoph Hellwig, Eric Paris On Thu, 2011-06-09 at 20:10 -0400, J. Bruce Fields wrote: > From: J. Bruce Fields <bfields@redhat.com> > > Since break_lease is called before i_writecount is incremented, there's > a window between the two where a setlease call would have no way to know > that an open is about to happen. So unless the break_lease() call is moved from may_open() to after nameidata_to_filp(), I don't see any other options. Mimi > We fix this by adding a new inode field, i_blockleases, that is > incremented while a lease-breaking operation is in progress. > > We will later reuse i_blockleases to enforce lease-breaking for rename, > unlink, etc. > > Signed-off-by: J. Bruce Fields <bfields@redhat.com> > --- > fs/inode.c | 1 + > fs/locks.c | 28 ++++++++++++++++++++++++++++ > fs/namei.c | 6 +++++- > include/linux/fs.h | 3 +++ > 4 files changed, 37 insertions(+), 1 deletions(-) > > diff --git a/fs/inode.c b/fs/inode.c > index 33c963d..4f253a2 100644 > --- a/fs/inode.c > +++ b/fs/inode.c > @@ -198,6 +198,7 @@ int inode_init_always(struct super_block *sb, struct inode *inode) > inode->i_uid = 0; > inode->i_gid = 0; > atomic_set(&inode->i_writecount, 0); > + atomic_set(&inode->i_blockleases, 0); > inode->i_size = 0; > inode->i_blocks = 0; > inode->i_bytes = 0; > diff --git a/fs/locks.c b/fs/locks.c > index 0a4f50d..7699b1b 100644 > --- a/fs/locks.c > +++ b/fs/locks.c > @@ -1164,6 +1164,32 @@ static void time_out_leases(struct inode *inode) > } > } > > +/* Disallow all leases (read or write): */ > +void disallow_leases(struct inode *inode, int flags) > +{ > + if (!inode) > + return; > + if (!S_ISREG(inode->i_mode)) > + return; > + if ((flags & O_ACCMODE) == O_RDONLY) > + return; > + atomic_inc(&inode->i_blockleases); > +} > +EXPORT_SYMBOL_GPL(disallow_leases); > + > +void reallow_leases(struct inode *inode, int flags) > +{ > + if (!inode) > + return; > + if (!S_ISREG(inode->i_mode)) > + return; > + if ((flags & O_ACCMODE) == O_RDONLY) > + return; > + BUG_ON(atomic_read(&inode->i_blockleases) <= 0); > + atomic_dec(&inode->i_blockleases); > +} > +EXPORT_SYMBOL_GPL(reallow_leases); > + > /** > * __break_lease - revoke all outstanding leases on file > * @inode: the inode of the file to return > @@ -1369,6 +1395,8 @@ int generic_setlease(struct file *filp, long arg, struct file_lock **flp) > > if (arg != F_UNLCK) { > error = -EAGAIN; > + if (atomic_read(&inode->i_blockleases)) > + goto out; > if ((arg == F_RDLCK) && (atomic_read(&inode->i_writecount) > 0)) > goto out; > if ((arg == F_WRLCK) > diff --git a/fs/namei.c b/fs/namei.c > index 3cb616d..079e68c 100644 > --- a/fs/namei.c > +++ b/fs/namei.c > @@ -2277,10 +2277,14 @@ ok: > want_write = 1; > } > common: > + disallow_leases(nd->path.dentry->d_inode, open_flag); > error = may_open(&nd->path, acc_mode, open_flag); > - if (error) > + if (error) { > + reallow_leases(nd->path.dentry->d_inode, open_flag); > goto exit; > + } > filp = nameidata_to_filp(nd); > + reallow_leases(nd->path.dentry->d_inode, open_flag); > if (!IS_ERR(filp)) { > error = ima_file_check(filp, op->acc_mode); > if (error) { > diff --git a/include/linux/fs.h b/include/linux/fs.h > index 1b95af3..daf8443 100644 > --- a/include/linux/fs.h > +++ b/include/linux/fs.h > @@ -796,6 +796,7 @@ struct inode { > #ifdef CONFIG_IMA > atomic_t i_readcount; /* struct files open RO */ > #endif > + atomic_t i_blockleases; /* setlease fails when >0 */ > atomic_t i_writecount; > #ifdef CONFIG_SECURITY > void *i_security; > @@ -1161,6 +1162,8 @@ extern void lease_get_mtime(struct inode *, struct timespec *time); > extern int generic_setlease(struct file *, long, struct file_lock **); > extern int vfs_setlease(struct file *, long, struct file_lock **); > extern int lease_modify(struct file_lock **, int); > +extern void disallow_leases(struct inode *, int flags); > +extern void reallow_leases(struct inode *, int flags); > extern int lock_may_read(struct inode *, loff_t start, unsigned long count); > extern int lock_may_write(struct inode *, loff_t start, unsigned long count); > extern void lock_flocks(void); ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH 1/2] locks: introduce i_blockleases to close lease races 2011-06-10 20:24 ` [PATCH 1/2] locks: introduce i_blockleases to close lease races Mimi Zohar @ 2011-06-10 21:34 ` J. Bruce Fields 2011-06-12 4:08 ` J. Bruce Fields 0 siblings, 1 reply; 13+ messages in thread From: J. Bruce Fields @ 2011-06-10 21:34 UTC (permalink / raw) To: Mimi Zohar Cc: linux-nfs, linux-fsdevel, samba-technical, Christoph Hellwig, Eric Paris On Fri, Jun 10, 2011 at 04:24:00PM -0400, Mimi Zohar wrote: > On Thu, 2011-06-09 at 20:10 -0400, J. Bruce Fields wrote: > > From: J. Bruce Fields <bfields@redhat.com> > > > > Since break_lease is called before i_writecount is incremented, there's > > a window between the two where a setlease call would have no way to know > > that an open is about to happen. > > So unless the break_lease() call is moved from may_open() to after > nameidata_to_filp(), I don't see any other options. Actually, offhand I can't see why that wouldn't be OK. Though I think we still end up needing something like i_blockleases to handle unlink, link, rename, chown, and chmod. --b. > > Mimi > > > We fix this by adding a new inode field, i_blockleases, that is > > incremented while a lease-breaking operation is in progress. > > > > We will later reuse i_blockleases to enforce lease-breaking for rename, > > unlink, etc. > > > > Signed-off-by: J. Bruce Fields <bfields@redhat.com> > > --- > > fs/inode.c | 1 + > > fs/locks.c | 28 ++++++++++++++++++++++++++++ > > fs/namei.c | 6 +++++- > > include/linux/fs.h | 3 +++ > > 4 files changed, 37 insertions(+), 1 deletions(-) > > > > diff --git a/fs/inode.c b/fs/inode.c > > index 33c963d..4f253a2 100644 > > --- a/fs/inode.c > > +++ b/fs/inode.c > > @@ -198,6 +198,7 @@ int inode_init_always(struct super_block *sb, struct inode *inode) > > inode->i_uid = 0; > > inode->i_gid = 0; > > atomic_set(&inode->i_writecount, 0); > > + atomic_set(&inode->i_blockleases, 0); > > inode->i_size = 0; > > inode->i_blocks = 0; > > inode->i_bytes = 0; > > diff --git a/fs/locks.c b/fs/locks.c > > index 0a4f50d..7699b1b 100644 > > --- a/fs/locks.c > > +++ b/fs/locks.c > > @@ -1164,6 +1164,32 @@ static void time_out_leases(struct inode *inode) > > } > > } > > > > +/* Disallow all leases (read or write): */ > > +void disallow_leases(struct inode *inode, int flags) > > +{ > > + if (!inode) > > + return; > > + if (!S_ISREG(inode->i_mode)) > > + return; > > + if ((flags & O_ACCMODE) == O_RDONLY) > > + return; > > + atomic_inc(&inode->i_blockleases); > > +} > > +EXPORT_SYMBOL_GPL(disallow_leases); > > + > > +void reallow_leases(struct inode *inode, int flags) > > +{ > > + if (!inode) > > + return; > > + if (!S_ISREG(inode->i_mode)) > > + return; > > + if ((flags & O_ACCMODE) == O_RDONLY) > > + return; > > + BUG_ON(atomic_read(&inode->i_blockleases) <= 0); > > + atomic_dec(&inode->i_blockleases); > > +} > > +EXPORT_SYMBOL_GPL(reallow_leases); > > + > > /** > > * __break_lease - revoke all outstanding leases on file > > * @inode: the inode of the file to return > > @@ -1369,6 +1395,8 @@ int generic_setlease(struct file *filp, long arg, struct file_lock **flp) > > > > if (arg != F_UNLCK) { > > error = -EAGAIN; > > + if (atomic_read(&inode->i_blockleases)) > > + goto out; > > if ((arg == F_RDLCK) && (atomic_read(&inode->i_writecount) > 0)) > > goto out; > > if ((arg == F_WRLCK) > > diff --git a/fs/namei.c b/fs/namei.c > > index 3cb616d..079e68c 100644 > > --- a/fs/namei.c > > +++ b/fs/namei.c > > @@ -2277,10 +2277,14 @@ ok: > > want_write = 1; > > } > > common: > > + disallow_leases(nd->path.dentry->d_inode, open_flag); > > error = may_open(&nd->path, acc_mode, open_flag); > > - if (error) > > + if (error) { > > + reallow_leases(nd->path.dentry->d_inode, open_flag); > > goto exit; > > + } > > filp = nameidata_to_filp(nd); > > + reallow_leases(nd->path.dentry->d_inode, open_flag); > > if (!IS_ERR(filp)) { > > error = ima_file_check(filp, op->acc_mode); > > if (error) { > > diff --git a/include/linux/fs.h b/include/linux/fs.h > > index 1b95af3..daf8443 100644 > > --- a/include/linux/fs.h > > +++ b/include/linux/fs.h > > @@ -796,6 +796,7 @@ struct inode { > > #ifdef CONFIG_IMA > > atomic_t i_readcount; /* struct files open RO */ > > #endif > > + atomic_t i_blockleases; /* setlease fails when >0 */ > > atomic_t i_writecount; > > #ifdef CONFIG_SECURITY > > void *i_security; > > @@ -1161,6 +1162,8 @@ extern void lease_get_mtime(struct inode *, struct timespec *time); > > extern int generic_setlease(struct file *, long, struct file_lock **); > > extern int vfs_setlease(struct file *, long, struct file_lock **); > > extern int lease_modify(struct file_lock **, int); > > +extern void disallow_leases(struct inode *, int flags); > > +extern void reallow_leases(struct inode *, int flags); > > extern int lock_may_read(struct inode *, loff_t start, unsigned long count); > > extern int lock_may_write(struct inode *, loff_t start, unsigned long count); > > extern void lock_flocks(void); > > ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH 1/2] locks: introduce i_blockleases to close lease races 2011-06-10 21:34 ` J. Bruce Fields @ 2011-06-12 4:08 ` J. Bruce Fields 2011-06-12 19:10 ` Mimi Zohar 0 siblings, 1 reply; 13+ messages in thread From: J. Bruce Fields @ 2011-06-12 4:08 UTC (permalink / raw) To: Mimi Zohar Cc: linux-nfs, linux-fsdevel, samba-technical, Christoph Hellwig, Eric Paris On Fri, Jun 10, 2011 at 05:34:46PM -0400, J. Bruce Fields wrote: > On Fri, Jun 10, 2011 at 04:24:00PM -0400, Mimi Zohar wrote: > > On Thu, 2011-06-09 at 20:10 -0400, J. Bruce Fields wrote: > > > From: J. Bruce Fields <bfields@redhat.com> > > > > > > Since break_lease is called before i_writecount is incremented, there's > > > a window between the two where a setlease call would have no way to know > > > that an open is about to happen. > > > > So unless the break_lease() call is moved from may_open() to after > > nameidata_to_filp(), I don't see any other options. > > Actually, offhand I can't see why that wouldn't be OK. > > Though I think we still end up needing something like i_blockleases to > handle unlink, link, rename, chown, and chmod. Well, I guess there's a bizarre alternative that wouldn't require a new inode field: What we care about is conflicts between read leases and operations that modify the metadata of the inode or the set of names pointing to it. As far as I can tell those operations all take the i_mutex either on the inode itself or on the parents of one of its aliases. So, you could prevent break_lease/setlease races by calling setlease under *all* of those i_mutexes: - take i_mutex on the inode - take i_lock to prevent the set of aliases from changing - take i_mutex for parent of each alias - set the lease - drop the parent i_mutexes, etc. where the i_mutexes would all be taken with mutex_trylock, and we'd just fail the whole setlease if any of them failed. ??? --b. ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH 1/2] locks: introduce i_blockleases to close lease races 2011-06-12 4:08 ` J. Bruce Fields @ 2011-06-12 19:10 ` Mimi Zohar 2011-06-12 19:12 ` J. Bruce Fields 0 siblings, 1 reply; 13+ messages in thread From: Mimi Zohar @ 2011-06-12 19:10 UTC (permalink / raw) To: J. Bruce Fields Cc: linux-nfs, linux-fsdevel, samba-technical, Christoph Hellwig, Eric Paris On Sun, 2011-06-12 at 00:08 -0400, J. Bruce Fields wrote: > On Fri, Jun 10, 2011 at 05:34:46PM -0400, J. Bruce Fields wrote: > > On Fri, Jun 10, 2011 at 04:24:00PM -0400, Mimi Zohar wrote: > > > On Thu, 2011-06-09 at 20:10 -0400, J. Bruce Fields wrote: > > > > From: J. Bruce Fields <bfields@redhat.com> > > > > > > > > Since break_lease is called before i_writecount is incremented, there's > > > > a window between the two where a setlease call would have no way to know > > > > that an open is about to happen. > > > > > > So unless the break_lease() call is moved from may_open() to after > > > nameidata_to_filp(), I don't see any other options. > > > > Actually, offhand I can't see why that wouldn't be OK. > > > > Though I think we still end up needing something like i_blockleases to > > handle unlink, link, rename, chown, and chmod. > > Well, I guess there's a bizarre alternative that wouldn't require a new > inode field: In lieu of adding a new inode field, another possible option, a bit kludgy, would be extending i_flock with an additional fl_flag FL_BLOCKLEASE. #define IS_BLOCKLEASE(fl) (fl->fl_flags & FL_BLOCKLEASE) Mimi > What we care about is conflicts between read leases and operations that > modify the metadata of the inode or the set of names pointing to it. > > As far as I can tell those operations all take the i_mutex either on the > inode itself or on the parents of one of its aliases. > > So, you could prevent break_lease/setlease races by calling setlease > under *all* of those i_mutexes: > > - take i_mutex on the inode > - take i_lock to prevent the set of aliases from changing > - take i_mutex for parent of each alias > - set the lease > - drop the parent i_mutexes, etc. > > where the i_mutexes would all be taken with mutex_trylock, and we'd just > fail the whole setlease if any of them failed. > > ??? > > --b. ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH 1/2] locks: introduce i_blockleases to close lease races 2011-06-12 19:10 ` Mimi Zohar @ 2011-06-12 19:12 ` J. Bruce Fields [not found] ` <20110612191220.GK12149-uC3wQj2KruNg9hUCZPvPmw@public.gmane.org> 0 siblings, 1 reply; 13+ messages in thread From: J. Bruce Fields @ 2011-06-12 19:12 UTC (permalink / raw) To: Mimi Zohar Cc: linux-nfs, linux-fsdevel, samba-technical, Christoph Hellwig, Eric Paris On Sun, Jun 12, 2011 at 03:10:04PM -0400, Mimi Zohar wrote: > On Sun, 2011-06-12 at 00:08 -0400, J. Bruce Fields wrote: > > On Fri, Jun 10, 2011 at 05:34:46PM -0400, J. Bruce Fields wrote: > > > On Fri, Jun 10, 2011 at 04:24:00PM -0400, Mimi Zohar wrote: > > > > On Thu, 2011-06-09 at 20:10 -0400, J. Bruce Fields wrote: > > > > > From: J. Bruce Fields <bfields@redhat.com> > > > > > > > > > > Since break_lease is called before i_writecount is incremented, there's > > > > > a window between the two where a setlease call would have no way to know > > > > > that an open is about to happen. > > > > > > > > So unless the break_lease() call is moved from may_open() to after > > > > nameidata_to_filp(), I don't see any other options. > > > > > > Actually, offhand I can't see why that wouldn't be OK. > > > > > > Though I think we still end up needing something like i_blockleases to > > > handle unlink, link, rename, chown, and chmod. > > > > Well, I guess there's a bizarre alternative that wouldn't require a new > > inode field: > > In lieu of adding a new inode field, another possible option, a bit > kludgy, would be extending i_flock with an additional fl_flag > FL_BLOCKLEASE. > > #define IS_BLOCKLEASE(fl) (fl->fl_flags & FL_BLOCKLEASE) Alas, that would mean adding and removing one of these file locks around every single link, unlink, rename,.... --b. > > Mimi > > > What we care about is conflicts between read leases and operations that > > modify the metadata of the inode or the set of names pointing to it. > > > > As far as I can tell those operations all take the i_mutex either on the > > inode itself or on the parents of one of its aliases. > > > > So, you could prevent break_lease/setlease races by calling setlease > > under *all* of those i_mutexes: > > > > - take i_mutex on the inode > > - take i_lock to prevent the set of aliases from changing > > - take i_mutex for parent of each alias > > - set the lease > > - drop the parent i_mutexes, etc. > > > > where the i_mutexes would all be taken with mutex_trylock, and we'd just > > fail the whole setlease if any of them failed. > > > > ??? > > > > --b. > > ^ permalink raw reply [flat|nested] 13+ messages in thread
[parent not found: <20110612191220.GK12149-uC3wQj2KruNg9hUCZPvPmw@public.gmane.org>]
* Re: [PATCH 1/2] locks: introduce i_blockleases to close lease races [not found] ` <20110612191220.GK12149-uC3wQj2KruNg9hUCZPvPmw@public.gmane.org> @ 2011-06-12 20:54 ` Mimi Zohar 2011-06-13 12:19 ` J. Bruce Fields 0 siblings, 1 reply; 13+ messages in thread From: Mimi Zohar @ 2011-06-12 20:54 UTC (permalink / raw) To: J. Bruce Fields Cc: linux-nfs-u79uwXL29TY76Z2rM5mHXA, linux-fsdevel-u79uwXL29TY76Z2rM5mHXA, samba-technical-w/Ol4Ecudpl8XjKLYN78aQ, Christoph Hellwig, Eric Paris On Sun, 2011-06-12 at 15:12 -0400, J. Bruce Fields wrote: > On Sun, Jun 12, 2011 at 03:10:04PM -0400, Mimi Zohar wrote: > > On Sun, 2011-06-12 at 00:08 -0400, J. Bruce Fields wrote: > > > On Fri, Jun 10, 2011 at 05:34:46PM -0400, J. Bruce Fields wrote: > > > > On Fri, Jun 10, 2011 at 04:24:00PM -0400, Mimi Zohar wrote: > > > > > On Thu, 2011-06-09 at 20:10 -0400, J. Bruce Fields wrote: > > > > > > From: J. Bruce Fields <bfields-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> > > > > > > > > > > > > Since break_lease is called before i_writecount is incremented, there's > > > > > > a window between the two where a setlease call would have no way to know > > > > > > that an open is about to happen. > > > > > > > > > > So unless the break_lease() call is moved from may_open() to after > > > > > nameidata_to_filp(), I don't see any other options. > > > > > > > > Actually, offhand I can't see why that wouldn't be OK. > > > > > > > > Though I think we still end up needing something like i_blockleases to > > > > handle unlink, link, rename, chown, and chmod. > > > > > > Well, I guess there's a bizarre alternative that wouldn't require a new > > > inode field: > > > > In lieu of adding a new inode field, another possible option, a bit > > kludgy, would be extending i_flock with an additional fl_flag > > FL_BLOCKLEASE. > > > > #define IS_BLOCKLEASE(fl) (fl->fl_flags & FL_BLOCKLEASE) > > Alas, that would mean adding and removing one of these file locks around > every single link, unlink, rename,.... > > --b. You're adding a call to break_lease() for each of them. Currently __break_lease() is only called if a lease exists. Assuming there aren't any existing leases, couldn't break_lease() call something like block_lease()? The free would be after the link, unlink, ..., completed/failed. (You wouldn't actually need to alloc/free the 'struct file_lock' each time, just set the pointer and reset to NULL.) Mimi > > > > Mimi > > > > > What we care about is conflicts between read leases and operations that > > > modify the metadata of the inode or the set of names pointing to it. > > > > > > As far as I can tell those operations all take the i_mutex either on the > > > inode itself or on the parents of one of its aliases. > > > > > > So, you could prevent break_lease/setlease races by calling setlease > > > under *all* of those i_mutexes: > > > > > > - take i_mutex on the inode > > > - take i_lock to prevent the set of aliases from changing > > > - take i_mutex for parent of each alias > > > - set the lease > > > - drop the parent i_mutexes, etc. > > > > > > where the i_mutexes would all be taken with mutex_trylock, and we'd just > > > fail the whole setlease if any of them failed. > > > > > > ??? > > > > > > --b. > > > > -- To unsubscribe from this list: send the line "unsubscribe linux-nfs" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH 1/2] locks: introduce i_blockleases to close lease races 2011-06-12 20:54 ` Mimi Zohar @ 2011-06-13 12:19 ` J. Bruce Fields 2011-06-13 20:37 ` Mimi Zohar [not found] ` <20110613121939.GL12149-uC3wQj2KruNg9hUCZPvPmw@public.gmane.org> 0 siblings, 2 replies; 13+ messages in thread From: J. Bruce Fields @ 2011-06-13 12:19 UTC (permalink / raw) To: Mimi Zohar; +Cc: linux-fsdevel, linux-nfs, samba-technical, Eric Paris On Sun, Jun 12, 2011 at 04:54:33PM -0400, Mimi Zohar wrote: > On Sun, 2011-06-12 at 15:12 -0400, J. Bruce Fields wrote: > > On Sun, Jun 12, 2011 at 03:10:04PM -0400, Mimi Zohar wrote: > > > On Sun, 2011-06-12 at 00:08 -0400, J. Bruce Fields wrote: > > > > On Fri, Jun 10, 2011 at 05:34:46PM -0400, J. Bruce Fields wrote: > > > > > On Fri, Jun 10, 2011 at 04:24:00PM -0400, Mimi Zohar wrote: > > > > > > On Thu, 2011-06-09 at 20:10 -0400, J. Bruce Fields wrote: > > > > > > > From: J. Bruce Fields <bfields@redhat.com> > > > > > > > > > > > > > > Since break_lease is called before i_writecount is incremented, there's > > > > > > > a window between the two where a setlease call would have no way to know > > > > > > > that an open is about to happen. > > > > > > > > > > > > So unless the break_lease() call is moved from may_open() to after > > > > > > nameidata_to_filp(), I don't see any other options. > > > > > > > > > > Actually, offhand I can't see why that wouldn't be OK. > > > > > > > > > > Though I think we still end up needing something like i_blockleases to > > > > > handle unlink, link, rename, chown, and chmod. > > > > > > > > Well, I guess there's a bizarre alternative that wouldn't require a new > > > > inode field: > > > > > > In lieu of adding a new inode field, another possible option, a bit > > > kludgy, would be extending i_flock with an additional fl_flag > > > FL_BLOCKLEASE. > > > > > > #define IS_BLOCKLEASE(fl) (fl->fl_flags & FL_BLOCKLEASE) > > > > Alas, that would mean adding and removing one of these file locks around > > every single link, unlink, rename,.... > > > > --b. > > You're adding a call to break_lease() for each of them. Currently > __break_lease() is only called if a lease exists. Assuming there aren't > any existing leases, couldn't break_lease() call something like > block_lease()? The free would be after the link, unlink, ..., > completed/failed. > > (You wouldn't actually need to alloc/free the 'struct file_lock' each > time, just set the pointer and reset to NULL.) Well, the pointer has to be set to something. I suppose we could put a struct file_lock on the stack. --b. ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH 1/2] locks: introduce i_blockleases to close lease races 2011-06-13 12:19 ` J. Bruce Fields @ 2011-06-13 20:37 ` Mimi Zohar 2011-06-14 0:35 ` J. Bruce Fields [not found] ` <20110613121939.GL12149-uC3wQj2KruNg9hUCZPvPmw@public.gmane.org> 1 sibling, 1 reply; 13+ messages in thread From: Mimi Zohar @ 2011-06-13 20:37 UTC (permalink / raw) To: J. Bruce Fields; +Cc: linux-fsdevel, linux-nfs, samba-technical, Eric Paris On Mon, 2011-06-13 at 08:19 -0400, J. Bruce Fields wrote: > On Sun, Jun 12, 2011 at 04:54:33PM -0400, Mimi Zohar wrote: > > On Sun, 2011-06-12 at 15:12 -0400, J. Bruce Fields wrote: > > > On Sun, Jun 12, 2011 at 03:10:04PM -0400, Mimi Zohar wrote: > > > > On Sun, 2011-06-12 at 00:08 -0400, J. Bruce Fields wrote: > > > > > On Fri, Jun 10, 2011 at 05:34:46PM -0400, J. Bruce Fields wrote: > > > > > > On Fri, Jun 10, 2011 at 04:24:00PM -0400, Mimi Zohar wrote: > > > > > > > On Thu, 2011-06-09 at 20:10 -0400, J. Bruce Fields wrote: > > > > > > > > From: J. Bruce Fields <bfields@redhat.com> > > > > > > > > > > > > > > > > Since break_lease is called before i_writecount is incremented, there's > > > > > > > > a window between the two where a setlease call would have no way to know > > > > > > > > that an open is about to happen. > > > > > > > > > > > > > > So unless the break_lease() call is moved from may_open() to after > > > > > > > nameidata_to_filp(), I don't see any other options. > > > > > > > > > > > > Actually, offhand I can't see why that wouldn't be OK. > > > > > > > > > > > > Though I think we still end up needing something like i_blockleases to > > > > > > handle unlink, link, rename, chown, and chmod. > > > > > > > > > > Well, I guess there's a bizarre alternative that wouldn't require a new > > > > > inode field: > > > > > > > > In lieu of adding a new inode field, another possible option, a bit > > > > kludgy, would be extending i_flock with an additional fl_flag > > > > FL_BLOCKLEASE. > > > > > > > > #define IS_BLOCKLEASE(fl) (fl->fl_flags & FL_BLOCKLEASE) > > > > > > Alas, that would mean adding and removing one of these file locks around > > > every single link, unlink, rename,.... > > > > > > --b. > > > > You're adding a call to break_lease() for each of them. Currently > > __break_lease() is only called if a lease exists. Assuming there aren't > > any existing leases, couldn't break_lease() call something like > > block_lease()? The free would be after the link, unlink, ..., > > completed/failed. > > > > (You wouldn't actually need to alloc/free the 'struct file_lock' each > > time, just set the pointer and reset to NULL.) > > Well, the pointer has to be set to something. I suppose we could put a > struct file_lock on the stack. > > --b. Instead of putting the struct file_lock on the stack, how about creating a dummy list containing a single element with FL_BLOCKLEASE set? Mimi ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH 1/2] locks: introduce i_blockleases to close lease races 2011-06-13 20:37 ` Mimi Zohar @ 2011-06-14 0:35 ` J. Bruce Fields 0 siblings, 0 replies; 13+ messages in thread From: J. Bruce Fields @ 2011-06-14 0:35 UTC (permalink / raw) To: Mimi Zohar Cc: linux-nfs, linux-fsdevel, samba-technical, Christoph Hellwig, Eric Paris On Mon, Jun 13, 2011 at 04:37:03PM -0400, Mimi Zohar wrote: > On Mon, 2011-06-13 at 08:19 -0400, J. Bruce Fields wrote: > > On Sun, Jun 12, 2011 at 04:54:33PM -0400, Mimi Zohar wrote: > > > On Sun, 2011-06-12 at 15:12 -0400, J. Bruce Fields wrote: > > > > On Sun, Jun 12, 2011 at 03:10:04PM -0400, Mimi Zohar wrote: > > > > > In lieu of adding a new inode field, another possible option, a bit > > > > > kludgy, would be extending i_flock with an additional fl_flag > > > > > FL_BLOCKLEASE. > > > > > > > > > > #define IS_BLOCKLEASE(fl) (fl->fl_flags & FL_BLOCKLEASE) > > > > > > > > Alas, that would mean adding and removing one of these file locks around > > > > every single link, unlink, rename,.... > > > > > > > > --b. > > > > > > You're adding a call to break_lease() for each of them. Currently > > > __break_lease() is only called if a lease exists. Assuming there aren't > > > any existing leases, couldn't break_lease() call something like > > > block_lease()? The free would be after the link, unlink, ..., > > > completed/failed. > > > > > > (You wouldn't actually need to alloc/free the 'struct file_lock' each > > > time, just set the pointer and reset to NULL.) > > > > Well, the pointer has to be set to something. I suppose we could put a > > struct file_lock on the stack. > > > > --b. > > Instead of putting the struct file_lock on the stack, how about creating > a dummy list containing a single element with FL_BLOCKLEASE set? I'm afraid I don't understand what you're proposing. Could you explain in more detail? --b. ^ permalink raw reply [flat|nested] 13+ messages in thread
[parent not found: <20110613121939.GL12149-uC3wQj2KruNg9hUCZPvPmw@public.gmane.org>]
* Re: [PATCH 1/2] locks: introduce i_blockleases to close lease races [not found] ` <20110613121939.GL12149-uC3wQj2KruNg9hUCZPvPmw@public.gmane.org> @ 2011-06-15 15:47 ` J. Bruce Fields 0 siblings, 0 replies; 13+ messages in thread From: J. Bruce Fields @ 2011-06-15 15:47 UTC (permalink / raw) To: Mimi Zohar Cc: linux-nfs-u79uwXL29TY76Z2rM5mHXA, linux-fsdevel-u79uwXL29TY76Z2rM5mHXA, samba-technical-w/Ol4Ecudpl8XjKLYN78aQ, Christoph Hellwig, Eric Paris On Mon, Jun 13, 2011 at 08:19:39AM -0400, J. Bruce Fields wrote: > On Sun, Jun 12, 2011 at 04:54:33PM -0400, Mimi Zohar wrote: > > You're adding a call to break_lease() for each of them. Currently > > __break_lease() is only called if a lease exists. Assuming there aren't > > any existing leases, couldn't break_lease() call something like > > block_lease()? The free would be after the link, unlink, ..., > > completed/failed. > > > > (You wouldn't actually need to alloc/free the 'struct file_lock' each > > time, just set the pointer and reset to NULL.) > > Well, the pointer has to be set to something. I suppose we could put a > struct file_lock on the stack. The locking code is under a global spinlock--instead of an atomic inc and dec of inode->i_blockleases we'd be doing a pair of lock/unlocks of file_lock_lock. We could probably fix that: off the top of my head the only reason I see for a global lock is the stupid deadlock-detection code. Which is only needed for posix locks (and I'm not at all convinced it's needed even there). With that fixed, it would be a choice between an i_lock-protected list_add/list_del vs. an atomic inc/dec of a new inode field. --b. -- To unsubscribe from this list: send the line "unsubscribe linux-nfs" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 13+ messages in thread
end of thread, other threads:[~2011-06-15 15:47 UTC | newest] Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2011-06-10 0:09 lease/delegation/oplock semantics J. Bruce Fields 2011-06-10 0:10 ` [PATCH 1/2] locks: introduce i_blockleases to close lease races J. Bruce Fields [not found] ` <20110610001011.GD22215-uC3wQj2KruNg9hUCZPvPmw@public.gmane.org> 2011-06-10 0:11 ` [PATCH 2/2] locks: break lease on unlink J. Bruce Fields 2011-06-10 20:24 ` [PATCH 1/2] locks: introduce i_blockleases to close lease races Mimi Zohar 2011-06-10 21:34 ` J. Bruce Fields 2011-06-12 4:08 ` J. Bruce Fields 2011-06-12 19:10 ` Mimi Zohar 2011-06-12 19:12 ` J. Bruce Fields [not found] ` <20110612191220.GK12149-uC3wQj2KruNg9hUCZPvPmw@public.gmane.org> 2011-06-12 20:54 ` Mimi Zohar 2011-06-13 12:19 ` J. Bruce Fields 2011-06-13 20:37 ` Mimi Zohar 2011-06-14 0:35 ` J. Bruce Fields [not found] ` <20110613121939.GL12149-uC3wQj2KruNg9hUCZPvPmw@public.gmane.org> 2011-06-15 15:47 ` J. Bruce Fields
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).