All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/4] locks: flock and lease related bugfixes, and remove i_flctx counters
@ 2015-02-17 12:46 Jeff Layton
  2015-02-17 12:46 ` [PATCH 1/4] Revert "locks: keep a count of locks on the flctx lists" Jeff Layton
                   ` (4 more replies)
  0 siblings, 5 replies; 14+ messages in thread
From: Jeff Layton @ 2015-02-17 12:46 UTC (permalink / raw)
  To: linux-fsdevel
  Cc: linux-kernel, Linus Torvalds, Kirill A. Shutemov,
	J. Bruce Fields, Christoph Hellwig, Dave Chinner, Sasha Levin

This patchset should fix the problems that Linus spotted in the flock
code yesterday, as well as another bug in the lease code that I spotted
while patching those up.

It'll be interesting to see if this fixes the problem that Kirill
reported yesterday. If these looks acceptable, then I can send another
pull request or you can just use the one below.

------------------------[snip]----------------------------------
The following changes since commit 1fa185ebcbcefdc5229c783450c9f0439a69f0c1:

  Merge tag 'cris-for-3.20' of git://git.kernel.org/pub/scm/linux/kernel/git/jesper/cris (2015-02-15 18:02:02 -0800)

are available in the git repository at:

  git://git.samba.org/jlayton/linux.git tags/locks-v3.20-2

for you to fetch changes up to c07ef59a2e7f768ec39fb9c93419e27d09698e22:

  locks: only remove leases associated with the file being closed (2015-02-17 06:59:42 -0500)

Jeff Layton (4):
  Revert "locks: keep a count of locks on the flctx lists"
  locks: remove conditional lock release in middle of flock_lock_file
  locks: when upgrading, don't remove old flock lock until replacing
    with new one
  locks: only remove leases associated with the file being closed

 fs/ceph/locks.c    | 11 ++++++--
 fs/cifs/file.c     | 14 +++++++---
 fs/locks.c         | 78 ++++++++++++++++++++++++------------------------------
 include/linux/fs.h |  3 ---
 4 files changed, 53 insertions(+), 53 deletions(-)

-- 
2.1.0


^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH 1/4] Revert "locks: keep a count of locks on the flctx lists"
  2015-02-17 12:46 [PATCH 0/4] locks: flock and lease related bugfixes, and remove i_flctx counters Jeff Layton
@ 2015-02-17 12:46 ` Jeff Layton
  2015-02-17 12:46 ` [PATCH 2/4] locks: remove conditional lock release in middle of flock_lock_file Jeff Layton
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 14+ messages in thread
From: Jeff Layton @ 2015-02-17 12:46 UTC (permalink / raw)
  To: linux-fsdevel
  Cc: linux-kernel, Linus Torvalds, Kirill A. Shutemov,
	J. Bruce Fields, Christoph Hellwig, Dave Chinner, Sasha Levin

This reverts commit 9bd0f45b7037fcfa8b575c7e27d0431d6e6dc3bb.

Linus rightly pointed out that I failed to initialize the counters
when adding them, so they don't work as expected. Just revert this
patch for now.

Reported-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Jeff Layton <jeff.layton@primarydata.com>
---
 fs/ceph/locks.c    | 11 +++++++++--
 fs/cifs/file.c     | 14 ++++++++++----
 fs/locks.c         | 45 ++++++++++++++++-----------------------------
 include/linux/fs.h |  3 ---
 4 files changed, 35 insertions(+), 38 deletions(-)

diff --git a/fs/ceph/locks.c b/fs/ceph/locks.c
index 06ea5cd05cd9..0303da8e3233 100644
--- a/fs/ceph/locks.c
+++ b/fs/ceph/locks.c
@@ -242,9 +242,12 @@ int ceph_flock(struct file *file, int cmd, struct file_lock *fl)
 /*
  * Fills in the passed counter variables, so you can prepare pagelist metadata
  * before calling ceph_encode_locks.
+ *
+ * FIXME: add counters to struct file_lock_context so we don't need to do this?
  */
 void ceph_count_locks(struct inode *inode, int *fcntl_count, int *flock_count)
 {
+	struct file_lock *lock;
 	struct file_lock_context *ctx;
 
 	*fcntl_count = 0;
@@ -252,8 +255,12 @@ void ceph_count_locks(struct inode *inode, int *fcntl_count, int *flock_count)
 
 	ctx = inode->i_flctx;
 	if (ctx) {
-		*fcntl_count = ctx->flc_posix_cnt;
-		*flock_count = ctx->flc_flock_cnt;
+		spin_lock(&ctx->flc_lock);
+		list_for_each_entry(lock, &ctx->flc_posix, fl_list)
+			++(*fcntl_count);
+		list_for_each_entry(lock, &ctx->flc_flock, fl_list)
+			++(*flock_count);
+		spin_unlock(&ctx->flc_lock);
 	}
 	dout("counted %d flock locks and %d fcntl locks",
 	     *flock_count, *fcntl_count);
diff --git a/fs/cifs/file.c b/fs/cifs/file.c
index 8fe1f7a21b3e..a94b3e673182 100644
--- a/fs/cifs/file.c
+++ b/fs/cifs/file.c
@@ -1129,7 +1129,7 @@ cifs_push_posix_locks(struct cifsFileInfo *cfile)
 	struct cifs_tcon *tcon = tlink_tcon(cfile->tlink);
 	struct file_lock *flock;
 	struct file_lock_context *flctx = inode->i_flctx;
-	unsigned int i;
+	unsigned int count = 0, i;
 	int rc = 0, xid, type;
 	struct list_head locks_to_send, *el;
 	struct lock_to_push *lck, *tmp;
@@ -1140,14 +1140,20 @@ cifs_push_posix_locks(struct cifsFileInfo *cfile)
 	if (!flctx)
 		goto out;
 
+	spin_lock(&flctx->flc_lock);
+	list_for_each(el, &flctx->flc_posix) {
+		count++;
+	}
+	spin_unlock(&flctx->flc_lock);
+
 	INIT_LIST_HEAD(&locks_to_send);
 
 	/*
-	 * Allocating flc_posix_cnt locks is enough because no FL_POSIX locks
-	 * can be added to the list while we are holding cinode->lock_sem that
+	 * Allocating count locks is enough because no FL_POSIX locks can be
+	 * added to the list while we are holding cinode->lock_sem that
 	 * protects locking operations of this inode.
 	 */
-	for (i = 0; i < flctx->flc_posix_cnt; i++) {
+	for (i = 0; i < count; i++) {
 		lck = kmalloc(sizeof(struct lock_to_push), GFP_KERNEL);
 		if (!lck) {
 			rc = -ENOMEM;
diff --git a/fs/locks.c b/fs/locks.c
index 4753218f308e..7998f670812c 100644
--- a/fs/locks.c
+++ b/fs/locks.c
@@ -681,21 +681,18 @@ static void locks_wake_up_blocks(struct file_lock *blocker)
 }
 
 static void
-locks_insert_lock_ctx(struct file_lock *fl, int *counter,
-		      struct list_head *before)
+locks_insert_lock_ctx(struct file_lock *fl, struct list_head *before)
 {
 	fl->fl_nspid = get_pid(task_tgid(current));
 	list_add_tail(&fl->fl_list, before);
-	++*counter;
 	locks_insert_global_locks(fl);
 }
 
 static void
-locks_unlink_lock_ctx(struct file_lock *fl, int *counter)
+locks_unlink_lock_ctx(struct file_lock *fl)
 {
 	locks_delete_global_locks(fl);
 	list_del_init(&fl->fl_list);
-	--*counter;
 	if (fl->fl_nspid) {
 		put_pid(fl->fl_nspid);
 		fl->fl_nspid = NULL;
@@ -704,10 +701,9 @@ locks_unlink_lock_ctx(struct file_lock *fl, int *counter)
 }
 
 static void
-locks_delete_lock_ctx(struct file_lock *fl, int *counter,
-		      struct list_head *dispose)
+locks_delete_lock_ctx(struct file_lock *fl, struct list_head *dispose)
 {
-	locks_unlink_lock_ctx(fl, counter);
+	locks_unlink_lock_ctx(fl);
 	if (dispose)
 		list_add(&fl->fl_list, dispose);
 	else
@@ -895,7 +891,7 @@ static int flock_lock_file(struct file *filp, struct file_lock *request)
 		if (request->fl_type == fl->fl_type)
 			goto out;
 		found = true;
-		locks_delete_lock_ctx(fl, &ctx->flc_flock_cnt, &dispose);
+		locks_delete_lock_ctx(fl, &dispose);
 		break;
 	}
 
@@ -929,7 +925,7 @@ find_conflict:
 	if (request->fl_flags & FL_ACCESS)
 		goto out;
 	locks_copy_lock(new_fl, request);
-	locks_insert_lock_ctx(new_fl, &ctx->flc_flock_cnt, &ctx->flc_flock);
+	locks_insert_lock_ctx(new_fl, &ctx->flc_flock);
 	new_fl = NULL;
 	error = 0;
 
@@ -1046,8 +1042,7 @@ static int __posix_lock_file(struct inode *inode, struct file_lock *request, str
 			else
 				request->fl_end = fl->fl_end;
 			if (added) {
-				locks_delete_lock_ctx(fl, &ctx->flc_posix_cnt,
-							&dispose);
+				locks_delete_lock_ctx(fl, &dispose);
 				continue;
 			}
 			request = fl;
@@ -1076,8 +1071,7 @@ static int __posix_lock_file(struct inode *inode, struct file_lock *request, str
 				 * one (This may happen several times).
 				 */
 				if (added) {
-					locks_delete_lock_ctx(fl,
-						&ctx->flc_posix_cnt, &dispose);
+					locks_delete_lock_ctx(fl, &dispose);
 					continue;
 				}
 				/*
@@ -1093,10 +1087,8 @@ static int __posix_lock_file(struct inode *inode, struct file_lock *request, str
 				locks_copy_lock(new_fl, request);
 				request = new_fl;
 				new_fl = NULL;
-				locks_insert_lock_ctx(request,
-					&ctx->flc_posix_cnt, &fl->fl_list);
-				locks_delete_lock_ctx(fl,
-					&ctx->flc_posix_cnt, &dispose);
+				locks_insert_lock_ctx(request, &fl->fl_list);
+				locks_delete_lock_ctx(fl, &dispose);
 				added = true;
 			}
 		}
@@ -1124,8 +1116,7 @@ static int __posix_lock_file(struct inode *inode, struct file_lock *request, str
 			goto out;
 		}
 		locks_copy_lock(new_fl, request);
-		locks_insert_lock_ctx(new_fl, &ctx->flc_posix_cnt,
-					&fl->fl_list);
+		locks_insert_lock_ctx(new_fl, &fl->fl_list);
 		new_fl = NULL;
 	}
 	if (right) {
@@ -1136,8 +1127,7 @@ static int __posix_lock_file(struct inode *inode, struct file_lock *request, str
 			left = new_fl2;
 			new_fl2 = NULL;
 			locks_copy_lock(left, right);
-			locks_insert_lock_ctx(left, &ctx->flc_posix_cnt,
-						&fl->fl_list);
+			locks_insert_lock_ctx(left, &fl->fl_list);
 		}
 		right->fl_start = request->fl_end + 1;
 		locks_wake_up_blocks(right);
@@ -1321,7 +1311,6 @@ static void lease_clear_pending(struct file_lock *fl, int arg)
 /* We already had a lease on this file; just change its type */
 int lease_modify(struct file_lock *fl, int arg, struct list_head *dispose)
 {
-	struct file_lock_context *flctx;
 	int error = assign_type(fl, arg);
 
 	if (error)
@@ -1331,7 +1320,6 @@ int lease_modify(struct file_lock *fl, int arg, struct list_head *dispose)
 	if (arg == F_UNLCK) {
 		struct file *filp = fl->fl_file;
 
-		flctx = file_inode(filp)->i_flctx;
 		f_delown(filp);
 		filp->f_owner.signum = 0;
 		fasync_helper(0, fl->fl_file, 0, &fl->fl_fasync);
@@ -1339,7 +1327,7 @@ int lease_modify(struct file_lock *fl, int arg, struct list_head *dispose)
 			printk(KERN_ERR "locks_delete_lock: fasync == %p\n", fl->fl_fasync);
 			fl->fl_fasync = NULL;
 		}
-		locks_delete_lock_ctx(fl, &flctx->flc_lease_cnt, dispose);
+		locks_delete_lock_ctx(fl, dispose);
 	}
 	return 0;
 }
@@ -1456,8 +1444,7 @@ int __break_lease(struct inode *inode, unsigned int mode, unsigned int type)
 			fl->fl_downgrade_time = break_time;
 		}
 		if (fl->fl_lmops->lm_break(fl))
-			locks_delete_lock_ctx(fl, &ctx->flc_lease_cnt,
-						&dispose);
+			locks_delete_lock_ctx(fl, &dispose);
 	}
 
 	if (list_empty(&ctx->flc_lease))
@@ -1697,7 +1684,7 @@ generic_add_lease(struct file *filp, long arg, struct file_lock **flp, void **pr
 	if (!leases_enable)
 		goto out;
 
-	locks_insert_lock_ctx(lease, &ctx->flc_lease_cnt, &ctx->flc_lease);
+	locks_insert_lock_ctx(lease, &ctx->flc_lease);
 	/*
 	 * The check in break_lease() is lockless. It's possible for another
 	 * open to race in after we did the earlier check for a conflicting
@@ -1710,7 +1697,7 @@ generic_add_lease(struct file *filp, long arg, struct file_lock **flp, void **pr
 	smp_mb();
 	error = check_conflicting_open(dentry, arg, lease->fl_flags);
 	if (error) {
-		locks_unlink_lock_ctx(lease, &ctx->flc_lease_cnt);
+		locks_unlink_lock_ctx(lease);
 		goto out;
 	}
 
diff --git a/include/linux/fs.h b/include/linux/fs.h
index e49f10cc8a73..a5a303e8a33c 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -969,9 +969,6 @@ struct file_lock_context {
 	struct list_head	flc_flock;
 	struct list_head	flc_posix;
 	struct list_head	flc_lease;
-	int			flc_flock_cnt;
-	int			flc_posix_cnt;
-	int			flc_lease_cnt;
 };
 
 /* The following constant reflects the upper bound of the file/locking space */
-- 
2.1.0


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH 2/4] locks: remove conditional lock release in middle of flock_lock_file
  2015-02-17 12:46 [PATCH 0/4] locks: flock and lease related bugfixes, and remove i_flctx counters Jeff Layton
  2015-02-17 12:46 ` [PATCH 1/4] Revert "locks: keep a count of locks on the flctx lists" Jeff Layton
@ 2015-02-17 12:46 ` Jeff Layton
  2015-02-17 17:10   ` J. Bruce Fields
  2015-02-17 12:46 ` [PATCH 3/4] locks: when upgrading, don't remove old flock lock until replacing with new one Jeff Layton
                   ` (2 subsequent siblings)
  4 siblings, 1 reply; 14+ messages in thread
From: Jeff Layton @ 2015-02-17 12:46 UTC (permalink / raw)
  To: linux-fsdevel
  Cc: linux-kernel, Linus Torvalds, Kirill A. Shutemov,
	J. Bruce Fields, Christoph Hellwig, Dave Chinner, Sasha Levin

As Linus pointed out:

    Say we have an existing flock, and now do a new one that conflicts. I
    see what looks like three separate bugs.

     - We go through the first loop, find a lock of another type, and
    delete it in preparation for replacing it

     - we *drop* the lock context spinlock.

     - BUG #1? So now there is no lock at all, and somebody can come in
    and see that unlocked state. Is that really valid?

     - another thread comes in while the first thread dropped the lock
    context lock, and wants to add its own lock. It doesn't see the
    deleted or pending locks, so it just adds it

     - the first thread gets the context spinlock again, and adds the lock
    that replaced the original

     - BUG #2? So now there are *two* locks on the thing, and the next
    time you do an unlock (or when you close the file), it will only
    remove/replace the first one.

...remove the "drop the spinlock" code in the middle of this function as
it has always been suspicious.

Reported-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Jeff Layton <jeff.layton@primarydata.com>
---
 fs/locks.c | 10 ----------
 1 file changed, 10 deletions(-)

diff --git a/fs/locks.c b/fs/locks.c
index 7998f670812c..00c105f499a2 100644
--- a/fs/locks.c
+++ b/fs/locks.c
@@ -901,16 +901,6 @@ static int flock_lock_file(struct file *filp, struct file_lock *request)
 		goto out;
 	}
 
-	/*
-	 * If a higher-priority process was blocked on the old file lock,
-	 * give it the opportunity to lock the file.
-	 */
-	if (found) {
-		spin_unlock(&ctx->flc_lock);
-		cond_resched();
-		spin_lock(&ctx->flc_lock);
-	}
-
 find_conflict:
 	list_for_each_entry(fl, &ctx->flc_flock, fl_list) {
 		if (!flock_locks_conflict(request, fl))
-- 
2.1.0


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH 3/4] locks: when upgrading, don't remove old flock lock until replacing with new one
  2015-02-17 12:46 [PATCH 0/4] locks: flock and lease related bugfixes, and remove i_flctx counters Jeff Layton
  2015-02-17 12:46 ` [PATCH 1/4] Revert "locks: keep a count of locks on the flctx lists" Jeff Layton
  2015-02-17 12:46 ` [PATCH 2/4] locks: remove conditional lock release in middle of flock_lock_file Jeff Layton
@ 2015-02-17 12:46 ` Jeff Layton
  2015-02-17 12:46 ` [PATCH 4/4] locks: only remove leases associated with the file being closed Jeff Layton
  2015-02-17 19:55 ` [PATCH 0/4] locks: flock and lease related bugfixes, and remove i_flctx counters Linus Torvalds
  4 siblings, 0 replies; 14+ messages in thread
From: Jeff Layton @ 2015-02-17 12:46 UTC (permalink / raw)
  To: linux-fsdevel
  Cc: linux-kernel, Linus Torvalds, Kirill A. Shutemov,
	J. Bruce Fields, Christoph Hellwig, Dave Chinner, Sasha Levin

There is a potential problem when upgrading a flock lock. Suppose we
have a LOCK_SH lock on a file, and then want to upgrade it to a LOCK_EX
lock. We go through the first loop in flock_lock_file, and remove the
first lock.

We then go through the second loop and try to insert a new LOCK_EX lock.
If however, there is another LOCK_SH lock on the file, we're out of
luck. We've removed our LOCK_SH lock already and can't insert a LOCK_EX
lock.

Fix this by ensuring that we don't remove any lock that we're replacing
until we're sure that we can add its replacement.

Signed-off-by: Jeff Layton <jeff.layton@primarydata.com>
---
 fs/locks.c | 22 +++++++++++++++++-----
 1 file changed, 17 insertions(+), 5 deletions(-)

diff --git a/fs/locks.c b/fs/locks.c
index 00c105f499a2..59eadd416b8c 100644
--- a/fs/locks.c
+++ b/fs/locks.c
@@ -864,13 +864,16 @@ static int posix_locks_deadlock(struct file_lock *caller_fl,
 static int flock_lock_file(struct file *filp, struct file_lock *request)
 {
 	struct file_lock *new_fl = NULL;
+	struct file_lock *old_fl = NULL;
 	struct file_lock *fl;
 	struct file_lock_context *ctx;
 	struct inode *inode = file_inode(filp);
 	int error = 0;
-	bool found = false;
 	LIST_HEAD(dispose);
 
+	/* flock_locks_conflict relies on this */
+	WARN_ON_ONCE(request->fl_file != filp);
+
 	ctx = locks_get_lock_context(inode);
 	if (!ctx)
 		return -ENOMEM;
@@ -885,22 +888,29 @@ static int flock_lock_file(struct file *filp, struct file_lock *request)
 	if (request->fl_flags & FL_ACCESS)
 		goto find_conflict;
 
+	/*
+	 * Do we already hold a lock on this filp? It may be upgradeable, or it
+	 * may be just what we need.
+	 */
 	list_for_each_entry(fl, &ctx->flc_flock, fl_list) {
 		if (filp != fl->fl_file)
 			continue;
 		if (request->fl_type == fl->fl_type)
 			goto out;
-		found = true;
-		locks_delete_lock_ctx(fl, &dispose);
+		old_fl = fl;
 		break;
 	}
 
 	if (request->fl_type == F_UNLCK) {
-		if ((request->fl_flags & FL_EXISTS) && !found)
-			error = -ENOENT;
+		if (old_fl) {
+			locks_delete_lock_ctx(old_fl, &dispose);
+			if (request->fl_flags & FL_EXISTS)
+				error = -ENOENT;
+		}
 		goto out;
 	}
 
+	/* SETLK(W) */
 find_conflict:
 	list_for_each_entry(fl, &ctx->flc_flock, fl_list) {
 		if (!flock_locks_conflict(request, fl))
@@ -915,6 +925,8 @@ find_conflict:
 	if (request->fl_flags & FL_ACCESS)
 		goto out;
 	locks_copy_lock(new_fl, request);
+	if (old_fl)
+		locks_delete_lock_ctx(old_fl, &dispose);
 	locks_insert_lock_ctx(new_fl, &ctx->flc_flock);
 	new_fl = NULL;
 	error = 0;
-- 
2.1.0


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH 4/4] locks: only remove leases associated with the file being closed
  2015-02-17 12:46 [PATCH 0/4] locks: flock and lease related bugfixes, and remove i_flctx counters Jeff Layton
                   ` (2 preceding siblings ...)
  2015-02-17 12:46 ` [PATCH 3/4] locks: when upgrading, don't remove old flock lock until replacing with new one Jeff Layton
@ 2015-02-17 12:46 ` Jeff Layton
  2015-02-17 19:55 ` [PATCH 0/4] locks: flock and lease related bugfixes, and remove i_flctx counters Linus Torvalds
  4 siblings, 0 replies; 14+ messages in thread
From: Jeff Layton @ 2015-02-17 12:46 UTC (permalink / raw)
  To: linux-fsdevel
  Cc: linux-kernel, Linus Torvalds, Kirill A. Shutemov,
	J. Bruce Fields, Christoph Hellwig, Dave Chinner, Sasha Levin

We don't want to remove all leases just because one filp was closed.

Signed-off-by: Jeff Layton <jeff.layton@primarydata.com>
---
 fs/locks.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/fs/locks.c b/fs/locks.c
index 59eadd416b8c..c748176fde6a 100644
--- a/fs/locks.c
+++ b/fs/locks.c
@@ -2437,7 +2437,8 @@ locks_remove_lease(struct file *filp)
 
 	spin_lock(&ctx->flc_lock);
 	list_for_each_entry_safe(fl, tmp, &ctx->flc_lease, fl_list)
-		lease_modify(fl, F_UNLCK, &dispose);
+		if (filp == fl->fl_file)
+			lease_modify(fl, F_UNLCK, &dispose);
 	spin_unlock(&ctx->flc_lock);
 	locks_dispose_list(&dispose);
 }
-- 
2.1.0


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [PATCH 2/4] locks: remove conditional lock release in middle of flock_lock_file
  2015-02-17 12:46 ` [PATCH 2/4] locks: remove conditional lock release in middle of flock_lock_file Jeff Layton
@ 2015-02-17 17:10   ` J. Bruce Fields
  2015-02-17 17:56     ` Jeff Layton
  0 siblings, 1 reply; 14+ messages in thread
From: J. Bruce Fields @ 2015-02-17 17:10 UTC (permalink / raw)
  To: Jeff Layton
  Cc: linux-fsdevel, linux-kernel, Linus Torvalds, Kirill A. Shutemov,
	Christoph Hellwig, Dave Chinner, Sasha Levin

On Tue, Feb 17, 2015 at 07:46:28AM -0500, Jeff Layton wrote:
> As Linus pointed out:
> 
>     Say we have an existing flock, and now do a new one that conflicts. I
>     see what looks like three separate bugs.
> 
>      - We go through the first loop, find a lock of another type, and
>     delete it in preparation for replacing it
> 
>      - we *drop* the lock context spinlock.
> 
>      - BUG #1? So now there is no lock at all, and somebody can come in
>     and see that unlocked state. Is that really valid?
> 
>      - another thread comes in while the first thread dropped the lock
>     context lock, and wants to add its own lock. It doesn't see the
>     deleted or pending locks, so it just adds it
> 
>      - the first thread gets the context spinlock again, and adds the lock
>     that replaced the original
> 
>      - BUG #2? So now there are *two* locks on the thing, and the next
>     time you do an unlock (or when you close the file), it will only
>     remove/replace the first one.
> 
> ...remove the "drop the spinlock" code in the middle of this function as
> it has always been suspicious.

Looking back through a historical git repo, purely out of curiosity--the
cond_resched was previously a yield, and then I *think* bug #2 was
introduced in 2002 by a "[PATCH] fs/locks.c: more cleanup".  Before that
it retried the first loop after the yield.

Before that the yield was in locks_wake_up_blocks, removed by:

	commit 7962ad19e6300531784722c16849837864304d84
	Author: Matthew Wilcox <willy@debian.org>
	Date:   Sat Jun 8 02:09:24 2002 -0700

	[PATCH] fs/locks.c: Only yield once for flocks
	    
	This patch removes the annoying and confusing `wait' argument
	from many places.  The only change in behaviour is that we now
	yield once when unblocking other BSD-style flocks instead of
	once for each lock.
			    
	This slightly improves the semantics for userspace.  Before,
	when we had two tasks waiting on a lock, the first one would
	receive the lock.  Now, the one with the highest priority
	receives the lock.

So this really was intended to guarantee other waiters the lock before
allowing an upgrade.  Could that actually have worked?

The non-atomic behavior is documented in flock(2), which says it's
"original BSD behavior".

A comment at the top of locks.c says this is to avoid deadlock.  Hm, so,
are we introducing a deadlock?:

	1. Two processes both get shared lock on different filps.
	2. Both request exclusive lock.

Now they're stuck, whereas previously they might have recovered?

--b.

> 
> Reported-by: Linus Torvalds <torvalds@linux-foundation.org>
> Signed-off-by: Jeff Layton <jeff.layton@primarydata.com>
> ---
>  fs/locks.c | 10 ----------
>  1 file changed, 10 deletions(-)
> 
> diff --git a/fs/locks.c b/fs/locks.c
> index 7998f670812c..00c105f499a2 100644
> --- a/fs/locks.c
> +++ b/fs/locks.c
> @@ -901,16 +901,6 @@ static int flock_lock_file(struct file *filp, struct file_lock *request)
>  		goto out;
>  	}
>  
> -	/*
> -	 * If a higher-priority process was blocked on the old file lock,
> -	 * give it the opportunity to lock the file.
> -	 */
> -	if (found) {
> -		spin_unlock(&ctx->flc_lock);
> -		cond_resched();
> -		spin_lock(&ctx->flc_lock);
> -	}
> -
>  find_conflict:
>  	list_for_each_entry(fl, &ctx->flc_flock, fl_list) {
>  		if (!flock_locks_conflict(request, fl))
> -- 
> 2.1.0

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 2/4] locks: remove conditional lock release in middle of flock_lock_file
  2015-02-17 17:10   ` J. Bruce Fields
@ 2015-02-17 17:56     ` Jeff Layton
  2015-02-17 19:11       ` J. Bruce Fields
  0 siblings, 1 reply; 14+ messages in thread
From: Jeff Layton @ 2015-02-17 17:56 UTC (permalink / raw)
  To: J. Bruce Fields
  Cc: linux-fsdevel, linux-kernel, Linus Torvalds, Kirill A. Shutemov,
	Christoph Hellwig, Dave Chinner, Sasha Levin

On Tue, 17 Feb 2015 12:10:17 -0500
"J. Bruce Fields" <bfields@fieldses.org> wrote:

> On Tue, Feb 17, 2015 at 07:46:28AM -0500, Jeff Layton wrote:
> > As Linus pointed out:
> > 
> >     Say we have an existing flock, and now do a new one that conflicts. I
> >     see what looks like three separate bugs.
> > 
> >      - We go through the first loop, find a lock of another type, and
> >     delete it in preparation for replacing it
> > 
> >      - we *drop* the lock context spinlock.
> > 
> >      - BUG #1? So now there is no lock at all, and somebody can come in
> >     and see that unlocked state. Is that really valid?
> > 
> >      - another thread comes in while the first thread dropped the lock
> >     context lock, and wants to add its own lock. It doesn't see the
> >     deleted or pending locks, so it just adds it
> > 
> >      - the first thread gets the context spinlock again, and adds the lock
> >     that replaced the original
> > 
> >      - BUG #2? So now there are *two* locks on the thing, and the next
> >     time you do an unlock (or when you close the file), it will only
> >     remove/replace the first one.
> > 
> > ...remove the "drop the spinlock" code in the middle of this function as
> > it has always been suspicious.
> 
> Looking back through a historical git repo, purely out of curiosity--the
> cond_resched was previously a yield, and then I *think* bug #2 was
> introduced in 2002 by a "[PATCH] fs/locks.c: more cleanup".  Before that
> it retried the first loop after the yield.
> 
> Before that the yield was in locks_wake_up_blocks, removed by:
> 
> 	commit 7962ad19e6300531784722c16849837864304d84
> 	Author: Matthew Wilcox <willy@debian.org>
> 	Date:   Sat Jun 8 02:09:24 2002 -0700
> 
> 	[PATCH] fs/locks.c: Only yield once for flocks
> 	    
> 	This patch removes the annoying and confusing `wait' argument
> 	from many places.  The only change in behaviour is that we now
> 	yield once when unblocking other BSD-style flocks instead of
> 	once for each lock.
> 			    
> 	This slightly improves the semantics for userspace.  Before,
> 	when we had two tasks waiting on a lock, the first one would
> 	receive the lock.  Now, the one with the highest priority
> 	receives the lock.
> 
> So this really was intended to guarantee other waiters the lock before
> allowing an upgrade.  Could that actually have worked?
> 
> The non-atomic behavior is documented in flock(2), which says it's
> "original BSD behavior".
> 
> A comment at the top of locks.c says this is to avoid deadlock.  Hm, so,
> are we introducing a deadlock?:
> 
> 	1. Two processes both get shared lock on different filps.
> 	2. Both request exclusive lock.
> 
> Now they're stuck, whereas previously they might have recovered?
> 
> --b.
> 


Yes, I see that now. It might be best to preserve the existing behavior
for that reason then. I'd rather not introduce any behavioral changes in this
set if we can help it, particularly if there are userland apps that
might rely on it.

It may be best then to keep this patch, but drop patch #3. That should
be enough to ensure that we don't end up with two different locks on
the same file description, which is clearly a bug.

It might also not hurt to follow HCH's earlier advice and make
locks_remove_flock just iterate over the list and just unconditionally
delete any lock that in the case where the ->flock operation isn't set.

In principle we shouldn't need that once apply patch #2, but it would
probably be simpler than dealing with flock_lock_file in that case.

> > 
> > Reported-by: Linus Torvalds <torvalds@linux-foundation.org>
> > Signed-off-by: Jeff Layton <jeff.layton@primarydata.com>
> > ---
> >  fs/locks.c | 10 ----------
> >  1 file changed, 10 deletions(-)
> > 
> > diff --git a/fs/locks.c b/fs/locks.c
> > index 7998f670812c..00c105f499a2 100644
> > --- a/fs/locks.c
> > +++ b/fs/locks.c
> > @@ -901,16 +901,6 @@ static int flock_lock_file(struct file *filp, struct file_lock *request)
> >  		goto out;
> >  	}
> >  
> > -	/*
> > -	 * If a higher-priority process was blocked on the old file lock,
> > -	 * give it the opportunity to lock the file.
> > -	 */
> > -	if (found) {
> > -		spin_unlock(&ctx->flc_lock);
> > -		cond_resched();
> > -		spin_lock(&ctx->flc_lock);
> > -	}
> > -
> >  find_conflict:
> >  	list_for_each_entry(fl, &ctx->flc_flock, fl_list) {
> >  		if (!flock_locks_conflict(request, fl))
> > -- 
> > 2.1.0


-- 
Jeff Layton <jeff.layton@primarydata.com>

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 2/4] locks: remove conditional lock release in middle of flock_lock_file
  2015-02-17 17:56     ` Jeff Layton
@ 2015-02-17 19:11       ` J. Bruce Fields
  2015-02-17 22:21         ` J. Bruce Fields
  0 siblings, 1 reply; 14+ messages in thread
From: J. Bruce Fields @ 2015-02-17 19:11 UTC (permalink / raw)
  To: Jeff Layton
  Cc: linux-fsdevel, linux-kernel, Linus Torvalds, Kirill A. Shutemov,
	Christoph Hellwig, Dave Chinner, Sasha Levin

On Tue, Feb 17, 2015 at 12:56:49PM -0500, Jeff Layton wrote:
> On Tue, 17 Feb 2015 12:10:17 -0500
> "J. Bruce Fields" <bfields@fieldses.org> wrote:
> 
> > On Tue, Feb 17, 2015 at 07:46:28AM -0500, Jeff Layton wrote:
> > > As Linus pointed out:
> > > 
> > >     Say we have an existing flock, and now do a new one that conflicts. I
> > >     see what looks like three separate bugs.
> > > 
> > >      - We go through the first loop, find a lock of another type, and
> > >     delete it in preparation for replacing it
> > > 
> > >      - we *drop* the lock context spinlock.
> > > 
> > >      - BUG #1? So now there is no lock at all, and somebody can come in
> > >     and see that unlocked state. Is that really valid?
> > > 
> > >      - another thread comes in while the first thread dropped the lock
> > >     context lock, and wants to add its own lock. It doesn't see the
> > >     deleted or pending locks, so it just adds it
> > > 
> > >      - the first thread gets the context spinlock again, and adds the lock
> > >     that replaced the original
> > > 
> > >      - BUG #2? So now there are *two* locks on the thing, and the next
> > >     time you do an unlock (or when you close the file), it will only
> > >     remove/replace the first one.
> > > 
> > > ...remove the "drop the spinlock" code in the middle of this function as
> > > it has always been suspicious.
> > 
> > Looking back through a historical git repo, purely out of curiosity--the
> > cond_resched was previously a yield, and then I *think* bug #2 was
> > introduced in 2002 by a "[PATCH] fs/locks.c: more cleanup".  Before that
> > it retried the first loop after the yield.
> > 
> > Before that the yield was in locks_wake_up_blocks, removed by:
> > 
> > 	commit 7962ad19e6300531784722c16849837864304d84
> > 	Author: Matthew Wilcox <willy@debian.org>
> > 	Date:   Sat Jun 8 02:09:24 2002 -0700
> > 
> > 	[PATCH] fs/locks.c: Only yield once for flocks
> > 	    
> > 	This patch removes the annoying and confusing `wait' argument
> > 	from many places.  The only change in behaviour is that we now
> > 	yield once when unblocking other BSD-style flocks instead of
> > 	once for each lock.
> > 			    
> > 	This slightly improves the semantics for userspace.  Before,
> > 	when we had two tasks waiting on a lock, the first one would
> > 	receive the lock.  Now, the one with the highest priority
> > 	receives the lock.
> > 
> > So this really was intended to guarantee other waiters the lock before
> > allowing an upgrade.  Could that actually have worked?
> > 
> > The non-atomic behavior is documented in flock(2), which says it's
> > "original BSD behavior".
> > 
> > A comment at the top of locks.c says this is to avoid deadlock.  Hm, so,
> > are we introducing a deadlock?:
> > 
> > 	1. Two processes both get shared lock on different filps.
> > 	2. Both request exclusive lock.
> > 
> > Now they're stuck, whereas previously they might have recovered?
> > 
> > --b.
> > 
> 
> 
> Yes, I see that now. It might be best to preserve the existing behavior
> for that reason then. I'd rather not introduce any behavioral changes in this
> set if we can help it, particularly if there are userland apps that
> might rely on it.
> 
> It may be best then to keep this patch, but drop patch #3.

Unfortunately it's this patch that I'm worried about.

Also these patches are introducing some failure in my locking tests
(probably unrelated, I doubt I ever wrote a test for this case).  I'll
investigate.

--b.

> That should
> be enough to ensure that we don't end up with two different locks on
> the same file description, which is clearly a bug.
> 
> It might also not hurt to follow HCH's earlier advice and make
> locks_remove_flock just iterate over the list and just unconditionally
> delete any lock that in the case where the ->flock operation isn't set.
> 
> In principle we shouldn't need that once apply patch #2, but it would
> probably be simpler than dealing with flock_lock_file in that case.
> 
> > > 
> > > Reported-by: Linus Torvalds <torvalds@linux-foundation.org>
> > > Signed-off-by: Jeff Layton <jeff.layton@primarydata.com>
> > > ---
> > >  fs/locks.c | 10 ----------
> > >  1 file changed, 10 deletions(-)
> > > 
> > > diff --git a/fs/locks.c b/fs/locks.c
> > > index 7998f670812c..00c105f499a2 100644
> > > --- a/fs/locks.c
> > > +++ b/fs/locks.c
> > > @@ -901,16 +901,6 @@ static int flock_lock_file(struct file *filp, struct file_lock *request)
> > >  		goto out;
> > >  	}
> > >  
> > > -	/*
> > > -	 * If a higher-priority process was blocked on the old file lock,
> > > -	 * give it the opportunity to lock the file.
> > > -	 */
> > > -	if (found) {
> > > -		spin_unlock(&ctx->flc_lock);
> > > -		cond_resched();
> > > -		spin_lock(&ctx->flc_lock);
> > > -	}
> > > -
> > >  find_conflict:
> > >  	list_for_each_entry(fl, &ctx->flc_flock, fl_list) {
> > >  		if (!flock_locks_conflict(request, fl))
> > > -- 
> > > 2.1.0
> 
> 
> -- 
> Jeff Layton <jeff.layton@primarydata.com>

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 0/4] locks: flock and lease related bugfixes, and remove i_flctx counters
  2015-02-17 12:46 [PATCH 0/4] locks: flock and lease related bugfixes, and remove i_flctx counters Jeff Layton
                   ` (3 preceding siblings ...)
  2015-02-17 12:46 ` [PATCH 4/4] locks: only remove leases associated with the file being closed Jeff Layton
@ 2015-02-17 19:55 ` Linus Torvalds
  2015-02-17 20:20   ` Linus Torvalds
  2015-02-17 20:20   ` Al Viro
  4 siblings, 2 replies; 14+ messages in thread
From: Linus Torvalds @ 2015-02-17 19:55 UTC (permalink / raw)
  To: Jeff Layton
  Cc: linux-fsdevel, Linux Kernel Mailing List, Kirill A. Shutemov,
	J. Bruce Fields, Christoph Hellwig, Dave Chinner, Sasha Levin

Ok, so having gone over these, I think 1, 2 and 4 are fine.

Let's just drop 3. The upgrade clearly does need to drop the old lock
when returning FILE_LOCK_DEFERRED, because otherwise two upgraders
will deadlock waiting for each other.

Oh, and in #1, you might want to remove the "FIXME: add counters to
struct file_lock_context so we don't need to do this?" although that
obviously makes it not strictly a 100% revert.  I do believe that we
should add a "list_count()" function, so that we could write

   *flock_count = list_count(&ctx->flc_flock);

instead of that horribly ugly

    list_for_each_entry(lock, &ctx->flc_flock, fl_list)
         ++(*flock_count);

thing. But that's a separate cleanup.

Can we get that truncated series tested with some flock test suite? I
assume there is *some* filesystem tester that tests some basic flock
stuff, even if it clearly didn't catch the race due to the unlock in
the middle..

                      Linus

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 0/4] locks: flock and lease related bugfixes, and remove i_flctx counters
  2015-02-17 19:55 ` [PATCH 0/4] locks: flock and lease related bugfixes, and remove i_flctx counters Linus Torvalds
@ 2015-02-17 20:20   ` Linus Torvalds
  2015-02-17 20:20   ` Al Viro
  1 sibling, 0 replies; 14+ messages in thread
From: Linus Torvalds @ 2015-02-17 20:20 UTC (permalink / raw)
  To: Jeff Layton
  Cc: linux-fsdevel, Linux Kernel Mailing List, Kirill A. Shutemov,
	J. Bruce Fields, Christoph Hellwig, Dave Chinner, Sasha Levin

[-- Attachment #1: Type: text/plain, Size: 571 bytes --]

On Tue, Feb 17, 2015 at 11:55 AM, Linus Torvalds
<torvalds@linux-foundation.org> wrote:
>
> I do believe that we should add a "list_count()" function, so that we could write
>
>    *flock_count = list_count(&ctx->flc_flock);
>
> instead of that horribly ugly
>
>     list_for_each_entry(lock, &ctx->flc_flock, fl_list)
>          ++(*flock_count);
>
> thing. But that's a separate cleanup.

The helper function would possibly be something like this.

Untested. It may mix-count, or it might do unspeakable acts on your
pets. No guarantees.

                        Linus

[-- Attachment #2: patch.diff --]
[-- Type: text/plain, Size: 1178 bytes --]

 include/linux/list.h | 28 ++++++++++++++++++++++++++++
 1 file changed, 28 insertions(+)

diff --git a/include/linux/list.h b/include/linux/list.h
index feb773c76ee0..495566be02e1 100644
--- a/include/linux/list.h
+++ b/include/linux/list.h
@@ -190,6 +190,20 @@ static inline int list_empty(const struct list_head *head)
 }
 
 /**
+ * list_count - count number of entries on list
+ * @head: the list to count.
+ */
+static inline int list_count(const struct list_head *head)
+{
+	int len = 0;
+	const struct list_head *p = head;
+
+	while ((p = p->next) != head)
+		len++;
+	return len;
+}
+
+/**
  * list_empty_careful - tests whether a list is empty and not being modified
  * @head: the list to test
  *
@@ -611,6 +625,20 @@ static inline int hlist_empty(const struct hlist_head *h)
 	return !h->first;
 }
 
+/**
+ * hlist_count - count number of entries on hlist
+ * @head: the list to count.
+ */
+static inline int hlist_count(const struct hlist_head *head)
+{
+	int len = 0;
+	const struct hlist_node *p;
+
+	for (p = head->first; p; p = p->next)
+		len++;
+	return len;
+}
+
 static inline void __hlist_del(struct hlist_node *n)
 {
 	struct hlist_node *next = n->next;

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [PATCH 0/4] locks: flock and lease related bugfixes, and remove i_flctx counters
  2015-02-17 19:55 ` [PATCH 0/4] locks: flock and lease related bugfixes, and remove i_flctx counters Linus Torvalds
  2015-02-17 20:20   ` Linus Torvalds
@ 2015-02-17 20:20   ` Al Viro
  2015-02-17 21:10     ` Jeff Layton
  1 sibling, 1 reply; 14+ messages in thread
From: Al Viro @ 2015-02-17 20:20 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Jeff Layton, linux-fsdevel, Linux Kernel Mailing List,
	Kirill A. Shutemov, J. Bruce Fields, Christoph Hellwig,
	Dave Chinner, Sasha Levin

On Tue, Feb 17, 2015 at 11:55:01AM -0800, Linus Torvalds wrote:

> Can we get that truncated series tested with some flock test suite? I
> assume there is *some* filesystem tester that tests some basic flock
> stuff, even if it clearly didn't catch the race due to the unlock in
> the middle..

LTP runltp -f syscalls does cover some of that; BTW, looks like fcntl11,
fcntl11_64, fcntl21 and fcntl21_64 in there got broken by something already
merged.  Hadn't don bisect yet, but seeing
 * NAME
 *      fcntl11.c
 *
 * DESCRIPTION
 *      Testcase to check locking of regions of a file
 *
 * ALGORITHM  
 *      Test changing lock sections around a write lock
and
 * NAME
 *      fcntl21.c
 *
 * DESCRIPTION
 *      Check locking of regions of a file
 *
 * ALGORITHM
 *      Test changing lock sections around a read lock

I'd say that file locking merge is most likely suspect...

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 0/4] locks: flock and lease related bugfixes, and remove i_flctx counters
  2015-02-17 20:20   ` Al Viro
@ 2015-02-17 21:10     ` Jeff Layton
  0 siblings, 0 replies; 14+ messages in thread
From: Jeff Layton @ 2015-02-17 21:10 UTC (permalink / raw)
  To: Al Viro
  Cc: Linus Torvalds, linux-fsdevel, Linux Kernel Mailing List,
	Kirill A. Shutemov, J. Bruce Fields, Christoph Hellwig,
	Dave Chinner, Sasha Levin

On Tue, 17 Feb 2015 20:20:49 +0000
Al Viro <viro@ZenIV.linux.org.uk> wrote:

> On Tue, Feb 17, 2015 at 11:55:01AM -0800, Linus Torvalds wrote:
> 
> > Can we get that truncated series tested with some flock test suite? I
> > assume there is *some* filesystem tester that tests some basic flock
> > stuff, even if it clearly didn't catch the race due to the unlock in
> > the middle..
> 
> LTP runltp -f syscalls does cover some of that; BTW, looks like fcntl11,
> fcntl11_64, fcntl21 and fcntl21_64 in there got broken by something already
> merged.  Hadn't don bisect yet, but seeing
>  * NAME
>  *      fcntl11.c
>  *
>  * DESCRIPTION
>  *      Testcase to check locking of regions of a file
>  *
>  * ALGORITHM  
>  *      Test changing lock sections around a write lock
> and
>  * NAME
>  *      fcntl21.c
>  *
>  * DESCRIPTION
>  *      Check locking of regions of a file
>  *
>  * ALGORITHM
>  *      Test changing lock sections around a read lock
> 
> I'd say that file locking merge is most likely suspect...

Almost certainly. I'll take a look.

Thanks,
-- 
Jeff Layton <jeff.layton@primarydata.com>

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 2/4] locks: remove conditional lock release in middle of flock_lock_file
  2015-02-17 19:11       ` J. Bruce Fields
@ 2015-02-17 22:21         ` J. Bruce Fields
  2015-02-17 22:29           ` Jeff Layton
  0 siblings, 1 reply; 14+ messages in thread
From: J. Bruce Fields @ 2015-02-17 22:21 UTC (permalink / raw)
  To: Jeff Layton
  Cc: linux-fsdevel, linux-kernel, Linus Torvalds, Kirill A. Shutemov,
	Christoph Hellwig, Dave Chinner, Sasha Levin

On Tue, Feb 17, 2015 at 02:11:36PM -0500, J. Bruce Fields wrote:
> On Tue, Feb 17, 2015 at 12:56:49PM -0500, Jeff Layton wrote:
> > On Tue, 17 Feb 2015 12:10:17 -0500
> > "J. Bruce Fields" <bfields@fieldses.org> wrote:
> > 
> > > On Tue, Feb 17, 2015 at 07:46:28AM -0500, Jeff Layton wrote:
> > > > As Linus pointed out:
> > > > 
> > > >     Say we have an existing flock, and now do a new one that conflicts. I
> > > >     see what looks like three separate bugs.
> > > > 
> > > >      - We go through the first loop, find a lock of another type, and
> > > >     delete it in preparation for replacing it
> > > > 
> > > >      - we *drop* the lock context spinlock.
> > > > 
> > > >      - BUG #1? So now there is no lock at all, and somebody can come in
> > > >     and see that unlocked state. Is that really valid?
> > > > 
> > > >      - another thread comes in while the first thread dropped the lock
> > > >     context lock, and wants to add its own lock. It doesn't see the
> > > >     deleted or pending locks, so it just adds it
> > > > 
> > > >      - the first thread gets the context spinlock again, and adds the lock
> > > >     that replaced the original
> > > > 
> > > >      - BUG #2? So now there are *two* locks on the thing, and the next
> > > >     time you do an unlock (or when you close the file), it will only
> > > >     remove/replace the first one.
> > > > 
> > > > ...remove the "drop the spinlock" code in the middle of this function as
> > > > it has always been suspicious.
> > > 
> > > Looking back through a historical git repo, purely out of curiosity--the
> > > cond_resched was previously a yield, and then I *think* bug #2 was
> > > introduced in 2002 by a "[PATCH] fs/locks.c: more cleanup".  Before that
> > > it retried the first loop after the yield.
> > > 
> > > Before that the yield was in locks_wake_up_blocks, removed by:
> > > 
> > > 	commit 7962ad19e6300531784722c16849837864304d84
> > > 	Author: Matthew Wilcox <willy@debian.org>
> > > 	Date:   Sat Jun 8 02:09:24 2002 -0700
> > > 
> > > 	[PATCH] fs/locks.c: Only yield once for flocks
> > > 	    
> > > 	This patch removes the annoying and confusing `wait' argument
> > > 	from many places.  The only change in behaviour is that we now
> > > 	yield once when unblocking other BSD-style flocks instead of
> > > 	once for each lock.
> > > 			    
> > > 	This slightly improves the semantics for userspace.  Before,
> > > 	when we had two tasks waiting on a lock, the first one would
> > > 	receive the lock.  Now, the one with the highest priority
> > > 	receives the lock.
> > > 
> > > So this really was intended to guarantee other waiters the lock before
> > > allowing an upgrade.  Could that actually have worked?
> > > 
> > > The non-atomic behavior is documented in flock(2), which says it's
> > > "original BSD behavior".
> > > 
> > > A comment at the top of locks.c says this is to avoid deadlock.  Hm, so,
> > > are we introducing a deadlock?:
> > > 
> > > 	1. Two processes both get shared lock on different filps.
> > > 	2. Both request exclusive lock.
> > > 
> > > Now they're stuck, whereas previously they might have recovered?
> > > 
> > > --b.
> > > 
> > 
> > 
> > Yes, I see that now. It might be best to preserve the existing behavior
> > for that reason then. I'd rather not introduce any behavioral changes in this
> > set if we can help it, particularly if there are userland apps that
> > might rely on it.
> > 
> > It may be best then to keep this patch, but drop patch #3.
> 
> Unfortunately it's this patch that I'm worried about.

(Urp, never mind, you and Linus were right here.)

> Also these patches are introducing some failure in my locking tests
> (probably unrelated, I doubt I ever wrote a test for this case).  I'll
> investigate.

I didn't try to figure out, but after dropping patch 3 everything does
pass.

--b.

> 
> --b.
> 
> > That should
> > be enough to ensure that we don't end up with two different locks on
> > the same file description, which is clearly a bug.
> > 
> > It might also not hurt to follow HCH's earlier advice and make
> > locks_remove_flock just iterate over the list and just unconditionally
> > delete any lock that in the case where the ->flock operation isn't set.
> > 
> > In principle we shouldn't need that once apply patch #2, but it would
> > probably be simpler than dealing with flock_lock_file in that case.
> > 
> > > > 
> > > > Reported-by: Linus Torvalds <torvalds@linux-foundation.org>
> > > > Signed-off-by: Jeff Layton <jeff.layton@primarydata.com>
> > > > ---
> > > >  fs/locks.c | 10 ----------
> > > >  1 file changed, 10 deletions(-)
> > > > 
> > > > diff --git a/fs/locks.c b/fs/locks.c
> > > > index 7998f670812c..00c105f499a2 100644
> > > > --- a/fs/locks.c
> > > > +++ b/fs/locks.c
> > > > @@ -901,16 +901,6 @@ static int flock_lock_file(struct file *filp, struct file_lock *request)
> > > >  		goto out;
> > > >  	}
> > > >  
> > > > -	/*
> > > > -	 * If a higher-priority process was blocked on the old file lock,
> > > > -	 * give it the opportunity to lock the file.
> > > > -	 */
> > > > -	if (found) {
> > > > -		spin_unlock(&ctx->flc_lock);
> > > > -		cond_resched();
> > > > -		spin_lock(&ctx->flc_lock);
> > > > -	}
> > > > -
> > > >  find_conflict:
> > > >  	list_for_each_entry(fl, &ctx->flc_flock, fl_list) {
> > > >  		if (!flock_locks_conflict(request, fl))
> > > > -- 
> > > > 2.1.0
> > 
> > 
> > -- 
> > Jeff Layton <jeff.layton@primarydata.com>

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 2/4] locks: remove conditional lock release in middle of flock_lock_file
  2015-02-17 22:21         ` J. Bruce Fields
@ 2015-02-17 22:29           ` Jeff Layton
  0 siblings, 0 replies; 14+ messages in thread
From: Jeff Layton @ 2015-02-17 22:29 UTC (permalink / raw)
  To: J. Bruce Fields
  Cc: linux-fsdevel, linux-kernel, Linus Torvalds, Kirill A. Shutemov,
	Christoph Hellwig, Dave Chinner, Sasha Levin, Al Viro

On Tue, 17 Feb 2015 17:21:02 -0500
"J. Bruce Fields" <bfields@fieldses.org> wrote:

> On Tue, Feb 17, 2015 at 02:11:36PM -0500, J. Bruce Fields wrote:
> > On Tue, Feb 17, 2015 at 12:56:49PM -0500, Jeff Layton wrote:
> > > On Tue, 17 Feb 2015 12:10:17 -0500
> > > "J. Bruce Fields" <bfields@fieldses.org> wrote:
> > > 
> > > > On Tue, Feb 17, 2015 at 07:46:28AM -0500, Jeff Layton wrote:
> > > > > As Linus pointed out:
> > > > > 
> > > > >     Say we have an existing flock, and now do a new one that conflicts. I
> > > > >     see what looks like three separate bugs.
> > > > > 
> > > > >      - We go through the first loop, find a lock of another type, and
> > > > >     delete it in preparation for replacing it
> > > > > 
> > > > >      - we *drop* the lock context spinlock.
> > > > > 
> > > > >      - BUG #1? So now there is no lock at all, and somebody can come in
> > > > >     and see that unlocked state. Is that really valid?
> > > > > 
> > > > >      - another thread comes in while the first thread dropped the lock
> > > > >     context lock, and wants to add its own lock. It doesn't see the
> > > > >     deleted or pending locks, so it just adds it
> > > > > 
> > > > >      - the first thread gets the context spinlock again, and adds the lock
> > > > >     that replaced the original
> > > > > 
> > > > >      - BUG #2? So now there are *two* locks on the thing, and the next
> > > > >     time you do an unlock (or when you close the file), it will only
> > > > >     remove/replace the first one.
> > > > > 
> > > > > ...remove the "drop the spinlock" code in the middle of this function as
> > > > > it has always been suspicious.
> > > > 
> > > > Looking back through a historical git repo, purely out of curiosity--the
> > > > cond_resched was previously a yield, and then I *think* bug #2 was
> > > > introduced in 2002 by a "[PATCH] fs/locks.c: more cleanup".  Before that
> > > > it retried the first loop after the yield.
> > > > 
> > > > Before that the yield was in locks_wake_up_blocks, removed by:
> > > > 
> > > > 	commit 7962ad19e6300531784722c16849837864304d84
> > > > 	Author: Matthew Wilcox <willy@debian.org>
> > > > 	Date:   Sat Jun 8 02:09:24 2002 -0700
> > > > 
> > > > 	[PATCH] fs/locks.c: Only yield once for flocks
> > > > 	    
> > > > 	This patch removes the annoying and confusing `wait' argument
> > > > 	from many places.  The only change in behaviour is that we now
> > > > 	yield once when unblocking other BSD-style flocks instead of
> > > > 	once for each lock.
> > > > 			    
> > > > 	This slightly improves the semantics for userspace.  Before,
> > > > 	when we had two tasks waiting on a lock, the first one would
> > > > 	receive the lock.  Now, the one with the highest priority
> > > > 	receives the lock.
> > > > 
> > > > So this really was intended to guarantee other waiters the lock before
> > > > allowing an upgrade.  Could that actually have worked?
> > > > 
> > > > The non-atomic behavior is documented in flock(2), which says it's
> > > > "original BSD behavior".
> > > > 
> > > > A comment at the top of locks.c says this is to avoid deadlock.  Hm, so,
> > > > are we introducing a deadlock?:
> > > > 
> > > > 	1. Two processes both get shared lock on different filps.
> > > > 	2. Both request exclusive lock.
> > > > 
> > > > Now they're stuck, whereas previously they might have recovered?
> > > > 
> > > > --b.
> > > > 
> > > 
> > > 
> > > Yes, I see that now. It might be best to preserve the existing behavior
> > > for that reason then. I'd rather not introduce any behavioral changes in this
> > > set if we can help it, particularly if there are userland apps that
> > > might rely on it.
> > > 
> > > It may be best then to keep this patch, but drop patch #3.
> > 
> > Unfortunately it's this patch that I'm worried about.
> 
> (Urp, never mind, you and Linus were right here.)
> 
> > Also these patches are introducing some failure in my locking tests
> > (probably unrelated, I doubt I ever wrote a test for this case).  I'll
> > investigate.
> 
> I didn't try to figure out, but after dropping patch 3 everything does
> pass.
> 
> --b.
> 

Ok, thanks.

I've also figured out the problem that Al reported too. The "if (left
== right)" case in __posix_lock_file was inserting "left" into the
wrong spot. It needs to be inserted just before the new lock, instead
of just before "fl".

I'll include the fix for that in the pull request once I send it. In
the meantime, I plan to sit down this evening with the LTP tests and
make sure there is no other breakage there.

-- 
Jeff Layton <jeff.layton@primarydata.com>

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2015-02-17 22:29 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-02-17 12:46 [PATCH 0/4] locks: flock and lease related bugfixes, and remove i_flctx counters Jeff Layton
2015-02-17 12:46 ` [PATCH 1/4] Revert "locks: keep a count of locks on the flctx lists" Jeff Layton
2015-02-17 12:46 ` [PATCH 2/4] locks: remove conditional lock release in middle of flock_lock_file Jeff Layton
2015-02-17 17:10   ` J. Bruce Fields
2015-02-17 17:56     ` Jeff Layton
2015-02-17 19:11       ` J. Bruce Fields
2015-02-17 22:21         ` J. Bruce Fields
2015-02-17 22:29           ` Jeff Layton
2015-02-17 12:46 ` [PATCH 3/4] locks: when upgrading, don't remove old flock lock until replacing with new one Jeff Layton
2015-02-17 12:46 ` [PATCH 4/4] locks: only remove leases associated with the file being closed Jeff Layton
2015-02-17 19:55 ` [PATCH 0/4] locks: flock and lease related bugfixes, and remove i_flctx counters Linus Torvalds
2015-02-17 20:20   ` Linus Torvalds
2015-02-17 20:20   ` Al Viro
2015-02-17 21:10     ` Jeff Layton

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.