All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] mds: remove waiting lock before merging with neighbours
@ 2013-07-29 15:05 David Disseldorp
  2013-08-01 12:11 ` David Disseldorp
  2013-08-23 20:58 ` Gregory Farnum
  0 siblings, 2 replies; 5+ messages in thread
From: David Disseldorp @ 2013-07-29 15:05 UTC (permalink / raw)
  To: ceph-devel; +Cc: David Disseldorp

CephFS currently deadlocks under CTDB's ping_pong POSIX locking test
when run concurrently on multiple nodes.
The deadlock is caused by failed removal of a waiting_locks entry when
the waiting lock is merged with an existing lock, e.g:

Initial MDS state (two clients, same file):
held_locks -- start: 0, length: 1, client: 4116, pid: 7899, type: 2
	      start: 2, length: 1, client: 4110, pid: 40767, type: 2
waiting_locks -- start: 1, length: 1, client: 4116, pid: 7899, type: 2

Waiting lock entry 4116@1:1 fires:
handle_client_file_setlock: start: 1, length: 1,
			    client: 4116, pid: 7899, type: 2

MDS state after lock is obtained:
held_locks -- start: 0, length: 2, client: 4116, pid: 7899, type: 2
	      start: 2, length: 1, client: 4110, pid: 40767, type: 2
waiting_locks -- start: 1, length: 1, client: 4116, pid: 7899, type: 2

Note that the waiting 4116@1:1 lock entry is merged with the existing
4116@0:1 held lock to become a 4116@0:2 held lock. However, the now
handled 4116@1:1 waiting_locks entry remains.

When handling a lock request, the MDS calls adjust_locks() to merge
the new lock with available neighbours. If the new lock is merged,
then the waiting_locks entry is not located in the subsequent
remove_waiting() call.
This fix ensures that the waiting_locks entry is removed prior to
modification during merge.

Signed-off-by: David Disseldorp <ddiss@suse.de>
---
 src/mds/flock.cc | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/src/mds/flock.cc b/src/mds/flock.cc
index e83c5ee..5e329af 100644
--- a/src/mds/flock.cc
+++ b/src/mds/flock.cc
@@ -75,12 +75,14 @@ bool ceph_lock_state_t::add_lock(ceph_filelock& new_lock,
       } else {
         //yay, we can insert a shared lock
         dout(15) << "inserting shared lock" << dendl;
+        remove_waiting(new_lock);
         adjust_locks(self_overlapping_locks, new_lock, neighbor_locks);
         held_locks.insert(pair<uint64_t, ceph_filelock>(new_lock.start, new_lock));
         ret = true;
       }
     }
   } else { //no overlapping locks except our own
+    remove_waiting(new_lock);
     adjust_locks(self_overlapping_locks, new_lock, neighbor_locks);
     dout(15) << "no conflicts, inserting " << new_lock << dendl;
     held_locks.insert(pair<uint64_t, ceph_filelock>
@@ -89,7 +91,6 @@ bool ceph_lock_state_t::add_lock(ceph_filelock& new_lock,
   }
   if (ret) {
     ++client_held_lock_counts[(client_t)new_lock.client];
-    remove_waiting(new_lock);
   }
   else if (wait_on_fail && !replay)
     ++client_waiting_lock_counts[(client_t)new_lock.client];
@@ -306,7 +307,7 @@ void ceph_lock_state_t::adjust_locks(list<multimap<uint64_t, ceph_filelock>::ite
     old_lock = &(*iter)->second;
     old_lock_client = old_lock->client;
     dout(15) << "lock to coalesce: " << *old_lock << dendl;
-    /* because if it's a neibhoring lock there can't be any self-overlapping
+    /* because if it's a neighboring lock there can't be any self-overlapping
        locks that covered it */
     if (old_lock->type == new_lock.type) { //merge them
       if (0 == new_lock.length) {
-- 
1.8.1.4


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH] mds: remove waiting lock before merging with neighbours
  2013-07-29 15:05 [PATCH] mds: remove waiting lock before merging with neighbours David Disseldorp
@ 2013-08-01 12:11 ` David Disseldorp
  2013-08-01 18:07   ` Sage Weil
  2013-08-23 20:58 ` Gregory Farnum
  1 sibling, 1 reply; 5+ messages in thread
From: David Disseldorp @ 2013-08-01 12:11 UTC (permalink / raw)
  To: ceph-devel

Hi,

Did anyone get a chance to look at this change?
Any comments/feedback/ridicule would be appreciated.

Cheers, David

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] mds: remove waiting lock before merging with neighbours
  2013-08-01 12:11 ` David Disseldorp
@ 2013-08-01 18:07   ` Sage Weil
  0 siblings, 0 replies; 5+ messages in thread
From: Sage Weil @ 2013-08-01 18:07 UTC (permalink / raw)
  To: David Disseldorp; +Cc: ceph-devel

On Thu, 1 Aug 2013, David Disseldorp wrote:
> Hi,
> 
> Did anyone get a chance to look at this change?
> Any comments/feedback/ridicule would be appreciated.

Sorry, not yet--and Greg just headed out for vacation yesterday.  It's on 
my list to look at when I have some time tonight or tomorrow, though. 
Thanks!  

I'm hopefully this will clear up some of the locking hangs we've seen with 
the samba and flock tests...

sage


> 
> Cheers, David
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 
> 

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] mds: remove waiting lock before merging with neighbours
  2013-07-29 15:05 [PATCH] mds: remove waiting lock before merging with neighbours David Disseldorp
  2013-08-01 12:11 ` David Disseldorp
@ 2013-08-23 20:58 ` Gregory Farnum
  2013-08-24 12:02   ` David Disseldorp
  1 sibling, 1 reply; 5+ messages in thread
From: Gregory Farnum @ 2013-08-23 20:58 UTC (permalink / raw)
  To: David Disseldorp; +Cc: ceph-devel

Hi David,
I'm really sorry it took us so long to get back to you on this. :(
However, I've reviewed the patch and, apart from going over the code
making me want to strangle myself for structuring it that way,
everything looks good. I changed the last paragraph in the commit
message very slightly to clarify the cause of the bug:

On Mon, Jul 29, 2013 at 8:05 AM, David Disseldorp <ddiss@suse.de> wrote:
> When handling a lock request, the MDS calls adjust_locks() to merge
> the new lock with available neighbours. If the new lock is merged,
> then the waiting_locks entry is not located in the subsequent
> remove_waiting() call.
> This fix ensures that the waiting_locks entry is removed prior to
> modification during merge.

    When handling a lock request, the MDS calls adjust_locks() to merge
    the new lock with available neighbours. If the new lock is merged,
    then the waiting_locks entry is not located in the subsequent
    remove_waiting() call because adjust_locks changed the new lock to
    include the old locks.
    This fix ensures that the waiting_locks entry is removed prior to
    modification during merge.

And it's now merged into master and backported to dumpling. Thank you very much!

If you feel like cleaning up the locking code a little more (or
anything else, for that matter) I can promise you faster reviews in
the future... ;)
-Greg
Software Engineer #42 @ http://inktank.com | http://ceph.com

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] mds: remove waiting lock before merging with neighbours
  2013-08-23 20:58 ` Gregory Farnum
@ 2013-08-24 12:02   ` David Disseldorp
  0 siblings, 0 replies; 5+ messages in thread
From: David Disseldorp @ 2013-08-24 12:02 UTC (permalink / raw)
  To: Gregory Farnum; +Cc: ceph-devel

On Fri, 23 Aug 2013 13:58:56 -0700
Gregory Farnum <greg@inktank.com> wrote:

>     When handling a lock request, the MDS calls adjust_locks() to merge
>     the new lock with available neighbours. If the new lock is merged,
>     then the waiting_locks entry is not located in the subsequent
>     remove_waiting() call because adjust_locks changed the new lock to
>     include the old locks.
>     This fix ensures that the waiting_locks entry is removed prior to
>     modification during merge.

Looks good.

> And it's now merged into master and backported to dumpling. Thank you very much!

Great, thanks Gregory.

Cheers, David

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2013-08-24 12:02 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-07-29 15:05 [PATCH] mds: remove waiting lock before merging with neighbours David Disseldorp
2013-08-01 12:11 ` David Disseldorp
2013-08-01 18:07   ` Sage Weil
2013-08-23 20:58 ` Gregory Farnum
2013-08-24 12:02   ` David Disseldorp

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.