From: piaojun <piaojun@huawei.com>
To: ocfs2-devel@oss.oracle.com
Subject: [Ocfs2-devel] ocfs2/dlm: disable BUG_ON when DLM_LOCK_RES_DROPPING_REF, is cleared before dlm_deref_lockres_done_handler
Date: Sun, 10 Jul 2016 18:01:45 +0800 [thread overview]
Message-ID: <57821D09.1040306@huawei.com> (raw)
We found a BUG situation in which DLM_LOCK_RES_DROPPING_REF is cleared
unexpected that described below. To solve the bug, we disable the BUG_ON
and purge lockres in dlm_do_local_recovery_cleanup.
Node 1 Node 2(master)
dlm_purge_lockres
dlm_deref_lockres_handler
DLM_LOCK_RES_SETREF_INPROG is set
response DLM_DEREF_RESPONSE_INPROG
receive DLM_DEREF_RESPONSE_INPROG
stop puring in dlm_purge_lockres
and wait for DLM_DEREF_RESPONSE_DONE
dispatch dlm_deref_lockres_worker
response DLM_DEREF_RESPONSE_DONE
receive DLM_DEREF_RESPONSE_DONE and
prepare to purge lockres
Node 2 goes down
find Node2 down and do local
clean up for Node2:
dlm_do_local_recovery_cleanup
-> clear DLM_LOCK_RES_DROPPING_REF
when purging lockres, BUG_ON happens
because DLM_LOCK_RES_DROPPING_REF is clear:
dlm_deref_lockres_done_handler
->BUG_ON(!(res->state & DLM_LOCK_RES_DROPPING_REF));
Fixes: 60d663cb5273 ("ocfs2/dlm: add DEREF_DONE message")
Signed-off-by: Jun Piao <piaojun@huawei.com>
---
fs/ocfs2/dlm/dlmmaster.c | 13 ++++++++++++-
1 file changed, 12 insertions(+), 1 deletion(-)
diff --git a/fs/ocfs2/dlm/dlmmaster.c b/fs/ocfs2/dlm/dlmmaster.c
index 9aed6e2..f72e7ae 100644
--- a/fs/ocfs2/dlm/dlmmaster.c
+++ b/fs/ocfs2/dlm/dlmmaster.c
@@ -2416,7 +2416,16 @@ int dlm_deref_lockres_done_handler(struct o2net_msg *msg, u32 len, void *data,
}
spin_lock(&res->spinlock);
- BUG_ON(!(res->state & DLM_LOCK_RES_DROPPING_REF));
+ if (!(res->state & DLM_LOCK_RES_DROPPING_REF)) {
+ spin_unlock(&res->spinlock);
+ spin_unlock(&dlm->spinlock);
+ mlog(ML_NOTICE, "%s:%.*s: node %u sends deref done "
+ "but it is already derefed!\n", dlm->name,
+ res->lockname.len, res->lockname.name, node);
+ dlm_lockres_put(res);
+ goto done;
+ }
+
if (!list_empty(&res->purge)) {
mlog(0, "%s: Removing res %.*s from purgelist\n",
dlm->name, res->lockname.len, res->lockname.name);
@@ -2455,6 +2464,8 @@ int dlm_deref_lockres_done_handler(struct o2net_msg *msg, u32 len, void *data,
spin_unlock(&dlm->spinlock);
+ ret = 0;
+
done:
dlm_put(dlm);
return ret;
--
1.8.4.3
next reply other threads:[~2016-07-10 10:01 UTC|newest]
Thread overview: 3+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-07-10 10:01 piaojun [this message]
2016-07-11 1:55 ` [Ocfs2-devel] ocfs2/dlm: disable BUG_ON when DLM_LOCK_RES_DROPPING_REF, is cleared before dlm_deref_lockres_done_handler Joseph Qi
2016-07-11 2:17 ` piaojun
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=57821D09.1040306@huawei.com \
--to=piaojun@huawei.com \
--cc=ocfs2-devel@oss.oracle.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.