* [folded-merged] mle-releases-issue.patch removed from -mm tree
@ 2016-11-23 0:36 akpm
0 siblings, 0 replies; only message in thread
From: akpm @ 2016-11-23 0:36 UTC (permalink / raw)
To: ge.changwei, jiangqi903, jlbec, junxiao.bi, mfasheh, mm-commits
The patch titled
Subject: ocfs: fix MLE release issue
has been removed from the -mm tree. Its filename was
mle-releases-issue.patch
This patch was dropped because it was folded into ocfs2-clean-up-unused-page-parameter-in-ocfs2_write_end_nolock.patch
------------------------------------------------------
From: Gechangwei <ge.changwei@h3c.com>
Subject: ocfs: fix MLE release issue
During my test on OCFS2 suffering a storage failure, a crash issue was
found. Below was the call trace when crashed.
In the call trace, we can see a MLE's reference count is going to be
negative, which aroused a BUG_ON()
[143355.593258] Call Trace:
[143355.593268] [<ffffffffc0328447>] dlm_put_mle_inuse+0x47/0x70 [ocfs2_dlm]
[143355.593276] [<ffffffffc032bee5>] dlm_get_lock_resource+0xac5/0x10d0 [ocfs2_dlm]
[143355.593286] [<ffffffff81724a7a>] ? ip_queue_xmit+0x14a/0x3d0
[143355.593292] [<ffffffff811e50b4>] ? kmem_cache_alloc+0x1e4/0x220
[143355.593300] [<ffffffffc03215cc>] ? dlm_wait_for_recovery+0x6c/0x190 [ocfs2_dlm]
[143355.593311] [<ffffffffc0335c4d>] dlmlock+0x62d/0x16e0 [ocfs2_dlm]
[143355.593316] [<ffffffff816cfbab>] ? __alloc_skb+0x9b/0x2b0
[143355.593323] [<ffffffffc01f6000>] ? 0xffffffffc01f6000
I think I probably have found the root cause of this issue. Please
**Node 1** **Node 2**
Storage failure
An assert master message is sent to Node 1
Treat Node2 as down
Assert master handler
Decrease MLE reference count
Clean blocked MLE
Decrease MLE reference count
In the above scenario, both dlm_assert_master_handler and
dlm_clean_block_mle will decease MLE reference count, thus, in the
following get_resouce procedure, the reference count is going to be
negative.
Link: http://lkml.kernel.org/r/63ADC13FD55D6546B7DECE290D39E373220C9A5B@H3CMLB12-EX.srv.huawei-3com.com
Signed-off-by: gechangwei <ge.changwei@h3c.com>
Cc: Mark Fasheh <mfasheh@versity.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Junxiao Bi <junxiao.bi@oracle.com>
Cc: Joseph Qi <jiangqi903@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---
fs/ocfs2/dlm/dlmmaster.c | 7 ++++++-
1 file changed, 6 insertions(+), 1 deletion(-)
diff -puN fs/ocfs2/dlm/dlmmaster.c~mle-releases-issue fs/ocfs2/dlm/dlmmaster.c
--- a/fs/ocfs2/dlm/dlmmaster.c~mle-releases-issue
+++ a/fs/ocfs2/dlm/dlmmaster.c
@@ -1935,7 +1935,7 @@ ok:
spin_lock(&mle->spinlock);
if (mle->type == DLM_MLE_BLOCK || mle->type == DLM_MLE_MIGRATION)
- extra_ref = 1;
+ extra_ref = test_bit(assert->node_idx, mle->maybe_map) ? 1 : 0;
else {
/* MASTER mle: if any bits set in the response map
* then the calling node needs to re-assert to clear
@@ -3338,12 +3338,17 @@ static void dlm_clean_block_mle(struct d
mlog(0, "mle found, but dead node %u would not have been "
"master\n", dead_node);
spin_unlock(&mle->spinlock);
+ } else if(mle->master != O2NM_MAX_NODES){
+ mlog(ML_NOTICE, "mle found, master assert received, master has "
+ "already set to %d.\n ", mle->master);
+ spin_unlock(&mle->spinlock);
} else {
/* Must drop the refcount by one since the assert_master will
* never arrive. This may result in the mle being unlinked and
* freed, but there may still be a process waiting in the
* dlmlock path which is fine. */
mlog(0, "node %u was expected master\n", dead_node);
+ clear_bit(bit, mle->maybe_map);
atomic_set(&mle->woken, 1);
spin_unlock(&mle->spinlock);
wake_up(&mle->wq);
_
Patches currently in -mm which might be from ge.changwei@h3c.com are
ocfs2-clean-up-unused-page-parameter-in-ocfs2_write_end_nolock.patch
^ permalink raw reply [flat|nested] only message in thread
only message in thread, other threads:[~2016-11-23 0:43 UTC | newest]
Thread overview: (only message) (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-11-23 0:36 [folded-merged] mle-releases-issue.patch removed from -mm tree akpm
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).