linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: linux-kernel@vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	stable@vger.kernel.org, Changwei Ge <gechangwei@live.cn>,
	Joseph Qi <joseph.qi@linux.alibaba.com>,
	Mark Fasheh <mark@fasheh.com>, Joel Becker <jlbec@evilplan.org>,
	Junxiao Bi <junxiao.bi@oracle.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Linus Torvalds <torvalds@linux-foundation.org>,
	Sasha Levin <sashal@kernel.org>
Subject: [PATCH 4.4 21/36] ocfs2: wait for recovering done after direct unlock request
Date: Sun,  6 Oct 2019 19:19:03 +0200	[thread overview]
Message-ID: <20191006171053.763416052@linuxfoundation.org> (raw)
In-Reply-To: <20191006171038.266461022@linuxfoundation.org>

From: Changwei Ge <gechangwei@live.cn>

[ Upstream commit 0a3775e4f883912944481cf2ef36eb6383a9cc74 ]

There is a scenario causing ocfs2 umount hang when multiple hosts are
rebooting at the same time.

NODE1                           NODE2               NODE3
send unlock requset to NODE2
                                dies
                                                    become recovery master
                                                    recover NODE2
find NODE2 dead
mark resource RECOVERING
directly remove lock from grant list
calculate usage but RECOVERING marked
**miss the window of purging
clear RECOVERING

To reproduce this issue, crash a host and then umount ocfs2
from another node.

To solve this, just let unlock progress wait for recovery done.

Link: http://lkml.kernel.org/r/1550124866-20367-1-git-send-email-gechangwei@live.cn
Signed-off-by: Changwei Ge <gechangwei@live.cn>
Reviewed-by: Joseph Qi <joseph.qi@linux.alibaba.com>
Cc: Mark Fasheh <mark@fasheh.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Junxiao Bi <junxiao.bi@oracle.com>
Cc: Changwei Ge <gechangwei@live.cn>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 fs/ocfs2/dlm/dlmunlock.c | 23 +++++++++++++++++++----
 1 file changed, 19 insertions(+), 4 deletions(-)

diff --git a/fs/ocfs2/dlm/dlmunlock.c b/fs/ocfs2/dlm/dlmunlock.c
index 2e3c9dbab68c9..d137d4692b918 100644
--- a/fs/ocfs2/dlm/dlmunlock.c
+++ b/fs/ocfs2/dlm/dlmunlock.c
@@ -105,7 +105,8 @@ static enum dlm_status dlmunlock_common(struct dlm_ctxt *dlm,
 	enum dlm_status status;
 	int actions = 0;
 	int in_use;
-        u8 owner;
+	u8 owner;
+	int recovery_wait = 0;
 
 	mlog(0, "master_node = %d, valblk = %d\n", master_node,
 	     flags & LKM_VALBLK);
@@ -208,9 +209,12 @@ static enum dlm_status dlmunlock_common(struct dlm_ctxt *dlm,
 		}
 		if (flags & LKM_CANCEL)
 			lock->cancel_pending = 0;
-		else
-			lock->unlock_pending = 0;
-
+		else {
+			if (!lock->unlock_pending)
+				recovery_wait = 1;
+			else
+				lock->unlock_pending = 0;
+		}
 	}
 
 	/* get an extra ref on lock.  if we are just switching
@@ -244,6 +248,17 @@ leave:
 	spin_unlock(&res->spinlock);
 	wake_up(&res->wq);
 
+	if (recovery_wait) {
+		spin_lock(&res->spinlock);
+		/* Unlock request will directly succeed after owner dies,
+		 * and the lock is already removed from grant list. We have to
+		 * wait for RECOVERING done or we miss the chance to purge it
+		 * since the removement is much faster than RECOVERING proc.
+		 */
+		__dlm_wait_on_lockres_flags(res, DLM_LOCK_RES_RECOVERING);
+		spin_unlock(&res->spinlock);
+	}
+
 	/* let the caller's final dlm_lock_put handle the actual kfree */
 	if (actions & DLM_UNLOCK_FREE_LOCK) {
 		/* this should always be coupled with list removal */
-- 
2.20.1




  parent reply	other threads:[~2019-10-06 17:20 UTC|newest]

Thread overview: 49+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-10-06 17:18 [PATCH 4.4 00/36] 4.4.196-stable review Greg Kroah-Hartman
2019-10-06 17:18 ` [PATCH 4.4 01/36] video: ssd1307fb: Start page range at page_offset Greg Kroah-Hartman
2019-10-06 17:18 ` [PATCH 4.4 02/36] gpu: drm: radeon: Fix a possible null-pointer dereference in radeon_connector_set_property() Greg Kroah-Hartman
2019-10-06 17:18 ` [PATCH 4.4 03/36] ipmi_si: Only schedule continuously in the thread in maintenance mode Greg Kroah-Hartman
2019-10-06 17:18 ` [PATCH 4.4 04/36] clk: qoriq: Fix -Wunused-const-variable Greg Kroah-Hartman
2019-10-06 17:18 ` [PATCH 4.4 05/36] clk: sirf: Dont reference clk_init_data after registration Greg Kroah-Hartman
2019-10-06 17:18 ` [PATCH 4.4 06/36] powerpc/rtas: use device model APIs and serialization during LPM Greg Kroah-Hartman
2019-10-06 17:18 ` [PATCH 4.4 07/36] powerpc/futex: Fix warning: oldval may be used uninitialized in this function Greg Kroah-Hartman
2019-10-06 17:18 ` [PATCH 4.4 08/36] powerpc/pseries/mobility: use cond_resched when updating device tree Greg Kroah-Hartman
2019-10-06 17:18 ` [PATCH 4.4 09/36] pinctrl: tegra: Fix write barrier placement in pmx_writel Greg Kroah-Hartman
2019-10-06 17:18 ` [PATCH 4.4 10/36] powerpc/eeh: Clear stale EEH_DEV_NO_HANDLER flag Greg Kroah-Hartman
2019-10-06 17:18 ` [PATCH 4.4 11/36] vfio_pci: Restore original state on release Greg Kroah-Hartman
2019-10-06 17:18 ` [PATCH 4.4 12/36] powerpc/64s/exception: machine check use correct cfar for late handler Greg Kroah-Hartman
2019-10-06 17:18 ` [PATCH 4.4 13/36] powerpc/pseries: correctly track irq state in default idle Greg Kroah-Hartman
2019-10-06 17:18 ` [PATCH 4.4 14/36] scsi: core: Reduce memory required for SCSI logging Greg Kroah-Hartman
2019-10-06 17:18 ` [PATCH 4.4 15/36] mfd: intel-lpss: Remove D3cold delay Greg Kroah-Hartman
2019-10-06 17:18 ` [PATCH 4.4 16/36] ARM: 8898/1: mm: Dont treat faults reported from cache maintenance as writes Greg Kroah-Hartman
2019-10-06 17:18 ` [PATCH 4.4 17/36] HID: apple: Fix stuck function keys when using FN Greg Kroah-Hartman
2019-10-06 17:19 ` [PATCH 4.4 18/36] security: smack: Fix possible null-pointer dereferences in smack_socket_sock_rcv_skb() Greg Kroah-Hartman
2019-10-06 17:19 ` [PATCH 4.4 19/36] fat: work around race with userspaces read via blockdev while mounting Greg Kroah-Hartman
2019-10-06 17:19 ` [PATCH 4.4 20/36] hypfs: Fix error number left in struct pointer member Greg Kroah-Hartman
2019-10-06 17:19 ` Greg Kroah-Hartman [this message]
2019-10-06 17:19 ` [PATCH 4.4 22/36] kmemleak: increase DEBUG_KMEMLEAK_EARLY_LOG_SIZE default to 16K Greg Kroah-Hartman
2019-10-06 17:19 ` [PATCH 4.4 23/36] ANDROID: binder: remove waitqueue when thread exits Greg Kroah-Hartman
2019-10-06 17:19 ` [PATCH 4.4 24/36] ANDROID: binder: synchronize_rcu() when using POLLFREE Greg Kroah-Hartman
2019-10-06 17:19 ` [PATCH 4.4 25/36] hso: fix NULL-deref on tty open Greg Kroah-Hartman
2019-10-06 17:19 ` [PATCH 4.4 26/36] ipv6: drop incoming packets having a v4mapped source address Greg Kroah-Hartman
2019-10-06 17:19 ` [PATCH 4.4 27/36] net: ipv4: avoid mixed n_redirects and rate_tokens usage Greg Kroah-Hartman
2019-10-06 17:19 ` [PATCH 4.4 28/36] net: qlogic: Fix memory leak in ql_alloc_large_buffers Greg Kroah-Hartman
2019-10-06 17:19 ` [PATCH 4.4 29/36] nfc: fix memory leak in llcp_sock_bind() Greg Kroah-Hartman
2019-10-06 17:19 ` [PATCH 4.4 30/36] sch_dsmark: fix potential NULL deref in dsmark_init() Greg Kroah-Hartman
2019-10-06 17:19 ` [PATCH 4.4 31/36] xen-netfront: do not use ~0U as error return value for xennet_fill_frags() Greg Kroah-Hartman
2019-10-06 17:19 ` [PATCH 4.4 32/36] net/rds: Fix error handling in rds_ib_add_one() Greg Kroah-Hartman
2019-10-06 17:19 ` [PATCH 4.4 33/36] sch_cbq: validate TCA_CBQ_WRROPT to avoid crash Greg Kroah-Hartman
2019-10-06 17:19 ` [PATCH 4.4 34/36] Smack: Dont ignore other bprm->unsafe flags if LSM_UNSAFE_PTRACE is set Greg Kroah-Hartman
2019-10-06 17:19 ` [PATCH 4.4 35/36] smack: use GFP_NOFS while holding inode_smack::smk_lock Greg Kroah-Hartman
2019-10-06 17:19 ` [PATCH 4.4 36/36] NFC: fix attrs checks in netlink interface Greg Kroah-Hartman
2019-10-06 22:01 ` [PATCH 4.4 00/36] 4.4.196-stable review kernelci.org bot
2019-10-07 10:07 ` Jon Hunter
2019-10-07 12:53 ` Guenter Roeck
2019-10-07 14:49   ` Greg Kroah-Hartman
2019-10-07 22:36     ` Guenter Roeck
2019-10-08  5:14       ` Greg Kroah-Hartman
2019-10-07 23:07     ` Sasha Levin
2019-10-07 23:16       ` Guenter Roeck
2019-10-08  1:49         ` Sasha Levin
2019-10-08  3:13           ` Guenter Roeck
2019-10-07 14:31 ` Guenter Roeck
2019-10-07 16:37 ` Daniel Díaz

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20191006171053.763416052@linuxfoundation.org \
    --to=gregkh@linuxfoundation.org \
    --cc=akpm@linux-foundation.org \
    --cc=gechangwei@live.cn \
    --cc=jlbec@evilplan.org \
    --cc=joseph.qi@linux.alibaba.com \
    --cc=junxiao.bi@oracle.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mark@fasheh.com \
    --cc=sashal@kernel.org \
    --cc=stable@vger.kernel.org \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).