All of lore.kernel.org
 help / color / mirror / Atom feed
From: Sasha Levin <sashal@kernel.org>
To: linux-kernel@vger.kernel.org, stable@vger.kernel.org
Cc: Bob Peterson <rpeterso@redhat.com>,
	Andreas Gruenbacher <agruenba@redhat.com>,
	Sasha Levin <sashal@kernel.org>,
	cluster-devel@redhat.com
Subject: [PATCH AUTOSEL 5.12 21/49] gfs2: fix a deadlock on withdraw-during-mount
Date: Mon,  7 Jun 2021 12:11:47 -0400	[thread overview]
Message-ID: <20210607161215.3583176-21-sashal@kernel.org> (raw)
In-Reply-To: <20210607161215.3583176-1-sashal@kernel.org>

From: Bob Peterson <rpeterso@redhat.com>

[ Upstream commit 865cc3e9cc0b1d4b81c10d53174bced76decf888 ]

Before this patch, gfs2 would deadlock because of the following
sequence during mount:

mount
   gfs2_fill_super
      gfs2_make_fs_rw <--- Detects IO error with glock
         kthread_stop(sdp->sd_quotad_process);
            <--- Blocked waiting for quotad to finish

logd
   Detects IO error and the need to withdraw
   calls gfs2_withdraw
      gfs2_make_fs_ro
         kthread_stop(sdp->sd_quotad_process);
            <--- Blocked waiting for quotad to finish

gfs2_quotad
   gfs2_statfs_sync
      gfs2_glock_wait <---- Blocked waiting for statfs glock to be granted

glock_work_func
   do_xmote <---Detects IO error, can't release glock: blocked on withdraw
      glops->go_inval
      glock_blocked_by_withdraw
         requeue glock work & exit <--- work requeued, blocked by withdraw

This patch makes a special exception for the statfs system inode glock,
which allows the statfs glock UNLOCK to proceed normally. That allows the
quotad daemon to exit during the withdraw, which allows the logd daemon
to exit during the withdraw, which allows the mount to exit.

Signed-off-by: Bob Peterson <rpeterso@redhat.com>
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 fs/gfs2/glock.c | 24 +++++++++++++++++++++---
 1 file changed, 21 insertions(+), 3 deletions(-)

diff --git a/fs/gfs2/glock.c b/fs/gfs2/glock.c
index 9567520d79f7..142f746d7b33 100644
--- a/fs/gfs2/glock.c
+++ b/fs/gfs2/glock.c
@@ -583,6 +583,16 @@ static void finish_xmote(struct gfs2_glock *gl, unsigned int ret)
 	spin_unlock(&gl->gl_lockref.lock);
 }
 
+static bool is_system_glock(struct gfs2_glock *gl)
+{
+	struct gfs2_sbd *sdp = gl->gl_name.ln_sbd;
+	struct gfs2_inode *m_ip = GFS2_I(sdp->sd_statfs_inode);
+
+	if (gl == m_ip->i_gl)
+		return true;
+	return false;
+}
+
 /**
  * do_xmote - Calls the DLM to change the state of a lock
  * @gl: The lock state
@@ -672,17 +682,25 @@ __acquires(&gl->gl_lockref.lock)
 	 * to see sd_log_error and withdraw, and in the meantime, requeue the
 	 * work for later.
 	 *
+	 * We make a special exception for some system glocks, such as the
+	 * system statfs inode glock, which needs to be granted before the
+	 * gfs2_quotad daemon can exit, and that exit needs to finish before
+	 * we can unmount the withdrawn file system.
+	 *
 	 * However, if we're just unlocking the lock (say, for unmount, when
 	 * gfs2_gl_hash_clear calls clear_glock) and recovery is complete
 	 * then it's okay to tell dlm to unlock it.
 	 */
 	if (unlikely(sdp->sd_log_error && !gfs2_withdrawn(sdp)))
 		gfs2_withdraw_delayed(sdp);
-	if (glock_blocked_by_withdraw(gl)) {
-		if (target != LM_ST_UNLOCKED ||
-		    test_bit(SDF_WITHDRAW_RECOVERY, &sdp->sd_flags)) {
+	if (glock_blocked_by_withdraw(gl) &&
+	    (target != LM_ST_UNLOCKED ||
+	     test_bit(SDF_WITHDRAW_RECOVERY, &sdp->sd_flags))) {
+		if (!is_system_glock(gl)) {
 			gfs2_glock_queue_work(gl, GL_GLOCK_DFT_HOLD);
 			goto out;
+		} else {
+			clear_bit(GLF_INVALIDATE_IN_PROGRESS, &gl->gl_flags);
 		}
 	}
 
-- 
2.30.2


WARNING: multiple messages have this Message-ID (diff)
From: Sasha Levin <sashal@kernel.org>
To: cluster-devel.redhat.com
Subject: [Cluster-devel] [PATCH AUTOSEL 5.12 21/49] gfs2: fix a deadlock on withdraw-during-mount
Date: Mon,  7 Jun 2021 12:11:47 -0400	[thread overview]
Message-ID: <20210607161215.3583176-21-sashal@kernel.org> (raw)
In-Reply-To: <20210607161215.3583176-1-sashal@kernel.org>

From: Bob Peterson <rpeterso@redhat.com>

[ Upstream commit 865cc3e9cc0b1d4b81c10d53174bced76decf888 ]

Before this patch, gfs2 would deadlock because of the following
sequence during mount:

mount
   gfs2_fill_super
      gfs2_make_fs_rw <--- Detects IO error with glock
         kthread_stop(sdp->sd_quotad_process);
            <--- Blocked waiting for quotad to finish

logd
   Detects IO error and the need to withdraw
   calls gfs2_withdraw
      gfs2_make_fs_ro
         kthread_stop(sdp->sd_quotad_process);
            <--- Blocked waiting for quotad to finish

gfs2_quotad
   gfs2_statfs_sync
      gfs2_glock_wait <---- Blocked waiting for statfs glock to be granted

glock_work_func
   do_xmote <---Detects IO error, can't release glock: blocked on withdraw
      glops->go_inval
      glock_blocked_by_withdraw
         requeue glock work & exit <--- work requeued, blocked by withdraw

This patch makes a special exception for the statfs system inode glock,
which allows the statfs glock UNLOCK to proceed normally. That allows the
quotad daemon to exit during the withdraw, which allows the logd daemon
to exit during the withdraw, which allows the mount to exit.

Signed-off-by: Bob Peterson <rpeterso@redhat.com>
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 fs/gfs2/glock.c | 24 +++++++++++++++++++++---
 1 file changed, 21 insertions(+), 3 deletions(-)

diff --git a/fs/gfs2/glock.c b/fs/gfs2/glock.c
index 9567520d79f7..142f746d7b33 100644
--- a/fs/gfs2/glock.c
+++ b/fs/gfs2/glock.c
@@ -583,6 +583,16 @@ static void finish_xmote(struct gfs2_glock *gl, unsigned int ret)
 	spin_unlock(&gl->gl_lockref.lock);
 }
 
+static bool is_system_glock(struct gfs2_glock *gl)
+{
+	struct gfs2_sbd *sdp = gl->gl_name.ln_sbd;
+	struct gfs2_inode *m_ip = GFS2_I(sdp->sd_statfs_inode);
+
+	if (gl == m_ip->i_gl)
+		return true;
+	return false;
+}
+
 /**
  * do_xmote - Calls the DLM to change the state of a lock
  * @gl: The lock state
@@ -672,17 +682,25 @@ __acquires(&gl->gl_lockref.lock)
 	 * to see sd_log_error and withdraw, and in the meantime, requeue the
 	 * work for later.
 	 *
+	 * We make a special exception for some system glocks, such as the
+	 * system statfs inode glock, which needs to be granted before the
+	 * gfs2_quotad daemon can exit, and that exit needs to finish before
+	 * we can unmount the withdrawn file system.
+	 *
 	 * However, if we're just unlocking the lock (say, for unmount, when
 	 * gfs2_gl_hash_clear calls clear_glock) and recovery is complete
 	 * then it's okay to tell dlm to unlock it.
 	 */
 	if (unlikely(sdp->sd_log_error && !gfs2_withdrawn(sdp)))
 		gfs2_withdraw_delayed(sdp);
-	if (glock_blocked_by_withdraw(gl)) {
-		if (target != LM_ST_UNLOCKED ||
-		    test_bit(SDF_WITHDRAW_RECOVERY, &sdp->sd_flags)) {
+	if (glock_blocked_by_withdraw(gl) &&
+	    (target != LM_ST_UNLOCKED ||
+	     test_bit(SDF_WITHDRAW_RECOVERY, &sdp->sd_flags))) {
+		if (!is_system_glock(gl)) {
 			gfs2_glock_queue_work(gl, GL_GLOCK_DFT_HOLD);
 			goto out;
+		} else {
+			clear_bit(GLF_INVALIDATE_IN_PROGRESS, &gl->gl_flags);
 		}
 	}
 
-- 
2.30.2




  parent reply	other threads:[~2021-06-07 16:13 UTC|newest]

Thread overview: 78+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-06-07 16:11 [PATCH AUTOSEL 5.12 01/49] net: ieee802154: fix null deref in parse dev addr Sasha Levin
2021-06-07 16:11 ` [PATCH AUTOSEL 5.12 02/49] HID: asus: Filter keyboard EC for old ROG keyboard Sasha Levin
2021-06-07 16:11 ` [PATCH AUTOSEL 5.12 03/49] HID: quirks: Set INCREMENT_USAGE_ON_DUPLICATE for Saitek X65 Sasha Levin
2021-06-07 16:11 ` [PATCH AUTOSEL 5.12 04/49] HID: quirks: Add HID_QUIRK_NO_INIT_REPORTS quirk for Dell K15A keyboard-dock Sasha Levin
2021-06-07 16:11 ` [PATCH AUTOSEL 5.12 05/49] HID: a4tech: use A4_2WHEEL_MOUSE_HACK_B8 for A4TECH NB-95 Sasha Levin
2021-06-07 16:11 ` [PATCH AUTOSEL 5.12 06/49] HID: hid-input: add mapping for emoji picker key Sasha Levin
2021-06-07 16:11 ` [PATCH AUTOSEL 5.12 07/49] HID: hid-sensor-hub: Return error for hid_set_field() failure Sasha Levin
2021-06-07 16:11 ` [PATCH AUTOSEL 5.12 08/49] HID: asus: filter G713/G733 key event to prevent shutdown Sasha Levin
2021-06-07 16:11 ` [PATCH AUTOSEL 5.12 09/49] HID: quirks: Add quirk for Lenovo optical mouse Sasha Levin
2021-06-07 16:11 ` [PATCH AUTOSEL 5.12 10/49] HID: multitouch: set Stylus suffix for Stylus-application devices, too Sasha Levin
2021-06-07 16:11 ` [PATCH AUTOSEL 5.12 11/49] HID: Add BUS_VIRTUAL to hid_connect logging Sasha Levin
2021-06-07 16:11 ` [PATCH AUTOSEL 5.12 12/49] HID: usbhid: fix info leak in hid_submit_ctrl Sasha Levin
2021-06-07 16:11 ` [PATCH AUTOSEL 5.12 13/49] mt76: mt7921: fix max aggregation subframes setting Sasha Levin
2021-06-07 16:11   ` Sasha Levin
2021-06-07 16:11   ` Sasha Levin
2021-06-07 16:11 ` [PATCH AUTOSEL 5.12 14/49] drm/tegra: sor: Do not leak runtime PM reference Sasha Levin
2021-06-07 16:11   ` Sasha Levin
2021-06-07 16:11 ` [PATCH AUTOSEL 5.12 15/49] gpu: host1x: Split up client initalization and registration Sasha Levin
2021-06-07 16:11   ` Sasha Levin
2021-06-07 16:11 ` [PATCH AUTOSEL 5.12 16/49] drm/tegra: sor: Fully initialize SOR before registration Sasha Levin
2021-06-07 16:11   ` Sasha Levin
2021-06-07 16:11 ` [PATCH AUTOSEL 5.12 17/49] hwmon/pmbus: (q54sj108a2) The PMBUS_MFR_ID is actually 6 chars instead of 5 Sasha Levin
2021-06-07 16:11 ` [PATCH AUTOSEL 5.12 18/49] ARM: OMAP1: Fix use of possibly uninitialized irq variable Sasha Levin
2021-06-07 16:11   ` Sasha Levin
2021-06-07 16:11 ` [PATCH AUTOSEL 5.12 19/49] ARM: OMAP2+: Fix build warning when mmc_omap is not built Sasha Levin
2021-06-07 16:11   ` Sasha Levin
2021-06-07 16:11 ` [PATCH AUTOSEL 5.12 20/49] gfs2: Prevent direct-I/O write fallback errors from getting lost Sasha Levin
2021-06-07 16:11   ` [Cluster-devel] " Sasha Levin
2021-06-07 16:11 ` Sasha Levin [this message]
2021-06-07 16:11   ` [Cluster-devel] [PATCH AUTOSEL 5.12 21/49] gfs2: fix a deadlock on withdraw-during-mount Sasha Levin
2021-06-07 16:11 ` [PATCH AUTOSEL 5.12 22/49] gfs2: Clean up revokes on normal withdraws Sasha Levin
2021-06-07 16:11   ` [Cluster-devel] " Sasha Levin
2021-06-07 16:11 ` [PATCH AUTOSEL 5.12 23/49] HID: multitouch: Disable event reporting on suspend on the Asus T101HA touchpad Sasha Levin
2021-06-07 16:11 ` [PATCH AUTOSEL 5.12 24/49] HID: gt683r: add missing MODULE_DEVICE_TABLE Sasha Levin
2021-06-07 16:11 ` [PATCH AUTOSEL 5.12 25/49] HID: intel-ish-hid: ipc: Add Alder Lake device IDs Sasha Levin
2021-06-07 16:11 ` [PATCH AUTOSEL 5.12 26/49] riscv: Use -mno-relax when using lld linker Sasha Levin
2021-06-07 16:11   ` Sasha Levin
2021-06-07 16:11 ` [PATCH AUTOSEL 5.12 27/49] ALSA: hda: Add AlderLake-M PCI ID Sasha Levin
2021-06-07 16:11   ` Sasha Levin
2021-06-07 16:11 ` [PATCH AUTOSEL 5.12 28/49] mt76: mt7921: remove leftover 80+80 HE capability Sasha Levin
2021-06-07 16:11   ` Sasha Levin
2021-06-07 16:11   ` Sasha Levin
2021-06-07 16:11 ` [PATCH AUTOSEL 5.12 29/49] gfs2: Fix use-after-free in gfs2_glock_shrink_scan Sasha Levin
2021-06-07 16:11   ` [Cluster-devel] " Sasha Levin
2021-06-07 16:11 ` [PATCH AUTOSEL 5.12 30/49] Bluetooth: use correct lock to prevent UAF of hdev object Sasha Levin
2021-06-07 16:11 ` [PATCH AUTOSEL 5.12 31/49] scsi: target: core: Fix warning on realtime kernels Sasha Levin
2021-06-07 16:11 ` [PATCH AUTOSEL 5.12 32/49] ethernet: myri10ge: Fix missing error code in myri10ge_probe() Sasha Levin
2021-06-07 16:11 ` [PATCH AUTOSEL 5.12 33/49] scsi: qedf: Do not put host in qedf_vport_create() unconditionally Sasha Levin
2021-06-07 16:12 ` [PATCH AUTOSEL 5.12 34/49] Bluetooth: Add a new USB ID for RTL8822CE Sasha Levin
2021-06-07 16:12 ` [PATCH AUTOSEL 5.12 35/49] scsi: scsi_devinfo: Add blacklist entry for HPE OPEN-V Sasha Levin
2021-06-07 16:12 ` [PATCH AUTOSEL 5.12 36/49] nvme-loop: reset queue count to 1 in nvme_loop_destroy_io_queues() Sasha Levin
2021-06-07 16:12   ` Sasha Levin
2021-06-07 16:12 ` [PATCH AUTOSEL 5.12 37/49] nvme-loop: clear NVME_LOOP_Q_LIVE when nvme_loop_configure_admin_queue() fails Sasha Levin
2021-06-07 16:12   ` Sasha Levin
2021-06-07 16:12 ` [PATCH AUTOSEL 5.12 38/49] nvme-loop: check for NVME_LOOP_Q_LIVE in nvme_loop_destroy_admin_queue() Sasha Levin
2021-06-07 16:12   ` Sasha Levin
2021-06-07 16:12 ` [PATCH AUTOSEL 5.12 39/49] nvme-loop: do not warn for deleted controllers during reset Sasha Levin
2021-06-07 16:12   ` Sasha Levin
2021-06-07 16:12 ` [PATCH AUTOSEL 5.12 40/49] net: ipconfig: Don't override command-line hostnames or domains Sasha Levin
2021-06-07 16:12 ` [PATCH AUTOSEL 5.12 41/49] drm/amd/display: Allow bandwidth validation for 0 streams Sasha Levin
2021-06-07 16:12   ` Sasha Levin
2021-06-07 16:12   ` Sasha Levin
2021-06-07 16:12 ` [PATCH AUTOSEL 5.12 42/49] drm/amdgpu: refine amdgpu_fru_get_product_info Sasha Levin
2021-06-07 16:12   ` Sasha Levin
2021-06-07 16:12   ` Sasha Levin
2021-06-07 16:12 ` [PATCH AUTOSEL 5.12 43/49] drm/amd/display: Fix overlay validation by considering cursors Sasha Levin
2021-06-07 16:12   ` Sasha Levin
2021-06-07 16:12   ` Sasha Levin
2021-06-07 16:12 ` [PATCH AUTOSEL 5.12 44/49] drm/amd/display: Fix potential memory leak in DMUB hw_init Sasha Levin
2021-06-07 16:12   ` Sasha Levin
2021-06-07 16:12   ` Sasha Levin
2021-06-07 16:12 ` [PATCH AUTOSEL 5.12 45/49] drm/amd/amdgpu:save psp ring wptr to avoid attack Sasha Levin
2021-06-07 16:12   ` Sasha Levin
2021-06-07 16:12   ` Sasha Levin
2021-06-07 16:12 ` [PATCH AUTOSEL 5.12 46/49] rtnetlink: Fix missing error code in rtnl_bridge_notify() Sasha Levin
2021-06-07 16:12 ` [PATCH AUTOSEL 5.12 47/49] net/x25: Return the correct errno code Sasha Levin
2021-06-07 16:12 ` [PATCH AUTOSEL 5.12 48/49] net: " Sasha Levin
2021-06-07 16:12 ` [PATCH AUTOSEL 5.12 49/49] fib: " Sasha Levin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210607161215.3583176-21-sashal@kernel.org \
    --to=sashal@kernel.org \
    --cc=agruenba@redhat.com \
    --cc=cluster-devel@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=rpeterso@redhat.com \
    --cc=stable@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.