stable.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: linux-kernel@vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	stable@vger.kernel.org,
	"Darrick J. Wong" <darrick.wong@oracle.com>,
	Christoph Hellwig <hch@lst.de>, Jan Kara <jack@suse.cz>,
	Sasha Levin <sashal@kernel.org>
Subject: [PATCH 4.19 32/91] vfs: remove lockdep bogosity in __sb_start_write
Date: Mon, 23 Nov 2020 13:21:52 +0100	[thread overview]
Message-ID: <20201123121810.885615538@linuxfoundation.org> (raw)
In-Reply-To: <20201123121809.285416732@linuxfoundation.org>

From: Darrick J. Wong <darrick.wong@oracle.com>

[ Upstream commit 22843291efc986ce7722610073fcf85a39b4cb13 ]

__sb_start_write has some weird looking lockdep code that claims to
exist to handle nested freeze locking requests from xfs.  The code as
written seems broken -- if we think we hold a read lock on any of the
higher freeze levels (e.g. we hold SB_FREEZE_WRITE and are trying to
lock SB_FREEZE_PAGEFAULT), it converts a blocking lock attempt into a
trylock.

However, it's not correct to downgrade a blocking lock attempt to a
trylock unless the downgrading code or the callers are prepared to deal
with that situation.  Neither __sb_start_write nor its callers handle
this at all.  For example:

sb_start_pagefault ignores the return value completely, with the result
that if xfs_filemap_fault loses a race with a different thread trying to
fsfreeze, it will proceed without pagefault freeze protection (thereby
breaking locking rules) and then unlocks the pagefault freeze lock that
it doesn't own on its way out (thereby corrupting the lock state), which
leads to a system hang shortly afterwards.

Normally, this won't happen because our ownership of a read lock on a
higher freeze protection level blocks fsfreeze from grabbing a write
lock on that higher level.  *However*, if lockdep is offline,
lock_is_held_type unconditionally returns 1, which means that
percpu_rwsem_is_held returns 1, which means that __sb_start_write
unconditionally converts blocking freeze lock attempts into trylocks,
even when we *don't* hold anything that would block a fsfreeze.

Apparently this all held together until 5.10-rc1, when bugs in lockdep
caused lockdep to shut itself off early in an fstests run, and once
fstests gets to the "race writes with freezer" tests, kaboom.  This
might explain the long trail of vanishingly infrequent livelocks in
fstests after lockdep goes offline that I've never been able to
diagnose.

We could fix it by spinning on the trylock if wait==true, but AFAICT the
locking works fine if lockdep is not built at all (and I didn't see any
complaints running fstests overnight), so remove this snippet entirely.

NOTE: Commit f4b554af9931 in 2015 created the current weird logic (which
used to exist in a different form in commit 5accdf82ba25c from 2012) in
__sb_start_write.  XFS solved this whole problem in the late 2.6 era by
creating a variant of transactions (XFS_TRANS_NO_WRITECOUNT) that don't
grab intwrite freeze protection, thus making lockdep's solution
unnecessary.  The commit claims that Dave Chinner explained that the
trylock hack + comment could be removed, but nobody ever did.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Jan Kara <jack@suse.cz>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 fs/super.c | 33 ++++-----------------------------
 1 file changed, 4 insertions(+), 29 deletions(-)

diff --git a/fs/super.c b/fs/super.c
index f3a8c008e1643..9fb4553c46e63 100644
--- a/fs/super.c
+++ b/fs/super.c
@@ -1360,36 +1360,11 @@ EXPORT_SYMBOL(__sb_end_write);
  */
 int __sb_start_write(struct super_block *sb, int level, bool wait)
 {
-	bool force_trylock = false;
-	int ret = 1;
+	if (!wait)
+		return percpu_down_read_trylock(sb->s_writers.rw_sem + level-1);
 
-#ifdef CONFIG_LOCKDEP
-	/*
-	 * We want lockdep to tell us about possible deadlocks with freezing
-	 * but it's it bit tricky to properly instrument it. Getting a freeze
-	 * protection works as getting a read lock but there are subtle
-	 * problems. XFS for example gets freeze protection on internal level
-	 * twice in some cases, which is OK only because we already hold a
-	 * freeze protection also on higher level. Due to these cases we have
-	 * to use wait == F (trylock mode) which must not fail.
-	 */
-	if (wait) {
-		int i;
-
-		for (i = 0; i < level - 1; i++)
-			if (percpu_rwsem_is_held(sb->s_writers.rw_sem + i)) {
-				force_trylock = true;
-				break;
-			}
-	}
-#endif
-	if (wait && !force_trylock)
-		percpu_down_read(sb->s_writers.rw_sem + level-1);
-	else
-		ret = percpu_down_read_trylock(sb->s_writers.rw_sem + level-1);
-
-	WARN_ON(force_trylock && !ret);
-	return ret;
+	percpu_down_read(sb->s_writers.rw_sem + level-1);
+	return 1;
 }
 EXPORT_SYMBOL(__sb_start_write);
 
-- 
2.27.0




  parent reply	other threads:[~2020-11-23 12:31 UTC|newest]

Thread overview: 99+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-11-23 12:21 [PATCH 4.19 00/91] 4.19.160-rc1 review Greg Kroah-Hartman
2020-11-23 12:21 ` [PATCH 4.19 01/91] ah6: fix error return code in ah6_input() Greg Kroah-Hartman
2020-11-23 12:21 ` [PATCH 4.19 02/91] atm: nicstar: Unmap DMA on send error Greg Kroah-Hartman
2020-11-23 12:21 ` [PATCH 4.19 03/91] bnxt_en: read EEPROM A2h address using page 0 Greg Kroah-Hartman
2020-11-23 12:21 ` [PATCH 4.19 04/91] devlink: Add missing genlmsg_cancel() in devlink_nl_sb_port_pool_fill() Greg Kroah-Hartman
2020-11-23 12:21 ` [PATCH 4.19 05/91] inet_diag: Fix error path to cancel the meseage in inet_req_diag_fill() Greg Kroah-Hartman
2020-11-23 12:21 ` [PATCH 4.19 06/91] lan743x: fix issue causing intermittent kernel log warnings Greg Kroah-Hartman
2020-11-23 12:21 ` [PATCH 4.19 07/91] lan743x: prevent entire kernel HANG on open, for some platforms Greg Kroah-Hartman
2020-11-23 12:21 ` [PATCH 4.19 08/91] mlxsw: core: Use variable timeout for EMAD retries Greg Kroah-Hartman
2020-11-23 12:21 ` [PATCH 4.19 09/91] net: b44: fix error return code in b44_init_one() Greg Kroah-Hartman
2020-11-23 12:21 ` [PATCH 4.19 10/91] net: bridge: add missing counters to ndo_get_stats64 callback Greg Kroah-Hartman
2020-11-23 12:21 ` [PATCH 4.19 11/91] net: dsa: mv88e6xxx: Avoid VTU corruption on 6097 Greg Kroah-Hartman
2020-11-23 12:21 ` [PATCH 4.19 12/91] net: Have netpoll bring-up DSA management interface Greg Kroah-Hartman
2020-11-23 12:21 ` [PATCH 4.19 13/91] netlabel: fix our progress tracking in netlbl_unlabel_staticlist() Greg Kroah-Hartman
2020-11-23 12:21 ` [PATCH 4.19 14/91] netlabel: fix an uninitialized warning " Greg Kroah-Hartman
2020-11-23 12:21 ` [PATCH 4.19 15/91] net/mlx4_core: Fix init_hca fields offset Greg Kroah-Hartman
2020-11-23 12:21 ` [PATCH 4.19 16/91] net: qualcomm: rmnet: Fix incorrect receive packet handling during cleanup Greg Kroah-Hartman
2020-11-23 12:21 ` [PATCH 4.19 17/91] net: x25: Increase refcnt of "struct x25_neigh" in x25_rx_call_request Greg Kroah-Hartman
2020-11-23 12:21 ` [PATCH 4.19 18/91] page_frag: Recover from memory pressure Greg Kroah-Hartman
2020-11-23 12:21 ` [PATCH 4.19 19/91] qed: fix error return code in qed_iwarp_ll2_start() Greg Kroah-Hartman
2020-11-23 12:21 ` [PATCH 4.19 20/91] qlcnic: fix error return code in qlcnic_83xx_restart_hw() Greg Kroah-Hartman
2020-11-23 12:21 ` [PATCH 4.19 21/91] sctp: change to hold/put transport for proto_unreach_timer Greg Kroah-Hartman
2020-11-23 12:21 ` [PATCH 4.19 22/91] tcp: only postpone PROBE_RTT if RTT is < current min_rtt estimate Greg Kroah-Hartman
2020-11-23 12:21 ` [PATCH 4.19 23/91] net/mlx5: Disable QoS when min_rates on all VFs are zero Greg Kroah-Hartman
2020-11-23 12:21 ` [PATCH 4.19 24/91] net: usb: qmi_wwan: Set DTR quirk for MR400 Greg Kroah-Hartman
2020-11-23 12:21 ` [PATCH 4.19 25/91] net/ncsi: Fix netlink registration Greg Kroah-Hartman
2020-11-23 12:21 ` [PATCH 4.19 26/91] net: ftgmac100: Fix crash when removing driver Greg Kroah-Hartman
2020-11-23 12:21 ` [PATCH 4.19 27/91] pinctrl: rockchip: enable gpio pclk for rockchip_gpio_to_irq Greg Kroah-Hartman
2020-11-23 12:21 ` [PATCH 4.19 28/91] scsi: ufs: Fix unbalanced scsi_block_reqs_cnt caused by ufshcd_hold() Greg Kroah-Hartman
2020-11-23 12:21 ` [PATCH 4.19 29/91] selftests: kvm: Fix the segment descriptor layout to match the actual layout Greg Kroah-Hartman
2020-11-23 12:21 ` [PATCH 4.19 30/91] ACPI: button: Add DMI quirk for Medion Akoya E2228T Greg Kroah-Hartman
2020-11-23 12:21 ` [PATCH 4.19 31/91] arm64: psci: Avoid printing in cpu_psci_cpu_die() Greg Kroah-Hartman
2020-11-23 12:21 ` Greg Kroah-Hartman [this message]
2020-11-23 12:21 ` [PATCH 4.19 33/91] arm64: dts: allwinner: a64: Pine64 Plus: Fix ethernet node Greg Kroah-Hartman
2020-11-23 12:21 ` [PATCH 4.19 34/91] arm64: dts: allwinner: h5: OrangePi PC2: " Greg Kroah-Hartman
2020-11-23 12:21 ` [PATCH 4.19 35/91] ARM: dts: sun8i: r40: bananapi-m2-ultra: " Greg Kroah-Hartman
2020-11-23 12:21 ` [PATCH 4.19 36/91] Revert "arm: sun8i: orangepi-pc-plus: Set EMAC activity LEDs to active high" Greg Kroah-Hartman
2020-11-23 12:21 ` [PATCH 4.19 37/91] ARM: dts: sun8i: h3: orangepi-plus2e: Enable RGMII RX/TX delay on Ethernet PHY Greg Kroah-Hartman
2020-11-23 12:21 ` [PATCH 4.19 38/91] ARM: dts: sun8i: a83t: Enable both " Greg Kroah-Hartman
2020-11-23 12:21 ` [PATCH 4.19 39/91] arm64: dts: allwinner: a64: bananapi-m64: Enable RGMII RX/TX delay on PHY Greg Kroah-Hartman
2020-11-23 12:22 ` [PATCH 4.19 40/91] Input: adxl34x - clean up a data type in adxl34x_probe() Greg Kroah-Hartman
2020-11-23 12:22 ` [PATCH 4.19 41/91] MIPS: export has_transparent_hugepage() for modules Greg Kroah-Hartman
2020-11-23 12:22 ` [PATCH 4.19 42/91] arm64: dts: allwinner: h5: OrangePi Prime: Fix ethernet node Greg Kroah-Hartman
2020-11-23 12:22 ` [PATCH 4.19 43/91] arm: dts: imx6qdl-udoo: fix rgmii phy-mode for ksz9031 phy Greg Kroah-Hartman
2020-11-23 12:22 ` [PATCH 4.19 44/91] ARM: dts: imx50-evk: Fix the chip select 1 IOMUX Greg Kroah-Hartman
2020-11-23 12:22 ` [PATCH 4.19 45/91] Input: resistive-adc-touch - fix kconfig dependency on IIO_BUFFER Greg Kroah-Hartman
2020-11-23 12:22 ` [PATCH 4.19 46/91] perf lock: Dont free "lock_seq_stat" if read_count isnt zero Greg Kroah-Hartman
2020-11-23 12:22 ` [PATCH 4.19 47/91] ip_tunnels: Set tunnel option flag when tunnel metadata is present Greg Kroah-Hartman
2020-11-23 12:22 ` [PATCH 4.19 48/91] can: af_can: prevent potential access of uninitialized member in can_rcv() Greg Kroah-Hartman
2020-11-23 12:22 ` [PATCH 4.19 49/91] can: af_can: prevent potential access of uninitialized member in canfd_rcv() Greg Kroah-Hartman
2020-11-23 12:22 ` [PATCH 4.19 50/91] can: dev: can_restart(): post buffer from the right context Greg Kroah-Hartman
2020-11-23 12:22 ` [PATCH 4.19 51/91] can: ti_hecc: Fix memleak in ti_hecc_probe Greg Kroah-Hartman
2020-11-24 22:52   ` Pavel Machek
2020-11-23 12:22 ` [PATCH 4.19 52/91] can: mcba_usb: mcba_usb_start_xmit(): first fill skb, then pass to can_put_echo_skb() Greg Kroah-Hartman
2020-11-23 12:22 ` [PATCH 4.19 53/91] can: peak_usb: fix potential integer overflow on shift of a int Greg Kroah-Hartman
2020-11-23 12:22 ` [PATCH 4.19 54/91] can: m_can: m_can_handle_state_change(): fix state change Greg Kroah-Hartman
2020-11-23 12:22 ` [PATCH 4.19 55/91] ASoC: qcom: lpass-platform: Fix memory leak Greg Kroah-Hartman
2020-11-23 12:22 ` [PATCH 4.19 56/91] MIPS: Alchemy: Fix memleak in alchemy_clk_setup_cpu Greg Kroah-Hartman
2020-11-23 12:22 ` [PATCH 4.19 57/91] drm/sun4i: dw-hdmi: fix error return code in sun8i_dw_hdmi_bind() Greg Kroah-Hartman
2020-11-23 12:22 ` [PATCH 4.19 58/91] can: kvaser_usb: kvaser_usb_hydra: Fix KCAN bittiming limits Greg Kroah-Hartman
2020-11-23 12:22 ` [PATCH 4.19 59/91] xfs: fix the minrecs logic when dealing with inode root child blocks Greg Kroah-Hartman
2020-11-23 12:22 ` [PATCH 4.19 60/91] xfs: strengthen rmap record flags checking Greg Kroah-Hartman
2020-11-23 12:22 ` [PATCH 4.19 61/91] regulator: ti-abb: Fix array out of bound read access on the first transition Greg Kroah-Hartman
2020-11-23 12:22 ` [PATCH 4.19 62/91] fail_function: Remove a redundant mutex unlock Greg Kroah-Hartman
2020-11-23 12:22 ` [PATCH 4.19 63/91] xfs: revert "xfs: fix rmap key and record comparison functions" Greg Kroah-Hartman
2020-11-23 12:22 ` [PATCH 4.19 64/91] efi/x86: Free efi_pgd with free_pages() Greg Kroah-Hartman
2020-11-23 12:22 ` [PATCH 4.19 65/91] libfs: fix error cast of negative value in simple_attr_write() Greg Kroah-Hartman
2020-11-23 12:22 ` [PATCH 4.19 66/91] speakup: Do not let the line discipline be used several times Greg Kroah-Hartman
2020-11-23 12:22 ` [PATCH 4.19 67/91] ALSA: firewire: Clean up a locking issue in copy_resp_to_buf() Greg Kroah-Hartman
2020-11-23 12:22 ` [PATCH 4.19 68/91] ALSA: usb-audio: Add delay quirk for all Logitech USB devices Greg Kroah-Hartman
2020-11-23 12:22 ` [PATCH 4.19 69/91] ALSA: ctl: fix error path at adding user-defined element set Greg Kroah-Hartman
2020-11-23 12:22 ` [PATCH 4.19 70/91] ALSA: mixart: Fix mutex deadlock Greg Kroah-Hartman
2020-11-23 12:22 ` [PATCH 4.19 71/91] ALSA: hda/realtek: Add some Clove SSID in the ALC293(ALC1220) Greg Kroah-Hartman
2020-11-23 12:22 ` [PATCH 4.19 72/91] tty: serial: imx: keep console clocks always on Greg Kroah-Hartman
2020-11-23 12:22 ` [PATCH 4.19 73/91] efivarfs: fix memory leak in efivarfs_create() Greg Kroah-Hartman
2020-11-23 12:22 ` [PATCH 4.19 74/91] staging: rtl8723bs: Add 024c:0627 to the list of SDIO device-ids Greg Kroah-Hartman
2020-11-23 12:22 ` [PATCH 4.19 75/91] ext4: fix bogus warning in ext4_update_dx_flag() Greg Kroah-Hartman
2020-11-23 12:22 ` [PATCH 4.19 76/91] iio: accel: kxcjk1013: Replace is_smo8500_device with an acpi_type enum Greg Kroah-Hartman
2020-11-23 12:22 ` [PATCH 4.19 77/91] iio: accel: kxcjk1013: Add support for KIOX010A ACPI DSM for setting tablet-mode Greg Kroah-Hartman
2020-11-23 12:22 ` [PATCH 4.19 78/91] regulator: pfuze100: limit pfuze-support-disable-sw to pfuze{100,200} Greg Kroah-Hartman
2020-11-23 12:22 ` [PATCH 4.19 79/91] regulator: fix memory leak with repeated set_machine_constraints() Greg Kroah-Hartman
2020-11-23 12:22 ` [PATCH 4.19 80/91] regulator: avoid resolve_supply() infinite recursion Greg Kroah-Hartman
2020-11-23 12:22 ` [PATCH 4.19 81/91] regulator: workaround self-referent regulators Greg Kroah-Hartman
2020-11-23 12:22 ` [PATCH 4.19 82/91] xtensa: disable preemption around cache alias management calls Greg Kroah-Hartman
2020-11-23 12:22 ` [PATCH 4.19 83/91] mac80211: minstrel: remove deferred sampling code Greg Kroah-Hartman
2020-11-23 12:22 ` [PATCH 4.19 84/91] mac80211: minstrel: fix tx status processing corner case Greg Kroah-Hartman
2020-11-23 12:22 ` [PATCH 4.19 85/91] mac80211: free sta in sta_info_insert_finish() on errors Greg Kroah-Hartman
2020-11-23 12:22 ` [PATCH 4.19 86/91] s390/cpum_sf.c: fix file permission for cpum_sfb_size Greg Kroah-Hartman
2020-11-23 12:22 ` [PATCH 4.19 87/91] s390/dasd: fix null pointer dereference for ERP requests Greg Kroah-Hartman
2020-11-23 12:22 ` [PATCH 4.19 88/91] ptrace: Set PF_SUPERPRIV when checking capability Greg Kroah-Hartman
2020-11-23 12:22 ` [PATCH 4.19 89/91] seccomp: " Greg Kroah-Hartman
2020-11-23 12:22 ` [PATCH 4.19 90/91] x86/microcode/intel: Check patch signature before saving microcode for early loading Greg Kroah-Hartman
2020-11-23 12:22 ` [PATCH 4.19 91/91] mm/userfaultfd: do not access vma->vm_mm after calling handle_userfault() Greg Kroah-Hartman
2020-11-23 20:54 ` [PATCH 4.19 00/91] 4.19.160-rc1 review Jon Hunter
2020-11-23 22:36 ` Guenter Roeck
2020-11-24  0:32 ` Shuah Khan
2020-11-24  6:55 ` Naresh Kamboju
2020-11-24 19:53 ` Pavel Machek
2020-11-24 20:27   ` Greg Kroah-Hartman

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20201123121810.885615538@linuxfoundation.org \
    --to=gregkh@linuxfoundation.org \
    --cc=darrick.wong@oracle.com \
    --cc=hch@lst.de \
    --cc=jack@suse.cz \
    --cc=linux-kernel@vger.kernel.org \
    --cc=sashal@kernel.org \
    --cc=stable@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).