All of lore.kernel.org
 help / color / mirror / Atom feed
From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: linux-kernel@vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	stable@vger.kernel.org, Boris Burkov <boris@bur.io>,
	David Sterba <dsterba@suse.com>, Sasha Levin <sashal@kernel.org>
Subject: [PATCH 4.9 14/61] btrfs: fix mount failure caused by race with umount
Date: Thu, 30 Jul 2020 10:04:32 +0200	[thread overview]
Message-ID: <20200730074421.512942134@linuxfoundation.org> (raw)
In-Reply-To: <20200730074420.811058810@linuxfoundation.org>

From: Boris Burkov <boris@bur.io>

[ Upstream commit 48cfa61b58a1fee0bc49eef04f8ccf31493b7cdd ]

It is possible to cause a btrfs mount to fail by racing it with a slow
umount. The crux of the sequence is generic_shutdown_super not yet
calling sop->put_super before btrfs_mount_root calls btrfs_open_devices.
If that occurs, btrfs_open_devices will decide the opened counter is
non-zero, increment it, and skip resetting fs_devices->total_rw_bytes to
0. From here, mount will call sget which will result in grab_super
trying to take the super block umount semaphore. That semaphore will be
held by the slow umount, so mount will block. Before up-ing the
semaphore, umount will delete the super block, resulting in mount's sget
reliably allocating a new one, which causes the mount path to dutifully
fill it out, and increment total_rw_bytes a second time, which causes
the mount to fail, as we see double the expected bytes.

Here is the sequence laid out in greater detail:

CPU0                                                    CPU1
down_write sb->s_umount
btrfs_kill_super
  kill_anon_super(sb)
    generic_shutdown_super(sb);
      shrink_dcache_for_umount(sb);
      sync_filesystem(sb);
      evict_inodes(sb); // SLOW

                                              btrfs_mount_root
                                                btrfs_scan_one_device
                                                fs_devices = device->fs_devices
                                                fs_info->fs_devices = fs_devices
                                                // fs_devices-opened makes this a no-op
                                                btrfs_open_devices(fs_devices, mode, fs_type)
                                                s = sget(fs_type, test, set, flags, fs_info);
                                                  find sb in s_instances
                                                  grab_super(sb);
                                                    down_write(&s->s_umount); // blocks

      sop->put_super(sb)
        // sb->fs_devices->opened == 2; no-op
      spin_lock(&sb_lock);
      hlist_del_init(&sb->s_instances);
      spin_unlock(&sb_lock);
      up_write(&sb->s_umount);
                                                    return 0;
                                                  retry lookup
                                                  don't find sb in s_instances (deleted by CPU0)
                                                  s = alloc_super
                                                  return s;
                                                btrfs_fill_super(s, fs_devices, data)
                                                  open_ctree // fs_devices total_rw_bytes improperly set!
                                                    btrfs_read_chunk_tree
                                                      read_one_dev // increment total_rw_bytes again!!
                                                      super_total_bytes < fs_devices->total_rw_bytes // ERROR!!!

To fix this, we clear total_rw_bytes from within btrfs_read_chunk_tree
before the calls to read_one_dev, while holding the sb umount semaphore
and the uuid mutex.

To reproduce, it is sufficient to dirty a decent number of inodes, then
quickly umount and mount.

  for i in $(seq 0 500)
  do
    dd if=/dev/zero of="/mnt/foo/$i" bs=1M count=1
  done
  umount /mnt/foo&
  mount /mnt/foo

does the trick for me.

CC: stable@vger.kernel.org # 4.4+
Signed-off-by: Boris Burkov <boris@bur.io>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 fs/btrfs/volumes.c | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
index 70aa22a8a9cce..bace03a546b2d 100644
--- a/fs/btrfs/volumes.c
+++ b/fs/btrfs/volumes.c
@@ -6854,6 +6854,14 @@ int btrfs_read_chunk_tree(struct btrfs_root *root)
 	mutex_lock(&uuid_mutex);
 	lock_chunks(root);
 
+	/*
+	 * It is possible for mount and umount to race in such a way that
+	 * we execute this code path, but open_fs_devices failed to clear
+	 * total_rw_bytes. We certainly want it cleared before reading the
+	 * device items, so clear it here.
+	 */
+	root->fs_info->fs_devices->total_rw_bytes = 0;
+
 	/*
 	 * Read all device items, and then all the chunk items. All
 	 * device items are found before any chunk item (their object id
-- 
2.25.1




  parent reply	other threads:[~2020-07-30  8:17 UTC|newest]

Thread overview: 65+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-07-30  8:04 [PATCH 4.9 00/61] 4.9.232-rc1 review Greg Kroah-Hartman
2020-07-30  8:04 ` [PATCH 4.9 01/61] pinctrl: amd: fix npins for uart0 in kerncz_groups Greg Kroah-Hartman
2020-07-30  8:04 ` [PATCH 4.9 02/61] mac80211: allow rx of mesh eapol frames with default rx key Greg Kroah-Hartman
2020-07-30  8:04 ` [PATCH 4.9 03/61] scsi: scsi_transport_spi: Fix function pointer check Greg Kroah-Hartman
2020-07-30  8:04 ` [PATCH 4.9 04/61] xtensa: fix __sync_fetch_and_{and,or}_4 declarations Greg Kroah-Hartman
2020-07-30  8:04 ` [PATCH 4.9 05/61] xtensa: update *pos in cpuinfo_op.next Greg Kroah-Hartman
2020-07-30  8:04 ` [PATCH 4.9 06/61] drivers/net/wan/lapbether: Fixed the value of hard_header_len Greg Kroah-Hartman
2020-07-30  8:04 ` [PATCH 4.9 07/61] net: sky2: initialize return of gm_phy_read Greg Kroah-Hartman
2020-07-30  8:04 ` [PATCH 4.9 08/61] drm/nouveau/i2c/g94-: increase NV_PMGR_DP_AUXCTL_TRANSACTREQ timeout Greg Kroah-Hartman
2020-07-30  8:04 ` [PATCH 4.9 09/61] SUNRPC reverting d03727b248d0 ("NFSv4 fix CLOSE not waiting for direct IO compeletion") Greg Kroah-Hartman
2020-07-30  8:04 ` [PATCH 4.9 10/61] uprobes: Change handle_swbp() to send SIGTRAP with si_code=SI_KERNEL, to fix GDB regression Greg Kroah-Hartman
2020-07-30  8:04 ` [PATCH 4.9 11/61] ALSA: info: Drop WARN_ON() from buffer NULL sanity check Greg Kroah-Hartman
2020-07-30  8:04 ` [PATCH 4.9 12/61] ASoC: rt5670: Correct RT5670_LDO_SEL_MASK Greg Kroah-Hartman
2020-07-30  8:04 ` [PATCH 4.9 13/61] btrfs: fix double free on ulist after backref resolution failure Greg Kroah-Hartman
2020-07-30  8:04 ` Greg Kroah-Hartman [this message]
2020-07-30  8:04 ` [PATCH 4.9 15/61] bnxt_en: Fix race when modifying pause settings Greg Kroah-Hartman
2020-07-30  8:04 ` [PATCH 4.9 16/61] hippi: Fix a size used in a pci_free_consistent() in an error handling path Greg Kroah-Hartman
2020-07-30  8:04 ` [PATCH 4.9 17/61] ax88172a: fix ax88172a_unbind() failures Greg Kroah-Hartman
2020-07-30  8:04 ` [PATCH 4.9 18/61] net: dp83640: fix SIOCSHWTSTAMP to update the struct with actual configuration Greg Kroah-Hartman
2020-07-30  8:04 ` [PATCH 4.9 19/61] net: smc91x: Fix possible memory leak in smc_drv_probe() Greg Kroah-Hartman
2020-07-30  8:04 ` [PATCH 4.9 20/61] scripts/decode_stacktrace: strip basepath from all paths Greg Kroah-Hartman
2020-07-30  8:04 ` [PATCH 4.9 21/61] HID: i2c-hid: add Mediacom FlexBook edge13 to descriptor override Greg Kroah-Hartman
2020-07-30  8:04 ` [PATCH 4.9 22/61] HID: apple: Disable Fn-key key-re-mapping on clone keyboards Greg Kroah-Hartman
2020-07-30  8:04 ` [PATCH 4.9 23/61] dmaengine: tegra210-adma: Fix runtime PM imbalance on error Greg Kroah-Hartman
2020-07-30  8:04 ` [PATCH 4.9 24/61] regmap: dev_get_regmap_match(): fix string comparison Greg Kroah-Hartman
2020-07-30  8:04 ` [PATCH 4.9 25/61] dmaengine: ioat setting ioat timeout as module parameter Greg Kroah-Hartman
2020-07-30  8:04 ` [PATCH 4.9 26/61] usb: gadget: udc: gr_udc: fix memleak on error handling path in gr_ep_init() Greg Kroah-Hartman
2020-07-30  8:04 ` [PATCH 4.9 27/61] arm64: Use test_tsk_thread_flag() for checking TIF_SINGLESTEP Greg Kroah-Hartman
2020-07-30  8:04 ` [PATCH 4.9 28/61] x86: math-emu: Fix up cmp insn for clang ias Greg Kroah-Hartman
2020-07-30  8:04 ` [PATCH 4.9 29/61] usb: xhci-mtk: fix the failure of bandwidth allocation Greg Kroah-Hartman
2020-07-30  8:04 ` [PATCH 4.9 30/61] usb: xhci: Fix ASM2142/ASM3142 DMA addressing Greg Kroah-Hartman
2020-07-30  8:04 ` [PATCH 4.9 31/61] Revert "cifs: Fix the target file was deleted when rename failed." Greg Kroah-Hartman
2020-07-30  8:04 ` [PATCH 4.9 32/61] staging: wlan-ng: properly check endpoint types Greg Kroah-Hartman
2020-07-30  8:04 ` [PATCH 4.9 33/61] staging: comedi: addi_apci_1032: check INSN_CONFIG_DIGITAL_TRIG shift Greg Kroah-Hartman
2020-07-30  8:04 ` [PATCH 4.9 34/61] staging: comedi: ni_6527: fix INSN_CONFIG_DIGITAL_TRIG support Greg Kroah-Hartman
2020-07-30  8:04 ` [PATCH 4.9 35/61] staging: comedi: addi_apci_1500: check INSN_CONFIG_DIGITAL_TRIG shift Greg Kroah-Hartman
2020-07-30  8:04 ` [PATCH 4.9 36/61] staging: comedi: addi_apci_1564: " Greg Kroah-Hartman
2020-07-30  8:04 ` [PATCH 4.9 37/61] serial: 8250: fix null-ptr-deref in serial8250_start_tx() Greg Kroah-Hartman
2020-07-30  8:04 ` [PATCH 4.9 38/61] serial: 8250_mtk: Fix high-speed baud rates clamping Greg Kroah-Hartman
2020-07-30  8:04 ` [PATCH 4.9 39/61] vt: Reject zero-sized screen buffer size Greg Kroah-Hartman
2020-07-30  8:04 ` [PATCH 4.9 40/61] Makefile: Fix GCC_TOOLCHAIN_DIR prefix for Clang cross compilation Greg Kroah-Hartman
2020-07-30  8:04 ` [PATCH 4.9 41/61] mm/memcg: fix refcount error while moving and swapping Greg Kroah-Hartman
2020-07-30  8:05 ` [PATCH 4.9 42/61] io-mapping: indicate mapping failure Greg Kroah-Hartman
2020-07-30  8:05 ` [PATCH 4.9 43/61] parisc: Add atomic64_set_release() define to avoid CPU soft lockups Greg Kroah-Hartman
2020-07-30  8:05 ` [PATCH 4.9 44/61] ath9k: Fix general protection fault in ath9k_hif_usb_rx_cb Greg Kroah-Hartman
2020-07-30  8:05 ` [PATCH 4.9 45/61] ath9k: Fix regression with Atheros 9271 Greg Kroah-Hartman
2020-07-30  8:05 ` [PATCH 4.9 46/61] AX.25: Fix out-of-bounds read in ax25_connect() Greg Kroah-Hartman
2020-07-30  8:05 ` [PATCH 4.9 47/61] AX.25: Prevent out-of-bounds read in ax25_sendmsg() Greg Kroah-Hartman
2020-07-30  8:05 ` [PATCH 4.9 48/61] dev: Defer free of skbs in flush_backlog Greg Kroah-Hartman
2020-07-30  8:05 ` [PATCH 4.9 49/61] net-sysfs: add a newline when printing tx_timeout by sysfs Greg Kroah-Hartman
2020-07-30  8:05 ` [PATCH 4.9 50/61] net: udp: Fix wrong clean up for IS_UDPLITE macro Greg Kroah-Hartman
2020-07-30  8:05 ` [PATCH 4.9 51/61] rxrpc: Fix sendmsg() returning EPIPE due to recvmsg() returning ENODATA Greg Kroah-Hartman
2020-07-30  8:05 ` [PATCH 4.9 52/61] AX.25: Prevent integer overflows in connect and sendmsg Greg Kroah-Hartman
2020-07-30  8:05 ` [PATCH 4.9 53/61] tcp: allow at most one TLP probe per flight Greg Kroah-Hartman
2020-07-30  8:05 ` [PATCH 4.9 54/61] ip6_gre: fix null-ptr-deref in ip6gre_init_net() Greg Kroah-Hartman
2020-07-30  8:05 ` [PATCH 4.9 55/61] drivers/net/wan/x25_asy: Fix to make it work Greg Kroah-Hartman
2020-07-30  8:05 ` [PATCH 4.9 56/61] regmap: debugfs: check count when read regmap file Greg Kroah-Hartman
2020-07-30  8:05 ` [PATCH 4.9 57/61] xfs: set format back to extents if xfs_bmap_extents_to_btree Greg Kroah-Hartman
2020-07-30  8:05 ` [PATCH 4.9 58/61] perf probe: Fix to check blacklist address correctly Greg Kroah-Hartman
2020-07-30  8:05 ` [PATCH 4.9 59/61] perf annotate: Use asprintf when formatting objdump command line Greg Kroah-Hartman
2020-07-30  8:05 ` [PATCH 4.9 60/61] perf tools: Fix snprint warnings for gcc 8 Greg Kroah-Hartman
2020-07-30  8:05 ` [PATCH 4.9 61/61] perf: Make perf able to build with latest libbfd Greg Kroah-Hartman
2020-07-30 16:46 ` [PATCH 4.9 00/61] 4.9.232-rc1 review Guenter Roeck
2020-07-31 12:41 ` Jon Hunter
2020-07-31 12:43 ` Naresh Kamboju

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200730074421.512942134@linuxfoundation.org \
    --to=gregkh@linuxfoundation.org \
    --cc=boris@bur.io \
    --cc=dsterba@suse.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=sashal@kernel.org \
    --cc=stable@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.