All of lore.kernel.org
 help / color / mirror / Atom feed
From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: linux-kernel@vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	stable@vger.kernel.org, Filipe Manana <fdmanana@suse.com>,
	Josef Bacik <josef@toxicpanda.com>,
	David Sterba <dsterba@suse.com>
Subject: [PATCH 5.4 16/61] btrfs: fix possible free space tree corruption with online conversion
Date: Tue,  2 Feb 2021 14:37:54 +0100	[thread overview]
Message-ID: <20210202132947.155342102@linuxfoundation.org> (raw)
In-Reply-To: <20210202132946.480479453@linuxfoundation.org>

From: Josef Bacik <josef@toxicpanda.com>

commit 2f96e40212d435b328459ba6b3956395eed8fa9f upstream.

While running btrfs/011 in a loop I would often ASSERT() while trying to
add a new free space entry that already existed, or get an EEXIST while
adding a new block to the extent tree, which is another indication of
double allocation.

This occurs because when we do the free space tree population, we create
the new root and then populate the tree and commit the transaction.
The problem is when you create a new root, the root node and commit root
node are the same.  During this initial transaction commit we will run
all of the delayed refs that were paused during the free space tree
generation, and thus begin to cache block groups.  While caching block
groups the caching thread will be reading from the main root for the
free space tree, so as we make allocations we'll be changing the free
space tree, which can cause us to add the same range twice which results
in either the ASSERT(ret != -EEXIST); in __btrfs_add_free_space, or in a
variety of different errors when running delayed refs because of a
double allocation.

Fix this by marking the fs_info as unsafe to load the free space tree,
and fall back on the old slow method.  We could be smarter than this,
for example caching the block group while we're populating the free
space tree, but since this is a serious problem I've opted for the
simplest solution.

CC: stable@vger.kernel.org # 4.9+
Fixes: a5ed91828518 ("Btrfs: implement the free space B-tree")
Reviewed-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 fs/btrfs/block-group.c     |   10 +++++++++-
 fs/btrfs/ctree.h           |    3 +++
 fs/btrfs/free-space-tree.c |   10 +++++++++-
 3 files changed, 21 insertions(+), 2 deletions(-)

--- a/fs/btrfs/block-group.c
+++ b/fs/btrfs/block-group.c
@@ -640,7 +640,15 @@ static noinline void caching_thread(stru
 	mutex_lock(&caching_ctl->mutex);
 	down_read(&fs_info->commit_root_sem);
 
-	if (btrfs_fs_compat_ro(fs_info, FREE_SPACE_TREE))
+	/*
+	 * If we are in the transaction that populated the free space tree we
+	 * can't actually cache from the free space tree as our commit root and
+	 * real root are the same, so we could change the contents of the blocks
+	 * while caching.  Instead do the slow caching in this case, and after
+	 * the transaction has committed we will be safe.
+	 */
+	if (btrfs_fs_compat_ro(fs_info, FREE_SPACE_TREE) &&
+	    !(test_bit(BTRFS_FS_FREE_SPACE_TREE_UNTRUSTED, &fs_info->flags)))
 		ret = load_free_space_tree(caching_ctl);
 	else
 		ret = load_extent_tree_free(caching_ctl);
--- a/fs/btrfs/ctree.h
+++ b/fs/btrfs/ctree.h
@@ -136,6 +136,9 @@ enum {
 	BTRFS_FS_STATE_DEV_REPLACING,
 	/* The btrfs_fs_info created for self-tests */
 	BTRFS_FS_STATE_DUMMY_FS_INFO,
+
+	/* Indicate that we can't trust the free space tree for caching yet */
+	BTRFS_FS_FREE_SPACE_TREE_UNTRUSTED,
 };
 
 #define BTRFS_BACKREF_REV_MAX		256
--- a/fs/btrfs/free-space-tree.c
+++ b/fs/btrfs/free-space-tree.c
@@ -1149,6 +1149,7 @@ int btrfs_create_free_space_tree(struct
 		return PTR_ERR(trans);
 
 	set_bit(BTRFS_FS_CREATING_FREE_SPACE_TREE, &fs_info->flags);
+	set_bit(BTRFS_FS_FREE_SPACE_TREE_UNTRUSTED, &fs_info->flags);
 	free_space_root = btrfs_create_tree(trans,
 					    BTRFS_FREE_SPACE_TREE_OBJECTID);
 	if (IS_ERR(free_space_root)) {
@@ -1170,11 +1171,18 @@ int btrfs_create_free_space_tree(struct
 	btrfs_set_fs_compat_ro(fs_info, FREE_SPACE_TREE);
 	btrfs_set_fs_compat_ro(fs_info, FREE_SPACE_TREE_VALID);
 	clear_bit(BTRFS_FS_CREATING_FREE_SPACE_TREE, &fs_info->flags);
+	ret = btrfs_commit_transaction(trans);
 
-	return btrfs_commit_transaction(trans);
+	/*
+	 * Now that we've committed the transaction any reading of our commit
+	 * root will be safe, so we can cache from the free space tree now.
+	 */
+	clear_bit(BTRFS_FS_FREE_SPACE_TREE_UNTRUSTED, &fs_info->flags);
+	return ret;
 
 abort:
 	clear_bit(BTRFS_FS_CREATING_FREE_SPACE_TREE, &fs_info->flags);
+	clear_bit(BTRFS_FS_FREE_SPACE_TREE_UNTRUSTED, &fs_info->flags);
 	btrfs_abort_transaction(trans, ret);
 	btrfs_end_transaction(trans);
 	return ret;



  parent reply	other threads:[~2021-02-02 19:22 UTC|newest]

Thread overview: 66+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-02-02 13:37 [PATCH 5.4 00/61] 5.4.95-rc1 review Greg Kroah-Hartman
2021-02-02 13:37 ` [PATCH 5.4 01/61] ICMPv6: Add ICMPv6 Parameter Problem, code 3 definition Greg Kroah-Hartman
2021-02-02 13:37 ` [PATCH 5.4 02/61] IPv6: reply ICMP error if the first fragment dont include all headers Greg Kroah-Hartman
2021-02-02 13:37 ` [PATCH 5.4 03/61] nbd: freeze the queue while were adding connections Greg Kroah-Hartman
2021-02-02 13:37 ` [PATCH 5.4 04/61] ACPI: sysfs: Prefer "compatible" modalias Greg Kroah-Hartman
2021-02-02 13:37 ` [PATCH 5.4 05/61] kernel: kexec: remove the lock operation of system_transition_mutex Greg Kroah-Hartman
2021-02-02 13:37 ` [PATCH 5.4 06/61] ALSA: hda/realtek: Enable headset of ASUS B1400CEPE with ALC256 Greg Kroah-Hartman
2021-02-02 13:37 ` [PATCH 5.4 07/61] ALSA: hda/via: Apply the workaround generically for Clevo machines Greg Kroah-Hartman
2021-02-02 13:37 ` [PATCH 5.4 08/61] media: rc: ensure that uevent can be read directly after rc device register Greg Kroah-Hartman
2021-02-02 13:37 ` [PATCH 5.4 09/61] ARM: dts: imx6qdl-gw52xx: fix duplicate regulator naming Greg Kroah-Hartman
2021-02-02 13:37 ` [PATCH 5.4 10/61] wext: fix NULL-ptr-dereference with cfg80211s lack of commit() Greg Kroah-Hartman
2021-02-02 13:37 ` [PATCH 5.4 11/61] net: usb: qmi_wwan: added support for Thales Cinterion PLSx3 modem family Greg Kroah-Hartman
2021-02-02 13:37 ` [PATCH 5.4 12/61] s390/vfio-ap: No need to disable IRQ after queue reset Greg Kroah-Hartman
2021-02-02 13:37 ` [PATCH 5.4 13/61] PM: hibernate: flush swap writer after marking Greg Kroah-Hartman
2021-02-02 13:37 ` [PATCH 5.4 14/61] drivers: soc: atmel: Avoid calling at91_soc_init on non AT91 SoCs Greg Kroah-Hartman
2021-02-02 13:37 ` [PATCH 5.4 15/61] drivers: soc: atmel: add null entry at the end of at91_soc_allowed_list[] Greg Kroah-Hartman
2021-02-02 13:37 ` Greg Kroah-Hartman [this message]
2021-02-02 13:37 ` [PATCH 5.4 17/61] KVM: x86/pmu: Fix HW_REF_CPU_CYCLES event pseudo-encoding in intel_arch_events[] Greg Kroah-Hartman
2021-02-02 13:37 ` [PATCH 5.4 18/61] KVM: x86/pmu: Fix UBSAN shift-out-of-bounds warning in intel_pmu_refresh() Greg Kroah-Hartman
2021-02-02 13:37 ` [PATCH 5.4 19/61] KVM: nVMX: Sync unsyncd vmcs02 state to vmcs12 on migration Greg Kroah-Hartman
2021-02-02 13:37 ` [PATCH 5.4 20/61] KVM: x86: get smi pending status correctly Greg Kroah-Hartman
2021-02-02 13:37 ` [PATCH 5.4 21/61] KVM: Forbid the use of tagged userspace addresses for memslots Greg Kroah-Hartman
2021-02-02 13:38 ` [PATCH 5.4 22/61] xen: Fix XenStore initialisation for XS_LOCAL Greg Kroah-Hartman
2021-02-02 13:38 ` [PATCH 5.4 23/61] leds: trigger: fix potential deadlock with libata Greg Kroah-Hartman
2021-02-02 13:38 ` [PATCH 5.4 24/61] arm64: dts: broadcom: Fix USB DMA address translation for Stingray Greg Kroah-Hartman
2021-02-02 13:38 ` [PATCH 5.4 25/61] mt7601u: fix kernel crash unplugging the device Greg Kroah-Hartman
2021-02-02 13:38 ` [PATCH 5.4 26/61] mt7601u: fix rx buffer refcounting Greg Kroah-Hartman
2021-02-02 13:38 ` [PATCH 5.4 27/61] drm/nouveau/svm: fail NOUVEAU_SVM_INIT ioctl on unsupported devices Greg Kroah-Hartman
2021-02-02 13:38 ` [PATCH 5.4 28/61] drm/i915: Check for all subplatform bits Greg Kroah-Hartman
2021-02-02 13:38 ` [PATCH 5.4 29/61] tee: optee: replace might_sleep with cond_resched Greg Kroah-Hartman
2021-02-02 13:38 ` [PATCH 5.4 30/61] xen-blkfront: allow discard-* nodes to be optional Greg Kroah-Hartman
2021-02-02 13:38 ` [PATCH 5.4 31/61] ARM: imx: build suspend-imx6.S with arm instruction set Greg Kroah-Hartman
2021-02-02 13:38 ` [PATCH 5.4 32/61] netfilter: nft_dynset: add timeout extension to template Greg Kroah-Hartman
2021-02-02 13:38 ` [PATCH 5.4 33/61] xfrm: Fix oops in xfrm_replay_advance_bmp Greg Kroah-Hartman
2021-02-02 13:38 ` [PATCH 5.4 34/61] xfrm: fix disable_xfrm sysctl when used on xfrm interfaces Greg Kroah-Hartman
2021-02-02 13:38 ` [PATCH 5.4 35/61] selftests: xfrm: fix test return value override issue in xfrm_policy.sh Greg Kroah-Hartman
2021-02-02 13:38 ` [PATCH 5.4 36/61] xfrm: Fix wraparound in xfrm_policy_addr_delta() Greg Kroah-Hartman
2021-02-02 13:38 ` [PATCH 5.4 37/61] arm64: dts: ls1028a: fix the offset of the reset register Greg Kroah-Hartman
2021-02-02 13:38 ` [PATCH 5.4 38/61] ARM: dts: imx6qdl-kontron-samx6i: fix i2c_lcd/cam default status Greg Kroah-Hartman
2021-02-02 13:38 ` [PATCH 5.4 39/61] firmware: imx: select SOC_BUS to fix firmware build Greg Kroah-Hartman
2021-02-02 13:38 ` [PATCH 5.4 40/61] RDMA/cxgb4: Fix the reported max_recv_sge value Greg Kroah-Hartman
2021-02-02 13:38 ` [PATCH 5.4 41/61] ASoC: Intel: Skylake: skl-topology: Fix OOPs ib skl_tplg_complete Greg Kroah-Hartman
2021-02-02 13:38 ` [PATCH 5.4 42/61] pNFS/NFSv4: Fix a layout segment leak in pnfs_layout_process() Greg Kroah-Hartman
2021-02-02 13:38 ` [PATCH 5.4 43/61] iwlwifi: pcie: use jiffies for memory read spin time limit Greg Kroah-Hartman
2021-02-02 13:38 ` [PATCH 5.4 44/61] iwlwifi: pcie: reschedule in long-running memory reads Greg Kroah-Hartman
2021-02-02 13:38 ` [PATCH 5.4 45/61] mac80211: pause TX while changing interface type Greg Kroah-Hartman
2021-02-02 13:38 ` [PATCH 5.4 46/61] i40e: acquire VSI pointer only after VF is initialized Greg Kroah-Hartman
2021-02-02 13:38 ` [PATCH 5.4 47/61] igc: fix link speed advertising Greg Kroah-Hartman
2021-02-02 13:38 ` [PATCH 5.4 48/61] net/mlx5: Fix memory leak on flow table creation error flow Greg Kroah-Hartman
2021-02-02 13:38 ` [PATCH 5.4 49/61] net/mlx5e: E-switch, Fix rate calculation for overflow Greg Kroah-Hartman
2021-02-02 13:38 ` [PATCH 5.4 50/61] net/mlx5e: Reduce tc unsupported key print level Greg Kroah-Hartman
2021-02-02 13:38 ` [PATCH 5.4 51/61] can: dev: prevent potential information leak in can_fill_info() Greg Kroah-Hartman
2021-02-02 13:38 ` [PATCH 5.4 52/61] nvme-multipath: Early exit if no path is available Greg Kroah-Hartman
2021-02-02 13:38 ` [PATCH 5.4 53/61] selftests: forwarding: Specify interface when invoking mausezahn Greg Kroah-Hartman
2021-02-02 13:38 ` [PATCH 5.4 54/61] iommu/vt-d: Gracefully handle DMAR units with no supported address widths Greg Kroah-Hartman
2021-02-02 13:38 ` [PATCH 5.4 55/61] iommu/vt-d: Dont dereference iommu_device if IOMMU_API is not built Greg Kroah-Hartman
2021-02-02 13:38 ` [PATCH 5.4 56/61] rxrpc: Fix memory leak in rxrpc_lookup_local Greg Kroah-Hartman
2021-02-02 13:38 ` [PATCH 5.4 57/61] NFC: fix resource leak when target index is invalid Greg Kroah-Hartman
2021-02-02 13:38 ` [PATCH 5.4 58/61] NFC: fix possible resource leak Greg Kroah-Hartman
2021-02-02 13:38 ` [PATCH 5.4 59/61] ASoC: topology: Fix memory corruption in soc_tplg_denum_create_values() Greg Kroah-Hartman
2021-02-02 13:38 ` [PATCH 5.4 60/61] team: protect features update by RCU to avoid deadlock Greg Kroah-Hartman
2021-02-02 13:38 ` [PATCH 5.4 61/61] tcp: fix TLP timer not set when CA_STATE changes from DISORDER to OPEN Greg Kroah-Hartman
2021-02-02 20:21 ` [PATCH 5.4 00/61] 5.4.95-rc1 review Jon Hunter
2021-02-03  3:16 ` Naresh Kamboju
2021-02-03 15:38 ` Shuah Khan
2021-02-03 20:42 ` Guenter Roeck

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210202132947.155342102@linuxfoundation.org \
    --to=gregkh@linuxfoundation.org \
    --cc=dsterba@suse.com \
    --cc=fdmanana@suse.com \
    --cc=josef@toxicpanda.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=stable@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.