All of lore.kernel.org
 help / color / mirror / Atom feed
From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: stable@vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	patches@lists.linux.dev, Filipe Manana <fdmanana@suse.com>,
	David Sterba <dsterba@suse.com>, Sasha Levin <sashal@kernel.org>
Subject: [PATCH 5.10 25/79] btrfs: fix processing of delayed data refs during backref walking
Date: Thu, 27 Oct 2022 18:55:35 +0200	[thread overview]
Message-ID: <20221027165055.240745273@linuxfoundation.org> (raw)
In-Reply-To: <20221027165054.270676357@linuxfoundation.org>

From: Filipe Manana <fdmanana@suse.com>

[ Upstream commit 4fc7b57228243d09c0d878873bf24fa64a90fa01 ]

When processing delayed data references during backref walking and we are
using a share context (we are being called through fiemap), whenever we
find a delayed data reference for an inode different from the one we are
interested in, then we immediately exit and consider the data extent as
shared. This is wrong, because:

1) This might be a DROP reference that will cancel out a reference in the
   extent tree;

2) Even if it's an ADD reference, it may be followed by a DROP reference
   that cancels it out.

In either case we should not exit immediately.

Fix this by never exiting when we find a delayed data reference for
another inode - instead add the reference and if it does not cancel out
other delayed reference, we will exit early when we call
extent_is_shared() after processing all delayed references. If we find
a drop reference, then signal the code that processes references from
the extent tree (add_inline_refs() and add_keyed_refs()) to not exit
immediately if it finds there a reference for another inode, since we
have delayed drop references that may cancel it out. In this later case
we exit once we don't have references in the rb trees that cancel out
each other and have two references for different inodes.

Example reproducer for case 1):

   $ cat test-1.sh
   #!/bin/bash

   DEV=/dev/sdj
   MNT=/mnt/sdj

   mkfs.btrfs -f $DEV
   mount $DEV $MNT

   xfs_io -f -c "pwrite 0 64K" $MNT/foo
   cp --reflink=always $MNT/foo $MNT/bar

   echo
   echo "fiemap after cloning:"
   xfs_io -c "fiemap -v" $MNT/foo

   rm -f $MNT/bar
   echo
   echo "fiemap after removing file bar:"
   xfs_io -c "fiemap -v" $MNT/foo

   umount $MNT

Running it before this patch, the extent is still listed as shared, it has
the flag 0x2000 (FIEMAP_EXTENT_SHARED) set:

   $ ./test-1.sh
   fiemap after cloning:
   /mnt/sdj/foo:
    EXT: FILE-OFFSET      BLOCK-RANGE      TOTAL FLAGS
      0: [0..127]:        26624..26751       128 0x2001

   fiemap after removing file bar:
   /mnt/sdj/foo:
    EXT: FILE-OFFSET      BLOCK-RANGE      TOTAL FLAGS
      0: [0..127]:        26624..26751       128 0x2001

Example reproducer for case 2):

   $ cat test-2.sh
   #!/bin/bash

   DEV=/dev/sdj
   MNT=/mnt/sdj

   mkfs.btrfs -f $DEV
   mount $DEV $MNT

   xfs_io -f -c "pwrite 0 64K" $MNT/foo
   cp --reflink=always $MNT/foo $MNT/bar

   # Flush delayed references to the extent tree and commit current
   # transaction.
   sync

   echo
   echo "fiemap after cloning:"
   xfs_io -c "fiemap -v" $MNT/foo

   rm -f $MNT/bar
   echo
   echo "fiemap after removing file bar:"
   xfs_io -c "fiemap -v" $MNT/foo

   umount $MNT

Running it before this patch, the extent is still listed as shared, it has
the flag 0x2000 (FIEMAP_EXTENT_SHARED) set:

   $ ./test-2.sh
   fiemap after cloning:
   /mnt/sdj/foo:
    EXT: FILE-OFFSET      BLOCK-RANGE      TOTAL FLAGS
      0: [0..127]:        26624..26751       128 0x2001

   fiemap after removing file bar:
   /mnt/sdj/foo:
    EXT: FILE-OFFSET      BLOCK-RANGE      TOTAL FLAGS
      0: [0..127]:        26624..26751       128 0x2001

After this patch, after deleting bar in both tests, the extent is not
reported with the 0x2000 flag anymore, it gets only the flag 0x1
(which is FIEMAP_EXTENT_LAST):

   $ ./test-1.sh
   fiemap after cloning:
   /mnt/sdj/foo:
    EXT: FILE-OFFSET      BLOCK-RANGE      TOTAL FLAGS
      0: [0..127]:        26624..26751       128 0x2001

   fiemap after removing file bar:
   /mnt/sdj/foo:
    EXT: FILE-OFFSET      BLOCK-RANGE      TOTAL FLAGS
      0: [0..127]:        26624..26751       128   0x1

   $ ./test-2.sh
   fiemap after cloning:
   /mnt/sdj/foo:
    EXT: FILE-OFFSET      BLOCK-RANGE      TOTAL FLAGS
      0: [0..127]:        26624..26751       128 0x2001

   fiemap after removing file bar:
   /mnt/sdj/foo:
    EXT: FILE-OFFSET      BLOCK-RANGE      TOTAL FLAGS
      0: [0..127]:        26624..26751       128   0x1

These tests will later be converted to a test case for fstests.

Fixes: dc046b10c8b7d4 ("Btrfs: make fiemap not blow when you have lots of snapshots")
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 fs/btrfs/backref.c | 33 ++++++++++++++++++++++++---------
 1 file changed, 24 insertions(+), 9 deletions(-)

diff --git a/fs/btrfs/backref.c b/fs/btrfs/backref.c
index baff31a147e7..7e8fac12f3f8 100644
--- a/fs/btrfs/backref.c
+++ b/fs/btrfs/backref.c
@@ -137,6 +137,7 @@ struct share_check {
 	u64 root_objectid;
 	u64 inum;
 	int share_count;
+	bool have_delayed_delete_refs;
 };
 
 static inline int extent_is_shared(struct share_check *sc)
@@ -881,13 +882,22 @@ static int add_delayed_refs(const struct btrfs_fs_info *fs_info,
 			key.offset = ref->offset;
 
 			/*
-			 * Found a inum that doesn't match our known inum, we
-			 * know it's shared.
+			 * If we have a share check context and a reference for
+			 * another inode, we can't exit immediately. This is
+			 * because even if this is a BTRFS_ADD_DELAYED_REF
+			 * reference we may find next a BTRFS_DROP_DELAYED_REF
+			 * which cancels out this ADD reference.
+			 *
+			 * If this is a DROP reference and there was no previous
+			 * ADD reference, then we need to signal that when we
+			 * process references from the extent tree (through
+			 * add_inline_refs() and add_keyed_refs()), we should
+			 * not exit early if we find a reference for another
+			 * inode, because one of the delayed DROP references
+			 * may cancel that reference in the extent tree.
 			 */
-			if (sc && sc->inum && ref->objectid != sc->inum) {
-				ret = BACKREF_FOUND_SHARED;
-				goto out;
-			}
+			if (sc && count < 0)
+				sc->have_delayed_delete_refs = true;
 
 			ret = add_indirect_ref(fs_info, preftrees, ref->root,
 					       &key, 0, node->bytenr, count, sc,
@@ -917,7 +927,7 @@ static int add_delayed_refs(const struct btrfs_fs_info *fs_info,
 	}
 	if (!ret)
 		ret = extent_is_shared(sc);
-out:
+
 	spin_unlock(&head->lock);
 	return ret;
 }
@@ -1020,7 +1030,8 @@ static int add_inline_refs(const struct btrfs_fs_info *fs_info,
 			key.type = BTRFS_EXTENT_DATA_KEY;
 			key.offset = btrfs_extent_data_ref_offset(leaf, dref);
 
-			if (sc && sc->inum && key.objectid != sc->inum) {
+			if (sc && sc->inum && key.objectid != sc->inum &&
+			    !sc->have_delayed_delete_refs) {
 				ret = BACKREF_FOUND_SHARED;
 				break;
 			}
@@ -1030,6 +1041,7 @@ static int add_inline_refs(const struct btrfs_fs_info *fs_info,
 			ret = add_indirect_ref(fs_info, preftrees, root,
 					       &key, 0, bytenr, count,
 					       sc, GFP_NOFS);
+
 			break;
 		}
 		default:
@@ -1119,7 +1131,8 @@ static int add_keyed_refs(struct btrfs_fs_info *fs_info,
 			key.type = BTRFS_EXTENT_DATA_KEY;
 			key.offset = btrfs_extent_data_ref_offset(leaf, dref);
 
-			if (sc && sc->inum && key.objectid != sc->inum) {
+			if (sc && sc->inum && key.objectid != sc->inum &&
+			    !sc->have_delayed_delete_refs) {
 				ret = BACKREF_FOUND_SHARED;
 				break;
 			}
@@ -1542,6 +1555,7 @@ int btrfs_check_shared(struct btrfs_root *root, u64 inum, u64 bytenr,
 		.root_objectid = root->root_key.objectid,
 		.inum = inum,
 		.share_count = 0,
+		.have_delayed_delete_refs = false,
 	};
 
 	ulist_init(roots);
@@ -1576,6 +1590,7 @@ int btrfs_check_shared(struct btrfs_root *root, u64 inum, u64 bytenr,
 			break;
 		bytenr = node->val;
 		shared.share_count = 0;
+		shared.have_delayed_delete_refs = false;
 		cond_resched();
 	}
 
-- 
2.35.1




  parent reply	other threads:[~2022-10-27 17:06 UTC|newest]

Thread overview: 98+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-10-27 16:55 [PATCH 5.10 00/79] 5.10.151-rc1 review Greg Kroah-Hartman
2022-10-27 16:55 ` [PATCH 5.10 01/79] ocfs2: clear dinode links count in case of error Greg Kroah-Hartman
2022-10-27 16:55 ` [PATCH 5.10 02/79] ocfs2: fix BUG when iput after ocfs2_mknod fails Greg Kroah-Hartman
2022-10-27 16:55 ` [PATCH 5.10 03/79] selinux: enable use of both GFP_KERNEL and GFP_ATOMIC in convert_context() Greg Kroah-Hartman
2022-10-27 16:55   ` Greg Kroah-Hartman
2022-10-27 16:55 ` [PATCH 5.10 04/79] cpufreq: qcom: fix writes in read-only memory region Greg Kroah-Hartman
2022-10-27 16:55 ` [PATCH 5.10 05/79] i2c: qcom-cci: Fix ordering of pm_runtime_xx and i2c_add_adapter Greg Kroah-Hartman
2022-10-27 16:55 ` [PATCH 5.10 06/79] cpufreq: tegra194: Fix module loading Greg Kroah-Hartman
2022-10-27 16:55 ` [PATCH 5.10 07/79] x86/microcode/AMD: Apply the patch early on every logical thread Greg Kroah-Hartman
2022-10-27 16:55 ` [PATCH 5.10 08/79] hwmon/coretemp: Handle large core ID value Greg Kroah-Hartman
2022-10-27 16:55 ` [PATCH 5.10 09/79] ata: ahci-imx: Fix MODULE_ALIAS Greg Kroah-Hartman
2022-10-27 16:55 ` [PATCH 5.10 10/79] ata: ahci: Match EM_MAX_SLOTS with SATA_PMP_MAX_PORTS Greg Kroah-Hartman
2022-10-27 16:55 ` [PATCH 5.10 11/79] cpufreq: qcom: fix memory leak in error path Greg Kroah-Hartman
2022-10-27 16:55 ` [PATCH 5.10 12/79] kvm: Add support for arch compat vm ioctls Greg Kroah-Hartman
2022-10-30  9:54   ` Pavel Machek
2022-10-27 16:55 ` [PATCH 5.10 13/79] KVM: arm64: vgic: Fix exit condition in scan_its_table() Greg Kroah-Hartman
2022-10-27 16:55 ` [PATCH 5.10 14/79] media: mceusb: set timeout to at least timeout provided Greg Kroah-Hartman
2022-10-27 16:55 ` [PATCH 5.10 15/79] media: venus: dec: Handle the case where find_format fails Greg Kroah-Hartman
2022-10-27 16:55 ` [PATCH 5.10 16/79] bpf: Generate BTF_KIND_FLOAT when linking vmlinux Greg Kroah-Hartman
2022-10-27 16:55 ` [PATCH 5.10 17/79] kbuild: Quote OBJCOPY var to avoid a pahole call break the build Greg Kroah-Hartman
2022-10-27 16:55 ` [PATCH 5.10 18/79] kbuild: skip per-CPU BTF generation for pahole v1.18-v1.21 Greg Kroah-Hartman
2022-10-27 16:55 ` [PATCH 5.10 19/79] kbuild: Unify options for BTF generation for vmlinux and modules Greg Kroah-Hartman
2022-10-27 16:55 ` [PATCH 5.10 20/79] kbuild: Add skip_encoding_btf_enum64 option to pahole Greg Kroah-Hartman
2022-10-27 16:55 ` [PATCH 5.10 21/79] block: wbt: Remove unnecessary invoking of wbt_update_limits in wbt_init Greg Kroah-Hartman
2022-10-27 16:55 ` [PATCH 5.10 22/79] blk-wbt: call rq_qos_add() after wb_normal is initialized Greg Kroah-Hartman
2022-10-27 16:55 ` [PATCH 5.10 23/79] arm64: errata: Remove AES hwcap for COMPAT tasks Greg Kroah-Hartman
2022-10-27 16:55 ` [PATCH 5.10 24/79] r8152: add PID for the Lenovo OneLink+ Dock Greg Kroah-Hartman
2022-10-27 16:55 ` Greg Kroah-Hartman [this message]
2022-10-27 16:55 ` [PATCH 5.10 26/79] btrfs: fix processing of delayed tree block refs during backref walking Greg Kroah-Hartman
2022-10-27 16:55 ` [PATCH 5.10 27/79] ACPI: extlog: Handle multiple records Greg Kroah-Hartman
2022-10-27 16:55 ` [PATCH 5.10 28/79] tipc: Fix recognition of trial period Greg Kroah-Hartman
2022-10-27 16:55 ` [PATCH 5.10 29/79] tipc: fix an information leak in tipc_topsrv_kern_subscr Greg Kroah-Hartman
2022-10-27 16:55 ` [PATCH 5.10 30/79] i40e: Fix DMA mappings leak Greg Kroah-Hartman
2022-10-27 16:55 ` [PATCH 5.10 31/79] HID: magicmouse: Do not set BTN_MOUSE on double report Greg Kroah-Hartman
2022-10-27 16:55 ` [PATCH 5.10 32/79] sfc: Change VF mac via PF as first preference if available Greg Kroah-Hartman
2022-10-27 16:55 ` [PATCH 5.10 33/79] net/atm: fix proc_mpc_write incorrect return value Greg Kroah-Hartman
2022-10-27 16:55 ` [PATCH 5.10 34/79] net: phy: dp83867: Extend RX strap quirk for SGMII mode Greg Kroah-Hartman
2022-10-27 16:55 ` [PATCH 5.10 35/79] tcp: Add num_closed_socks to struct sock_reuseport Greg Kroah-Hartman
2022-10-27 19:53   ` Kuniyuki Iwashima
2022-10-28  6:17     ` Greg KH
2022-10-28 17:05       ` Kuniyuki Iwashima
2022-10-29  6:27         ` Greg KH
2022-10-27 16:55 ` [PATCH 5.10 36/79] udp: Update reuse->has_conns under reuseport_lock Greg Kroah-Hartman
2022-10-27 16:55 ` [PATCH 5.10 37/79] cifs: Fix xid leak in cifs_copy_file_range() Greg Kroah-Hartman
2022-10-27 16:55 ` [PATCH 5.10 38/79] cifs: Fix xid leak in cifs_flock() Greg Kroah-Hartman
2022-10-27 16:55 ` [PATCH 5.10 39/79] cifs: Fix xid leak in cifs_ses_add_channel() Greg Kroah-Hartman
2022-10-27 16:55 ` [PATCH 5.10 40/79] net: hsr: avoid possible NULL deref in skb_clone() Greg Kroah-Hartman
2022-10-27 16:55 ` [PATCH 5.10 41/79] ionic: catch NULL pointer issue on reconfig Greg Kroah-Hartman
2022-10-27 16:55 ` [PATCH 5.10 42/79] nvme-hwmon: rework to avoid devm allocation Greg Kroah-Hartman
2022-10-27 16:55 ` [PATCH 5.10 43/79] nvme-hwmon: Return error code when registration fails Greg Kroah-Hartman
2022-10-27 16:55 ` [PATCH 5.10 44/79] nvme-hwmon: consistently ignore errors from nvme_hwmon_init Greg Kroah-Hartman
2022-10-27 16:55 ` [PATCH 5.10 45/79] nvme-hwmon: kmalloc the NVME SMART log buffer Greg Kroah-Hartman
2022-10-27 16:55 ` [PATCH 5.10 46/79] net: sched: cake: fix null pointer access issue when cake_init() fails Greg Kroah-Hartman
2022-10-27 16:55 ` [PATCH 5.10 47/79] net: sched: delete duplicate cleanup of backlog and qlen Greg Kroah-Hartman
2022-10-27 16:55 ` [PATCH 5.10 48/79] net: sched: sfb: fix null pointer access issue when sfb_init() fails Greg Kroah-Hartman
2022-10-27 16:55 ` [PATCH 5.10 49/79] sfc: include vport_id in filter spec hash and equal() Greg Kroah-Hartman
2022-10-27 16:56 ` [PATCH 5.10 50/79] net: hns: fix possible memory leak in hnae_ae_register() Greg Kroah-Hartman
2022-10-27 16:56 ` [PATCH 5.10 51/79] net: sched: fix race condition in qdisc_graft() Greg Kroah-Hartman
2022-10-27 16:56 ` [PATCH 5.10 52/79] net: phy: dp83822: disable MDI crossover status change interrupt Greg Kroah-Hartman
2022-10-27 16:56 ` [PATCH 5.10 53/79] iommu/vt-d: Allow NVS regions in arch_rmrr_sanity_check() Greg Kroah-Hartman
2022-10-27 16:56 ` [PATCH 5.10 54/79] iommu/vt-d: Clean up si_domain in the init_dmars() error path Greg Kroah-Hartman
2022-10-27 16:56 ` [PATCH 5.10 55/79] drm/virtio: Use appropriate atomic state in virtio_gpu_plane_cleanup_fb() Greg Kroah-Hartman
2022-10-27 16:56 ` [PATCH 5.10 56/79] dmaengine: mxs-dma: Remove the unused .id_table Greg Kroah-Hartman
2022-10-27 16:56 ` [PATCH 5.10 57/79] dmaengine: mxs: use platform_driver_register Greg Kroah-Hartman
2022-10-27 16:56 ` [PATCH 5.10 58/79] tracing: Simplify conditional compilation code in tracing_set_tracer() Greg Kroah-Hartman
2022-10-27 16:56 ` [PATCH 5.10 59/79] tracing: Do not free snapshot if tracer is on cmdline Greg Kroah-Hartman
2022-10-27 16:56 ` [PATCH 5.10 60/79] xen: assume XENFEAT_gnttab_map_avail_bits being set for pv guests Greg Kroah-Hartman
2022-10-27 16:56 ` [PATCH 5.10 61/79] xen/gntdev: Accommodate VMA splitting Greg Kroah-Hartman
2022-10-27 16:56 ` [PATCH 5.10 62/79] mmc: sdhci-tegra: Use actual clock rate for SW tuning correction Greg Kroah-Hartman
2022-10-27 16:56 ` [PATCH 5.10 63/79] riscv: Add machine name to kernel boot log and stack dump output Greg Kroah-Hartman
2022-10-27 16:56 ` [PATCH 5.10 64/79] riscv: always honor the CONFIG_CMDLINE_FORCE when parsing dtb Greg Kroah-Hartman
2022-10-27 16:56 ` [PATCH 5.10 65/79] perf pmu: Validate raw event with sysfs exported format bits Greg Kroah-Hartman
2022-10-27 16:56 ` [PATCH 5.10 66/79] perf: Skip and warn on unknown format configN attrs Greg Kroah-Hartman
2022-10-27 16:56 ` [PATCH 5.10 67/79] fcntl: make F_GETOWN(EX) return 0 on dead owner task Greg Kroah-Hartman
2022-10-27 16:56 ` [PATCH 5.10 68/79] fcntl: fix potential deadlocks for &fown_struct.lock Greg Kroah-Hartman
2022-10-27 16:56 ` [PATCH 5.10 69/79] arm64: dts: qcom: sc7180-trogdor: Fixup modem memory region Greg Kroah-Hartman
2022-10-27 16:56 ` [PATCH 5.10 70/79] arm64: topology: move store_cpu_topology() to shared code Greg Kroah-Hartman
2022-10-27 16:56 ` [PATCH 5.10 71/79] riscv: topology: fix default topology reporting Greg Kroah-Hartman
2022-10-27 16:56 ` [PATCH 5.10 72/79] perf/x86/intel/pt: Relax address filter validation Greg Kroah-Hartman
2022-10-27 16:56 ` [PATCH 5.10 73/79] hv_netvsc: Fix race between VF offering and VF association message from host Greg Kroah-Hartman
2022-10-27 16:56 ` [PATCH 5.10 74/79] [PATCH v3] ACPI: video: Force backlight native for more TongFang devices Greg Kroah-Hartman
2022-10-27 16:56 ` [PATCH 5.10 75/79] x86/Kconfig: Drop check for -mabi=ms for CONFIG_EFI_STUB Greg Kroah-Hartman
2022-10-27 16:56 ` [PATCH 5.10 76/79] Makefile.debug: re-enable debug info for .S files Greg Kroah-Hartman
2022-10-27 16:56 ` [PATCH 5.10 77/79] mmc: core: Add SD card quirk for broken discard Greg Kroah-Hartman
2022-10-27 16:56 ` [PATCH 5.10 78/79] blk-wbt: fix that rwb->wc is always set to 1 in wbt_init() Greg Kroah-Hartman
2022-10-27 16:56 ` [PATCH 5.10 79/79] mm: /proc/pid/smaps_rollup: fix no vmas null-deref Greg Kroah-Hartman
2022-10-27 18:10 ` [PATCH 5.10 00/79] 5.10.151-rc1 review Guenter Roeck
2022-10-27 19:25   ` Greg Kroah-Hartman
2022-10-27 19:27     ` Pavel Machek
2022-10-27 19:39       ` Guenter Roeck
2022-10-27 19:54         ` Florian Fainelli
2022-10-27 19:49       ` Linus Torvalds
2022-10-28 11:01         ` Greg Kroah-Hartman
2022-10-28 10:47 ` Sudip Mukherjee (Codethink)
2022-10-28 10:58   ` Greg Kroah-Hartman
2022-10-28 11:58 ` Jon Hunter
2022-10-28 12:21 ` Pavel Machek
2022-10-28 13:59 ` Naresh Kamboju

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20221027165055.240745273@linuxfoundation.org \
    --to=gregkh@linuxfoundation.org \
    --cc=dsterba@suse.com \
    --cc=fdmanana@suse.com \
    --cc=patches@lists.linux.dev \
    --cc=sashal@kernel.org \
    --cc=stable@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.