* [GIT PULL] Btrfs updates for 6.9
@ 2024-03-11 19:18 David Sterba
2024-03-12 22:24 ` pr-tracker-bot
0 siblings, 1 reply; 5+ messages in thread
From: David Sterba @ 2024-03-11 19:18 UTC (permalink / raw)
To: torvalds; +Cc: David Sterba, linux-btrfs, linux-kernel
Hi,
there are mostly stabilization, refactoring and cleanup changes. There rest are
minor performance optimizations due to caching or lock contention reduction and
a few notable fixes.
Please pull, thanks.
Performance improvements:
- minor speedup in logging when repeatedly allocated structure is preallocated
only once, improves latency and decreases lock contention
- minor throughput increase (+6%), reduced lock contention after clearing
delayed allocation bits, applies to several common workload types
- skip full quota rescan if a new relation is added in the same transaction
Fixes:
- zstd fix for inline compressed file in subpage mode, updated version from the
6.8 time
- proper qgroup inheritance ioctl parameter validation
- more fiemap followup fixes after reduced locking done in 6.8
- fix race when detecting delalloc ranges
Core changes:
- more debugging code
- added assertions for a very rare crash in raid56 calculation
- tree-checker dumps page state to give more insights into possible reference
counting issues
- add checksum calculation offloading sysfs knob, for now enabled under DEBUG
only to determine a good heuristic for deciding the offload or synchronous,
depends on various factors (block group profile, device speed) and is not as
clear as initially thought (checksum type)
- error handling improvements, added assertions
- more page to folio conversion (defrag, truncate), cached size and shift
- preparation for more fine grained locking of sectors in subpage mode
- cleanups and refactoring
- include cleanups, forward declarations
- pointer-to-structure helpers
- redundant argument removals
- removed unused code
- slab cache updates, last use of SLAB_MEM_SPREAD removed
----------------------------------------------------------------
The following changes since commit 90d35da658da8cff0d4ecbb5113f5fac9d00eb72:
Linux 6.8-rc7 (2024-03-03 13:02:52 -0800)
are available in the Git repository at:
git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux.git tags/for-6.9-tag
for you to fetch changes up to 1cab1375ba6d5337a25acb346996106c12bb2dd0:
btrfs: reuse cloned extent buffer during fiemap to avoid re-allocations (2024-03-05 18:14:19 +0100)
----------------------------------------------------------------
for-6.9-tag
----------------------------------------------------------------
Anand Jain (1):
btrfs: include device major and minor numbers in the device scan notice
Chengming Zhou (1):
btrfs: remove SLAB_MEM_SPREAD flag use
Colin Ian King (1):
btrfs: zlib: Fix spelling mistake "infalte" -> "inflate"
David Sterba (62):
btrfs: replace sb::s_blocksize by fs_info::sectorsize
btrfs: replace i_blocksize by fs_info::sectorsize
btrfs: remove unused included headers
btrfs: handle errors returned from unpin_extent_cache()
btrfs: return errors from unpin_extent_range()
btrfs: make btrfs_error_unpin_extent_range() return void
btrfs: handle directory and dentry mismatch in btrfs_may_delete()
btrfs: handle invalid range and start in merge_extent_mapping()
btrfs: handle block group lookup error when it's being removed
btrfs: handle root deletion lookup error in btrfs_del_root()
btrfs: handle invalid root reference found in btrfs_find_root()
btrfs: handle invalid root reference found in btrfs_init_root_free_objectid()
btrfs: handle chunk tree lookup error in btrfs_relocate_sys_chunks()
btrfs: handle invalid extent item reference found in check_committed_ref()
btrfs: export: handle invalid inode or root reference in btrfs_get_parent()
btrfs: delayed-inode: drop pointless BUG_ON in __btrfs_remove_delayed_item()
btrfs: change BUG_ON to assertion when checking for delayed_node root
btrfs: defrag: change BUG_ON to assertion in btrfs_defrag_leaves()
btrfs: change BUG_ON to assertion in btrfs_read_roots()
btrfs: change BUG_ON to assertion when verifying lockdep class setup
btrfs: change BUG_ON to assertion when verifying root in btrfs_alloc_reserved_file_extent()
btrfs: change BUG_ON to assertion in reset_balance_state()
btrfs: unify handling of return values of btrfs_insert_empty_items()
btrfs: move transaction abort to the error site in btrfs_delete_free_space_tree()
btrfs: move transaction abort to the error site in btrfs_create_free_space_tree()
btrfs: move transaction abort to the error site btrfs_rebuild_free_space_tree()
btrfs: tests: allocate dummy fs_info and root in test_find_delalloc()
btrfs: add helpers to get inode from page/folio pointers
btrfs: add helpers to get fs_info from page/folio pointers
btrfs: add helper to get fs_info from struct inode pointer
btrfs: hoist fs_info out of loops in end_bbio_data_write and end_bbio_data_read
btrfs: add forward declarations and headers, part 1
btrfs: add forward declarations and headers, part 2
btrfs: add forward declarations and headers, part 3
btrfs: push errors up from add_async_extent()
btrfs: update comment and drop assertion in extent item lookup in find_parent_nodes()
btrfs: handle invalid extent item reference found in extent_from_logical()
btrfs: handle invalid extent item reference found in find_first_extent_item()
btrfs: handle invalid root reference found in may_destroy_subvol()
btrfs: send: handle unexpected data in header buffer in begin_cmd()
btrfs: send: handle unexpected inode in header process_recorded_refs()
btrfs: send: handle path ref underflow in header iterate_inode_ref()
btrfs: change BUG_ON to assertion in tree_move_down()
btrfs: change BUG_ONs to assertions in btrfs_qgroup_trace_subtree()
btrfs: delete pointless BUG_ON check on quota root in btrfs_qgroup_account_extent()
btrfs: delete pointless BUG_ONs on extent item size
btrfs: delete BUG_ON in btrfs_init_locked_inode()
btrfs: factor out validation of btrfs_ioctl_vol_args::name
btrfs: factor out validation of btrfs_ioctl_vol_args_v2::name
btrfs: move balance args conversion helpers to volumes.c
btrfs: open code btrfs_backref_iter_free()
btrfs: open code btrfs_backref_get_eb()
btrfs: uninline some static inline helpers from backref.h
btrfs: uninline btrfs_init_delayed_root()
btrfs: drop static inline specifiers from tree-mod-log.c
btrfs: uninline some static inline helpers from tree-log.h
btrfs: open code trivial btrfs_lru_cache_size()
btrfs: uninline some static inline helpers from delayed-ref.h
btrfs: handle transaction commit errors in flush_reservations()
btrfs: pass btrfs_device to btrfs_scratch_superblocks()
btrfs: merge btrfs_del_delalloc_inode() helpers
btrfs: pass a valid extent map cache pointer to __get_extent_map()
Filipe Manana (19):
btrfs: remove extent_map_tree forward declaration at extent_io.h
btrfs: document what the spinlock unused_bgs_lock protects
btrfs: add comment about list_is_singular() use at btrfs_delete_unused_bgs()
btrfs: preallocate temporary extent buffer for inode logging when needed
btrfs: stop passing root argument to btrfs_add_delalloc_inodes()
btrfs: stop passing root argument to __btrfs_del_delalloc_inode()
btrfs: assert root delalloc lock is held at __btrfs_del_delalloc_inode()
btrfs: rename btrfs_add_delalloc_inodes() to singular form
btrfs: reduce inode lock critical section when setting and clearing delalloc
btrfs: add lockdep assertion to remaining delalloc callbacks
btrfs: use assertion instead of BUG_ON when adding/removing to delalloc list
btrfs: remove do_list variable at btrfs_set_delalloc_extent()
btrfs: remove do_list variable at btrfs_clear_delalloc_extent()
btrfs: remove no longer used btrfs_transaction_in_commit()
btrfs: send: avoid duplicated search for last extent when sending hole
btrfs: avoid unnecessary ref initialization when freeing log tree block
btrfs: fix off-by-one chunk length calculation at contains_pending_extent()
btrfs: fix race when detecting delalloc ranges during fiemap
btrfs: reuse cloned extent buffer during fiemap to avoid re-allocations
Goldwyn Rodrigues (1):
btrfs: page to folio conversion in btrfs_truncate_block()
Johannes Thumshirn (1):
btrfs: remove duplicate recording of physical address
Josef Bacik (1):
btrfs: WARN_ON_ONCE() in our leak detection code
Kunwu Chan (6):
btrfs: use KMEM_CACHE() to create btrfs_delayed_node cache
btrfs: use KMEM_CACHE() to create btrfs_ordered_extent cache
btrfs: use KMEM_CACHE() to create btrfs_trans_handle cache
btrfs: use KMEM_CACHE() to create btrfs_path cache
btrfs: use KMEM_CACHE() to create delayed ref caches
btrfs: use KMEM_CACHE() to create btrfs_free_space cache
Lijuan Li (2):
btrfs: mark __btrfs_add_free_space static
btrfs: mark btrfs_put_caching_control() static
Matthew Wilcox (Oracle) (3):
btrfs: add set_folio_extent_mapped() helper
btrfs: convert defrag_prepare_one_page() to use a folio
btrfs: use a folio array throughout the defrag process
Naohiro Aota (2):
btrfs: use READ/WRITE_ONCE for fs_devices->read_policy
btrfs: introduce offload_csum_mode to tweak checksum offloading behavior
Neal Gompa (1):
btrfs: sysfs: drop unnecessary double logical negation in acl_show()
Qu Wenruo (13):
btrfs: remove the pg_offset parameter from btrfs_get_extent()
btrfs: remove unused variable bio_offset from end_bbio_data_read()
btrfs: cache folio size and shift in extent_buffer
btrfs: zstd: fix and simplify the inline extent decompression (v2)
btrfs: raid56: extra debugging for raid6 syndrome generation
btrfs: unexport btrfs_subpage_start_writer() and btrfs_subpage_end_and_test_writer()
btrfs: subpage: make reader lock utilize bitmap
btrfs: subpage: make writer lock utilize bitmap
btrfs: compression: remove dead comments in btrfs_compress_heuristic()
btrfs: tree-checker: dump the page status if hit something wrong
btrfs: qgroup: always free reserved space for extent records
btrfs: qgroup: validate btrfs_qgroup_inherit parameter
btrfs: qgroup: allow quick inherit if snapshot is created and added to the same parent
fs/btrfs/accessors.c | 15 +-
fs/btrfs/accessors.h | 50 +----
fs/btrfs/acl.c | 1 -
fs/btrfs/acl.h | 11 ++
fs/btrfs/async-thread.c | 1 -
fs/btrfs/async-thread.h | 3 +
fs/btrfs/backref.c | 119 ++++++++++--
fs/btrfs/backref.h | 136 +++-----------
fs/btrfs/bio.c | 17 +-
fs/btrfs/bio.h | 2 +
fs/btrfs/block-group.c | 15 +-
fs/btrfs/block-group.h | 14 +-
fs/btrfs/block-rsv.c | 1 -
fs/btrfs/block-rsv.h | 7 +
fs/btrfs/btrfs_inode.h | 25 ++-
fs/btrfs/compression.c | 18 +-
fs/btrfs/compression.h | 12 +-
fs/btrfs/ctree.c | 10 +-
fs/btrfs/ctree.h | 28 ++-
fs/btrfs/defrag.c | 104 +++++------
fs/btrfs/defrag.h | 10 +
fs/btrfs/delalloc-space.c | 2 -
fs/btrfs/delalloc-space.h | 4 +
fs/btrfs/delayed-inode.c | 21 ++-
fs/btrfs/delayed-inode.h | 21 +--
fs/btrfs/delayed-ref.c | 85 +++++++--
fs/btrfs/delayed-ref.h | 82 ++-------
fs/btrfs/dev-replace.c | 5 +-
fs/btrfs/dev-replace.h | 4 +
fs/btrfs/dir-item.h | 6 +
fs/btrfs/disk-io.c | 30 ++-
fs/btrfs/disk-io.h | 20 +-
fs/btrfs/export.c | 12 +-
fs/btrfs/export.h | 4 +
fs/btrfs/extent-io-tree.c | 6 +-
fs/btrfs/extent-io-tree.h | 7 +
fs/btrfs/extent-tree.c | 51 ++++--
fs/btrfs/extent-tree.h | 10 +
fs/btrfs/extent_io.c | 387 +++++++++++++++++++++++++--------------
fs/btrfs/extent_io.h | 44 ++++-
fs/btrfs/extent_map.c | 23 ++-
fs/btrfs/extent_map.h | 8 +
fs/btrfs/file-item.c | 6 -
fs/btrfs/file-item.h | 13 ++
fs/btrfs/file.c | 43 +++--
fs/btrfs/file.h | 15 ++
fs/btrfs/free-space-cache.c | 12 +-
fs/btrfs/free-space-cache.h | 15 +-
fs/btrfs/free-space-tree.c | 56 +++---
fs/btrfs/free-space-tree.h | 6 +
fs/btrfs/fs.h | 59 +++++-
fs/btrfs/inode-item.c | 1 -
fs/btrfs/inode-item.h | 5 +-
fs/btrfs/inode.c | 238 +++++++++++++-----------
fs/btrfs/ioctl.c | 120 +++++++-----
fs/btrfs/ioctl.h | 9 +
fs/btrfs/locking.c | 3 +-
fs/btrfs/locking.h | 8 +-
fs/btrfs/lru_cache.h | 7 +-
fs/btrfs/lzo.c | 4 +-
fs/btrfs/messages.c | 2 -
fs/btrfs/misc.h | 2 +
fs/btrfs/ordered-data.c | 6 +-
fs/btrfs/ordered-data.h | 15 ++
fs/btrfs/orphan.c | 1 -
fs/btrfs/orphan.h | 5 +
fs/btrfs/print-tree.h | 3 +
fs/btrfs/props.c | 3 +-
fs/btrfs/props.h | 7 +-
fs/btrfs/qgroup.c | 148 +++++++++++++--
fs/btrfs/qgroup.h | 20 +-
fs/btrfs/raid-stripe-tree.c | 1 -
fs/btrfs/raid-stripe-tree.h | 5 +
fs/btrfs/raid56.c | 31 +++-
fs/btrfs/raid56.h | 9 +
fs/btrfs/rcu-string.h | 6 +
fs/btrfs/ref-verify.h | 9 +
fs/btrfs/reflink.c | 12 +-
fs/btrfs/reflink.h | 4 +-
fs/btrfs/relocation.c | 5 +-
fs/btrfs/relocation.h | 9 +
fs/btrfs/root-tree.c | 17 +-
fs/btrfs/root-tree.h | 10 +
fs/btrfs/scrub.c | 9 +-
fs/btrfs/scrub.h | 6 +
fs/btrfs/send.c | 64 ++++---
fs/btrfs/send.h | 8 +-
fs/btrfs/space-info.c | 1 -
fs/btrfs/space-info.h | 9 +
fs/btrfs/subpage.c | 74 ++++++--
fs/btrfs/subpage.h | 21 ++-
fs/btrfs/super.c | 9 +-
fs/btrfs/super.h | 7 +
fs/btrfs/sysfs.c | 53 +++++-
fs/btrfs/sysfs.h | 9 +
fs/btrfs/tests/extent-io-tests.c | 28 ++-
fs/btrfs/tests/inode-tests.c | 40 ++--
fs/btrfs/transaction.c | 19 +-
fs/btrfs/transaction.h | 18 +-
fs/btrfs/tree-checker.c | 8 +-
fs/btrfs/tree-checker.h | 2 +
fs/btrfs/tree-log.c | 141 ++++++++++----
fs/btrfs/tree-log.h | 49 ++---
fs/btrfs/tree-mod-log.c | 13 +-
fs/btrfs/tree-mod-log.h | 8 +-
fs/btrfs/ulist.c | 1 -
fs/btrfs/ulist.h | 1 +
fs/btrfs/uuid-tree.c | 3 +-
fs/btrfs/uuid-tree.h | 5 +
fs/btrfs/verity.c | 1 -
fs/btrfs/verity.h | 7 +
fs/btrfs/volumes.c | 98 +++++++---
fs/btrfs/volumes.h | 53 +++++-
fs/btrfs/xattr.h | 6 +-
fs/btrfs/zlib.c | 2 +-
fs/btrfs/zoned.c | 2 -
fs/btrfs/zoned.h | 15 ++
fs/btrfs/zstd.c | 75 +++-----
include/uapi/linux/btrfs.h | 1 +
119 files changed, 2131 insertions(+), 1116 deletions(-)
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [GIT PULL] Btrfs updates for 6.9
2024-03-11 19:18 [GIT PULL] Btrfs updates for 6.9 David Sterba
@ 2024-03-12 22:24 ` pr-tracker-bot
2024-03-18 4:43 ` [PATCH v5] btrfs: do not skip re-registration for the mounted device Anand Jain
0 siblings, 1 reply; 5+ messages in thread
From: pr-tracker-bot @ 2024-03-12 22:24 UTC (permalink / raw)
To: David Sterba; +Cc: torvalds, David Sterba, linux-btrfs, linux-kernel
The pull request you sent on Mon, 11 Mar 2024 20:18:45 +0100:
> git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux.git tags/for-6.9-tag
has been merged into torvalds/linux.git:
https://git.kernel.org/torvalds/c/43a7548e28a6df12a6170421d9d016c576010baa
Thank you!
--
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/prtracker.html
^ permalink raw reply [flat|nested] 5+ messages in thread
* [PATCH v5] btrfs: do not skip re-registration for the mounted device
2024-03-12 22:24 ` pr-tracker-bot
@ 2024-03-18 4:43 ` Anand Jain
2024-03-18 10:53 ` Alex Romosan
2024-03-18 18:09 ` David Sterba
0 siblings, 2 replies; 5+ messages in thread
From: Anand Jain @ 2024-03-18 4:43 UTC (permalink / raw)
To: linux-btrfs
Cc: Anand Jain, stable, Alex Romosan, CHECK_1234543212345, David Sterba
There are reports that since version 6.7 update-grub fails to find the
device of the root on systems without initrd and on a single device.
This looks like the device name changed in the output of
/proc/self/mountinfo:
6.5-rc5 working
18 1 0:16 / / rw,noatime - btrfs /dev/sda8 ...
6.7 not working:
17 1 0:15 / / rw,noatime - btrfs /dev/root ...
and "update-grub" shows this error:
/usr/sbin/grub-probe: error: cannot find a device for / (is /dev mounted?)
This looks like it's related to the device name, but grub-probe
recognizes the "/dev/root" path and tries to find the underlying device.
However there's a special case for some filesystems, for btrfs in
particular.
The generic root device detection heuristic is not done and it all
relies on reading the device infos by a btrfs specific ioctl. This ioctl
returns the device name as it was saved at the time of device scan (in
this case it's /dev/root).
The change in 6.7 for temp_fsid to allow several single device
filesystem to exist with the same fsid (and transparently generate a new
UUID at mount time) was to skip caching/registering such devices.
This also skipped mounted device. One step of scanning is to check if
the device name hasn't changed, and if yes then update the cached value.
This broke the grub-probe as it always read the device /dev/root and
couldn't find it in the system. A temporary workaround is to create a
symlink but this does not survive reboot.
The right fix is to allow updating the device path of a mounted
filesystem even if this is a single device one.
In the fix, check if the device's major:minor number matches with the
cached device. If they do, then we can allow the scan to happen so that
device_list_add() can take care of updating the device path. The file
descriptor remains unchanged.
This does not affect the temp_fsid feature, the UUID of the mounted
filesystem remains the same and the matching is based on device major:minor
which is unique per mounted filesystem.
This covers the path when the device (that exists for all mounted
devices) name changes, updating /dev/root to /dev/sdx. Any other single
device with filesystem and is not mounted is still skipped.
Note that if a system is booted and initial mount is done on the
/dev/root device, this will be the cached name of the device. Only after
the command "btrfs device scan" it will change as it triggers the
rename.
The fix was verified by users whose systems were affected.
CC: stable@vger.kernel.org # 6.7+
Fixes: bc27d6f0aa0e ("btrfs: scan but don't register device on single device filesystem")
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=218353
Link: https://lore.kernel.org/lkml/CAKLYgeJ1tUuqLcsquwuFqjDXPSJpEiokrWK2gisPKDZLs8Y2TQ@mail.gmail.com/
Tested-by: Alex Romosan <aromosan@gmail.com>
Tested-by: CHECK_1234543212345@protonmail.com
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: Anand Jain <anand.jain@oracle.com>
---
v5:
Fix the linux-next build failure reported here:
https://lore.kernel.org/all/20240318091755.1d0f696f@canb.auug.org.au/
As the Linux-next branch no longer has the this commit,
I've sent out the entire patch again.
v4: (based on mainline master)
I removed CC: stable@vger.kernel.org # 6.7+ as this is still in the RFC stage.
I need this patch verified by the bug filer.
Use devt from bdev->bd_dev
Rebased on mainline kernel.org master branch
v3:
https://lore.kernel.org/linux-btrfs/e2add8d54fbbd813305ba014c11d21d297ad87d0.1709782041.git.anand.jain@oracle.com/T/#u
fs/btrfs/volumes.c | 58 +++++++++++++++++++++++++++++++++++++---------
1 file changed, 47 insertions(+), 11 deletions(-)
diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
index a2d07fa3cfdf..813c1c66b2db 100644
--- a/fs/btrfs/volumes.c
+++ b/fs/btrfs/volumes.c
@@ -1303,6 +1303,47 @@ int btrfs_forget_devices(dev_t devt)
return ret;
}
+static bool btrfs_skip_registration(struct btrfs_super_block *disk_super,
+ const char *path, dev_t devt,
+ bool mount_arg_dev)
+{
+ struct btrfs_fs_devices *fs_devices;
+
+ /*
+ * Do not skip device registration for mounted devices with matching
+ * maj:min but different paths. Booting without initrd relies on
+ * /dev/root initially, later replaced with the actual root device.
+ * A successful scan ensures update-grub selects the correct device.
+ */
+ list_for_each_entry(fs_devices, &fs_uuids, fs_list) {
+ struct btrfs_device *device;
+
+ mutex_lock(&fs_devices->device_list_mutex);
+
+ if (!fs_devices->opened) {
+ mutex_unlock(&fs_devices->device_list_mutex);
+ continue;
+ }
+
+ list_for_each_entry(device, &fs_devices->devices, dev_list) {
+ if ((device->devt == devt) &&
+ strcmp(device->name->str, path)) {
+ mutex_unlock(&fs_devices->device_list_mutex);
+
+ /* Do not skip registration */
+ return false;
+ }
+ }
+ mutex_unlock(&fs_devices->device_list_mutex);
+ }
+
+ if (!mount_arg_dev && btrfs_super_num_devices(disk_super) == 1 &&
+ !(btrfs_super_flags(disk_super) & BTRFS_SUPER_FLAG_SEEDING))
+ return true;
+
+ return false;
+}
+
/*
* Look for a btrfs signature on a device. This may be called out of the mount path
* and we are not allowed to call set_blocksize during the scan. The superblock
@@ -1320,6 +1361,7 @@ struct btrfs_device *btrfs_scan_one_device(const char *path, blk_mode_t flags,
struct btrfs_device *device = NULL;
struct file *bdev_file;
u64 bytenr, bytenr_orig;
+ dev_t devt;
int ret;
lockdep_assert_held(&uuid_mutex);
@@ -1359,19 +1401,13 @@ struct btrfs_device *btrfs_scan_one_device(const char *path, blk_mode_t flags,
goto error_bdev_put;
}
- if (!mount_arg_dev && btrfs_super_num_devices(disk_super) == 1 &&
- !(btrfs_super_flags(disk_super) & BTRFS_SUPER_FLAG_SEEDING)) {
- dev_t devt;
+ devt = file_bdev(bdev_file)->bd_dev;
+ if (btrfs_skip_registration(disk_super, path, devt, mount_arg_dev)) {
+ pr_debug("BTRFS: skip registering single non-seed device %s (%d:%d)\n",
+ path, MAJOR(devt), MINOR(devt));
- ret = lookup_bdev(path, &devt);
- if (ret)
- btrfs_warn(NULL, "lookup bdev failed for path %s: %d",
- path, ret);
- else
- btrfs_free_stale_devices(devt, NULL);
+ btrfs_free_stale_devices(devt, NULL);
- pr_debug("BTRFS: skip registering single non-seed device %s (%d:%d)\n",
- path, MAJOR(devt), MINOR(devt));
device = NULL;
goto free_disk_super;
}
--
2.38.1
^ permalink raw reply related [flat|nested] 5+ messages in thread
* Re: [PATCH v5] btrfs: do not skip re-registration for the mounted device
2024-03-18 4:43 ` [PATCH v5] btrfs: do not skip re-registration for the mounted device Anand Jain
@ 2024-03-18 10:53 ` Alex Romosan
2024-03-18 18:09 ` David Sterba
1 sibling, 0 replies; 5+ messages in thread
From: Alex Romosan @ 2024-03-18 10:53 UTC (permalink / raw)
To: Anand Jain; +Cc: linux-btrfs, stable, CHECK_1234543212345, David Sterba
confirming that update-grub works with v5 of the patch (applied
against current Linus tree HEAD). these are the relevant entries in
the log:
Btrfs loaded, debug=on, zoned=no, fsverity=no
BTRFS: device fsid 695aa7ac-862a-4de3-ae59-c96f784600a0 devid 1
transid 2037166 /dev/root (259:3) scanned by swapper/0 (1)
BTRFS info (device nvme0n1p3): first mount of filesystem
695aa7ac-862a-4de3-ae59-c96f784600a0
BTRFS info (device nvme0n1p3): using crc32c (crc32c-generic) checksum algorithm
BTRFS info (device nvme0n1p3): using free-space-tree
VFS: Mounted root (btrfs filesystem) readonly on device 0:20.
BTRFS info: devid 1 device path /dev/root changed to /dev/nvme0n1p3
scanned by (udev-worker) (278)
On Mon, Mar 18, 2024 at 5:47 AM Anand Jain <anand.jain@oracle.com> wrote:
>
> There are reports that since version 6.7 update-grub fails to find the
> device of the root on systems without initrd and on a single device.
>
> This looks like the device name changed in the output of
> /proc/self/mountinfo:
>
> 6.5-rc5 working
>
> 18 1 0:16 / / rw,noatime - btrfs /dev/sda8 ...
>
> 6.7 not working:
>
> 17 1 0:15 / / rw,noatime - btrfs /dev/root ...
>
> and "update-grub" shows this error:
>
> /usr/sbin/grub-probe: error: cannot find a device for / (is /dev mounted?)
>
> This looks like it's related to the device name, but grub-probe
> recognizes the "/dev/root" path and tries to find the underlying device.
> However there's a special case for some filesystems, for btrfs in
> particular.
>
> The generic root device detection heuristic is not done and it all
> relies on reading the device infos by a btrfs specific ioctl. This ioctl
> returns the device name as it was saved at the time of device scan (in
> this case it's /dev/root).
>
> The change in 6.7 for temp_fsid to allow several single device
> filesystem to exist with the same fsid (and transparently generate a new
> UUID at mount time) was to skip caching/registering such devices.
>
> This also skipped mounted device. One step of scanning is to check if
> the device name hasn't changed, and if yes then update the cached value.
>
> This broke the grub-probe as it always read the device /dev/root and
> couldn't find it in the system. A temporary workaround is to create a
> symlink but this does not survive reboot.
>
> The right fix is to allow updating the device path of a mounted
> filesystem even if this is a single device one.
>
> In the fix, check if the device's major:minor number matches with the
> cached device. If they do, then we can allow the scan to happen so that
> device_list_add() can take care of updating the device path. The file
> descriptor remains unchanged.
>
> This does not affect the temp_fsid feature, the UUID of the mounted
> filesystem remains the same and the matching is based on device major:minor
> which is unique per mounted filesystem.
>
> This covers the path when the device (that exists for all mounted
> devices) name changes, updating /dev/root to /dev/sdx. Any other single
> device with filesystem and is not mounted is still skipped.
>
> Note that if a system is booted and initial mount is done on the
> /dev/root device, this will be the cached name of the device. Only after
> the command "btrfs device scan" it will change as it triggers the
> rename.
>
> The fix was verified by users whose systems were affected.
>
> CC: stable@vger.kernel.org # 6.7+
> Fixes: bc27d6f0aa0e ("btrfs: scan but don't register device on single device filesystem")
> Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=218353
> Link: https://lore.kernel.org/lkml/CAKLYgeJ1tUuqLcsquwuFqjDXPSJpEiokrWK2gisPKDZLs8Y2TQ@mail.gmail.com/
> Tested-by: Alex Romosan <aromosan@gmail.com>
> Tested-by: CHECK_1234543212345@protonmail.com
> Reviewed-by: David Sterba <dsterba@suse.com>
> Signed-off-by: Anand Jain <anand.jain@oracle.com>
> ---
> v5:
> Fix the linux-next build failure reported here:
> https://lore.kernel.org/all/20240318091755.1d0f696f@canb.auug.org.au/
> As the Linux-next branch no longer has the this commit,
> I've sent out the entire patch again.
>
> v4: (based on mainline master)
> I removed CC: stable@vger.kernel.org # 6.7+ as this is still in the RFC stage.
> I need this patch verified by the bug filer.
> Use devt from bdev->bd_dev
> Rebased on mainline kernel.org master branch
>
> v3:
> https://lore.kernel.org/linux-btrfs/e2add8d54fbbd813305ba014c11d21d297ad87d0.1709782041.git.anand.jain@oracle.com/T/#u
>
> fs/btrfs/volumes.c | 58 +++++++++++++++++++++++++++++++++++++---------
> 1 file changed, 47 insertions(+), 11 deletions(-)
>
> diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
> index a2d07fa3cfdf..813c1c66b2db 100644
> --- a/fs/btrfs/volumes.c
> +++ b/fs/btrfs/volumes.c
> @@ -1303,6 +1303,47 @@ int btrfs_forget_devices(dev_t devt)
> return ret;
> }
>
> +static bool btrfs_skip_registration(struct btrfs_super_block *disk_super,
> + const char *path, dev_t devt,
> + bool mount_arg_dev)
> +{
> + struct btrfs_fs_devices *fs_devices;
> +
> + /*
> + * Do not skip device registration for mounted devices with matching
> + * maj:min but different paths. Booting without initrd relies on
> + * /dev/root initially, later replaced with the actual root device.
> + * A successful scan ensures update-grub selects the correct device.
> + */
> + list_for_each_entry(fs_devices, &fs_uuids, fs_list) {
> + struct btrfs_device *device;
> +
> + mutex_lock(&fs_devices->device_list_mutex);
> +
> + if (!fs_devices->opened) {
> + mutex_unlock(&fs_devices->device_list_mutex);
> + continue;
> + }
> +
> + list_for_each_entry(device, &fs_devices->devices, dev_list) {
> + if ((device->devt == devt) &&
> + strcmp(device->name->str, path)) {
> + mutex_unlock(&fs_devices->device_list_mutex);
> +
> + /* Do not skip registration */
> + return false;
> + }
> + }
> + mutex_unlock(&fs_devices->device_list_mutex);
> + }
> +
> + if (!mount_arg_dev && btrfs_super_num_devices(disk_super) == 1 &&
> + !(btrfs_super_flags(disk_super) & BTRFS_SUPER_FLAG_SEEDING))
> + return true;
> +
> + return false;
> +}
> +
> /*
> * Look for a btrfs signature on a device. This may be called out of the mount path
> * and we are not allowed to call set_blocksize during the scan. The superblock
> @@ -1320,6 +1361,7 @@ struct btrfs_device *btrfs_scan_one_device(const char *path, blk_mode_t flags,
> struct btrfs_device *device = NULL;
> struct file *bdev_file;
> u64 bytenr, bytenr_orig;
> + dev_t devt;
> int ret;
>
> lockdep_assert_held(&uuid_mutex);
> @@ -1359,19 +1401,13 @@ struct btrfs_device *btrfs_scan_one_device(const char *path, blk_mode_t flags,
> goto error_bdev_put;
> }
>
> - if (!mount_arg_dev && btrfs_super_num_devices(disk_super) == 1 &&
> - !(btrfs_super_flags(disk_super) & BTRFS_SUPER_FLAG_SEEDING)) {
> - dev_t devt;
> + devt = file_bdev(bdev_file)->bd_dev;
> + if (btrfs_skip_registration(disk_super, path, devt, mount_arg_dev)) {
> + pr_debug("BTRFS: skip registering single non-seed device %s (%d:%d)\n",
> + path, MAJOR(devt), MINOR(devt));
>
> - ret = lookup_bdev(path, &devt);
> - if (ret)
> - btrfs_warn(NULL, "lookup bdev failed for path %s: %d",
> - path, ret);
> - else
> - btrfs_free_stale_devices(devt, NULL);
> + btrfs_free_stale_devices(devt, NULL);
>
> - pr_debug("BTRFS: skip registering single non-seed device %s (%d:%d)\n",
> - path, MAJOR(devt), MINOR(devt));
> device = NULL;
> goto free_disk_super;
> }
> --
> 2.38.1
>
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH v5] btrfs: do not skip re-registration for the mounted device
2024-03-18 4:43 ` [PATCH v5] btrfs: do not skip re-registration for the mounted device Anand Jain
2024-03-18 10:53 ` Alex Romosan
@ 2024-03-18 18:09 ` David Sterba
1 sibling, 0 replies; 5+ messages in thread
From: David Sterba @ 2024-03-18 18:09 UTC (permalink / raw)
To: Anand Jain
Cc: linux-btrfs, stable, Alex Romosan, CHECK_1234543212345, David Sterba
On Mon, Mar 18, 2024 at 10:13:13AM +0530, Anand Jain wrote:
> There are reports that since version 6.7 update-grub fails to find the
> device of the root on systems without initrd and on a single device.
>
> This looks like the device name changed in the output of
> /proc/self/mountinfo:
>
> 6.5-rc5 working
>
> 18 1 0:16 / / rw,noatime - btrfs /dev/sda8 ...
>
> 6.7 not working:
>
> 17 1 0:15 / / rw,noatime - btrfs /dev/root ...
>
> and "update-grub" shows this error:
>
> /usr/sbin/grub-probe: error: cannot find a device for / (is /dev mounted?)
>
> This looks like it's related to the device name, but grub-probe
> recognizes the "/dev/root" path and tries to find the underlying device.
> However there's a special case for some filesystems, for btrfs in
> particular.
>
> The generic root device detection heuristic is not done and it all
> relies on reading the device infos by a btrfs specific ioctl. This ioctl
> returns the device name as it was saved at the time of device scan (in
> this case it's /dev/root).
>
> The change in 6.7 for temp_fsid to allow several single device
> filesystem to exist with the same fsid (and transparently generate a new
> UUID at mount time) was to skip caching/registering such devices.
>
> This also skipped mounted device. One step of scanning is to check if
> the device name hasn't changed, and if yes then update the cached value.
>
> This broke the grub-probe as it always read the device /dev/root and
> couldn't find it in the system. A temporary workaround is to create a
> symlink but this does not survive reboot.
>
> The right fix is to allow updating the device path of a mounted
> filesystem even if this is a single device one.
>
> In the fix, check if the device's major:minor number matches with the
> cached device. If they do, then we can allow the scan to happen so that
> device_list_add() can take care of updating the device path. The file
> descriptor remains unchanged.
>
> This does not affect the temp_fsid feature, the UUID of the mounted
> filesystem remains the same and the matching is based on device major:minor
> which is unique per mounted filesystem.
>
> This covers the path when the device (that exists for all mounted
> devices) name changes, updating /dev/root to /dev/sdx. Any other single
> device with filesystem and is not mounted is still skipped.
>
> Note that if a system is booted and initial mount is done on the
> /dev/root device, this will be the cached name of the device. Only after
> the command "btrfs device scan" it will change as it triggers the
> rename.
>
> The fix was verified by users whose systems were affected.
>
> CC: stable@vger.kernel.org # 6.7+
> Fixes: bc27d6f0aa0e ("btrfs: scan but don't register device on single device filesystem")
> Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=218353
> Link: https://lore.kernel.org/lkml/CAKLYgeJ1tUuqLcsquwuFqjDXPSJpEiokrWK2gisPKDZLs8Y2TQ@mail.gmail.com/
> Tested-by: Alex Romosan <aromosan@gmail.com>
> Tested-by: CHECK_1234543212345@protonmail.com
> Reviewed-by: David Sterba <dsterba@suse.com>
> Signed-off-by: Anand Jain <anand.jain@oracle.com>
> ---
> v5:
> Fix the linux-next build failure reported here:
> https://lore.kernel.org/all/20240318091755.1d0f696f@canb.auug.org.au/
> As the Linux-next branch no longer has the this commit,
> I've sent out the entire patch again.
Thanks but this won't work. The code in v5 is against current master
with bdev_handle -> file_bdev change but the patch gets merged from
branch that does not have that.
For backport to stable we'll need the v4 version while for merge to
master (in the next pull request) it's going to be v5.
I'll do a separate pull request with this change on the correct base and
we'll have to let stable team know which patch to pick.
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2024-03-18 18:17 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-03-11 19:18 [GIT PULL] Btrfs updates for 6.9 David Sterba
2024-03-12 22:24 ` pr-tracker-bot
2024-03-18 4:43 ` [PATCH v5] btrfs: do not skip re-registration for the mounted device Anand Jain
2024-03-18 10:53 ` Alex Romosan
2024-03-18 18:09 ` David Sterba
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.