All of lore.kernel.org
 help / color / mirror / Atom feed
* [GIT PULL] Btrfs updates for 6.9
@ 2024-03-11 19:18 David Sterba
  2024-03-12 22:24 ` pr-tracker-bot
  0 siblings, 1 reply; 5+ messages in thread
From: David Sterba @ 2024-03-11 19:18 UTC (permalink / raw)
  To: torvalds; +Cc: David Sterba, linux-btrfs, linux-kernel

Hi,

there are mostly stabilization, refactoring and cleanup changes. There rest are
minor performance optimizations due to caching or lock contention reduction and
a few notable fixes.

Please pull, thanks.

Performance improvements:

- minor speedup in logging when repeatedly allocated structure is preallocated
  only once, improves latency and decreases lock contention

- minor throughput increase (+6%), reduced lock contention after clearing
  delayed allocation bits, applies to several common workload types

- skip full quota rescan if a new relation is added in the same transaction

Fixes:

- zstd fix for inline compressed file in subpage mode, updated version from the
  6.8 time

- proper qgroup inheritance ioctl parameter validation

- more fiemap followup fixes after reduced locking done in 6.8
  - fix race when detecting delalloc ranges

Core changes:

- more debugging code
  - added assertions for a very rare crash in raid56 calculation
  - tree-checker dumps page state to give more insights into possible reference
    counting issues

- add checksum calculation offloading sysfs knob, for now enabled under DEBUG
  only to determine a good heuristic for deciding the offload or synchronous,
  depends on various factors (block group profile, device speed) and is not as
  clear as initially thought (checksum type)

- error handling improvements, added assertions

- more page to folio conversion (defrag, truncate), cached size and shift

- preparation for more fine grained locking of sectors in subpage mode

- cleanups and refactoring
  - include cleanups, forward declarations
  - pointer-to-structure helpers
  - redundant argument removals
  - removed unused code
  - slab cache updates, last use of SLAB_MEM_SPREAD removed

----------------------------------------------------------------
The following changes since commit 90d35da658da8cff0d4ecbb5113f5fac9d00eb72:

  Linux 6.8-rc7 (2024-03-03 13:02:52 -0800)

are available in the Git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux.git tags/for-6.9-tag

for you to fetch changes up to 1cab1375ba6d5337a25acb346996106c12bb2dd0:

  btrfs: reuse cloned extent buffer during fiemap to avoid re-allocations (2024-03-05 18:14:19 +0100)

----------------------------------------------------------------
for-6.9-tag

----------------------------------------------------------------
Anand Jain (1):
      btrfs: include device major and minor numbers in the device scan notice

Chengming Zhou (1):
      btrfs: remove SLAB_MEM_SPREAD flag use

Colin Ian King (1):
      btrfs: zlib: Fix spelling mistake "infalte" -> "inflate"

David Sterba (62):
      btrfs: replace sb::s_blocksize by fs_info::sectorsize
      btrfs: replace i_blocksize by fs_info::sectorsize
      btrfs: remove unused included headers
      btrfs: handle errors returned from unpin_extent_cache()
      btrfs: return errors from unpin_extent_range()
      btrfs: make btrfs_error_unpin_extent_range() return void
      btrfs: handle directory and dentry mismatch in btrfs_may_delete()
      btrfs: handle invalid range and start in merge_extent_mapping()
      btrfs: handle block group lookup error when it's being removed
      btrfs: handle root deletion lookup error in btrfs_del_root()
      btrfs: handle invalid root reference found in btrfs_find_root()
      btrfs: handle invalid root reference found in btrfs_init_root_free_objectid()
      btrfs: handle chunk tree lookup error in btrfs_relocate_sys_chunks()
      btrfs: handle invalid extent item reference found in check_committed_ref()
      btrfs: export: handle invalid inode or root reference in btrfs_get_parent()
      btrfs: delayed-inode: drop pointless BUG_ON in __btrfs_remove_delayed_item()
      btrfs: change BUG_ON to assertion when checking for delayed_node root
      btrfs: defrag: change BUG_ON to assertion in btrfs_defrag_leaves()
      btrfs: change BUG_ON to assertion in btrfs_read_roots()
      btrfs: change BUG_ON to assertion when verifying lockdep class setup
      btrfs: change BUG_ON to assertion when verifying root in btrfs_alloc_reserved_file_extent()
      btrfs: change BUG_ON to assertion in reset_balance_state()
      btrfs: unify handling of return values of btrfs_insert_empty_items()
      btrfs: move transaction abort to the error site in btrfs_delete_free_space_tree()
      btrfs: move transaction abort to the error site in btrfs_create_free_space_tree()
      btrfs: move transaction abort to the error site btrfs_rebuild_free_space_tree()
      btrfs: tests: allocate dummy fs_info and root in test_find_delalloc()
      btrfs: add helpers to get inode from page/folio pointers
      btrfs: add helpers to get fs_info from page/folio pointers
      btrfs: add helper to get fs_info from struct inode pointer
      btrfs: hoist fs_info out of loops in end_bbio_data_write and end_bbio_data_read
      btrfs: add forward declarations and headers, part 1
      btrfs: add forward declarations and headers, part 2
      btrfs: add forward declarations and headers, part 3
      btrfs: push errors up from add_async_extent()
      btrfs: update comment and drop assertion in extent item lookup in find_parent_nodes()
      btrfs: handle invalid extent item reference found in extent_from_logical()
      btrfs: handle invalid extent item reference found in find_first_extent_item()
      btrfs: handle invalid root reference found in may_destroy_subvol()
      btrfs: send: handle unexpected data in header buffer in begin_cmd()
      btrfs: send: handle unexpected inode in header process_recorded_refs()
      btrfs: send: handle path ref underflow in header iterate_inode_ref()
      btrfs: change BUG_ON to assertion in tree_move_down()
      btrfs: change BUG_ONs to assertions in btrfs_qgroup_trace_subtree()
      btrfs: delete pointless BUG_ON check on quota root in btrfs_qgroup_account_extent()
      btrfs: delete pointless BUG_ONs on extent item size
      btrfs: delete BUG_ON in btrfs_init_locked_inode()
      btrfs: factor out validation of btrfs_ioctl_vol_args::name
      btrfs: factor out validation of btrfs_ioctl_vol_args_v2::name
      btrfs: move balance args conversion helpers to volumes.c
      btrfs: open code btrfs_backref_iter_free()
      btrfs: open code btrfs_backref_get_eb()
      btrfs: uninline some static inline helpers from backref.h
      btrfs: uninline btrfs_init_delayed_root()
      btrfs: drop static inline specifiers from tree-mod-log.c
      btrfs: uninline some static inline helpers from tree-log.h
      btrfs: open code trivial btrfs_lru_cache_size()
      btrfs: uninline some static inline helpers from delayed-ref.h
      btrfs: handle transaction commit errors in flush_reservations()
      btrfs: pass btrfs_device to btrfs_scratch_superblocks()
      btrfs: merge btrfs_del_delalloc_inode() helpers
      btrfs: pass a valid extent map cache pointer to __get_extent_map()

Filipe Manana (19):
      btrfs: remove extent_map_tree forward declaration at extent_io.h
      btrfs: document what the spinlock unused_bgs_lock protects
      btrfs: add comment about list_is_singular() use at btrfs_delete_unused_bgs()
      btrfs: preallocate temporary extent buffer for inode logging when needed
      btrfs: stop passing root argument to btrfs_add_delalloc_inodes()
      btrfs: stop passing root argument to __btrfs_del_delalloc_inode()
      btrfs: assert root delalloc lock is held at __btrfs_del_delalloc_inode()
      btrfs: rename btrfs_add_delalloc_inodes() to singular form
      btrfs: reduce inode lock critical section when setting and clearing delalloc
      btrfs: add lockdep assertion to remaining delalloc callbacks
      btrfs: use assertion instead of BUG_ON when adding/removing to delalloc list
      btrfs: remove do_list variable at btrfs_set_delalloc_extent()
      btrfs: remove do_list variable at btrfs_clear_delalloc_extent()
      btrfs: remove no longer used btrfs_transaction_in_commit()
      btrfs: send: avoid duplicated search for last extent when sending hole
      btrfs: avoid unnecessary ref initialization when freeing log tree block
      btrfs: fix off-by-one chunk length calculation at contains_pending_extent()
      btrfs: fix race when detecting delalloc ranges during fiemap
      btrfs: reuse cloned extent buffer during fiemap to avoid re-allocations

Goldwyn Rodrigues (1):
      btrfs: page to folio conversion in btrfs_truncate_block()

Johannes Thumshirn (1):
      btrfs: remove duplicate recording of physical address

Josef Bacik (1):
      btrfs: WARN_ON_ONCE() in our leak detection code

Kunwu Chan (6):
      btrfs: use KMEM_CACHE() to create btrfs_delayed_node cache
      btrfs: use KMEM_CACHE() to create btrfs_ordered_extent cache
      btrfs: use KMEM_CACHE() to create btrfs_trans_handle cache
      btrfs: use KMEM_CACHE() to create btrfs_path cache
      btrfs: use KMEM_CACHE() to create delayed ref caches
      btrfs: use KMEM_CACHE() to create btrfs_free_space cache

Lijuan Li (2):
      btrfs: mark __btrfs_add_free_space static
      btrfs: mark btrfs_put_caching_control() static

Matthew Wilcox (Oracle) (3):
      btrfs: add set_folio_extent_mapped() helper
      btrfs: convert defrag_prepare_one_page() to use a folio
      btrfs: use a folio array throughout the defrag process

Naohiro Aota (2):
      btrfs: use READ/WRITE_ONCE for fs_devices->read_policy
      btrfs: introduce offload_csum_mode to tweak checksum offloading behavior

Neal Gompa (1):
      btrfs: sysfs: drop unnecessary double logical negation in acl_show()

Qu Wenruo (13):
      btrfs: remove the pg_offset parameter from btrfs_get_extent()
      btrfs: remove unused variable bio_offset from end_bbio_data_read()
      btrfs: cache folio size and shift in extent_buffer
      btrfs: zstd: fix and simplify the inline extent decompression (v2)
      btrfs: raid56: extra debugging for raid6 syndrome generation
      btrfs: unexport btrfs_subpage_start_writer() and btrfs_subpage_end_and_test_writer()
      btrfs: subpage: make reader lock utilize bitmap
      btrfs: subpage: make writer lock utilize bitmap
      btrfs: compression: remove dead comments in btrfs_compress_heuristic()
      btrfs: tree-checker: dump the page status if hit something wrong
      btrfs: qgroup: always free reserved space for extent records
      btrfs: qgroup: validate btrfs_qgroup_inherit parameter
      btrfs: qgroup: allow quick inherit if snapshot is created and added to the same parent

 fs/btrfs/accessors.c             |  15 +-
 fs/btrfs/accessors.h             |  50 +----
 fs/btrfs/acl.c                   |   1 -
 fs/btrfs/acl.h                   |  11 ++
 fs/btrfs/async-thread.c          |   1 -
 fs/btrfs/async-thread.h          |   3 +
 fs/btrfs/backref.c               | 119 ++++++++++--
 fs/btrfs/backref.h               | 136 +++-----------
 fs/btrfs/bio.c                   |  17 +-
 fs/btrfs/bio.h                   |   2 +
 fs/btrfs/block-group.c           |  15 +-
 fs/btrfs/block-group.h           |  14 +-
 fs/btrfs/block-rsv.c             |   1 -
 fs/btrfs/block-rsv.h             |   7 +
 fs/btrfs/btrfs_inode.h           |  25 ++-
 fs/btrfs/compression.c           |  18 +-
 fs/btrfs/compression.h           |  12 +-
 fs/btrfs/ctree.c                 |  10 +-
 fs/btrfs/ctree.h                 |  28 ++-
 fs/btrfs/defrag.c                | 104 +++++------
 fs/btrfs/defrag.h                |  10 +
 fs/btrfs/delalloc-space.c        |   2 -
 fs/btrfs/delalloc-space.h        |   4 +
 fs/btrfs/delayed-inode.c         |  21 ++-
 fs/btrfs/delayed-inode.h         |  21 +--
 fs/btrfs/delayed-ref.c           |  85 +++++++--
 fs/btrfs/delayed-ref.h           |  82 ++-------
 fs/btrfs/dev-replace.c           |   5 +-
 fs/btrfs/dev-replace.h           |   4 +
 fs/btrfs/dir-item.h              |   6 +
 fs/btrfs/disk-io.c               |  30 ++-
 fs/btrfs/disk-io.h               |  20 +-
 fs/btrfs/export.c                |  12 +-
 fs/btrfs/export.h                |   4 +
 fs/btrfs/extent-io-tree.c        |   6 +-
 fs/btrfs/extent-io-tree.h        |   7 +
 fs/btrfs/extent-tree.c           |  51 ++++--
 fs/btrfs/extent-tree.h           |  10 +
 fs/btrfs/extent_io.c             | 387 +++++++++++++++++++++++++--------------
 fs/btrfs/extent_io.h             |  44 ++++-
 fs/btrfs/extent_map.c            |  23 ++-
 fs/btrfs/extent_map.h            |   8 +
 fs/btrfs/file-item.c             |   6 -
 fs/btrfs/file-item.h             |  13 ++
 fs/btrfs/file.c                  |  43 +++--
 fs/btrfs/file.h                  |  15 ++
 fs/btrfs/free-space-cache.c      |  12 +-
 fs/btrfs/free-space-cache.h      |  15 +-
 fs/btrfs/free-space-tree.c       |  56 +++---
 fs/btrfs/free-space-tree.h       |   6 +
 fs/btrfs/fs.h                    |  59 +++++-
 fs/btrfs/inode-item.c            |   1 -
 fs/btrfs/inode-item.h            |   5 +-
 fs/btrfs/inode.c                 | 238 +++++++++++++-----------
 fs/btrfs/ioctl.c                 | 120 +++++++-----
 fs/btrfs/ioctl.h                 |   9 +
 fs/btrfs/locking.c               |   3 +-
 fs/btrfs/locking.h               |   8 +-
 fs/btrfs/lru_cache.h             |   7 +-
 fs/btrfs/lzo.c                   |   4 +-
 fs/btrfs/messages.c              |   2 -
 fs/btrfs/misc.h                  |   2 +
 fs/btrfs/ordered-data.c          |   6 +-
 fs/btrfs/ordered-data.h          |  15 ++
 fs/btrfs/orphan.c                |   1 -
 fs/btrfs/orphan.h                |   5 +
 fs/btrfs/print-tree.h            |   3 +
 fs/btrfs/props.c                 |   3 +-
 fs/btrfs/props.h                 |   7 +-
 fs/btrfs/qgroup.c                | 148 +++++++++++++--
 fs/btrfs/qgroup.h                |  20 +-
 fs/btrfs/raid-stripe-tree.c      |   1 -
 fs/btrfs/raid-stripe-tree.h      |   5 +
 fs/btrfs/raid56.c                |  31 +++-
 fs/btrfs/raid56.h                |   9 +
 fs/btrfs/rcu-string.h            |   6 +
 fs/btrfs/ref-verify.h            |   9 +
 fs/btrfs/reflink.c               |  12 +-
 fs/btrfs/reflink.h               |   4 +-
 fs/btrfs/relocation.c            |   5 +-
 fs/btrfs/relocation.h            |   9 +
 fs/btrfs/root-tree.c             |  17 +-
 fs/btrfs/root-tree.h             |  10 +
 fs/btrfs/scrub.c                 |   9 +-
 fs/btrfs/scrub.h                 |   6 +
 fs/btrfs/send.c                  |  64 ++++---
 fs/btrfs/send.h                  |   8 +-
 fs/btrfs/space-info.c            |   1 -
 fs/btrfs/space-info.h            |   9 +
 fs/btrfs/subpage.c               |  74 ++++++--
 fs/btrfs/subpage.h               |  21 ++-
 fs/btrfs/super.c                 |   9 +-
 fs/btrfs/super.h                 |   7 +
 fs/btrfs/sysfs.c                 |  53 +++++-
 fs/btrfs/sysfs.h                 |   9 +
 fs/btrfs/tests/extent-io-tests.c |  28 ++-
 fs/btrfs/tests/inode-tests.c     |  40 ++--
 fs/btrfs/transaction.c           |  19 +-
 fs/btrfs/transaction.h           |  18 +-
 fs/btrfs/tree-checker.c          |   8 +-
 fs/btrfs/tree-checker.h          |   2 +
 fs/btrfs/tree-log.c              | 141 ++++++++++----
 fs/btrfs/tree-log.h              |  49 ++---
 fs/btrfs/tree-mod-log.c          |  13 +-
 fs/btrfs/tree-mod-log.h          |   8 +-
 fs/btrfs/ulist.c                 |   1 -
 fs/btrfs/ulist.h                 |   1 +
 fs/btrfs/uuid-tree.c             |   3 +-
 fs/btrfs/uuid-tree.h             |   5 +
 fs/btrfs/verity.c                |   1 -
 fs/btrfs/verity.h                |   7 +
 fs/btrfs/volumes.c               |  98 +++++++---
 fs/btrfs/volumes.h               |  53 +++++-
 fs/btrfs/xattr.h                 |   6 +-
 fs/btrfs/zlib.c                  |   2 +-
 fs/btrfs/zoned.c                 |   2 -
 fs/btrfs/zoned.h                 |  15 ++
 fs/btrfs/zstd.c                  |  75 +++-----
 include/uapi/linux/btrfs.h       |   1 +
 119 files changed, 2131 insertions(+), 1116 deletions(-)

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [GIT PULL] Btrfs updates for 6.9
  2024-03-11 19:18 [GIT PULL] Btrfs updates for 6.9 David Sterba
@ 2024-03-12 22:24 ` pr-tracker-bot
  2024-03-18  4:43   ` [PATCH v5] btrfs: do not skip re-registration for the mounted device Anand Jain
  0 siblings, 1 reply; 5+ messages in thread
From: pr-tracker-bot @ 2024-03-12 22:24 UTC (permalink / raw)
  To: David Sterba; +Cc: torvalds, David Sterba, linux-btrfs, linux-kernel

The pull request you sent on Mon, 11 Mar 2024 20:18:45 +0100:

> git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux.git tags/for-6.9-tag

has been merged into torvalds/linux.git:
https://git.kernel.org/torvalds/c/43a7548e28a6df12a6170421d9d016c576010baa

Thank you!

-- 
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/prtracker.html

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [PATCH v5] btrfs: do not skip re-registration for the mounted device
  2024-03-12 22:24 ` pr-tracker-bot
@ 2024-03-18  4:43   ` Anand Jain
  2024-03-18 10:53     ` Alex Romosan
  2024-03-18 18:09     ` David Sterba
  0 siblings, 2 replies; 5+ messages in thread
From: Anand Jain @ 2024-03-18  4:43 UTC (permalink / raw)
  To: linux-btrfs
  Cc: Anand Jain, stable, Alex Romosan, CHECK_1234543212345, David Sterba

There are reports that since version 6.7 update-grub fails to find the
device of the root on systems without initrd and on a single device.

This looks like the device name changed in the output of
/proc/self/mountinfo:

6.5-rc5 working

  18 1 0:16 / / rw,noatime - btrfs /dev/sda8 ...

6.7 not working:

  17 1 0:15 / / rw,noatime - btrfs /dev/root ...

and "update-grub" shows this error:

  /usr/sbin/grub-probe: error: cannot find a device for / (is /dev mounted?)

This looks like it's related to the device name, but grub-probe
recognizes the "/dev/root" path and tries to find the underlying device.
However there's a special case for some filesystems, for btrfs in
particular.

The generic root device detection heuristic is not done and it all
relies on reading the device infos by a btrfs specific ioctl. This ioctl
returns the device name as it was saved at the time of device scan (in
this case it's /dev/root).

The change in 6.7 for temp_fsid to allow several single device
filesystem to exist with the same fsid (and transparently generate a new
UUID at mount time) was to skip caching/registering such devices.

This also skipped mounted device. One step of scanning is to check if
the device name hasn't changed, and if yes then update the cached value.

This broke the grub-probe as it always read the device /dev/root and
couldn't find it in the system. A temporary workaround is to create a
symlink but this does not survive reboot.

The right fix is to allow updating the device path of a mounted
filesystem even if this is a single device one.

In the fix, check if the device's major:minor number matches with the
cached device. If they do, then we can allow the scan to happen so that
device_list_add() can take care of updating the device path. The file
descriptor remains unchanged.

This does not affect the temp_fsid feature, the UUID of the mounted
filesystem remains the same and the matching is based on device major:minor
which is unique per mounted filesystem.

This covers the path when the device (that exists for all mounted
devices) name changes, updating /dev/root to /dev/sdx. Any other single
device with filesystem and is not mounted is still skipped.

Note that if a system is booted and initial mount is done on the
/dev/root device, this will be the cached name of the device. Only after
the command "btrfs device scan" it will change as it triggers the
rename.

The fix was verified by users whose systems were affected.

CC: stable@vger.kernel.org # 6.7+
Fixes: bc27d6f0aa0e ("btrfs: scan but don't register device on single device filesystem")
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=218353
Link: https://lore.kernel.org/lkml/CAKLYgeJ1tUuqLcsquwuFqjDXPSJpEiokrWK2gisPKDZLs8Y2TQ@mail.gmail.com/
Tested-by: Alex Romosan <aromosan@gmail.com>
Tested-by: CHECK_1234543212345@protonmail.com
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: Anand Jain <anand.jain@oracle.com>
---
v5:
Fix the linux-next build failure reported here:
  https://lore.kernel.org/all/20240318091755.1d0f696f@canb.auug.org.au/
As the Linux-next branch no longer has the this commit,
I've sent out the entire patch again.

v4: (based on mainline master)
I removed CC: stable@vger.kernel.org # 6.7+ as this is still in the RFC stage.
I need this patch verified by the bug filer.
Use devt from bdev->bd_dev
Rebased on mainline kernel.org master branch

v3:
https://lore.kernel.org/linux-btrfs/e2add8d54fbbd813305ba014c11d21d297ad87d0.1709782041.git.anand.jain@oracle.com/T/#u

 fs/btrfs/volumes.c | 58 +++++++++++++++++++++++++++++++++++++---------
 1 file changed, 47 insertions(+), 11 deletions(-)

diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
index a2d07fa3cfdf..813c1c66b2db 100644
--- a/fs/btrfs/volumes.c
+++ b/fs/btrfs/volumes.c
@@ -1303,6 +1303,47 @@ int btrfs_forget_devices(dev_t devt)
 	return ret;
 }
 
+static bool btrfs_skip_registration(struct btrfs_super_block *disk_super,
+				    const char *path, dev_t devt,
+				    bool mount_arg_dev)
+{
+	struct btrfs_fs_devices *fs_devices;
+
+	/*
+	 * Do not skip device registration for mounted devices with matching
+	 * maj:min but different paths. Booting without initrd relies on
+	 * /dev/root initially, later replaced with the actual root device.
+	 * A successful scan ensures update-grub selects the correct device.
+	 */
+	list_for_each_entry(fs_devices, &fs_uuids, fs_list) {
+		struct btrfs_device *device;
+
+		mutex_lock(&fs_devices->device_list_mutex);
+
+		if (!fs_devices->opened) {
+			mutex_unlock(&fs_devices->device_list_mutex);
+			continue;
+		}
+
+		list_for_each_entry(device, &fs_devices->devices, dev_list) {
+			if ((device->devt == devt) &&
+			    strcmp(device->name->str, path)) {
+				mutex_unlock(&fs_devices->device_list_mutex);
+
+				/* Do not skip registration */
+				return false;
+			}
+		}
+		mutex_unlock(&fs_devices->device_list_mutex);
+	}
+
+	if (!mount_arg_dev && btrfs_super_num_devices(disk_super) == 1 &&
+	    !(btrfs_super_flags(disk_super) & BTRFS_SUPER_FLAG_SEEDING))
+		return true;
+
+	return false;
+}
+
 /*
  * Look for a btrfs signature on a device. This may be called out of the mount path
  * and we are not allowed to call set_blocksize during the scan. The superblock
@@ -1320,6 +1361,7 @@ struct btrfs_device *btrfs_scan_one_device(const char *path, blk_mode_t flags,
 	struct btrfs_device *device = NULL;
 	struct file *bdev_file;
 	u64 bytenr, bytenr_orig;
+	dev_t devt;
 	int ret;
 
 	lockdep_assert_held(&uuid_mutex);
@@ -1359,19 +1401,13 @@ struct btrfs_device *btrfs_scan_one_device(const char *path, blk_mode_t flags,
 		goto error_bdev_put;
 	}
 
-	if (!mount_arg_dev && btrfs_super_num_devices(disk_super) == 1 &&
-	    !(btrfs_super_flags(disk_super) & BTRFS_SUPER_FLAG_SEEDING)) {
-		dev_t devt;
+	devt = file_bdev(bdev_file)->bd_dev;
+	if (btrfs_skip_registration(disk_super, path, devt, mount_arg_dev)) {
+	pr_debug("BTRFS: skip registering single non-seed device %s (%d:%d)\n",
+			  path, MAJOR(devt), MINOR(devt));
 
-		ret = lookup_bdev(path, &devt);
-		if (ret)
-			btrfs_warn(NULL, "lookup bdev failed for path %s: %d",
-				   path, ret);
-		else
-			btrfs_free_stale_devices(devt, NULL);
+		btrfs_free_stale_devices(devt, NULL);
 
-	pr_debug("BTRFS: skip registering single non-seed device %s (%d:%d)\n",
-			path, MAJOR(devt), MINOR(devt));
 		device = NULL;
 		goto free_disk_super;
 	}
-- 
2.38.1


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH v5] btrfs: do not skip re-registration for the mounted device
  2024-03-18  4:43   ` [PATCH v5] btrfs: do not skip re-registration for the mounted device Anand Jain
@ 2024-03-18 10:53     ` Alex Romosan
  2024-03-18 18:09     ` David Sterba
  1 sibling, 0 replies; 5+ messages in thread
From: Alex Romosan @ 2024-03-18 10:53 UTC (permalink / raw)
  To: Anand Jain; +Cc: linux-btrfs, stable, CHECK_1234543212345, David Sterba

confirming that update-grub works with v5 of the patch (applied
against current Linus tree HEAD). these are the relevant entries in
the log:

Btrfs loaded, debug=on, zoned=no, fsverity=no
BTRFS: device fsid 695aa7ac-862a-4de3-ae59-c96f784600a0 devid 1
transid 2037166 /dev/root (259:3) scanned by swapper/0 (1)
BTRFS info (device nvme0n1p3): first mount of filesystem
695aa7ac-862a-4de3-ae59-c96f784600a0
BTRFS info (device nvme0n1p3): using crc32c (crc32c-generic) checksum algorithm
BTRFS info (device nvme0n1p3): using free-space-tree
VFS: Mounted root (btrfs filesystem) readonly on device 0:20.
BTRFS info: devid 1 device path /dev/root changed to /dev/nvme0n1p3
scanned by (udev-worker) (278)


On Mon, Mar 18, 2024 at 5:47 AM Anand Jain <anand.jain@oracle.com> wrote:
>
> There are reports that since version 6.7 update-grub fails to find the
> device of the root on systems without initrd and on a single device.
>
> This looks like the device name changed in the output of
> /proc/self/mountinfo:
>
> 6.5-rc5 working
>
>   18 1 0:16 / / rw,noatime - btrfs /dev/sda8 ...
>
> 6.7 not working:
>
>   17 1 0:15 / / rw,noatime - btrfs /dev/root ...
>
> and "update-grub" shows this error:
>
>   /usr/sbin/grub-probe: error: cannot find a device for / (is /dev mounted?)
>
> This looks like it's related to the device name, but grub-probe
> recognizes the "/dev/root" path and tries to find the underlying device.
> However there's a special case for some filesystems, for btrfs in
> particular.
>
> The generic root device detection heuristic is not done and it all
> relies on reading the device infos by a btrfs specific ioctl. This ioctl
> returns the device name as it was saved at the time of device scan (in
> this case it's /dev/root).
>
> The change in 6.7 for temp_fsid to allow several single device
> filesystem to exist with the same fsid (and transparently generate a new
> UUID at mount time) was to skip caching/registering such devices.
>
> This also skipped mounted device. One step of scanning is to check if
> the device name hasn't changed, and if yes then update the cached value.
>
> This broke the grub-probe as it always read the device /dev/root and
> couldn't find it in the system. A temporary workaround is to create a
> symlink but this does not survive reboot.
>
> The right fix is to allow updating the device path of a mounted
> filesystem even if this is a single device one.
>
> In the fix, check if the device's major:minor number matches with the
> cached device. If they do, then we can allow the scan to happen so that
> device_list_add() can take care of updating the device path. The file
> descriptor remains unchanged.
>
> This does not affect the temp_fsid feature, the UUID of the mounted
> filesystem remains the same and the matching is based on device major:minor
> which is unique per mounted filesystem.
>
> This covers the path when the device (that exists for all mounted
> devices) name changes, updating /dev/root to /dev/sdx. Any other single
> device with filesystem and is not mounted is still skipped.
>
> Note that if a system is booted and initial mount is done on the
> /dev/root device, this will be the cached name of the device. Only after
> the command "btrfs device scan" it will change as it triggers the
> rename.
>
> The fix was verified by users whose systems were affected.
>
> CC: stable@vger.kernel.org # 6.7+
> Fixes: bc27d6f0aa0e ("btrfs: scan but don't register device on single device filesystem")
> Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=218353
> Link: https://lore.kernel.org/lkml/CAKLYgeJ1tUuqLcsquwuFqjDXPSJpEiokrWK2gisPKDZLs8Y2TQ@mail.gmail.com/
> Tested-by: Alex Romosan <aromosan@gmail.com>
> Tested-by: CHECK_1234543212345@protonmail.com
> Reviewed-by: David Sterba <dsterba@suse.com>
> Signed-off-by: Anand Jain <anand.jain@oracle.com>
> ---
> v5:
> Fix the linux-next build failure reported here:
>   https://lore.kernel.org/all/20240318091755.1d0f696f@canb.auug.org.au/
> As the Linux-next branch no longer has the this commit,
> I've sent out the entire patch again.
>
> v4: (based on mainline master)
> I removed CC: stable@vger.kernel.org # 6.7+ as this is still in the RFC stage.
> I need this patch verified by the bug filer.
> Use devt from bdev->bd_dev
> Rebased on mainline kernel.org master branch
>
> v3:
> https://lore.kernel.org/linux-btrfs/e2add8d54fbbd813305ba014c11d21d297ad87d0.1709782041.git.anand.jain@oracle.com/T/#u
>
>  fs/btrfs/volumes.c | 58 +++++++++++++++++++++++++++++++++++++---------
>  1 file changed, 47 insertions(+), 11 deletions(-)
>
> diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
> index a2d07fa3cfdf..813c1c66b2db 100644
> --- a/fs/btrfs/volumes.c
> +++ b/fs/btrfs/volumes.c
> @@ -1303,6 +1303,47 @@ int btrfs_forget_devices(dev_t devt)
>         return ret;
>  }
>
> +static bool btrfs_skip_registration(struct btrfs_super_block *disk_super,
> +                                   const char *path, dev_t devt,
> +                                   bool mount_arg_dev)
> +{
> +       struct btrfs_fs_devices *fs_devices;
> +
> +       /*
> +        * Do not skip device registration for mounted devices with matching
> +        * maj:min but different paths. Booting without initrd relies on
> +        * /dev/root initially, later replaced with the actual root device.
> +        * A successful scan ensures update-grub selects the correct device.
> +        */
> +       list_for_each_entry(fs_devices, &fs_uuids, fs_list) {
> +               struct btrfs_device *device;
> +
> +               mutex_lock(&fs_devices->device_list_mutex);
> +
> +               if (!fs_devices->opened) {
> +                       mutex_unlock(&fs_devices->device_list_mutex);
> +                       continue;
> +               }
> +
> +               list_for_each_entry(device, &fs_devices->devices, dev_list) {
> +                       if ((device->devt == devt) &&
> +                           strcmp(device->name->str, path)) {
> +                               mutex_unlock(&fs_devices->device_list_mutex);
> +
> +                               /* Do not skip registration */
> +                               return false;
> +                       }
> +               }
> +               mutex_unlock(&fs_devices->device_list_mutex);
> +       }
> +
> +       if (!mount_arg_dev && btrfs_super_num_devices(disk_super) == 1 &&
> +           !(btrfs_super_flags(disk_super) & BTRFS_SUPER_FLAG_SEEDING))
> +               return true;
> +
> +       return false;
> +}
> +
>  /*
>   * Look for a btrfs signature on a device. This may be called out of the mount path
>   * and we are not allowed to call set_blocksize during the scan. The superblock
> @@ -1320,6 +1361,7 @@ struct btrfs_device *btrfs_scan_one_device(const char *path, blk_mode_t flags,
>         struct btrfs_device *device = NULL;
>         struct file *bdev_file;
>         u64 bytenr, bytenr_orig;
> +       dev_t devt;
>         int ret;
>
>         lockdep_assert_held(&uuid_mutex);
> @@ -1359,19 +1401,13 @@ struct btrfs_device *btrfs_scan_one_device(const char *path, blk_mode_t flags,
>                 goto error_bdev_put;
>         }
>
> -       if (!mount_arg_dev && btrfs_super_num_devices(disk_super) == 1 &&
> -           !(btrfs_super_flags(disk_super) & BTRFS_SUPER_FLAG_SEEDING)) {
> -               dev_t devt;
> +       devt = file_bdev(bdev_file)->bd_dev;
> +       if (btrfs_skip_registration(disk_super, path, devt, mount_arg_dev)) {
> +       pr_debug("BTRFS: skip registering single non-seed device %s (%d:%d)\n",
> +                         path, MAJOR(devt), MINOR(devt));
>
> -               ret = lookup_bdev(path, &devt);
> -               if (ret)
> -                       btrfs_warn(NULL, "lookup bdev failed for path %s: %d",
> -                                  path, ret);
> -               else
> -                       btrfs_free_stale_devices(devt, NULL);
> +               btrfs_free_stale_devices(devt, NULL);
>
> -       pr_debug("BTRFS: skip registering single non-seed device %s (%d:%d)\n",
> -                       path, MAJOR(devt), MINOR(devt));
>                 device = NULL;
>                 goto free_disk_super;
>         }
> --
> 2.38.1
>

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH v5] btrfs: do not skip re-registration for the mounted device
  2024-03-18  4:43   ` [PATCH v5] btrfs: do not skip re-registration for the mounted device Anand Jain
  2024-03-18 10:53     ` Alex Romosan
@ 2024-03-18 18:09     ` David Sterba
  1 sibling, 0 replies; 5+ messages in thread
From: David Sterba @ 2024-03-18 18:09 UTC (permalink / raw)
  To: Anand Jain
  Cc: linux-btrfs, stable, Alex Romosan, CHECK_1234543212345, David Sterba

On Mon, Mar 18, 2024 at 10:13:13AM +0530, Anand Jain wrote:
> There are reports that since version 6.7 update-grub fails to find the
> device of the root on systems without initrd and on a single device.
> 
> This looks like the device name changed in the output of
> /proc/self/mountinfo:
> 
> 6.5-rc5 working
> 
>   18 1 0:16 / / rw,noatime - btrfs /dev/sda8 ...
> 
> 6.7 not working:
> 
>   17 1 0:15 / / rw,noatime - btrfs /dev/root ...
> 
> and "update-grub" shows this error:
> 
>   /usr/sbin/grub-probe: error: cannot find a device for / (is /dev mounted?)
> 
> This looks like it's related to the device name, but grub-probe
> recognizes the "/dev/root" path and tries to find the underlying device.
> However there's a special case for some filesystems, for btrfs in
> particular.
> 
> The generic root device detection heuristic is not done and it all
> relies on reading the device infos by a btrfs specific ioctl. This ioctl
> returns the device name as it was saved at the time of device scan (in
> this case it's /dev/root).
> 
> The change in 6.7 for temp_fsid to allow several single device
> filesystem to exist with the same fsid (and transparently generate a new
> UUID at mount time) was to skip caching/registering such devices.
> 
> This also skipped mounted device. One step of scanning is to check if
> the device name hasn't changed, and if yes then update the cached value.
> 
> This broke the grub-probe as it always read the device /dev/root and
> couldn't find it in the system. A temporary workaround is to create a
> symlink but this does not survive reboot.
> 
> The right fix is to allow updating the device path of a mounted
> filesystem even if this is a single device one.
> 
> In the fix, check if the device's major:minor number matches with the
> cached device. If they do, then we can allow the scan to happen so that
> device_list_add() can take care of updating the device path. The file
> descriptor remains unchanged.
> 
> This does not affect the temp_fsid feature, the UUID of the mounted
> filesystem remains the same and the matching is based on device major:minor
> which is unique per mounted filesystem.
> 
> This covers the path when the device (that exists for all mounted
> devices) name changes, updating /dev/root to /dev/sdx. Any other single
> device with filesystem and is not mounted is still skipped.
> 
> Note that if a system is booted and initial mount is done on the
> /dev/root device, this will be the cached name of the device. Only after
> the command "btrfs device scan" it will change as it triggers the
> rename.
> 
> The fix was verified by users whose systems were affected.
> 
> CC: stable@vger.kernel.org # 6.7+
> Fixes: bc27d6f0aa0e ("btrfs: scan but don't register device on single device filesystem")
> Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=218353
> Link: https://lore.kernel.org/lkml/CAKLYgeJ1tUuqLcsquwuFqjDXPSJpEiokrWK2gisPKDZLs8Y2TQ@mail.gmail.com/
> Tested-by: Alex Romosan <aromosan@gmail.com>
> Tested-by: CHECK_1234543212345@protonmail.com
> Reviewed-by: David Sterba <dsterba@suse.com>
> Signed-off-by: Anand Jain <anand.jain@oracle.com>
> ---
> v5:
> Fix the linux-next build failure reported here:
>   https://lore.kernel.org/all/20240318091755.1d0f696f@canb.auug.org.au/
> As the Linux-next branch no longer has the this commit,
> I've sent out the entire patch again.

Thanks but this won't work. The code in v5 is against current master
with bdev_handle -> file_bdev change but the patch gets merged from
branch that does not have that.

For backport to stable we'll need the v4 version while for merge to
master (in the next pull request) it's going to be v5.

I'll do a separate pull request with this change on the correct base and
we'll have to let stable team know which patch to pick.

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2024-03-18 18:17 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-03-11 19:18 [GIT PULL] Btrfs updates for 6.9 David Sterba
2024-03-12 22:24 ` pr-tracker-bot
2024-03-18  4:43   ` [PATCH v5] btrfs: do not skip re-registration for the mounted device Anand Jain
2024-03-18 10:53     ` Alex Romosan
2024-03-18 18:09     ` David Sterba

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.