All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] btrfs: fix mkfs/mount/check failures due to race with systemd-udevd scan
@ 2023-03-23  7:56 Anand Jain
  2023-03-23 11:57 ` Wang Yugui
  2023-03-23 18:24 ` David Sterba
  0 siblings, 2 replies; 6+ messages in thread
From: Anand Jain @ 2023-03-23  7:56 UTC (permalink / raw)
  To: linux-btrfs; +Cc: Anand Jain, Sherry Yang, kernel test robot

During the device scan initiated by systemd-udevd, other user space
EXCL operations such as mkfs, mount, or check may get blocked and result
in a "Device or resource busy" error. This is because the device
scan process opens the device with the EXCL flag in the kernel.

Two reports were received:

 . One with the btrfs/179 testcase, where the fsck command failed with
   the -EBUSY error; and

 . Another with the LTP pwritev03 testcase, where mkfs.vfs failed with
   the -EBUSY error, when mkfs.vfs tried to overwrite old btrfs filesystem
   on the device.

In both cases, fsck and mkfs (respectively) were racing with a
systemd-udevd device scan, and systemd-udevd won, resulting in the
-EBUSY error for fsck and mkfs.

Reproducing the problem has been difficult because there is a very
small timeframe during which these userspace threads can race to
acquire the exclusive device open. Even on the system where the problem
was observed, the problem occurances were anywhere between 10 to 400
iterations and chances of reproducing lessen with debug printk()s.

However, an exclusive device open is unnecessary for the scan process,
as there are no write operations on the device during scan. Furthermore,
during the mount process, the superblock is re-read in the below
function stack.

  btrfs_mount_root
   btrfs_open_devices
    open_fs_devices
     btrfs_open_one_device
       btrfs_get_bdev_and_sb

So, to fix this issue, this patch removes the FMODE_EXCL flag from the scan
operation, and adds a comment.

Reported-by: Sherry Yang <sherry.yang@oracle.com>
Reported-by: kernel test robot <oliver.sang@intel.com>
Link: https://lore.kernel.org/oe-lkp/202303170839.fdf23068-oliver.sang@intel.com
Signed-off-by: Anand Jain <anand.jain@oracle.com>
---

 This patch should be cc-ed to stable-5.15.y and stable-6.1.y. As for
 stable-5.10.y and stable-5.4.y, a conflict fix is necessary, which I
 will send separately.

 fs/btrfs/volumes.c | 11 ++++++++++-
 1 file changed, 10 insertions(+), 1 deletion(-)

diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
index 93bc45001e68..cc1871767c8c 100644
--- a/fs/btrfs/volumes.c
+++ b/fs/btrfs/volumes.c
@@ -1366,8 +1366,17 @@ struct btrfs_device *btrfs_scan_one_device(const char *path, fmode_t flags,
 	 * So, we need to add a special mount option to scan for
 	 * later supers, using BTRFS_SUPER_MIRROR_MAX instead
 	 */
-	flags |= FMODE_EXCL;
 
+	/*
+	 * Avoid using flag |= FMODE_EXCL here, as the systemd-udev may
+	 * initiate the device scan which may race with the user's mount
+	 * or mkfs command, resulting in failure.
+	 * Since the device scan is solely for reading purposes, there is
+	 * no need for FMODE_EXCL. Additionally, the devices are read again
+	 * during the mount process. It is ok to get some inconsistent
+	 * values temporarily, as the device paths of the fsid are the only
+	 * required information for assembling the volume.
+	 */
 	bdev = blkdev_get_by_path(path, flags, holder);
 	if (IS_ERR(bdev))
 		return ERR_CAST(bdev);
-- 
2.38.1


^ permalink raw reply related	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2023-03-23 18:34 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-03-23  7:56 [PATCH] btrfs: fix mkfs/mount/check failures due to race with systemd-udevd scan Anand Jain
2023-03-23 11:57 ` Wang Yugui
2023-03-23 13:14   ` Anand Jain
2023-03-23 13:27     ` Wang Yugui
2023-03-23 18:27       ` David Sterba
2023-03-23 18:24 ` David Sterba

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.