linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Austin S. Hemmelgarn" <ahferroin7@gmail.com>
To: Stanislav Brabec <sbrabec@suse.cz>,
	linux-kernel@vger.kernel.org, Jens Axboe <axboe@kernel.dk>,
	Btrfs BTRFS <linux-btrfs@vger.kernel.org>
Subject: Re: loop subsystem corrupted after mounting multiple btrfs sub-volumes
Date: Fri, 26 Feb 2016 07:33:52 -0500	[thread overview]
Message-ID: <56D04630.1020809@gmail.com> (raw)
In-Reply-To: <56CF5490.7040102@suse.cz>

Added linux-btrfs as this should be documented there as a known issue 
until it gets fixed (although I have no idea which side is the issue).
On 2016-02-25 14:22, Stanislav Brabec wrote:
> While writing a test suite for util-linux[1], I experienced a a strange
> behavior of loop device:
>
> When two loop devices refer to the same file, and two btrfs mounts are
> called on them, the second mount changes loop device of the first,
> already mounted sub-volume. (Note that the current implementation of
> util-linux mount -oloop works exactly in this way, and it allocates new
> loop device for each mount command, so this bug can be easily
> reproduced without losetup, just using "mount -oloop" or fstab.)
I'm not 100% certain, but I think this is a interaction between how 
BTRFS handles multiple mounts of the same filesystem on a given system 
and how mount handles loop mounts.  AFAIUI, all instances of a given 
BTRFS filesystem being mounted on a given system are internally 
identical to bind mounts of a hidden mount of that filesystem.  This is 
what allows both manual mounting of sub-volumes, and multiple mounting 
of the FS in general.
>
> /proc/self/mountinfo after first btrfs loop mount:
>
> 107 59 0:59 /d0/dd0/ddd0/s1/d1/dd1/ddd1/s2 /mnt/1 rw,relatime shared:45 - btrfs /dev/loop0 rw,space_cache,subvolid=257,subvol=/d0/dd0/ddd0/s1/d1/dd1/ddd1/s2
>
> This line changes after second first btrfs loop to:
>
> 07 59 0:59 /d0/dd0/ddd0/s1/d1/dd1/ddd1/s2 /mnt/1 rw,relatime shared:45 - btrfs /dev/loop1 rw,space_cache,subvolid=257,subvol=/d0/dd0/ddd0/s1/d1/dd1/ddd1/s2
>
> See the change of /dev/loop0 to /dev/loop1!
>
> It is apparently not only proc file change, but it also causes a
> corruption of loop device subsystem, as I observed severe problems
> on the affected system later:
>
> - mount(2) returning 0 but doing nothing.
>
> - mount(8) entering an infinite loop while searching for free loop
> device.
This seems odd that it would cause such a degree of inconsistency in the 
kernel itself.  My guess though is that mount(8) sees that you're trying 
to mount a file and unconditionally tries to bind it to a loop device 
without checking any in-use loop devices to see if it's already bound to 
them, and then when it calls mount(2), this ends up somehow confusing 
the BTRFS driver (probably because you've now mounted two filesystems 
with effectively identical super-blocks, BTRFS already has issues if 
multiple filesystems have the same UUID, and I have no idea how it might 
react to filesystems that appear identical but are on separate devices).
>
>
> Here is a main reproducer:
>
> =====================
> #!/bin/sh
> # Prepare the environment:
> /btrfs.sh
> mkdir -p /mnt/1 /mnt/2
> losetup /dev/loop0 /btrfs.img
> # Verify that nothing is mounted:
> cat /proc/self/mountinfo | grep /mnt
> mount /dev/loop0 /mnt/1
> echo "One file system should be mounted now."
> cat /proc/self/mountinfo | grep /mnt
> # Create another loop.
> losetup /dev/loop1 /btrfs.img
> echo "Going to mount second one."
> mount -osubvol=/ /dev/loop1 /mnt/2 2>&1
> echo "Two file system should be mounted now."
> cat /proc/self/mountinfo | grep /mnt
> echo "Strange. First mount changed its loop device!"
> umount /mnt/2
> echo "And now check, whether it remains changed after umount."
> cat /proc/self/mountinfo | grep /mnt
> umount /mnt/1
> losetup -d /dev/loop1
> losetup -d /dev/loop0
> rmdir /mnt/1 /mnt/2
> =====================
>
> And here is its output:
>
> One file system should be mounted now.
> 107 59 0:59 /d0/dd0/ddd0/s1/d1/dd1/ddd1/s2 /mnt/1 rw,relatime shared:45 - btrfs /dev/loop0 rw,space_cache,subvolid=257,subvol=/d0/dd0/ddd0/s1/d1/dd1/ddd1/s2
> Going to mount second one.
> Two file system should be mounted now.
> 107 59 0:59 /d0/dd0/ddd0/s1/d1/dd1/ddd1/s2 /mnt/1 rw,relatime shared:45 - btrfs /dev/loop1 rw,space_cache,subvolid=257,subvol=/d0/dd0/ddd0/s1/d1/dd1/ddd1/s2
> 108 59 0:59 / /mnt/2 rw,relatime shared:47 - btrfs /dev/loop1 rw,space_cache,subvolid=5,subvol=/
> Strange. First mount changed its loop device!
> And now check, whether it remains changed after umount.
> 107 59 0:59 /d0/dd0/ddd0/s1/d1/dd1/ddd1/s2 /mnt/1 rw,relatime shared:45 - btrfs /dev/loop1 rw,space_cache,subvolid=257,subvol=/d0/dd0/ddd0/s1/d1/dd1/ddd1/s2
>
> It was actually reproduced on linux-4.4.1 on openSUSE Tumbleweed.
>
>
> Test image creator:
>
> ===== /btrfs.sh =====
> #!/bin/sh
> truncate -s 42M /btrfs.img
> mkfs.btrfs -f -d single -m single /btrfs.img >/dev/null
> mount -o loop /btrfs.img /mnt
> pushd . >/dev/null
> cd /mnt
> mkdir -p d0/dd0/ddd0
> cd ./d0/dd0/ddd0
> touch file{1..5}
> btrfs subvol create s1 >/dev/null
> cd ./s1
> touch file{1..5}
> mkdir bind-point
> mkdir -p d1/dd1/ddd1
> cd ./d1/dd1/ddd1
> btrfs subvol create s2 >/dev/null
> DEFAULT_SUBVOLID=$(btrfs inspect rootid s2)
> btrfs subvol set-default $DEFAULT_SUBVOLID . >/dev/null
> NON_DEFAULT_SUBVOLID=$(btrfs subvol list /mnt |
> while read dummy id rest ; do if test $id = $DEFAULT_SUBVOLID ; then
> continue ; fi ; echo $id ; done)
> cd ../../../..
> mkdir -p d2/dd2/ddd2
> cd ./d2/dd2/ddd2
> btrfs subvol create s3 >/dev/null
> mkdir -p s3/bind-mnt
> popd >/dev/null
> NON_DEFAULT_SUBVOL=d0/dd0/ddd0/d2/dd2/ddd2/s3
> umount /mnt
> =====================
>
> [1] http://marc.info/?l=util-linux-ng&m=145590643206663&w=2
>

  reply	other threads:[~2016-02-26 12:35 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-02-25 19:22 loop subsystem corrupted after mounting multiple btrfs sub-volumes Stanislav Brabec
2016-02-26 12:33 ` Austin S. Hemmelgarn [this message]
2016-02-26 15:50   ` Stanislav Brabec
2016-02-26 16:39     ` Austin S. Hemmelgarn
2016-02-26 17:07       ` Stanislav Brabec
2016-02-26 18:22         ` Austin S. Hemmelgarn
2016-02-26 19:31           ` Stanislav Brabec
2016-02-26 17:53       ` Al Viro
2016-02-26 19:12         ` Stanislav Brabec
2016-02-26 20:05           ` Austin S. Hemmelgarn
2016-02-26 20:30             ` Al Viro
2016-02-26 20:36               ` Austin S. Hemmelgarn
2016-02-26 21:00               ` Stanislav Brabec
2016-02-26 22:00                 ` Valdis.Kletnieks
2016-02-29 14:56                   ` Stanislav Brabec
2016-03-01 13:44                     ` Ming Lei
2016-04-12 18:38               ` Stanislav Brabec
2016-02-26 20:37             ` Stanislav Brabec
2016-02-26 21:03               ` Al Viro
2016-02-26 21:36                 ` Stanislav Brabec
2016-02-26 21:45                   ` Al Viro
2016-02-29 13:11                     ` Austin S. Hemmelgarn

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=56D04630.1020809@gmail.com \
    --to=ahferroin7@gmail.com \
    --cc=axboe@kernel.dk \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=sbrabec@suse.cz \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).