From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from frost.carfax.org.uk ([85.119.82.111]:59049 "EHLO frost.carfax.org.uk" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1755783Ab2JVRSO (ORCPT ); Mon, 22 Oct 2012 13:18:14 -0400 Date: Mon, 22 Oct 2012 18:18:09 +0100 From: Hugo Mills To: Chris Murphy Cc: "linux-btrfs@vger.kernel.org" Subject: Re: device delete, error removing device Message-ID: <20121022171809.GA25498@carfax.org.uk> References: <4D1258FC-36CB-4C7B-AE7F-AFCC73E6AEC4@colorremedies.com> <20121022091904.GY25498@carfax.org.uk> MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="sT0SRK93LqpGW452" In-Reply-To: Sender: linux-btrfs-owner@vger.kernel.org List-ID: --sT0SRK93LqpGW452 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline On Mon, Oct 22, 2012 at 10:42:18AM -0600, Chris Murphy wrote: > Thanks for the response Hugo, > > On Oct 22, 2012, at 3:19 AM, Hugo Mills wrote: > > > I'm not entirely sure what's going on here(*), but it looks like an > > awkward interaction between the unequal sizes of the devices, the fact > > that three of them are very small, and the RAID-0/RAID-1 on > > data/metadata respectively. > > I'm fine accepting the devices are very small and the original file system was packed completely full: to the point this is effectively sabotage. > > The idea was merely to test how a full (I was aiming more for 90%, not 97%, oops) volume handles being migrated to a replacement disk, which I think for a typical user would be larger not the same, knowing in advance that not all of the space on the new disk is usable. And I was doing it at a one order magnitude reduced scale for space consideration. > > > > You can't relocate any of the data chunks, because RAID-0 requires > > at least two chunks, and all your data chunks are more than 50% full, > > so it can't put one 0.55 GiB chunk on the big disk and one 0.55 GiB > > chunk on the remaining space on the small disk, which is the only way > > it could proceed. > > Interesting. So the way "device delete" moves extents is not at all similar to how LVM pvmove moves extents, which is unidirectional (away from the device being demoted). My, seemingly flawed, expectation was that "device delete" would cause extents on the deleted device to be moved to the newly added disk. It's more like a balance which moves everything that has some (part of its) existence on a device. So when you have RAID-0 or RAID-1 data, all of the related chunks on other disks get moved too (so in RAID-1, it's the mirror chunk as well as the chunk on the removed disk that gets rewritten). > If I add yet another 12GB virtual disk, sdf, and then attempt a delete, it works, no errors. Result: > [root@f18v ~]# btrfs device delete /dev/sdb /mnt > [root@f18v ~]# btrfs fi show > failed to read /dev/sr0 > Label: none uuid: 6e96a96e-3357-4f23-b064-0f0713366d45 > Total devices 5 FS bytes used 7.52GB > devid 5 size 12.00GB used 4.17GB path /dev/sdf > devid 4 size 12.00GB used 4.62GB path /dev/sde > devid 3 size 3.00GB used 2.68GB path /dev/sdd > devid 2 size 3.00GB used 2.68GB path /dev/sdc > *** Some devices missing > > However, I think that last line is a bug. When I > > [root@f18v ~]# btrfs device delete missing /mnt > > I get > > [ 2152.257163] btrfs: no missing devices found to remove > > So they're missing but not missing? If you run sync, or wait for 30 seconds, you'll find that fi show shows the correct information again -- btrfs fi show reads the superblocks directly, and if you run it immediately after the dev del, they've not been flushed back to disk yet. > > btrfs balance start -dconvert=single /mountpoint > Yeah that's perhaps a better starting point for many regular Joe > users setting up a multiple device btrfs volume, in particular where > different sized disks can be anticipated. I think we should probably default to single on multi-device filesystems, not RAID-0, as this kind of problem bites a lot of people, particularly when trying to drop the second disk in a pair. In similar vein, I'd suggest that an automatic downgrade from RAID-1 to DUP metadata on removing one device from a 2-device array should also be done, but I suspect there's some good reasons for not doing that, that I've not thought of. This has also bitten a lot of people in the past. Hugo. -- === Hugo Mills: hugo@... carfax.org.uk | darksatanic.net | lug.org.uk === PGP key: 515C238D from wwwkeys.eu.pgp.net or http://www.carfax.org.uk --- "There's more than one way to do it" is not a commandment. It --- is a dire warning. --sT0SRK93LqpGW452 Content-Type: application/pgp-signature; name="signature.asc" Content-Description: Digital signature -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.12 (GNU/Linux) iQIVAwUBUIV/0b9z9OVl50rAAQKuww/+MoYp8MkU5zc+C9Q0BWkqrKK/a5ILaqhO hNXmKEwUY9SRUqwMOA5DhR+W8wboDX9aXWItMtkqSpWC7X7uVLd6En1yfK9sW91h Qrh9wYDVURM6YN67ZNQP9lfajcBzk6TiErH+FkEw2uPPrJLqc2TwsZ2aQDPyk5Nn SnQucMiL9GsQrXDkTrUtJiEyKAF1k1YPJKPcR9Fp+MFpQWkASOQALqqMbpqGlljr JieEe+y7aGVjkYMpsHZ5m9syBv8o+eDnq1QRiTLC9U9iXw5uxjaBuaSO3VV0gdTl fG/RkAs1ovW8gDBc4xCHdOgqvvsHxrfx2gNnFc2ABoMVMz9Xi0TJCn+zSme/pIAM BCB4xQxSQne+3WPTnwqckOjV81j3ziddAKNIfVK1dPfeOCLJh7dbBeKpF72s9Ebs VlGaHN0LmhWDk+q0h4UM2QLL9300doXbS6Eh5UKZuCfUhaDcTySnLmPlzZBG3bhl ogafhbIdmq4b9PsaC41S+W4H3CstvgTmXbyTU6pnHT6OOULGBDMvdfECzu/bZsoU SULzLW7nPmfNGFMcpzhDHekVaNrr+HMlGG5o59DTiHLUcJWMljgWDh8zXxllTJeU BqqcadIJkBmEPI8FSBrCXlEvdsrlWCUQRNQuK2AoUeu4Jo0NwyZsmlObqNlmGMFj XOC8dLYN6t4= =6d/E -----END PGP SIGNATURE----- --sT0SRK93LqpGW452--