From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-io0-f176.google.com ([209.85.223.176]:48486 "EHLO mail-io0-f176.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753718AbdJaRpi (ORCPT ); Tue, 31 Oct 2017 13:45:38 -0400 Received: by mail-io0-f176.google.com with SMTP id j17so549889iod.5 for ; Tue, 31 Oct 2017 10:45:37 -0700 (PDT) Subject: Re: Several questions regarding btrfs To: ST , linux-btrfs@vger.kernel.org References: <1509467017.1662.37.camel@gmail.com> From: "Austin S. Hemmelgarn" Message-ID: Date: Tue, 31 Oct 2017 13:45:34 -0400 MIME-Version: 1.0 In-Reply-To: <1509467017.1662.37.camel@gmail.com> Content-Type: text/plain; charset=utf-8; format=flowed Sender: linux-btrfs-owner@vger.kernel.org List-ID: On 2017-10-31 12:23, ST wrote: > Hello, > > I've recently learned about btrfs and consider to utilize for my needs. > I have several questions in this regard: > > I manage a dedicated server remotely and have some sort of script that > installs an OS from several images. There I can define partitions and > their FSs. > > 1. By default the script provides a small separate partition for /boot > with ext3. Does it have any advantages or can I simply have /boot > within / all on btrfs? (Note: the OS is Debian9) It depends on the boot loader. I think Debian 9's version of GRUB has no issue with BTRFS, but see the response below to your question on subvolumes for the one caveat. > > 2. as for the / I get ca. following written to /etc/fstab: > UUID=blah_blah /dev/sda3 / btrfs ... > So top-level volume is populated after initial installation with the > main filesystem dir-structure (/bin /usr /home, etc..). As per btrfs > wiki I would like top-level volume to have only subvolumes (at least, > the one mounted as /) and snapshots. I can make a snapshot of the > top-level volume with / structure, but how can get rid of all the > directories within top-lvl volume and keep only the subvolume > containing / (and later snapshots), unmount it and then mount the > snapshot that I took? rm -rf / - is not a good idea... There are three approaches to doing this, from a live environment, from single user mode running with init=/bin/bash, or from systemd emergency mode. Doing it from a live environment is much safer overall, even if it does take a bit longer. I'm listing the last two methods here only for completeness, and I very much suggest that you use the first (do it from a live environment). Regardless of which method you use, if you don't have a separate boot partition, you will have to create a symlink called /boot outside the subvolume, pointing at the boot directory inside the subvolume, or change the boot loader to look at the new location for /boot. From a live environment, it's pretty simple overall, though it's much easier if your live environment matches your distribution: 1. Create the snapshot of the root, naming it what you want the subvolume to be called (I usually just call it root, SUSE and Ubuntu call it @, others may have different conventions). 2. Delete everything except the snapshot you just created. The safest way to do this is to explicitly list each individual top-level directory to delete. 3. Use `btrfs subvolume list` to figure out the subvolume ID for the subvolume you just created, and then set that as the default subvolume with `btrfs subvolume set-default /path SUBVOLID`. Once you do this, you will need to specify subvolid=5 in the mount options to get the real top-level subvolume. 4. Reboot. For single user mode (check further down for what to do with systemd, also note that this may brick your system if you get it wrong): 1. When booting up the system, stop the bootloader and add 'init=/bin/bash' to the kernel command line before booting. 2. When you get a shell prompt, create the snapshot, just like above. 3. Run the following: 'cd /path ; mkdir old_root ; pivot_root . old_root ; chroot . /bin/bash' 3. You're now running inside the new subvolume, and the old root filesystem is mounted at /old_root. From here, just follow steps 2 to 4 from the live environment method. For doing it from emergency mode, things are a bit more complicated: 1. Create the snapshot of the root, just like above. 2. Make sure the only services running are udev and systemd-journald. 3. Run `systemctl switch-root` with the path to the subvolume you just created. 4. You're now running inside the new root, systemd _may_ try to go all the way to a full boot now. 5. Mount the root filesystem somewhere, and follow steps 2 through 4 of the live environment method. > > 3. in my current ext4-based setup I have two servers while one syncs > files of certain dir to the other using lsyncd (which launches rsync on > inotify events). As far as I have understood it is more efficient to use > btrfs send/receive (over ssh) than rsync (over ssh) to sync two boxes. > Do you think it would be possible to make lsyncd to use btrfs for > syncing instead of rsync? I.e. can btrfs work with inotify events? Did > somebody try it already? BTRFS send/receive needs a read-only snapshot to send from. This means that triggering it on inotify events is liable to cause performance issues and possibly lose changes (contrary to popular belief, snapshot creation is neither atomic nor free). It also means that if you want to match rsync performance in terms of network usage, you're going to have to keep the previous snapshot around so you can do an incremental send (which is also less efficient than rsync's file comparison, unless rsync is checksumming files). Because of this, it would be pretty complicated right now to get lsyncd reliable integration. > Otherwise I can sync using btrfs send/receive from within cron every > 10-15 minutes, but it seems less elegant.When it comes to stuff like this, it's usually best to go for the simplest solution that meets your requirements. Unless you need real-time synchronization, inotify is overkill, and unless you need to copy reflinks (you probably don't, as almost nothing uses them yet, and absolutely nothing I know of depends on them) send/receive is overkill. As a pretty simple example, we've got a couple of systems that have near-line active backups set up. The data is stored on BTRFS, but we just use a handful of parallel rsync invocations every 15 minutes to keep the backup system in sync (because of what we do, we can afford to lose 15 minutes of data). It's not 'elegant', but it's immediately obvious to any seasoned sysadmin what it's doing, and it gets the job done easily syncing the data in question in at most a few minutes. Back when I switched to using BTRFS, I considered using send/receive, but even using incremental send/receive still performed worse than rsync. > > 4. In a case when compression is used - what quota is based on - (a) > amount of GBs the data actually consumes on the hard drive while in > compressed state or (b) amount of GBs the data naturally is in > uncompressed form. I need to set quotas as in (b). Is it possible? If > not - should I file a feature request? I can't directly answer this as I don't know myself (I don't use quotas), but have two comments I would suggest you consider: 1. qgroups (the BTRFS quota implementation) cause scaling and performance issues. Unless you absolutely need quotas (unless you're a hosting company, or are dealing with users who don't listen and don't pay attention to disk usage, you usually do not need quotas), you're almost certainly better off disabling them for now, especially for a production system. 2. Compression and quotas cause issues regardless of how they interact. In case (a), the user has no way of knowing if a given file will fit under their quota until they try to create it. In case (b), actual disk usage (as reported by du) will not match up with what the quota says the user is using, which makes it harder for them to figure out what to delete to free up space. It's debatable which is a less objectionable situation for users, though most people I know tend to think in a way that the issue with (a) doesn't matter, but the issue with (b) does.