From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-wm0-f48.google.com ([74.125.82.48]:55886 "EHLO mail-wm0-f48.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751682AbdKAOF6 (ORCPT ); Wed, 1 Nov 2017 10:05:58 -0400 Received: by mail-wm0-f48.google.com with SMTP id y83so5124782wmc.4 for ; Wed, 01 Nov 2017 07:05:57 -0700 (PDT) Message-ID: <1509545153.1662.105.camel@gmail.com> Subject: Re: Several questions regarding btrfs From: ST To: "Austin S. Hemmelgarn" Cc: linux-btrfs@vger.kernel.org Date: Wed, 01 Nov 2017 16:05:53 +0200 In-Reply-To: References: <1509467017.1662.37.camel@gmail.com> <1509480384.1662.84.camel@gmail.com> Content-Type: text/plain; charset="UTF-8" Mime-Version: 1.0 Sender: linux-btrfs-owner@vger.kernel.org List-ID: > >>> 3. in my current ext4-based setup I have two servers while one syncs > >>> files of certain dir to the other using lsyncd (which launches rsync on > >>> inotify events). As far as I have understood it is more efficient to use > >>> btrfs send/receive (over ssh) than rsync (over ssh) to sync two boxes. > >>> Do you think it would be possible to make lsyncd to use btrfs for > >>> syncing instead of rsync? I.e. can btrfs work with inotify events? Did > >>> somebody try it already? > >> BTRFS send/receive needs a read-only snapshot to send from. This means > >> that triggering it on inotify events is liable to cause performance > >> issues and possibly lose changes > > > > Actually triggering doesn't happen on each and every inotify event. > > lsyncd has an option to define a time interval within which all inotify > > events are accumulated and only then rsync is launched. It could be 5-10 > > seconds or more. Which is quasi real time sync. Do you still hold that > > it will not work with BTRFS send/receive (i.e. keeping previous snapshot > > around and creating a new one)? > Okay, I actually didn't know that. Depending on how lsyncd invokes > rsync though (does it call out rsync with the exact paths or just on the > whole directory?), it may still be less efficient to use BTRFS send/receive. I assume on the whole directory, but I'm not sure... > >>> 4. In a case when compression is used - what quota is based on - (a) > >>> amount of GBs the data actually consumes on the hard drive while in > >>> compressed state or (b) amount of GBs the data naturally is in > >>> uncompressed form. I need to set quotas as in (b). Is it possible? If > >>> not - should I file a feature request? > >> I can't directly answer this as I don't know myself (I don't use > >> quotas), but have two comments I would suggest you consider: > >> > >> 1. qgroups (the BTRFS quota implementation) cause scaling and > >> performance issues. Unless you absolutely need quotas (unless you're a > >> hosting company, or are dealing with users who don't listen and don't > >> pay attention to disk usage, you usually do not need quotas), you're > >> almost certainly better off disabling them for now, especially for a > >> production system. > > > > Ok. I'll use more standard approaches. Which of following commands will > > work with BTRFS: > > > > https://debian-handbook.info/browse/stable/sect.quotas.html > None, qgroups are the only option right now with BTRFS, and it's pretty > likely to stay that way since the internals of the filesystem don't fit > well within the semantics of the regular VFS quota API. However, > provided you're not using huge numbers of reflinks and subvolumes, you > should be fine using qgroups. I want to have 7 daily (or 7+4) read-only snapshots per user, for ca. 100 users. I don't expect users to invoke cp --reflink or take snapshots. > > However, it's important to know that if your users have shell access, > they can bypass qgroups. Normal users can create subvolumes, and new > subvolumes aren't added to an existing qgroup by default (and unless I'm > mistaken, aren't constrained by the qgroup set on the parent subvolume), > so simple shell access is enough to bypass quotas. I never did it before, but shouldn't it be possible to just whitelist commands users are allowed to use in the SSH config (and so block creation of subvolumes/cp --reflink)? I actually would have restricted users to sftp if I knew how to let them change their passwords once they wish to. As far as I know it is not possible with OpenSSH... > >> > >> 2. Compression and quotas cause issues regardless of how they interact. > >> In case (a), the user has no way of knowing if a given file will fit > >> under their quota until they try to create it. In case (b), actual disk > >> usage (as reported by du) will not match up with what the quota says the > >> user is using, which makes it harder for them to figure out what to > >> delete to free up space. It's debatable which is a less objectionable > >> situation for users, though most people I know tend to think in a way > >> that the issue with (a) doesn't matter, but the issue with (b) does. > > > > I think both (a) and (b) should be possible and it should be up to > > sysadmin to choose what he prefers. The concerns of the (b) scenario > > probably could be dealt with some sort of --real-size to the du command, > > while by default it could have behavior (which might be emphasized with > > --compressed-size). > Reporting anything but the compressed size by default in du would mean > it doesn't behave as existing software expect it to. It's supposed to > report actual disk usage (in contrast to the sum of file sizes), which > means for example that a 1G sparse file with only 64k of data is > supposed to be reported as being 64k by du. Yes, it shouldn't be default behavior, but an optional one... > > Two more question came to my mind: as I've mentioned above - I have two > > boxes one syncs to another. No RAID involved. I want to scrub (or scan - > > don't know yet, what is the difference...) the whole filesystem once in > > a month to look for bitrot. Questions: > > > > 1. is it a stable setup for production? Let's say I'll sync with rsync - > > either in cron or in lsyncd? > Reasonably, though depending on how much data and other environmental > constraints, you may want to scrub a bit more frequently. > > 2. should any data corruption be discovered - is there any way to heal > > it using the copy from the other box over SSH? > Provided you know which file is affected, yes, you can fix it by just > copying the file back from the other system. Ok, but there is no automatic fixing in such a case, right?