From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-it0-f54.google.com ([209.85.214.54]:53181 "EHLO mail-it0-f54.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S932745AbeB1TYo (ORCPT ); Wed, 28 Feb 2018 14:24:44 -0500 Received: by mail-it0-f54.google.com with SMTP id k135so4794164ite.2 for ; Wed, 28 Feb 2018 11:24:44 -0800 (PST) Received: from [191.9.212.201] (rrcs-70-62-41-24.central.biz.rr.com. [70.62.41.24]) by smtp.gmail.com with ESMTPSA id 62sm1715039iow.35.2018.02.28.11.24.41 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 28 Feb 2018 11:24:42 -0800 (PST) Subject: Re: btrfs space used issue To: linux-btrfs@vger.kernel.org References: From: "Austin S. Hemmelgarn" Message-ID: <2892a866-fdc3-b337-4cd4-2cd4a18b9f21@gmail.com> Date: Wed, 28 Feb 2018 14:24:40 -0500 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8; format=flowed Sender: linux-btrfs-owner@vger.kernel.org List-ID: On 2018-02-28 14:09, Duncan wrote: > vinayak hegde posted on Tue, 27 Feb 2018 18:39:51 +0530 as excerpted: > >> I am using btrfs, But I am seeing du -sh and df -h showing huge size >> difference on ssd. >> >> mount: >> /dev/drbd1 on /dc/fileunifier.datacache type btrfs >> > (rw,noatime,nodiratime,flushoncommit,discard,nospace_cache,recovery,commit=5,subvolid=5,subvol=/) >> >> >> du -sh /dc/fileunifier.datacache/ - 331G >> >> df -h /dev/drbd1 746G 346G 398G 47% /dc/fileunifier.datacache >> >> btrfs fi usage /dc/fileunifier.datacache/ >> Overall: >> Device size: 745.19GiB Device allocated: 368.06GiB >> Device unallocated: 377.13GiB Device missing: >> 0.00B Used: 346.73GiB Free (estimated): >> 396.36GiB (min: 207.80GiB) >> Data ratio: 1.00 Metadata ratio: 2.00 >> Global reserve: 176.00MiB (used: 0.00B) >> >> Data,single: Size:365.00GiB, Used:345.76GiB >> /dev/drbd1 365.00GiB >> >> Metadata,DUP: Size:1.50GiB, Used:493.23MiB >> /dev/drbd1 3.00GiB >> >> System,DUP: Size:32.00MiB, Used:80.00KiB >> /dev/drbd1 64.00MiB >> >> Unallocated: >> /dev/drbd1 377.13GiB >> >> >> Even if we consider 6G metadata its 331+6 = 337. >> where is 9GB used? >> >> Please explain. > > Taking a somewhat higher level view than Austin's reply, on btrfs, plain > df and to a somewhat lessor extent du[1] are at best good /estimations/ > of usage, and for df, space remaining. Due to btrfs' COW/copy-on-write > semantics and features such as the various replication/raid schemes, > snapshotting, etc, btrfs makes available, that df/du don't really > understand as they simply don't have and weren't /designed/ to have that > level of filesystem-specific insight, they, particularly df due to its > whole-filesystem focus, aren't particularly accurate on btrfs. Consider > their output more a "best estimate given the rough data we have > available" sort of report. > > To get the real filesystem focused picture, use btrfs filesystem usage, > or btrfs filesystem show combined with btrfs filesystem df. That's what > you should trust, altho various utilities that check for available space > before doing something often use the kernel-call equivalent of (plain) df > to ensure they have the required space, so it's worthwhile to keep an eye > on it as the filesystem fills, as well. If it gets too out of sync with > btrfs filesystem usage, or if btrfs filesystem usage unallocated drops > below say five gigs or data or metadata size vs used shows a spread of > multiple gigs (your data shows a spread of ~20 gigs ATM, but with 377 > gigs still unallocated it's no big deal; it would be a big deal if those > were reversed, tho, only 20 gigs unallocated and a spread of 300+ gigs in > data size vs used), then corrective action such as a filtered rebalance > may be necessary. > > There are entries in the FAQ discussing free space issues that you should > definitely read if you haven't, altho they obviously address the general > case, so if you have more questions about an individual case after having > read them, here is a good place to ask. =:^) > > Everything having to do with "space" (see both the 1/Important-questions > and 4/Common-questions sections) here: > > https://btrfs.wiki.kernel.org/index.php/FAQ > > Meanwhile, it's worth noting that not entirely intuitively, btrfs' COW > implementation can "waste" space on larger files that are mostly, but not > entirely, rewritten. An example is the best way to demonstrate. > Consider each x a used block and each - an unused but still referenced > block: > > Original file, written as a single extent (diagram works best with > monospace, not arbitrarily rewrapped): > > xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx > > First rewrite of part of it: > > xxxxxxxxxxx------xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx > xxxxxx > > > Nth rewrite, where some blocks of the original still remain as originally > written: > > ------------------xxx------------------------------ > xxx--- > xxxx----xxx > xxxx > xxxxxxxxxxxxxxxxxxxxx---xxxxxx > xxx > xxx > > > As you can see, that first really large extent remains fully referenced, > altho only three blocks of it remain in actual use. All those -- won't > be returned to free space until those last three blocks get rewritten as > well, thus freeing the entire original extent. > > I believe this effect is what Austin was referencing when he suggested > the defrag, tho defrag won't necessarily /entirely/ clear it up. One way > to be /sure/ it's cleared up would be to rewrite the entire file, > deleting the original, either by copying it to a different filesystem and > back (with the off-filesystem copy guaranteeing that it can't use reflinks > to the existing extents), or by using cp's --reflink=never option. > (FWIW, I prefer the former, just to be sure, using temporary copies to a > suitably sized tmpfs for speed where possible, tho obviously if the file > is larger than your memory size that's not possible.) Correct, this is why I recommended trying a defrag. I've actually never seen things so bad that a simple defrag didn't fix them however (though I have seen a few cases where the target extent size had to be set higher than the default of 20MB). Also, as counter-intuitive as it might sound, autodefrag really doesn't help much with this, and can actually make things worse. This is also one of the things I was referring to in item 6of the list of causes I gave, partly because I couldn't come up with a good way to explain it clearly (which I feel you did an excellent job of above), with the other big one being handling of xattrs and ACL's (which get accounted by `df` but generally aren't by `du` (at least, not reliably). > > Of course where applicable, snapshots and dedup keep reflink-references > to the old extents, so they must be adjusted or deleted as well, to > properly free that space. > > --- > [1] du: Because its purpose is different. du's primary purpose is > telling you in detail what space files take up, per-file and per- > directory, without particular regard to usage on the filesystem itself. > df's focus, by contrast, is on the filesystem as a whole. So where two > files share the same extent due to reflinking, du should and does count > that usage for each file, because that's what each file /uses/ even if > they both use the same extents.