All of lore.kernel.org
 help / color / mirror / Atom feed
From: Marc Haber <mh+linux-btrfs@zugschlus.de>
To: linux-btrfs@vger.kernel.org
Subject: Re: Again, no space left on device while rebalancing and recipe doesnt work
Date: Sat, 5 Mar 2016 15:28:36 +0100	[thread overview]
Message-ID: <20160305142836.GD1902@torres.zugschlus.de> (raw)
In-Reply-To: <pan$b2129$4febff94$eb9a65a0$72f9d0cf@cox.net>

Hi,

I have not seen this message coming back to the mailing list. Was it
again too long?

I have pastebinned the log at http://paste.debian.net/412118/

On Tue, Mar 01, 2016 at 08:51:32PM +0000, Duncan wrote:
> There has been something bothering me about this thread that I wasn't 
> quite pinning down, but here it is.
> 
> If you look at the btrfs fi df/usage numbers, data chunk total vs. used 
> are very close to one another (113 GiB total, 112.77 GiB used, single 
> profile, assuming GiB data chunks, that's only a fraction of a single 
> data chunk unused), so balance would seem to be getting thru them just 
> fine.

Where would you see those numbers? I have those, pre-balance:

Mar  2 20:28:01 fan root: Data, single: total=77.00GiB, used=76.35GiB
Mar  2 20:28:01 fan root: System, DUP: total=32.00MiB, used=48.00KiB
Mar  2 20:28:01 fan root: Metadata, DUP: total=86.50GiB, used=2.11GiB
Mar  2 20:28:01 fan root: GlobalReserve, single: total=512.00MiB, used=0.00B

> But there's a /huge/ spread between total vs. used metadata (32 GiB 
> total, under 4 GiB used, clearly _many_ empty or nearly empty chunks), 
> implying that has not been successfully balanced in quite some time, if 
> ever.

This is possible, yes.

>   So I'd surmise the problem is in metadata, not in data.
> 
> Which would explain why balancing data works fine, but a whole-filesystem 
> balance doesn't, because it's getting stuck on the metadata, not the data.
> 
> Now the balance metadata filters include system as well, by default, and 
> the -mprofiles=dup and -sprofiles=dup balances finished, apparently 
> without error, which throws a wrench into my theory.

Also finishes without changing things, post-balance:
Mar  2 21:55:37 fan root: Data, single: total=77.00GiB, used=76.36GiB
Mar  2 21:55:37 fan root: System, DUP: total=32.00MiB, used=80.00KiB
Mar  2 21:55:37 fan root: Metadata, DUP: total=99.00GiB, used=2.11GiB
Mar  2 21:55:37 fan root: GlobalReserve, single: total=512.00MiB, used=0.00B

Wait, Metadata used actually _grew_???

> But while we have the btrfs fi df from before the attempt with the 
> profiles filters, we don't have the same output from after.
s
We now have everything. New log attached.

> > I'd like to remove unused snapshots and keep the number of them to 4
> > digits, as a workaround.
> 
> I'll strongly second that recommendation.  Btrfs is known to have 
> snapshot scaling issues at 10K snapshots and above.  My strong 
> recommendation is to limit snapshots per filesystem to 3000 or less, with 
> a target of 2000 per filesystem or less if possible, and an ideal of 1000 
> per filesystem or less if it's practical to keep it to that, which it 
> should be with thinning, if you're only snapshotting 1-2 subvolumes, but 
> may not be if you're snapshotting more.

I'm snapshotting /home every 10 minutes, the filesystem that I have
been posting logs from has about 400 snapshots, and snapshot cleanup
works fine. The slow snapshot removal is a different filesystem on the
same host which is on a rotating rust HDD, and is much bigger.

> By 3000 snapshots per filesystem, you'll be beginning to notice slowdowns 
> in some btrfs maintenance commands if you're sensitive to it, tho it's 
> still at least practical to work with, and by 10K, it's generally 
> noticeable by all, at least once they thin down to 2K or so, as it's 
> suddenly faster again!  Above 100K, some btrfs maintenance commands slow 
> to a crawl and doing that sort of maintenance really becomes impractical 
> enough that it's generally easier to backup what you need to and blow 
> away the filesystem to start again with a new one, than it is to try to 
> recover the existing filesystem to a workable state, given that 
> maintenance can at that point take days to weeks.

Ouch. This shold not be the case, or btrfs subvolume snapshot should
at least emit a warning. It is not good that it is so easy to get a
filesystem into a state this bad.

> So 5-digits of snapshots on a filesystem is definitely well outside of 
> the recommended range, to the point that in some cases, particularly 
> approaching 6-digits of snapshots, it'll be more practical to simply 
> ditch the filesystem and start over, than to try to work with it any 
> longer.  Just don't do it; setup your thinning schedule so your peak is 
> 3000 snapshots per filesystem or under, and you won't have that problem 
> to worry about. =:^)

That needs to be documented prominently. Ths ZFS fanbois will love that.

> Oh, and btrfs quota management exacerbates the scaling issues 
> dramatically.  If you're using btrfs quotas

Am not, thankfully.

Greetings
Marc

-- 
-----------------------------------------------------------------------------
Marc Haber         | "I don't trust Computers. They | Mailadresse im Header
Leimen, Germany    |  lose things."    Winona Ryder | Fon: *49 6224 1600402
Nordisch by Nature |  How to make an American Quilt | Fax: *49 6224 1600421

  reply	other threads:[~2016-03-05 14:28 UTC|newest]

Thread overview: 81+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-02-27 21:14 Again, no space left on device while rebalancing and recipe doesnt work Marc Haber
2016-02-27 23:15 ` Martin Steigerwald
2016-02-28  0:08   ` Marc Haber
2016-02-28  0:22     ` Hugo Mills
2016-02-28  8:40       ` Marc Haber
2016-02-29  1:56 ` Qu Wenruo
2016-02-29 15:33   ` Marc Haber
2016-03-01  0:45     ` Qu Wenruo
     [not found]       ` <20160301065448.GJ2334@torres.zugschlus.de>
2016-03-01  7:24         ` Qu Wenruo
2016-03-01  8:13           ` Qu Wenruo
     [not found]             ` <20160301161659.GR2334@torres.zugschlus.de>
2016-03-03  2:02               ` Qu Wenruo
2016-03-01 20:51           ` Duncan
2016-03-05 14:28             ` Marc Haber [this message]
2016-03-03  0:28 ` Dāvis Mosāns
2016-03-03  3:42   ` Qu Wenruo
2016-03-03  4:57   ` Duncan
2016-03-03 15:39     ` Dāvis Mosāns
2016-03-04 12:31       ` Duncan
2016-03-04 12:35         ` Hugo Mills
2016-03-27 12:10         ` Martin Steigerwald
2016-03-27 23:12           ` Duncan
2016-03-05 14:39   ` Marc Haber
2016-03-05 19:34     ` Chris Murphy
2016-03-05 20:09       ` Marc Haber
2016-03-06  6:43         ` Duncan
2016-03-06 20:27           ` Chris Murphy
2016-03-06 20:37             ` Chris Murphy
2016-03-07  8:47               ` Marc Haber
2016-03-07  8:42             ` Marc Haber
2016-03-07 18:39               ` Chris Murphy
2016-03-07 18:56                 ` Austin S. Hemmelgarn
2016-03-07 19:07                   ` Chris Murphy
2016-03-07 19:33                   ` Marc Haber
2016-03-12 21:36                 ` Marc Haber
2016-03-07 19:44               ` Chris Murphy
2016-03-07 20:43                 ` Duncan
2016-03-07 22:44                   ` Chris Murphy
2016-03-12 21:30             ` Marc Haber
2016-03-07  8:30           ` Marc Haber
2016-03-07 20:07             ` Duncan
2016-03-07  8:56         ` Marc Haber
2016-03-12 19:57       ` Marc Haber
2016-03-13 19:43         ` Chris Murphy
2016-03-13 20:50           ` Marc Haber
2016-03-13 21:31             ` Chris Murphy
2016-03-12 21:14       ` Marc Haber
2016-03-13 11:58       ` New file system with same issue (was: Again, no space left on device while rebalancing and recipe doesnt work) Marc Haber
2016-03-13 13:17         ` Andrew Vaughan
2016-03-13 16:56           ` Marc Haber
2016-03-13 17:12         ` Duncan
2016-03-13 21:05           ` Marc Haber
2016-03-14  1:05             ` Duncan
2016-03-14 11:49               ` Marc Haber
2016-03-13 19:14         ` Henk Slager
2016-03-13 19:42           ` Henk Slager
2016-03-13 20:56           ` Marc Haber
2016-03-14  0:00             ` Henk Slager
2016-03-15  7:20               ` Marc Haber
2016-03-14 12:07         ` Marc Haber
2016-03-14 12:48           ` New file system with same issue Holger Hoffstätte
2016-03-14 20:13             ` Marc Haber
2016-03-15 10:52               ` Holger Hoffstätte
2016-03-15 13:46                 ` Marc Haber
2016-03-15 13:54                   ` Austin S. Hemmelgarn
2016-03-15 14:09                     ` Marc Haber
2016-03-17  1:17               ` A good "Boot Maintenance" scheme (WAS: New file system with same issue) Robert White
2016-03-14 13:46           ` New file system with same issue (was: Again, no space left on device while rebalancing and recipe doesnt work) Henk Slager
2016-03-14 20:05             ` Marc Haber
2016-03-14 20:39               ` Henk Slager
2016-03-14 21:59                 ` Chris Murphy
2016-03-14 23:22                   ` Henk Slager
2016-03-15  7:16                     ` Marc Haber
2016-03-15 12:15                       ` Henk Slager
2016-03-15 13:24                         ` Marc Haber
2016-03-15  7:07                 ` Marc Haber
2016-03-27 12:15                   ` Martin Steigerwald
2016-03-15 13:29               ` Marc Haber
2016-03-15 13:42                 ` Marc Haber
2016-03-15 16:54                   ` Henk Slager
2016-03-27  8:41 ` Current state of old filesystem " Marc Haber
2016-04-01 13:59 ` Again, no space left on device while rebalancing and recipe doesnt work Marc Haber

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20160305142836.GD1902@torres.zugschlus.de \
    --to=mh+linux-btrfs@zugschlus.de \
    --cc=linux-btrfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.