linux-btrfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Zygo Blaxell <ce3g8jdj@umail.furryterror.org>
To: Christoph Anton Mitterer <calestyo@scientia.org>
Cc: Qu Wenruo <quwenruo.btrfs@gmx.com>, linux-btrfs@vger.kernel.org
Subject: Re: ENOSPC while df shows 826.93GiB free
Date: Tue, 7 Dec 2021 02:21:28 -0500	[thread overview]
Message-ID: <20211207072128.GL17148@hungrycats.org> (raw)
In-Reply-To: <3239b2307fae4c7f9e8be9f194d5f3ef470ddb8c.camel@scientia.org>

On Tue, Dec 07, 2021 at 04:44:13AM +0100, Christoph Anton Mitterer wrote:
> On Tue, 2021-12-07 at 11:29 +0800, Qu Wenruo wrote:
> > For other regular operations, you either got ENOSPC just like all
> > other
> > fses which runs out of space, or do it without problem.
> > 
> > Furthermore, balance in this case is not really the preferred way to
> > free up space, really freeing up data is the correct way to go.
> 
> Well but to be honest... that makes btrfs kinda broke for that
> particular purpose.
> 
> 
> The software which runs on the storage and provides the data to the
> experiments does in fact make sure that the space isn't fully used (per
> default, it leave a gap of 4GB).
> 
> While this gap is configurable it seems a bit odd if one would have to
> set it to ~1TB per fs... just to make sure that btrfs doesn't run out
> of space for metadata.
> 
> 
> And btrfs *does* show that plenty of space is left (always around 700-
> 800 GB)... so the application thinks it can happily continue to write,
> while in fact it fails (and the cannot even start anymore as it fails
> to create lock files).
> 
> 
> My understanding was the when not using --mixed, btrfs has block groups
> for data and metadata.
> 
> And it seems here that the data block groups have several 100 GB still
> free, while - AFAIU you - the metadata block groups are already full.
> 
> 
> 
> I also wouldn't want to regularly balance (which doesn't really seem to
> help that much so far)... cause it puts quite some IO load on the
> systems.

If you minimally balance data (so that you keep 2GB unallocated at all
times) then it works much better: you can allocate the last metadata
chunk that you need to expand, and it requires only a few minutes of IO
per day.  After a while you don't need to do this any more, as a large
buffer of allocated but unused metadata will form.

If you need a drastic intervention, you can mount with metadata_ratio=1
for a short(!) time to allocate a lot of extra metadata block groups.
Combine with a data block group balance for a few blocks (e.g. -dlimit=9).

You need about (3 + number_of_disks) GB of allocated but unused metadata
block groups to handle the worst case (balance, scrub, and discard all
active at the same time, plus the required free metadata space).  Also
leave room for existing metadata to expand by about 50%, especially if
you have snapshots.

Never balance metadata.  Balancing metadata will erase existing metadata
allocations, leading directly to this situation.

Free space search time goes up as the filesystem fills up.  The last 1%
of the filesystem will fill up significantly slower than the other 99%,
You might need to reserve 3% of the filesystem to keep latencies down
(ironically about the same amount that ext4 reserves).

There are some patches floating around to address these issues.

> So if csum data needs so much space... why can't it simply reserve e.g.
> 60 GB for metadata instead of just 17 GB?

It normally does.  Are you:

	- running metadata balances?  (Stop immediately.)

	- preallocating large files?  Checksums are allocated later, and
	naive usage of prealloc burns metadata space due to fragmentation.

	- modifying snapshots?	Metadata size increases with each
	modified snapshot.

	- replacing large files with a lot of very small ones?	Files
	below 2K are stored in metadata.  max_inline=0 disables this.

> If I really had to reserve ~ 1TB of storage to be unused (per 16TB fs)
> just to get that working... I would need to move stuff back to ext4,
> cause that's such a big loss we couldn't justify to our funding
> agencies.
> 
> 
> And we haven't had that issue with e.g. ext4 ... that seems to reserve
> just enough for meta, so that we could basically fill up the fs close
> to the end.
> 
> 
> 
> Cheers,
> Chris.

  parent reply	other threads:[~2021-12-07  7:21 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-12-07  2:29 ENOSPC while df shows 826.93GiB free Christoph Anton Mitterer
2021-12-07  2:59 ` Qu Wenruo
2021-12-07  3:06   ` Christoph Anton Mitterer
2021-12-07  3:29     ` Qu Wenruo
2021-12-07  3:44       ` Christoph Anton Mitterer
2021-12-07  4:56         ` Qu Wenruo
2021-12-07 14:30           ` Christoph Anton Mitterer
2021-12-07  7:21         ` Zygo Blaxell [this message]
2021-12-07 12:31           ` Jorge Bastos
2021-12-07 15:07           ` Christoph Anton Mitterer
2021-12-07 18:14             ` Zygo Blaxell
2021-12-16 23:16               ` Christoph Anton Mitterer
2021-12-17  2:00                 ` Qu Wenruo
2021-12-17  3:10                   ` Christoph Anton Mitterer
2021-12-17  5:53                 ` Zygo Blaxell
2021-12-07 15:10           ` Jorge Bastos
2021-12-07 15:22             ` Christoph Anton Mitterer
2021-12-07 16:11               ` Jorge Bastos
2021-12-07 15:39 ` Phillip Susi
2021-12-16  3:47   ` Christoph Anton Mitterer

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20211207072128.GL17148@hungrycats.org \
    --to=ce3g8jdj@umail.furryterror.org \
    --cc=calestyo@scientia.org \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=quwenruo.btrfs@gmx.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).