Questions about BTRFS balance and scrub on non-RAID setup

* Questions about BTRFS balance and scrub on non-RAID setup
@ 2021-08-30 13:20 Andrej Friesen
  2021-08-30 14:18 ` Lionel Bouton
  0 siblings, 1 reply; 5+ messages in thread
From: Andrej Friesen @ 2021-08-30 13:20 UTC (permalink / raw)
  To: linux-btrfs

Hey folks,

I have used btrfs now for a few years on my home server and have had a
good experience so far.

But now I need some advice because I and my team want to use BTRFs in
a product. And personal use is something really different than
enterprise :-)

Use case and context for my questions:

A file system as a service for our customers.
This will be offered to the customer as a network share via NFS. That
also means we do not have any control over the usage patterns.
No idea about how often, how much they write small or big files to
that file system.

Technically we only create one block device with several terabytes and
format this with btrfs. The actual block device which we format is
backed by a ceph cluster.
So the actual block device is already been on a distributed storage,
therefore we will not do any raid configuration.

The kernel will be a recent 5.10.

Scrub:

Do I need to regularly scrub?
If so, what would be a recommendation for my use case?

My conclusion after reading about the scrub. This checks for damaged
data and will recover the data if this filesystem has another copy of
that data.
Since we will run without raid in btrfs this is not needed in my opinion.
Am I right with my conclusion here?

Balance:

Do I need to regularly balance my filesystem?
If so, what would be a recommendation for my use case?

I am a little bit confused about this one.
The FAQ (https://btrfs.wiki.kernel.org/index.php/FAQ#Do_I_need_to_run_a_balance_regularly.3F)
says:

> In general usage, no. A full unfiltered balance typically takes a long time, and will rewrite huge amounts of data unnecessarily. You may wish to run a balance on metadata only (see Balance_Filters) if you find you have very large amounts of metadata space allocated but unused, but this should be a last resort. At some point, this kind of clean-up will be made an automatic background process.

Others on the wide internet however say it makes sense to regularly balance:

https://github.com/netdata/netdata/issues/3203#issuecomment-356026930

Something like this every day:
`btrfs balance start -dusage=50 -dlimit=2 -musage=50 -mlimit=4`

I also asked on IRC (username ajfriesen) about regular balance and
people seem to have different opinions on that topic as well.

What would a recommendation look like for my use case?
Would it make sense to update the FAQ in that regard?

PS: First-time mailing list user, please tell me if I did something wrong.

All the best
---
Andrej Friesen

https://www.ajfriesen.com/

^ permalink raw reply	[flat|nested] 5+ messages in thread