linux-btrfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Qu Wenruo <quwenruo.btrfs@gmx.com>
To: "Libor Klepáč" <libor.klepac@bcom.cz>,
	"linux-btrfs@vger.kernel.org" <linux-btrfs@vger.kernel.org>
Subject: Re: Question about metadata size
Date: Mon, 27 Jun 2022 18:54:28 +0800	[thread overview]
Message-ID: <068fdbd5-d704-c929-355b-9cf0f500807f@gmx.com> (raw)
In-Reply-To: <f9b7e2fea36afefaec844c385d34b280f766b10b.camel@bcom.cz>



On 2022/6/27 18:23, Libor Klepáč wrote:
> On Po, 2022-06-27 at 18:10 +0800, Qu Wenruo wrote:
>>
>>
>> On 2022/6/27 17:02, Libor Klepáč wrote:
>>> Hi,
>>> we have filesystem like this
>>>
>>> Overall:
>>>       Device size:                  30.00TiB
>>>       Device allocated:             24.93TiB
>>>       Device unallocated:            5.07TiB
>>>       Device missing:                  0.00B
>>>       Used:                         24.92TiB
>>>       Free (estimated):              5.07TiB      (min: 2.54TiB)
>>>       Data ratio:                       1.00
>>>       Metadata ratio:                   1.00
>>>       Global reserve:              512.00MiB      (used: 0.00B)
>>>
>>> Data,single: Size:24.85TiB, Used:24.84TiB (99.98%)
>>>      /dev/sdc       24.85TiB
>>>
>>> Metadata,single: Size:88.00GiB, Used:81.54GiB (92.65%)
>>>      /dev/sdc       88.00GiB
>>>
>>> System,DUP: Size:32.00MiB, Used:3.25MiB (10.16%)
>>>      /dev/sdc       64.00MiB
>>>
>>> Unallocated:
>>>      /dev/sdc        5.07TiB
>>>
>>>
>>> Is it normal to have so much metadata? We have only 119 files with
>>> size
>>> of 2048 bytes or less.
>>
>> That would only take around 50KiB so no problem.
>>
>>> There is 885 files in total and 17 directories, we don't use
>>> snapshots.
>>
>> The other files really depends.
>>
>> Do you use compression, if so metadata usage will be greately
>> increased.
>
>
> Yes, we use zstd compression - filesystem is mounted with compress-
> force=zstd:9
>
>>
>> For non-compressed files, the max file extent size is 128M, while for
>> compressed files, the max file extent size is only 128K.
>>
>> This means, for a 3TiB file, if you have compress enabled, it will
>> take
>> 24M file extents, and since each file extent takes at least 53 bytes
>> for
>
> That is lot of extents ;)
>
>> metadata, one such 3TiB file can already take over 1 GiB for
>> metadata.
>
> I guess there is no way to increase extent size?

Currently it's hard coded. So no way to change that yet.

But please keep in mind that, btrfs compression needs to do trade-off
between writes, and the decompressed size.

E.g. if we can have an 1MiB compressed extent, but if 1020KiB are
overwritten, just one 4KiB is really referred, then to read that 4KiB we
need to decompress all that 1MiB just to read that 4KiB.

So personally speaking, if the main purpose of those large files are
just to archive, not to do frequent write on, then user space
compression would make more sense.

The default btrfs tends to lean to write support.

> We can use internal compression of nakivo, but not without deleting all
> stored data and creating empty repository.
> Also we wanted to do compression in btrfs, we hoped it will give more
> power to beesd to do it's thing (for comparing data)

Then I guess there is not much thing we can help right now, and that
many extents are also slowing down file deletion just as you mentioned.

Thanks,
Qu

>
>>
>> Thanks,
>> Qu
>
> With regards, Libor
>
>>>
>>> Most of the files are multi gigabyte, some of them have around 3TB
>>> -
>>> all are snapshots from vmware stored using nakivo.
>>>
>>> Working with filesystem - mostly deleting files seems to be very
>>> slow -
>>> it took several hours to delete snapshot of one machine, which
>>> consisted of four or five of those 3TB files.
>>>
>>> We run beesd on those data, but i think, there was this much
>>> metadata
>>> even before we started to do so.
>>>
>>> With regards,
>>> Libor

  reply	other threads:[~2022-06-27 10:54 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-06-27  9:02 Question about metadata size Libor Klepáč
2022-06-27 10:10 ` Qu Wenruo
2022-06-27 10:23   ` Libor Klepáč
2022-06-27 10:54     ` Qu Wenruo [this message]
2022-06-27 12:17       ` Libor Klepáč
2022-06-27 12:39         ` Qu Wenruo
2022-06-27 14:34           ` Libor Klepáč
2022-06-27 14:47           ` Joshua Villwock
2022-06-27 15:13             ` Libor Klepáč
     [not found]             ` <cc28f1516a8fe92007994d1a3fcee93f@mailmag.net>
2022-06-27 15:29               ` Libor Klepáč
2022-06-28 14:18               ` Libor Klepáč
2022-06-27 18:12     ` Chris Murphy
2022-06-27 22:16       ` remi

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=068fdbd5-d704-c929-355b-9cf0f500807f@gmx.com \
    --to=quwenruo.btrfs@gmx.com \
    --cc=libor.klepac@bcom.cz \
    --cc=linux-btrfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).