All of lore.kernel.org
 help / color / mirror / Atom feed
* migrating to space_cache=2 and btrfs userspace commands
@ 2021-07-13 15:38 DanglingPointer
  2021-07-14  4:59 ` Qu Wenruo
  0 siblings, 1 reply; 16+ messages in thread
From: DanglingPointer @ 2021-07-13 15:38 UTC (permalink / raw)
  To: linux-btrfs; +Cc: danglingpointerexception

We're currently considering switching to "space_cache=v2" with noatime 
mount options for my lab server-workstations running RAID5.

  * One has 13TB of data/metadata in a bunch of 6TB and 2TB disks
    totalling 26TB.
  * Another has about 12TB data/metadata in uniformly sized 6TB disks
    totalling 24TB.
  * Both of the arrays are on individually luks encrypted disks with
    btrfs on top of the luks.
  * Both have "defaults,autodefrag" turned on in fstab.

We're starting to see large pauses during constant backups of millions 
of chunk files (using duplicacy backup) in the 24TB array.

Pauses sometimes take up to 20+ seconds in frequencies after every 
~30secs of the end of the last pause.  "btrfs-transacti" process 
consistently shows up as the blocking process/thread locking up 
filesystem IO.  IO gets into the RAID5 array via nfsd. There are no disk 
or btrfs errors recorded.  scrub last finished yesterday successfully.

After doing some research around the internet, we've come to the 
consideration above as described.  Unfortunately the official 
documentation isn't clear on the following.

Official documentation URL - 
https://btrfs.wiki.kernel.org/index.php/Manpage/btrfs(5)

 1. How to migrate from default space_cache=v1 to space_cache=v2? It
    talks about the reverse, from v2 to v1!
 2. If we use space_cache=v2, is it indeed still the case that the
    "btrfs" command will NOT work with the filesystem?  So will our
    "btrfs scrub start /mount/point/..." cron jobs FAIL?  I'm guessing
    the btrfs command comes from btrfs-progs which is currently v5.4.1-2
    amd64, is that correct?
 3. Any other ideas on how we can get rid of those annoying pauses with
    large backups into the array?

Thanks in advance!

DP


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: migrating to space_cache=2 and btrfs userspace commands
  2021-07-13 15:38 migrating to space_cache=2 and btrfs userspace commands DanglingPointer
@ 2021-07-14  4:59 ` Qu Wenruo
  2021-07-14  5:44   ` Chris Murphy
  2021-07-14  7:18   ` DanglingPointer
  0 siblings, 2 replies; 16+ messages in thread
From: Qu Wenruo @ 2021-07-14  4:59 UTC (permalink / raw)
  To: DanglingPointer, linux-btrfs



On 2021/7/13 下午11:38, DanglingPointer wrote:
> We're currently considering switching to "space_cache=v2" with noatime
> mount options for my lab server-workstations running RAID5.

Btrfs RAID5 is unsafe due to its write-hole problem.

>
>   * One has 13TB of data/metadata in a bunch of 6TB and 2TB disks
>     totalling 26TB.
>   * Another has about 12TB data/metadata in uniformly sized 6TB disks
>     totalling 24TB.
>   * Both of the arrays are on individually luks encrypted disks with
>     btrfs on top of the luks.
>   * Both have "defaults,autodefrag" turned on in fstab.
>
> We're starting to see large pauses during constant backups of millions
> of chunk files (using duplicacy backup) in the 24TB array.
>
> Pauses sometimes take up to 20+ seconds in frequencies after every
> ~30secs of the end of the last pause.  "btrfs-transacti" process
> consistently shows up as the blocking process/thread locking up
> filesystem IO.  IO gets into the RAID5 array via nfsd. There are no disk
> or btrfs errors recorded.  scrub last finished yesterday successfully.

Please provide the "echo l > /proc/sysrq-trigger" output when such pause
happens.

If you're using qgroup (may be enabled by things like snapper), it may
be the cause, as qgroup does its accounting when committing transaction.

If one transaction is super large, it can cause such problem.

You can test if qgroup is enabled by:

# btrfs qgroup show -prce <mnt>

>
> After doing some research around the internet, we've come to the
> consideration above as described.  Unfortunately the official
> documentation isn't clear on the following.
>
> Official documentation URL -
> https://btrfs.wiki.kernel.org/index.php/Manpage/btrfs(5)
>
> 1. How to migrate from default space_cache=v1 to space_cache=v2? It
>     talks about the reverse, from v2 to v1!

Just mount with "space_cache=v2".

> 2. If we use space_cache=v2, is it indeed still the case that the
>     "btrfs" command will NOT work with the filesystem?

Why would you think "btrfs" won't work on a btrfs?

Thanks,
Qu

>  So will our
>     "btrfs scrub start /mount/point/..." cron jobs FAIL?  I'm guessing
>     the btrfs command comes from btrfs-progs which is currently v5.4.1-2
>     amd64, is that correct?
> 3. Any other ideas on how we can get rid of those annoying pauses with
>     large backups into the array?
>
> Thanks in advance!
>
> DP
>

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: migrating to space_cache=2 and btrfs userspace commands
  2021-07-14  4:59 ` Qu Wenruo
@ 2021-07-14  5:44   ` Chris Murphy
  2021-07-14  6:05     ` Qu Wenruo
  2021-07-14  7:18   ` DanglingPointer
  1 sibling, 1 reply; 16+ messages in thread
From: Chris Murphy @ 2021-07-14  5:44 UTC (permalink / raw)
  To: Qu Wenruo; +Cc: DanglingPointer, Btrfs BTRFS

On Tue, Jul 13, 2021 at 10:59 PM Qu Wenruo <quwenruo.btrfs@gmx.com> wrote:
>
>
>
> On 2021/7/13 下午11:38, DanglingPointer wrote:

> > 2. If we use space_cache=v2, is it indeed still the case that the
> >     "btrfs" command will NOT work with the filesystem?
>
> Why would you think "btrfs" won't work on a btrfs?
>

Maybe this?

man 5 btrfs, space_cache includes:

 The btrfs(8) command currently only has read-only support for v2. A
read-write command may be run on a v2 filesystem by clearing the
cache, running the command, and then remounting with space_cache=v2.



-- 
Chris Murphy

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: migrating to space_cache=2 and btrfs userspace commands
  2021-07-14  5:44   ` Chris Murphy
@ 2021-07-14  6:05     ` Qu Wenruo
  2021-07-14  6:54       ` DanglingPointer
  0 siblings, 1 reply; 16+ messages in thread
From: Qu Wenruo @ 2021-07-14  6:05 UTC (permalink / raw)
  To: Chris Murphy; +Cc: DanglingPointer, Btrfs BTRFS



On 2021/7/14 下午1:44, Chris Murphy wrote:
> On Tue, Jul 13, 2021 at 10:59 PM Qu Wenruo <quwenruo.btrfs@gmx.com> wrote:
>>
>>
>>
>> On 2021/7/13 下午11:38, DanglingPointer wrote:
>
>>> 2. If we use space_cache=v2, is it indeed still the case that the
>>>      "btrfs" command will NOT work with the filesystem?
>>
>> Why would you think "btrfs" won't work on a btrfs?
>>
>
> Maybe this?
>
> man 5 btrfs, space_cache includes:
>
>   The btrfs(8) command currently only has read-only support for v2. A
> read-write command may be run on a v2 filesystem by clearing the
> cache, running the command, and then remounting with space_cache=v2.
>

Oh, that's only for offline tools writing into the fs, namingly "btrfs
check --repair" and "mkfs.btrfs -R"

And I believe that sentence is now out-of-date after btrfs-progs v4.19,
which pulls all the support for write time free space tree (v2 space cache).

I'll soon send out a patch to fix that.

Thanks,
Qu

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: migrating to space_cache=2 and btrfs userspace commands
  2021-07-14  6:05     ` Qu Wenruo
@ 2021-07-14  6:54       ` DanglingPointer
  2021-07-14  7:07         ` Qu Wenruo
  0 siblings, 1 reply; 16+ messages in thread
From: DanglingPointer @ 2021-07-14  6:54 UTC (permalink / raw)
  To: Qu Wenruo, Chris Murphy; +Cc: Btrfs BTRFS

Yep that's what I'm referring to here: 
https://btrfs.wiki.kernel.org/index.php/Manpage/btrfs(5)

Thanks for the prompt respone Qu!

So just to confirm, our scheduled cron jobs to scrub will still work 
with space_cache=v2?



On 14/7/21 4:05 pm, Qu Wenruo wrote:
>
>
> On 2021/7/14 下午1:44, Chris Murphy wrote:
>> On Tue, Jul 13, 2021 at 10:59 PM Qu Wenruo <quwenruo.btrfs@gmx.com> 
>> wrote:
>>>
>>>
>>>
>>> On 2021/7/13 下午11:38, DanglingPointer wrote:
>>
>>>> 2. If we use space_cache=v2, is it indeed still the case that the
>>>>      "btrfs" command will NOT work with the filesystem?
>>>
>>> Why would you think "btrfs" won't work on a btrfs?
>>>
>>
>> Maybe this?
>>
>> man 5 btrfs, space_cache includes:
>>
>>   The btrfs(8) command currently only has read-only support for v2. A
>> read-write command may be run on a v2 filesystem by clearing the
>> cache, running the command, and then remounting with space_cache=v2.
>>
>
> Oh, that's only for offline tools writing into the fs, namingly "btrfs
> check --repair" and "mkfs.btrfs -R"
>
> And I believe that sentence is now out-of-date after btrfs-progs v4.19,
> which pulls all the support for write time free space tree (v2 space 
> cache).
>
> I'll soon send out a patch to fix that.
>
> Thanks,
> Qu

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: migrating to space_cache=2 and btrfs userspace commands
  2021-07-14  6:54       ` DanglingPointer
@ 2021-07-14  7:07         ` Qu Wenruo
  0 siblings, 0 replies; 16+ messages in thread
From: Qu Wenruo @ 2021-07-14  7:07 UTC (permalink / raw)
  To: DanglingPointer, Chris Murphy; +Cc: Btrfs BTRFS



On 2021/7/14 下午2:54, DanglingPointer wrote:
> Yep that's what I'm referring to here:
> https://btrfs.wiki.kernel.org/index.php/Manpage/btrfs(5)
>
> Thanks for the prompt respone Qu!
>
> So just to confirm, our scheduled cron jobs to scrub will still work
> with space_cache=v2?

Yep. No worry.

Just to be more accurate, btrfs-progs has two parts:

- Ioctl related part
   Things like "scrub", "subvolume", "filesystem", "device" and etc are
   all ioctl related commands.
   They just call kernel ioctls to do the job.

   For those commands, they are way more independent, and seldomly get
   affected by btrfs-progs version.

- Offline read/write part
   Things like "mkfs.btrfs", "btrfs check", "btrfs-convert" are the main
   commands in this part.
   They relies on the btrfs-progs implementation on how to read/write
   btrfs.
   That's why we always recommend the latest btrfs-progs for "btrfs
   check".

Thanks,
Qu

>
>
>
> On 14/7/21 4:05 pm, Qu Wenruo wrote:
>>
>>
>> On 2021/7/14 下午1:44, Chris Murphy wrote:
>>> On Tue, Jul 13, 2021 at 10:59 PM Qu Wenruo <quwenruo.btrfs@gmx.com>
>>> wrote:
>>>>
>>>>
>>>>
>>>> On 2021/7/13 下午11:38, DanglingPointer wrote:
>>>
>>>>> 2. If we use space_cache=v2, is it indeed still the case that the
>>>>>      "btrfs" command will NOT work with the filesystem?
>>>>
>>>> Why would you think "btrfs" won't work on a btrfs?
>>>>
>>>
>>> Maybe this?
>>>
>>> man 5 btrfs, space_cache includes:
>>>
>>>   The btrfs(8) command currently only has read-only support for v2. A
>>> read-write command may be run on a v2 filesystem by clearing the
>>> cache, running the command, and then remounting with space_cache=v2.
>>>
>>
>> Oh, that's only for offline tools writing into the fs, namingly "btrfs
>> check --repair" and "mkfs.btrfs -R"
>>
>> And I believe that sentence is now out-of-date after btrfs-progs v4.19,
>> which pulls all the support for write time free space tree (v2 space
>> cache).
>>
>> I'll soon send out a patch to fix that.
>>
>> Thanks,
>> Qu

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: migrating to space_cache=2 and btrfs userspace commands
  2021-07-14  4:59 ` Qu Wenruo
  2021-07-14  5:44   ` Chris Murphy
@ 2021-07-14  7:18   ` DanglingPointer
  2021-07-14  7:45     ` Qu Wenruo
  1 sibling, 1 reply; 16+ messages in thread
From: DanglingPointer @ 2021-07-14  7:18 UTC (permalink / raw)
  To: Qu Wenruo, linux-btrfs; +Cc: danglingpointerexception

a) "echo l > /proc/sysrq-trigger"

The backup finished today already unfortunately and we are unlikely to 
run it again until we get an outage to remount the array with the 
space_cache=v2 and noatime mount options.
Thanks for the command, we'll definitely use it if/when it happens again 
on the next large migration of data.


b) "sudo btrfs qgroup show -prce" ........

$ ERROR: can't list qgroups: quotas not enabled

So looks like it isn't enabled.

File sizes are between: 1,048,576 bytes and 16,777,216 bytes (Duplicacy 
backup defaults)

What classifies as a transaction?  Any/All writes done in a 30sec 
interval?  If 100 unique files were written in 30secs, is that 1 
transaction or 100 transactions?  Millions of files of the size range 
above were backed up.


c) "Just mount with "space_cache=v2""

Ok so no need to "clear_cache" the v1 cache, right?
I wrote this in the fstab but hadn't remounted yet until I can get an 
outage....

..."btrfs defaults,autodefrag,clear_cache,space_cache=v2,noatime  0  2"

Thanks again for your help Qu!

On 14/7/21 2:59 pm, Qu Wenruo wrote:
>
>
> On 2021/7/13 下午11:38, DanglingPointer wrote:
>> We're currently considering switching to "space_cache=v2" with noatime
>> mount options for my lab server-workstations running RAID5.
>
> Btrfs RAID5 is unsafe due to its write-hole problem.
>
>>
>>   * One has 13TB of data/metadata in a bunch of 6TB and 2TB disks
>>     totalling 26TB.
>>   * Another has about 12TB data/metadata in uniformly sized 6TB disks
>>     totalling 24TB.
>>   * Both of the arrays are on individually luks encrypted disks with
>>     btrfs on top of the luks.
>>   * Both have "defaults,autodefrag" turned on in fstab.
>>
>> We're starting to see large pauses during constant backups of millions
>> of chunk files (using duplicacy backup) in the 24TB array.
>>
>> Pauses sometimes take up to 20+ seconds in frequencies after every
>> ~30secs of the end of the last pause.  "btrfs-transacti" process
>> consistently shows up as the blocking process/thread locking up
>> filesystem IO.  IO gets into the RAID5 array via nfsd. There are no disk
>> or btrfs errors recorded.  scrub last finished yesterday successfully.
>
> Please provide the "echo l > /proc/sysrq-trigger" output when such pause
> happens.
>
> If you're using qgroup (may be enabled by things like snapper), it may
> be the cause, as qgroup does its accounting when committing transaction.
>
> If one transaction is super large, it can cause such problem.
>
> You can test if qgroup is enabled by:
>
> # btrfs qgroup show -prce <mnt>
>
>>
>> After doing some research around the internet, we've come to the
>> consideration above as described.  Unfortunately the official
>> documentation isn't clear on the following.
>>
>> Official documentation URL -
>> https://btrfs.wiki.kernel.org/index.php/Manpage/btrfs(5)
>>
>> 1. How to migrate from default space_cache=v1 to space_cache=v2? It
>>     talks about the reverse, from v2 to v1!
>
> Just mount with "space_cache=v2".
>
>> 2. If we use space_cache=v2, is it indeed still the case that the
>>     "btrfs" command will NOT work with the filesystem?
>
> Why would you think "btrfs" won't work on a btrfs?
>
> Thanks,
> Qu
>
>>   So will our
>>     "btrfs scrub start /mount/point/..." cron jobs FAIL?  I'm guessing
>>     the btrfs command comes from btrfs-progs which is currently v5.4.1-2
>>     amd64, is that correct?
>> 3. Any other ideas on how we can get rid of those annoying pauses with
>>     large backups into the array?
>>
>> Thanks in advance!
>>
>> DP
>>

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: migrating to space_cache=2 and btrfs userspace commands
  2021-07-14  7:18   ` DanglingPointer
@ 2021-07-14  7:45     ` Qu Wenruo
  2021-07-15 16:40       ` DanglingPointer
  2021-07-15 17:51       ` Joshua
  0 siblings, 2 replies; 16+ messages in thread
From: Qu Wenruo @ 2021-07-14  7:45 UTC (permalink / raw)
  To: DanglingPointer, linux-btrfs



On 2021/7/14 下午3:18, DanglingPointer wrote:
> a) "echo l > /proc/sysrq-trigger"
>
> The backup finished today already unfortunately and we are unlikely to
> run it again until we get an outage to remount the array with the
> space_cache=v2 and noatime mount options.
> Thanks for the command, we'll definitely use it if/when it happens again
> on the next large migration of data.

Just to avoid confusion, after that command, "dmesg" output is still
needed, as that's where sysrq put its output.
>
>
> b) "sudo btrfs qgroup show -prce" ........
>
> $ ERROR: can't list qgroups: quotas not enabled
>
> So looks like it isn't enabled.

One less thing to bother.
>
> File sizes are between: 1,048,576 bytes and 16,777,216 bytes (Duplicacy
> backup defaults)

Between 1~16MiB, thus tons of small files.

Btrfs is not really good at handling tons of small files, as they
generate a lot of metadata.

That may contribute to the hang.

>
> What classifies as a transaction?

It's a little complex.

Technically it's a check point where before the checkpoint, all you see
is old data, after the checkpoint, all you see is new data.

To end users, any data and metadata write will be included into one
transaction (with proper dependency handled).

One way to finish (or commit) current transaction is to sync the fs,
using "sync" command (sync all filesystems).

> Any/All writes done in a 30sec
> interval?

This the default commit interval. Almost all fses will try to commit its
data/metadata to disk after a configurable interval.

The default one is 30s. That's also one way to commit current transaction.

>  If 100 unique files were written in 30secs, is that 1
> transaction or 100 transactions?

It depends. As things like syncfs() and subvolume/snapshot creation may
try to commit transaction.

But without those special operations, just writing 100 unique files
using buffered write, it would only start one transaction, and when the
30s interval get hit, the transaction will be committed to disk.

>  Millions of files of the size range
> above were backed up.

The amount of files may not force a transaction commit, if it doesn't
trigger enough memory pressure, or free space pressure.

Anyway, the "echo l" sysrq would help us to locate what's taking so long
time.

>
>
> c) "Just mount with "space_cache=v2""
>
> Ok so no need to "clear_cache" the v1 cache, right?

Yes, and "clear_cache" won't really remove all the v1 cache anyway.

Thus it doesn't help much.

The only way to fully clear v1 cache is by using "btrfs check
--clear-space-cache v1" on a *unmounted* btrfs.

> I wrote this in the fstab but hadn't remounted yet until I can get an
> outage....

IMHO if you really want to test if v2 would help, you can just remount,
no need to wait for a break.

Thanks,
Qu
>
> ..."btrfs defaults,autodefrag,clear_cache,space_cache=v2,noatime  0  2 >
> Thanks again for your help Qu!
>
> On 14/7/21 2:59 pm, Qu Wenruo wrote:
>>
>>
>> On 2021/7/13 下午11:38, DanglingPointer wrote:
>>> We're currently considering switching to "space_cache=v2" with noatime
>>> mount options for my lab server-workstations running RAID5.
>>
>> Btrfs RAID5 is unsafe due to its write-hole problem.
>>
>>>
>>>   * One has 13TB of data/metadata in a bunch of 6TB and 2TB disks
>>>     totalling 26TB.
>>>   * Another has about 12TB data/metadata in uniformly sized 6TB disks
>>>     totalling 24TB.
>>>   * Both of the arrays are on individually luks encrypted disks with
>>>     btrfs on top of the luks.
>>>   * Both have "defaults,autodefrag" turned on in fstab.
>>>
>>> We're starting to see large pauses during constant backups of millions
>>> of chunk files (using duplicacy backup) in the 24TB array.
>>>
>>> Pauses sometimes take up to 20+ seconds in frequencies after every
>>> ~30secs of the end of the last pause.  "btrfs-transacti" process
>>> consistently shows up as the blocking process/thread locking up
>>> filesystem IO.  IO gets into the RAID5 array via nfsd. There are no disk
>>> or btrfs errors recorded.  scrub last finished yesterday successfully.
>>
>> Please provide the "echo l > /proc/sysrq-trigger" output when such pause
>> happens.
>>
>> If you're using qgroup (may be enabled by things like snapper), it may
>> be the cause, as qgroup does its accounting when committing transaction.
>>
>> If one transaction is super large, it can cause such problem.
>>
>> You can test if qgroup is enabled by:
>>
>> # btrfs qgroup show -prce <mnt>
>>
>>>
>>> After doing some research around the internet, we've come to the
>>> consideration above as described.  Unfortunately the official
>>> documentation isn't clear on the following.
>>>
>>> Official documentation URL -
>>> https://btrfs.wiki.kernel.org/index.php/Manpage/btrfs(5)
>>>
>>> 1. How to migrate from default space_cache=v1 to space_cache=v2? It
>>>     talks about the reverse, from v2 to v1!
>>
>> Just mount with "space_cache=v2".
>>
>>> 2. If we use space_cache=v2, is it indeed still the case that the
>>>     "btrfs" command will NOT work with the filesystem?
>>
>> Why would you think "btrfs" won't work on a btrfs?
>>
>> Thanks,
>> Qu
>>
>>>   So will our
>>>     "btrfs scrub start /mount/point/..." cron jobs FAIL?  I'm guessing
>>>     the btrfs command comes from btrfs-progs which is currently v5.4.1-2
>>>     amd64, is that correct?
>>> 3. Any other ideas on how we can get rid of those annoying pauses with
>>>     large backups into the array?
>>>
>>> Thanks in advance!
>>>
>>> DP
>>>

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: migrating to space_cache=2 and btrfs userspace commands
  2021-07-14  7:45     ` Qu Wenruo
@ 2021-07-15 16:40       ` DanglingPointer
  2021-07-15 22:13         ` Qu Wenruo
  2021-07-15 17:51       ` Joshua
  1 sibling, 1 reply; 16+ messages in thread
From: DanglingPointer @ 2021-07-15 16:40 UTC (permalink / raw)
  To: Qu Wenruo, linux-btrfs; +Cc: danglingpointerexception

Hi Qu,

Just updating here that setting the mount option "space_cache=v2" and 
"noatime" completely SOLVED the performance problem!
Basically like night and day!


These are my full fstab mount options...

btrfs defaults,autodefrag,space_cache=v2,noatime 0 2


Perhaps defaulting the space_cache=v2 should be considered?  Why default 
to v1, what's the value of v1?


So for conclusion, for large multi-terrabyte arrays (in my case RAID5s), 
setting space_cache=v2 and noatime massively increases performance and 
eliminates the large long pauses in frequent intervals by 
"btrfs-transacti" blocking all IO.

Thanks Qu for your help!



On 14/7/21 5:45 pm, Qu Wenruo wrote:
>
>
> On 2021/7/14 下午3:18, DanglingPointer wrote:
>> a) "echo l > /proc/sysrq-trigger"
>>
>> The backup finished today already unfortunately and we are unlikely to
>> run it again until we get an outage to remount the array with the
>> space_cache=v2 and noatime mount options.
>> Thanks for the command, we'll definitely use it if/when it happens again
>> on the next large migration of data.
>
> Just to avoid confusion, after that command, "dmesg" output is still
> needed, as that's where sysrq put its output.
>>
>>
>> b) "sudo btrfs qgroup show -prce" ........
>>
>> $ ERROR: can't list qgroups: quotas not enabled
>>
>> So looks like it isn't enabled.
>
> One less thing to bother.
>>
>> File sizes are between: 1,048,576 bytes and 16,777,216 bytes (Duplicacy
>> backup defaults)
>
> Between 1~16MiB, thus tons of small files.
>
> Btrfs is not really good at handling tons of small files, as they
> generate a lot of metadata.
>
> That may contribute to the hang.
>
>>
>> What classifies as a transaction?
>
> It's a little complex.
>
> Technically it's a check point where before the checkpoint, all you see
> is old data, after the checkpoint, all you see is new data.
>
> To end users, any data and metadata write will be included into one
> transaction (with proper dependency handled).
>
> One way to finish (or commit) current transaction is to sync the fs,
> using "sync" command (sync all filesystems).
>
>> Any/All writes done in a 30sec
>> interval?
>
> This the default commit interval. Almost all fses will try to commit its
> data/metadata to disk after a configurable interval.
>
> The default one is 30s. That's also one way to commit current 
> transaction.
>
>>   If 100 unique files were written in 30secs, is that 1
>> transaction or 100 transactions?
>
> It depends. As things like syncfs() and subvolume/snapshot creation may
> try to commit transaction.
>
> But without those special operations, just writing 100 unique files
> using buffered write, it would only start one transaction, and when the
> 30s interval get hit, the transaction will be committed to disk.
>
>>   Millions of files of the size range
>> above were backed up.
>
> The amount of files may not force a transaction commit, if it doesn't
> trigger enough memory pressure, or free space pressure.
>
> Anyway, the "echo l" sysrq would help us to locate what's taking so long
> time.
>
>>
>>
>> c) "Just mount with "space_cache=v2""
>>
>> Ok so no need to "clear_cache" the v1 cache, right?
>
> Yes, and "clear_cache" won't really remove all the v1 cache anyway.
>
> Thus it doesn't help much.
>
> The only way to fully clear v1 cache is by using "btrfs check
> --clear-space-cache v1" on a *unmounted* btrfs.
>
>> I wrote this in the fstab but hadn't remounted yet until I can get an
>> outage....
>
> IMHO if you really want to test if v2 would help, you can just remount,
> no need to wait for a break.
>
> Thanks,
> Qu
>>
>> ..."btrfs defaults,autodefrag,clear_cache,space_cache=v2,noatime  0  2 >
>> Thanks again for your help Qu!
>>
>> On 14/7/21 2:59 pm, Qu Wenruo wrote:
>>>
>>>
>>> On 2021/7/13 下午11:38, DanglingPointer wrote:
>>>> We're currently considering switching to "space_cache=v2" with noatime
>>>> mount options for my lab server-workstations running RAID5.
>>>
>>> Btrfs RAID5 is unsafe due to its write-hole problem.
>>>
>>>>
>>>>   * One has 13TB of data/metadata in a bunch of 6TB and 2TB disks
>>>>     totalling 26TB.
>>>>   * Another has about 12TB data/metadata in uniformly sized 6TB disks
>>>>     totalling 24TB.
>>>>   * Both of the arrays are on individually luks encrypted disks with
>>>>     btrfs on top of the luks.
>>>>   * Both have "defaults,autodefrag" turned on in fstab.
>>>>
>>>> We're starting to see large pauses during constant backups of millions
>>>> of chunk files (using duplicacy backup) in the 24TB array.
>>>>
>>>> Pauses sometimes take up to 20+ seconds in frequencies after every
>>>> ~30secs of the end of the last pause.  "btrfs-transacti" process
>>>> consistently shows up as the blocking process/thread locking up
>>>> filesystem IO.  IO gets into the RAID5 array via nfsd. There are no 
>>>> disk
>>>> or btrfs errors recorded.  scrub last finished yesterday successfully.
>>>
>>> Please provide the "echo l > /proc/sysrq-trigger" output when such 
>>> pause
>>> happens.
>>>
>>> If you're using qgroup (may be enabled by things like snapper), it may
>>> be the cause, as qgroup does its accounting when committing 
>>> transaction.
>>>
>>> If one transaction is super large, it can cause such problem.
>>>
>>> You can test if qgroup is enabled by:
>>>
>>> # btrfs qgroup show -prce <mnt>
>>>
>>>>
>>>> After doing some research around the internet, we've come to the
>>>> consideration above as described.  Unfortunately the official
>>>> documentation isn't clear on the following.
>>>>
>>>> Official documentation URL -
>>>> https://btrfs.wiki.kernel.org/index.php/Manpage/btrfs(5)
>>>>
>>>> 1. How to migrate from default space_cache=v1 to space_cache=v2? It
>>>>     talks about the reverse, from v2 to v1!
>>>
>>> Just mount with "space_cache=v2".
>>>
>>>> 2. If we use space_cache=v2, is it indeed still the case that the
>>>>     "btrfs" command will NOT work with the filesystem?
>>>
>>> Why would you think "btrfs" won't work on a btrfs?
>>>
>>> Thanks,
>>> Qu
>>>
>>>>   So will our
>>>>     "btrfs scrub start /mount/point/..." cron jobs FAIL? I'm guessing
>>>>     the btrfs command comes from btrfs-progs which is currently 
>>>> v5.4.1-2
>>>>     amd64, is that correct?
>>>> 3. Any other ideas on how we can get rid of those annoying pauses with
>>>>     large backups into the array?
>>>>
>>>> Thanks in advance!
>>>>
>>>> DP
>>>>

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: migrating to space_cache=2 and btrfs userspace commands
  2021-07-14  7:45     ` Qu Wenruo
  2021-07-15 16:40       ` DanglingPointer
@ 2021-07-15 17:51       ` Joshua
  2021-07-16 12:42         ` DanglingPointer
  1 sibling, 1 reply; 16+ messages in thread
From: Joshua @ 2021-07-15 17:51 UTC (permalink / raw)
  To: linux-btrfs

Just as a point of data, I have a 96 TB array with RAID1 data, and RAID1C3 metadata.

I made the switch to space_cache=v2 some time ago, and I remember it made a huge difference when I did so!
(It was RAID1 metadata at the time, as RAID1C3 was not available at the time.)


However, I also tried a check with '--clear-space-cache v1' at the time, and after waiting a literal whole day without it completing, I gave up, canceled it, and put it back into production.  Is a --clear-space-cache v1 operation expected to take so long on such a large file system?

Thanks!
--Joshua Villwock



July 15, 2021 9:40 AM, "DanglingPointer" <danglingpointerexception@gmail.com> wrote:

> Hi Qu,
> 
> Just updating here that setting the mount option "space_cache=v2" and "noatime" completely SOLVED
> the performance problem!
> Basically like night and day!
> 
> These are my full fstab mount options...
> 
> btrfs defaults,autodefrag,space_cache=v2,noatime 0 2
> 
> Perhaps defaulting the space_cache=v2 should be considered?  Why default to v1, what's the value of
> v1?
> 
> So for conclusion, for large multi-terrabyte arrays (in my case RAID5s), setting space_cache=v2 and
> noatime massively increases performance and eliminates the large long pauses in frequent intervals
> by "btrfs-transacti" blocking all IO.
> 
> Thanks Qu for your help!
> 
> On 14/7/21 5:45 pm, Qu Wenruo wrote:
> 
>> On 2021/7/14 下午3:18, DanglingPointer wrote:
>>> a) "echo l > /proc/sysrq-trigger"
>>> 
>>> The backup finished today already unfortunately and we are unlikely to
>>> run it again until we get an outage to remount the array with the
>>> space_cache=v2 and noatime mount options.
>>> Thanks for the command, we'll definitely use it if/when it happens again
>>> on the next large migration of data.
>> 
>> Just to avoid confusion, after that command, "dmesg" output is still
>> needed, as that's where sysrq put its output.
>>> b) "sudo btrfs qgroup show -prce" ........
>>> 
>>> $ ERROR: can't list qgroups: quotas not enabled
>>> 
>>> So looks like it isn't enabled.
>> 
>> One less thing to bother.
>>> File sizes are between: 1,048,576 bytes and 16,777,216 bytes (Duplicacy
>>> backup defaults)
>> 
>> Between 1~16MiB, thus tons of small files.
>> 
>> Btrfs is not really good at handling tons of small files, as they
>> generate a lot of metadata.
>> 
>> That may contribute to the hang.
>> 
>>> What classifies as a transaction?
>> 
>> It's a little complex.
>> 
>> Technically it's a check point where before the checkpoint, all you see
>> is old data, after the checkpoint, all you see is new data.
>> 
>> To end users, any data and metadata write will be included into one
>> transaction (with proper dependency handled).
>> 
>> One way to finish (or commit) current transaction is to sync the fs,
>> using "sync" command (sync all filesystems).
>> 
>>> Any/All writes done in a 30sec
>>> interval?
>> 
>> This the default commit interval. Almost all fses will try to commit its
>> data/metadata to disk after a configurable interval.
>> 
>> The default one is 30s. That's also one way to commit current > transaction.
>> 
>>> If 100 unique files were written in 30secs, is that 1
>>> transaction or 100 transactions?
>> 
>> It depends. As things like syncfs() and subvolume/snapshot creation may
>> try to commit transaction.
>> 
>> But without those special operations, just writing 100 unique files
>> using buffered write, it would only start one transaction, and when the
>> 30s interval get hit, the transaction will be committed to disk.
>> 
>>> Millions of files of the size range
>>> above were backed up.
>> 
>> The amount of files may not force a transaction commit, if it doesn't
>> trigger enough memory pressure, or free space pressure.
>> 
>> Anyway, the "echo l" sysrq would help us to locate what's taking so long
>> time.
>> 
>>> c) "Just mount with "space_cache=v2""
>>> 
>>> Ok so no need to "clear_cache" the v1 cache, right?
>> 
>> Yes, and "clear_cache" won't really remove all the v1 cache anyway.
>> 
>> Thus it doesn't help much.
>> 
>> The only way to fully clear v1 cache is by using "btrfs check
>> --clear-space-cache v1" on a *unmounted* btrfs.
>> 
>>> I wrote this in the fstab but hadn't remounted yet until I can get an
>>> outage....
>> 
>> IMHO if you really want to test if v2 would help, you can just remount,
>> no need to wait for a break.
>> 
>> Thanks,
>> Qu
>>> ..."btrfs defaults,autodefrag,clear_cache,space_cache=v2,noatime  0  2 >
>>> Thanks again for your help Qu!
>>> 
>>> On 14/7/21 2:59 pm, Qu Wenruo wrote:
>> 
>> On 2021/7/13 下午11:38, DanglingPointer wrote:
>> We're currently considering switching to "space_cache=v2" with noatime
>> mount options for my lab server-workstations running RAID5.
>> 
>> Btrfs RAID5 is unsafe due to its write-hole problem.
>> 
>> * One has 13TB of data/metadata in a bunch of 6TB and 2TB disks
>> totalling 26TB.
>> * Another has about 12TB data/metadata in uniformly sized 6TB disks
>> totalling 24TB.
>> * Both of the arrays are on individually luks encrypted disks with
>> btrfs on top of the luks.
>> * Both have "defaults,autodefrag" turned on in fstab.
>> 
>> We're starting to see large pauses during constant backups of millions
>> of chunk files (using duplicacy backup) in the 24TB array.
>> 
>> Pauses sometimes take up to 20+ seconds in frequencies after every
>> ~30secs of the end of the last pause.  "btrfs-transacti" process
>> consistently shows up as the blocking process/thread locking up
>> filesystem IO.  IO gets into the RAID5 array via nfsd. There are no >>>> disk
>> or btrfs errors recorded.  scrub last finished yesterday successfully.
>> 
>> Please provide the "echo l > /proc/sysrq-trigger" output when such >>> pause
>> happens.
>> 
>> If you're using qgroup (may be enabled by things like snapper), it may
>> be the cause, as qgroup does its accounting when committing >>> transaction.
>> 
>> If one transaction is super large, it can cause such problem.
>> 
>> You can test if qgroup is enabled by:
>> 
>> # btrfs qgroup show -prce <mnt>
>> 
>> After doing some research around the internet, we've come to the
>> consideration above as described.  Unfortunately the official
>> documentation isn't clear on the following.
>> 
>> Official documentation URL -
>> https://btrfs.wiki.kernel.org/index.php/Manpage/btrfs(5)
>> 
>> 1. How to migrate from default space_cache=v1 to space_cache=v2? It
>> talks about the reverse, from v2 to v1!
>> 
>> Just mount with "space_cache=v2".
>> 
>> 2. If we use space_cache=v2, is it indeed still the case that the
>> "btrfs" command will NOT work with the filesystem?
>> 
>> Why would you think "btrfs" won't work on a btrfs?
>> 
>> Thanks,
>> Qu
>> 
>> So will our
>> "btrfs scrub start /mount/point/..." cron jobs FAIL? I'm guessing
>> the btrfs command comes from btrfs-progs which is currently >>>> v5.4.1-2
>> amd64, is that correct?
>> 3. Any other ideas on how we can get rid of those annoying pauses with
>> large backups into the array?
>> 
>> Thanks in advance!
>> 
>> DP

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: migrating to space_cache=2 and btrfs userspace commands
  2021-07-15 16:40       ` DanglingPointer
@ 2021-07-15 22:13         ` Qu Wenruo
  0 siblings, 0 replies; 16+ messages in thread
From: Qu Wenruo @ 2021-07-15 22:13 UTC (permalink / raw)
  To: DanglingPointer, linux-btrfs



On 2021/7/16 上午12:40, DanglingPointer wrote:
> Hi Qu,
>
> Just updating here that setting the mount option "space_cache=v2" and
> "noatime" completely SOLVED the performance problem!
> Basically like night and day!
>
>
> These are my full fstab mount options...
>
> btrfs defaults,autodefrag,space_cache=v2,noatime 0 2
>
>
> Perhaps defaulting the space_cache=v2 should be considered?

We're already considering that.

>  Why default
> to v1, what's the value of v1?

One of the problem in the past is the lack of write ability in btrfs-progs.

Now we're testing default it in mkfs.btrfs.

Thanks,
Qu

>
>
> So for conclusion, for large multi-terrabyte arrays (in my case RAID5s),
> setting space_cache=v2 and noatime massively increases performance and
> eliminates the large long pauses in frequent intervals by
> "btrfs-transacti" blocking all IO.
>
> Thanks Qu for your help!
>
>
>
> On 14/7/21 5:45 pm, Qu Wenruo wrote:
>>
>>
>> On 2021/7/14 下午3:18, DanglingPointer wrote:
>>> a) "echo l > /proc/sysrq-trigger"
>>>
>>> The backup finished today already unfortunately and we are unlikely to
>>> run it again until we get an outage to remount the array with the
>>> space_cache=v2 and noatime mount options.
>>> Thanks for the command, we'll definitely use it if/when it happens again
>>> on the next large migration of data.
>>
>> Just to avoid confusion, after that command, "dmesg" output is still
>> needed, as that's where sysrq put its output.
>>>
>>>
>>> b) "sudo btrfs qgroup show -prce" ........
>>>
>>> $ ERROR: can't list qgroups: quotas not enabled
>>>
>>> So looks like it isn't enabled.
>>
>> One less thing to bother.
>>>
>>> File sizes are between: 1,048,576 bytes and 16,777,216 bytes (Duplicacy
>>> backup defaults)
>>
>> Between 1~16MiB, thus tons of small files.
>>
>> Btrfs is not really good at handling tons of small files, as they
>> generate a lot of metadata.
>>
>> That may contribute to the hang.
>>
>>>
>>> What classifies as a transaction?
>>
>> It's a little complex.
>>
>> Technically it's a check point where before the checkpoint, all you see
>> is old data, after the checkpoint, all you see is new data.
>>
>> To end users, any data and metadata write will be included into one
>> transaction (with proper dependency handled).
>>
>> One way to finish (or commit) current transaction is to sync the fs,
>> using "sync" command (sync all filesystems).
>>
>>> Any/All writes done in a 30sec
>>> interval?
>>
>> This the default commit interval. Almost all fses will try to commit its
>> data/metadata to disk after a configurable interval.
>>
>> The default one is 30s. That's also one way to commit current
>> transaction.
>>
>>>   If 100 unique files were written in 30secs, is that 1
>>> transaction or 100 transactions?
>>
>> It depends. As things like syncfs() and subvolume/snapshot creation may
>> try to commit transaction.
>>
>> But without those special operations, just writing 100 unique files
>> using buffered write, it would only start one transaction, and when the
>> 30s interval get hit, the transaction will be committed to disk.
>>
>>>   Millions of files of the size range
>>> above were backed up.
>>
>> The amount of files may not force a transaction commit, if it doesn't
>> trigger enough memory pressure, or free space pressure.
>>
>> Anyway, the "echo l" sysrq would help us to locate what's taking so long
>> time.
>>
>>>
>>>
>>> c) "Just mount with "space_cache=v2""
>>>
>>> Ok so no need to "clear_cache" the v1 cache, right?
>>
>> Yes, and "clear_cache" won't really remove all the v1 cache anyway.
>>
>> Thus it doesn't help much.
>>
>> The only way to fully clear v1 cache is by using "btrfs check
>> --clear-space-cache v1" on a *unmounted* btrfs.
>>
>>> I wrote this in the fstab but hadn't remounted yet until I can get an
>>> outage....
>>
>> IMHO if you really want to test if v2 would help, you can just remount,
>> no need to wait for a break.
>>
>> Thanks,
>> Qu
>>>
>>> ..."btrfs defaults,autodefrag,clear_cache,space_cache=v2,noatime  0  2 >
>>> Thanks again for your help Qu!
>>>
>>> On 14/7/21 2:59 pm, Qu Wenruo wrote:
>>>>
>>>>
>>>> On 2021/7/13 下午11:38, DanglingPointer wrote:
>>>>> We're currently considering switching to "space_cache=v2" with noatime
>>>>> mount options for my lab server-workstations running RAID5.
>>>>
>>>> Btrfs RAID5 is unsafe due to its write-hole problem.
>>>>
>>>>>
>>>>>   * One has 13TB of data/metadata in a bunch of 6TB and 2TB disks
>>>>>     totalling 26TB.
>>>>>   * Another has about 12TB data/metadata in uniformly sized 6TB disks
>>>>>     totalling 24TB.
>>>>>   * Both of the arrays are on individually luks encrypted disks with
>>>>>     btrfs on top of the luks.
>>>>>   * Both have "defaults,autodefrag" turned on in fstab.
>>>>>
>>>>> We're starting to see large pauses during constant backups of millions
>>>>> of chunk files (using duplicacy backup) in the 24TB array.
>>>>>
>>>>> Pauses sometimes take up to 20+ seconds in frequencies after every
>>>>> ~30secs of the end of the last pause.  "btrfs-transacti" process
>>>>> consistently shows up as the blocking process/thread locking up
>>>>> filesystem IO.  IO gets into the RAID5 array via nfsd. There are no
>>>>> disk
>>>>> or btrfs errors recorded.  scrub last finished yesterday successfully.
>>>>
>>>> Please provide the "echo l > /proc/sysrq-trigger" output when such
>>>> pause
>>>> happens.
>>>>
>>>> If you're using qgroup (may be enabled by things like snapper), it may
>>>> be the cause, as qgroup does its accounting when committing
>>>> transaction.
>>>>
>>>> If one transaction is super large, it can cause such problem.
>>>>
>>>> You can test if qgroup is enabled by:
>>>>
>>>> # btrfs qgroup show -prce <mnt>
>>>>
>>>>>
>>>>> After doing some research around the internet, we've come to the
>>>>> consideration above as described.  Unfortunately the official
>>>>> documentation isn't clear on the following.
>>>>>
>>>>> Official documentation URL -
>>>>> https://btrfs.wiki.kernel.org/index.php/Manpage/btrfs(5)
>>>>>
>>>>> 1. How to migrate from default space_cache=v1 to space_cache=v2? It
>>>>>     talks about the reverse, from v2 to v1!
>>>>
>>>> Just mount with "space_cache=v2".
>>>>
>>>>> 2. If we use space_cache=v2, is it indeed still the case that the
>>>>>     "btrfs" command will NOT work with the filesystem?
>>>>
>>>> Why would you think "btrfs" won't work on a btrfs?
>>>>
>>>> Thanks,
>>>> Qu
>>>>
>>>>>   So will our
>>>>>     "btrfs scrub start /mount/point/..." cron jobs FAIL? I'm guessing
>>>>>     the btrfs command comes from btrfs-progs which is currently
>>>>> v5.4.1-2
>>>>>     amd64, is that correct?
>>>>> 3. Any other ideas on how we can get rid of those annoying pauses with
>>>>>     large backups into the array?
>>>>>
>>>>> Thanks in advance!
>>>>>
>>>>> DP
>>>>>

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: migrating to space_cache=2 and btrfs userspace commands
  2021-07-15 17:51       ` Joshua
@ 2021-07-16 12:42         ` DanglingPointer
  2021-07-16 12:59           ` Qu Wenruo
  0 siblings, 1 reply; 16+ messages in thread
From: DanglingPointer @ 2021-07-16 12:42 UTC (permalink / raw)
  To: Joshua, linux-btrfs; +Cc: danglingpointerexception

Hi Joshua, on that system where you tried to run the 
"--clear-space-cache v1", when you gave up, did you continue using 
"space_cache=v2" on it?


Here are some more questions to add for anyone who can help educate us:...

Why would someone want to clear the space_cache v1?

What's the value of clearing the previous version space_cache before 
using the new version?

Why "clear" and not just "delete"?  Won't "deleting" the whole previous 
space_cache files, blocks, whatever in the filesystem be faster then 
doing whatever "clear" does?

Am I missing out on something by not attempting to clear the previous 
version space_cache before using the new v2 version?


On 16/7/21 3:51 am, Joshua wrote:
> Just as a point of data, I have a 96 TB array with RAID1 data, and RAID1C3 metadata.
>
> I made the switch to space_cache=v2 some time ago, and I remember it made a huge difference when I did so!
> (It was RAID1 metadata at the time, as RAID1C3 was not available at the time.)
>
>
> However, I also tried a check with '--clear-space-cache v1' at the time, and after waiting a literal whole day without it completing, I gave up, canceled it, and put it back into production.  Is a --clear-space-cache v1 operation expected to take so long on such a large file system?
>
> Thanks!
> --Joshua Villwock
>
>
>
> July 15, 2021 9:40 AM, "DanglingPointer" <danglingpointerexception@gmail.com> wrote:
>
>> Hi Qu,
>>
>> Just updating here that setting the mount option "space_cache=v2" and "noatime" completely SOLVED
>> the performance problem!
>> Basically like night and day!
>>
>> These are my full fstab mount options...
>>
>> btrfs defaults,autodefrag,space_cache=v2,noatime 0 2
>>
>> Perhaps defaulting the space_cache=v2 should be considered?  Why default to v1, what's the value of
>> v1?
>>
>> So for conclusion, for large multi-terrabyte arrays (in my case RAID5s), setting space_cache=v2 and
>> noatime massively increases performance and eliminates the large long pauses in frequent intervals
>> by "btrfs-transacti" blocking all IO.
>>
>> Thanks Qu for your help!
>>
>> On 14/7/21 5:45 pm, Qu Wenruo wrote:
>>
>>> On 2021/7/14 下午3:18, DanglingPointer wrote:
>>>> a) "echo l > /proc/sysrq-trigger"
>>>>
>>>> The backup finished today already unfortunately and we are unlikely to
>>>> run it again until we get an outage to remount the array with the
>>>> space_cache=v2 and noatime mount options.
>>>> Thanks for the command, we'll definitely use it if/when it happens again
>>>> on the next large migration of data.
>>> Just to avoid confusion, after that command, "dmesg" output is still
>>> needed, as that's where sysrq put its output.
>>>> b) "sudo btrfs qgroup show -prce" ........
>>>>
>>>> $ ERROR: can't list qgroups: quotas not enabled
>>>>
>>>> So looks like it isn't enabled.
>>> One less thing to bother.
>>>> File sizes are between: 1,048,576 bytes and 16,777,216 bytes (Duplicacy
>>>> backup defaults)
>>> Between 1~16MiB, thus tons of small files.
>>>
>>> Btrfs is not really good at handling tons of small files, as they
>>> generate a lot of metadata.
>>>
>>> That may contribute to the hang.
>>>
>>>> What classifies as a transaction?
>>> It's a little complex.
>>>
>>> Technically it's a check point where before the checkpoint, all you see
>>> is old data, after the checkpoint, all you see is new data.
>>>
>>> To end users, any data and metadata write will be included into one
>>> transaction (with proper dependency handled).
>>>
>>> One way to finish (or commit) current transaction is to sync the fs,
>>> using "sync" command (sync all filesystems).
>>>
>>>> Any/All writes done in a 30sec
>>>> interval?
>>> This the default commit interval. Almost all fses will try to commit its
>>> data/metadata to disk after a configurable interval.
>>>
>>> The default one is 30s. That's also one way to commit current > transaction.
>>>
>>>> If 100 unique files were written in 30secs, is that 1
>>>> transaction or 100 transactions?
>>> It depends. As things like syncfs() and subvolume/snapshot creation may
>>> try to commit transaction.
>>>
>>> But without those special operations, just writing 100 unique files
>>> using buffered write, it would only start one transaction, and when the
>>> 30s interval get hit, the transaction will be committed to disk.
>>>
>>>> Millions of files of the size range
>>>> above were backed up.
>>> The amount of files may not force a transaction commit, if it doesn't
>>> trigger enough memory pressure, or free space pressure.
>>>
>>> Anyway, the "echo l" sysrq would help us to locate what's taking so long
>>> time.
>>>
>>>> c) "Just mount with "space_cache=v2""
>>>>
>>>> Ok so no need to "clear_cache" the v1 cache, right?
>>> Yes, and "clear_cache" won't really remove all the v1 cache anyway.
>>>
>>> Thus it doesn't help much.
>>>
>>> The only way to fully clear v1 cache is by using "btrfs check
>>> --clear-space-cache v1" on a *unmounted* btrfs.
>>>
>>>> I wrote this in the fstab but hadn't remounted yet until I can get an
>>>> outage....
>>> IMHO if you really want to test if v2 would help, you can just remount,
>>> no need to wait for a break.
>>>
>>> Thanks,
>>> Qu
>>>> ..."btrfs defaults,autodefrag,clear_cache,space_cache=v2,noatime  0  2 >
>>>> Thanks again for your help Qu!
>>>>
>>>> On 14/7/21 2:59 pm, Qu Wenruo wrote:
>>> On 2021/7/13 下午11:38, DanglingPointer wrote:
>>> We're currently considering switching to "space_cache=v2" with noatime
>>> mount options for my lab server-workstations running RAID5.
>>>
>>> Btrfs RAID5 is unsafe due to its write-hole problem.
>>>
>>> * One has 13TB of data/metadata in a bunch of 6TB and 2TB disks
>>> totalling 26TB.
>>> * Another has about 12TB data/metadata in uniformly sized 6TB disks
>>> totalling 24TB.
>>> * Both of the arrays are on individually luks encrypted disks with
>>> btrfs on top of the luks.
>>> * Both have "defaults,autodefrag" turned on in fstab.
>>>
>>> We're starting to see large pauses during constant backups of millions
>>> of chunk files (using duplicacy backup) in the 24TB array.
>>>
>>> Pauses sometimes take up to 20+ seconds in frequencies after every
>>> ~30secs of the end of the last pause.  "btrfs-transacti" process
>>> consistently shows up as the blocking process/thread locking up
>>> filesystem IO.  IO gets into the RAID5 array via nfsd. There are no >>>> disk
>>> or btrfs errors recorded.  scrub last finished yesterday successfully.
>>>
>>> Please provide the "echo l > /proc/sysrq-trigger" output when such >>> pause
>>> happens.
>>>
>>> If you're using qgroup (may be enabled by things like snapper), it may
>>> be the cause, as qgroup does its accounting when committing >>> transaction.
>>>
>>> If one transaction is super large, it can cause such problem.
>>>
>>> You can test if qgroup is enabled by:
>>>
>>> # btrfs qgroup show -prce <mnt>
>>>
>>> After doing some research around the internet, we've come to the
>>> consideration above as described.  Unfortunately the official
>>> documentation isn't clear on the following.
>>>
>>> Official documentation URL -
>>> https://btrfs.wiki.kernel.org/index.php/Manpage/btrfs(5)
>>>
>>> 1. How to migrate from default space_cache=v1 to space_cache=v2? It
>>> talks about the reverse, from v2 to v1!
>>>
>>> Just mount with "space_cache=v2".
>>>
>>> 2. If we use space_cache=v2, is it indeed still the case that the
>>> "btrfs" command will NOT work with the filesystem?
>>>
>>> Why would you think "btrfs" won't work on a btrfs?
>>>
>>> Thanks,
>>> Qu
>>>
>>> So will our
>>> "btrfs scrub start /mount/point/..." cron jobs FAIL? I'm guessing
>>> the btrfs command comes from btrfs-progs which is currently >>>> v5.4.1-2
>>> amd64, is that correct?
>>> 3. Any other ideas on how we can get rid of those annoying pauses with
>>> large backups into the array?
>>>
>>> Thanks in advance!
>>>
>>> DP

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: migrating to space_cache=2 and btrfs userspace commands
  2021-07-16 12:42         ` DanglingPointer
@ 2021-07-16 12:59           ` Qu Wenruo
  2021-07-16 13:23             ` DanglingPointer
  2021-07-16 20:33             ` Joshua Villwock
  0 siblings, 2 replies; 16+ messages in thread
From: Qu Wenruo @ 2021-07-16 12:59 UTC (permalink / raw)
  To: DanglingPointer, Joshua, linux-btrfs



On 2021/7/16 下午8:42, DanglingPointer wrote:
> Hi Joshua, on that system where you tried to run the
> "--clear-space-cache v1", when you gave up, did you continue using
> "space_cache=v2" on it?
>
>
> Here are some more questions to add for anyone who can help educate us:...
>
> Why would someone want to clear the space_cache v1?

Because it takes up space and we won't be able to delete them.

>
> What's the value of clearing the previous version space_cache before
> using the new version?

AFAIK, just save some space and reduce the root tree size.

The v1 cache exists as special files inside root tree (not accessible by
end users).

Their existence takes up space and fragments the file system (one file
is normally around 64K, and we have one such v1 file for each block
group, you can see how many small files it has now)

>
> Why "clear" and not just "delete"?  Won't "deleting" the whole previous
> space_cache files, blocks, whatever in the filesystem be faster then
> doing whatever "clear" does?

Just bad naming, and properly from me.

Indeed "delete" would be more proper here.

And we're indeed deleting them in "btrfs check --clear-space-cache v1",
that's also why it's so slow.

If you have 20T used space, then the it would be around 20,000 block
groups, meaning 20,000 64K files inside root tree, and deleting them one
by one, and each deletion will cause a new transaction, no wonder it
will be slow to hell.

>
> Am I missing out on something by not attempting to clear the previous
> version space_cache before using the new v2 version?

Except some wasted space, you're completely fine to skip the slow deletion.

This also means, I should enhance the deletion process to avoid too many
transactions...

Thanks,
Qu

>
>
> On 16/7/21 3:51 am, Joshua wrote:
>> Just as a point of data, I have a 96 TB array with RAID1 data, and
>> RAID1C3 metadata.
>>
>> I made the switch to space_cache=v2 some time ago, and I remember it
>> made a huge difference when I did so!
>> (It was RAID1 metadata at the time, as RAID1C3 was not available at
>> the time.)
>>
>>
>> However, I also tried a check with '--clear-space-cache v1' at the
>> time, and after waiting a literal whole day without it completing, I
>> gave up, canceled it, and put it back into production.  Is a
>> --clear-space-cache v1 operation expected to take so long on such a
>> large file system?
>>
>> Thanks!
>> --Joshua Villwock
>>
>>
>>
>> July 15, 2021 9:40 AM, "DanglingPointer"
>> <danglingpointerexception@gmail.com> wrote:
>>
>>> Hi Qu,
>>>
>>> Just updating here that setting the mount option "space_cache=v2" and
>>> "noatime" completely SOLVED
>>> the performance problem!
>>> Basically like night and day!
>>>
>>> These are my full fstab mount options...
>>>
>>> btrfs defaults,autodefrag,space_cache=v2,noatime 0 2
>>>
>>> Perhaps defaulting the space_cache=v2 should be considered?  Why
>>> default to v1, what's the value of
>>> v1?
>>>
>>> So for conclusion, for large multi-terrabyte arrays (in my case
>>> RAID5s), setting space_cache=v2 and
>>> noatime massively increases performance and eliminates the large long
>>> pauses in frequent intervals
>>> by "btrfs-transacti" blocking all IO.
>>>
>>> Thanks Qu for your help!
>>>
>>> On 14/7/21 5:45 pm, Qu Wenruo wrote:
>>>
>>>> On 2021/7/14 下午3:18, DanglingPointer wrote:
>>>>> a) "echo l > /proc/sysrq-trigger"
>>>>>
>>>>> The backup finished today already unfortunately and we are unlikely to
>>>>> run it again until we get an outage to remount the array with the
>>>>> space_cache=v2 and noatime mount options.
>>>>> Thanks for the command, we'll definitely use it if/when it happens
>>>>> again
>>>>> on the next large migration of data.
>>>> Just to avoid confusion, after that command, "dmesg" output is still
>>>> needed, as that's where sysrq put its output.
>>>>> b) "sudo btrfs qgroup show -prce" ........
>>>>>
>>>>> $ ERROR: can't list qgroups: quotas not enabled
>>>>>
>>>>> So looks like it isn't enabled.
>>>> One less thing to bother.
>>>>> File sizes are between: 1,048,576 bytes and 16,777,216 bytes
>>>>> (Duplicacy
>>>>> backup defaults)
>>>> Between 1~16MiB, thus tons of small files.
>>>>
>>>> Btrfs is not really good at handling tons of small files, as they
>>>> generate a lot of metadata.
>>>>
>>>> That may contribute to the hang.
>>>>
>>>>> What classifies as a transaction?
>>>> It's a little complex.
>>>>
>>>> Technically it's a check point where before the checkpoint, all you see
>>>> is old data, after the checkpoint, all you see is new data.
>>>>
>>>> To end users, any data and metadata write will be included into one
>>>> transaction (with proper dependency handled).
>>>>
>>>> One way to finish (or commit) current transaction is to sync the fs,
>>>> using "sync" command (sync all filesystems).
>>>>
>>>>> Any/All writes done in a 30sec
>>>>> interval?
>>>> This the default commit interval. Almost all fses will try to commit
>>>> its
>>>> data/metadata to disk after a configurable interval.
>>>>
>>>> The default one is 30s. That's also one way to commit current >
>>>> transaction.
>>>>
>>>>> If 100 unique files were written in 30secs, is that 1
>>>>> transaction or 100 transactions?
>>>> It depends. As things like syncfs() and subvolume/snapshot creation may
>>>> try to commit transaction.
>>>>
>>>> But without those special operations, just writing 100 unique files
>>>> using buffered write, it would only start one transaction, and when the
>>>> 30s interval get hit, the transaction will be committed to disk.
>>>>
>>>>> Millions of files of the size range
>>>>> above were backed up.
>>>> The amount of files may not force a transaction commit, if it doesn't
>>>> trigger enough memory pressure, or free space pressure.
>>>>
>>>> Anyway, the "echo l" sysrq would help us to locate what's taking so
>>>> long
>>>> time.
>>>>
>>>>> c) "Just mount with "space_cache=v2""
>>>>>
>>>>> Ok so no need to "clear_cache" the v1 cache, right?
>>>> Yes, and "clear_cache" won't really remove all the v1 cache anyway.
>>>>
>>>> Thus it doesn't help much.
>>>>
>>>> The only way to fully clear v1 cache is by using "btrfs check
>>>> --clear-space-cache v1" on a *unmounted* btrfs.
>>>>
>>>>> I wrote this in the fstab but hadn't remounted yet until I can get an
>>>>> outage....
>>>> IMHO if you really want to test if v2 would help, you can just remount,
>>>> no need to wait for a break.
>>>>
>>>> Thanks,
>>>> Qu
>>>>> ..."btrfs defaults,autodefrag,clear_cache,space_cache=v2,noatime
>>>>> 0  2 >
>>>>> Thanks again for your help Qu!
>>>>>
>>>>> On 14/7/21 2:59 pm, Qu Wenruo wrote:
>>>> On 2021/7/13 下午11:38, DanglingPointer wrote:
>>>> We're currently considering switching to "space_cache=v2" with noatime
>>>> mount options for my lab server-workstations running RAID5.
>>>>
>>>> Btrfs RAID5 is unsafe due to its write-hole problem.
>>>>
>>>> * One has 13TB of data/metadata in a bunch of 6TB and 2TB disks
>>>> totalling 26TB.
>>>> * Another has about 12TB data/metadata in uniformly sized 6TB disks
>>>> totalling 24TB.
>>>> * Both of the arrays are on individually luks encrypted disks with
>>>> btrfs on top of the luks.
>>>> * Both have "defaults,autodefrag" turned on in fstab.
>>>>
>>>> We're starting to see large pauses during constant backups of millions
>>>> of chunk files (using duplicacy backup) in the 24TB array.
>>>>
>>>> Pauses sometimes take up to 20+ seconds in frequencies after every
>>>> ~30secs of the end of the last pause.  "btrfs-transacti" process
>>>> consistently shows up as the blocking process/thread locking up
>>>> filesystem IO.  IO gets into the RAID5 array via nfsd. There are no
>>>> >>>> disk
>>>> or btrfs errors recorded.  scrub last finished yesterday successfully.
>>>>
>>>> Please provide the "echo l > /proc/sysrq-trigger" output when such
>>>> >>> pause
>>>> happens.
>>>>
>>>> If you're using qgroup (may be enabled by things like snapper), it may
>>>> be the cause, as qgroup does its accounting when committing >>>
>>>> transaction.
>>>>
>>>> If one transaction is super large, it can cause such problem.
>>>>
>>>> You can test if qgroup is enabled by:
>>>>
>>>> # btrfs qgroup show -prce <mnt>
>>>>
>>>> After doing some research around the internet, we've come to the
>>>> consideration above as described.  Unfortunately the official
>>>> documentation isn't clear on the following.
>>>>
>>>> Official documentation URL -
>>>> https://btrfs.wiki.kernel.org/index.php/Manpage/btrfs(5)
>>>>
>>>> 1. How to migrate from default space_cache=v1 to space_cache=v2? It
>>>> talks about the reverse, from v2 to v1!
>>>>
>>>> Just mount with "space_cache=v2".
>>>>
>>>> 2. If we use space_cache=v2, is it indeed still the case that the
>>>> "btrfs" command will NOT work with the filesystem?
>>>>
>>>> Why would you think "btrfs" won't work on a btrfs?
>>>>
>>>> Thanks,
>>>> Qu
>>>>
>>>> So will our
>>>> "btrfs scrub start /mount/point/..." cron jobs FAIL? I'm guessing
>>>> the btrfs command comes from btrfs-progs which is currently >>>>
>>>> v5.4.1-2
>>>> amd64, is that correct?
>>>> 3. Any other ideas on how we can get rid of those annoying pauses with
>>>> large backups into the array?
>>>>
>>>> Thanks in advance!
>>>>
>>>> DP

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: migrating to space_cache=2 and btrfs userspace commands
  2021-07-16 12:59           ` Qu Wenruo
@ 2021-07-16 13:23             ` DanglingPointer
  2021-07-16 20:33             ` Joshua Villwock
  1 sibling, 0 replies; 16+ messages in thread
From: DanglingPointer @ 2021-07-16 13:23 UTC (permalink / raw)
  To: Qu Wenruo, Joshua, linux-btrfs

Thanks Qu for the comprehensive response!


On 16/7/21 10:59 pm, Qu Wenruo wrote:
>
>
> On 2021/7/16 下午8:42, DanglingPointer wrote:
>> Hi Joshua, on that system where you tried to run the
>> "--clear-space-cache v1", when you gave up, did you continue using
>> "space_cache=v2" on it?
>>
>>
>> Here are some more questions to add for anyone who can help educate 
>> us:...
>>
>> Why would someone want to clear the space_cache v1?
>
> Because it takes up space and we won't be able to delete them.
>
>>
>> What's the value of clearing the previous version space_cache before
>> using the new version?
>
> AFAIK, just save some space and reduce the root tree size.
>
> The v1 cache exists as special files inside root tree (not accessible by
> end users).
>
> Their existence takes up space and fragments the file system (one file
> is normally around 64K, and we have one such v1 file for each block
> group, you can see how many small files it has now)
>
>>
>> Why "clear" and not just "delete"?  Won't "deleting" the whole previous
>> space_cache files, blocks, whatever in the filesystem be faster then
>> doing whatever "clear" does?
>
> Just bad naming, and properly from me.
>
> Indeed "delete" would be more proper here.
>
> And we're indeed deleting them in "btrfs check --clear-space-cache v1",
> that's also why it's so slow.
>
> If you have 20T used space, then the it would be around 20,000 block
> groups, meaning 20,000 64K files inside root tree, and deleting them one
> by one, and each deletion will cause a new transaction, no wonder it
> will be slow to hell.
>
>>
>> Am I missing out on something by not attempting to clear the previous
>> version space_cache before using the new v2 version?
>
> Except some wasted space, you're completely fine to skip the slow 
> deletion.
>
> This also means, I should enhance the deletion process to avoid too many
> transactions...
>
> Thanks,
> Qu
>
>>
>>
>> On 16/7/21 3:51 am, Joshua wrote:
>>> Just as a point of data, I have a 96 TB array with RAID1 data, and
>>> RAID1C3 metadata.
>>>
>>> I made the switch to space_cache=v2 some time ago, and I remember it
>>> made a huge difference when I did so!
>>> (It was RAID1 metadata at the time, as RAID1C3 was not available at
>>> the time.)
>>>
>>>
>>> However, I also tried a check with '--clear-space-cache v1' at the
>>> time, and after waiting a literal whole day without it completing, I
>>> gave up, canceled it, and put it back into production.  Is a
>>> --clear-space-cache v1 operation expected to take so long on such a
>>> large file system?
>>>
>>> Thanks!
>>> --Joshua Villwock
>>>
>>>
>>>
>>> July 15, 2021 9:40 AM, "DanglingPointer"
>>> <danglingpointerexception@gmail.com> wrote:
>>>
>>>> Hi Qu,
>>>>
>>>> Just updating here that setting the mount option "space_cache=v2" and
>>>> "noatime" completely SOLVED
>>>> the performance problem!
>>>> Basically like night and day!
>>>>
>>>> These are my full fstab mount options...
>>>>
>>>> btrfs defaults,autodefrag,space_cache=v2,noatime 0 2
>>>>
>>>> Perhaps defaulting the space_cache=v2 should be considered? Why
>>>> default to v1, what's the value of
>>>> v1?
>>>>
>>>> So for conclusion, for large multi-terrabyte arrays (in my case
>>>> RAID5s), setting space_cache=v2 and
>>>> noatime massively increases performance and eliminates the large long
>>>> pauses in frequent intervals
>>>> by "btrfs-transacti" blocking all IO.
>>>>
>>>> Thanks Qu for your help!
>>>>
>>>> On 14/7/21 5:45 pm, Qu Wenruo wrote:
>>>>
>>>>> On 2021/7/14 下午3:18, DanglingPointer wrote:
>>>>>> a) "echo l > /proc/sysrq-trigger"
>>>>>>
>>>>>> The backup finished today already unfortunately and we are 
>>>>>> unlikely to
>>>>>> run it again until we get an outage to remount the array with the
>>>>>> space_cache=v2 and noatime mount options.
>>>>>> Thanks for the command, we'll definitely use it if/when it happens
>>>>>> again
>>>>>> on the next large migration of data.
>>>>> Just to avoid confusion, after that command, "dmesg" output is still
>>>>> needed, as that's where sysrq put its output.
>>>>>> b) "sudo btrfs qgroup show -prce" ........
>>>>>>
>>>>>> $ ERROR: can't list qgroups: quotas not enabled
>>>>>>
>>>>>> So looks like it isn't enabled.
>>>>> One less thing to bother.
>>>>>> File sizes are between: 1,048,576 bytes and 16,777,216 bytes
>>>>>> (Duplicacy
>>>>>> backup defaults)
>>>>> Between 1~16MiB, thus tons of small files.
>>>>>
>>>>> Btrfs is not really good at handling tons of small files, as they
>>>>> generate a lot of metadata.
>>>>>
>>>>> That may contribute to the hang.
>>>>>
>>>>>> What classifies as a transaction?
>>>>> It's a little complex.
>>>>>
>>>>> Technically it's a check point where before the checkpoint, all 
>>>>> you see
>>>>> is old data, after the checkpoint, all you see is new data.
>>>>>
>>>>> To end users, any data and metadata write will be included into one
>>>>> transaction (with proper dependency handled).
>>>>>
>>>>> One way to finish (or commit) current transaction is to sync the fs,
>>>>> using "sync" command (sync all filesystems).
>>>>>
>>>>>> Any/All writes done in a 30sec
>>>>>> interval?
>>>>> This the default commit interval. Almost all fses will try to commit
>>>>> its
>>>>> data/metadata to disk after a configurable interval.
>>>>>
>>>>> The default one is 30s. That's also one way to commit current >
>>>>> transaction.
>>>>>
>>>>>> If 100 unique files were written in 30secs, is that 1
>>>>>> transaction or 100 transactions?
>>>>> It depends. As things like syncfs() and subvolume/snapshot 
>>>>> creation may
>>>>> try to commit transaction.
>>>>>
>>>>> But without those special operations, just writing 100 unique files
>>>>> using buffered write, it would only start one transaction, and 
>>>>> when the
>>>>> 30s interval get hit, the transaction will be committed to disk.
>>>>>
>>>>>> Millions of files of the size range
>>>>>> above were backed up.
>>>>> The amount of files may not force a transaction commit, if it doesn't
>>>>> trigger enough memory pressure, or free space pressure.
>>>>>
>>>>> Anyway, the "echo l" sysrq would help us to locate what's taking so
>>>>> long
>>>>> time.
>>>>>
>>>>>> c) "Just mount with "space_cache=v2""
>>>>>>
>>>>>> Ok so no need to "clear_cache" the v1 cache, right?
>>>>> Yes, and "clear_cache" won't really remove all the v1 cache anyway.
>>>>>
>>>>> Thus it doesn't help much.
>>>>>
>>>>> The only way to fully clear v1 cache is by using "btrfs check
>>>>> --clear-space-cache v1" on a *unmounted* btrfs.
>>>>>
>>>>>> I wrote this in the fstab but hadn't remounted yet until I can 
>>>>>> get an
>>>>>> outage....
>>>>> IMHO if you really want to test if v2 would help, you can just 
>>>>> remount,
>>>>> no need to wait for a break.
>>>>>
>>>>> Thanks,
>>>>> Qu
>>>>>> ..."btrfs defaults,autodefrag,clear_cache,space_cache=v2,noatime
>>>>>> 0  2 >
>>>>>> Thanks again for your help Qu!
>>>>>>
>>>>>> On 14/7/21 2:59 pm, Qu Wenruo wrote:
>>>>> On 2021/7/13 下午11:38, DanglingPointer wrote:
>>>>> We're currently considering switching to "space_cache=v2" with 
>>>>> noatime
>>>>> mount options for my lab server-workstations running RAID5.
>>>>>
>>>>> Btrfs RAID5 is unsafe due to its write-hole problem.
>>>>>
>>>>> * One has 13TB of data/metadata in a bunch of 6TB and 2TB disks
>>>>> totalling 26TB.
>>>>> * Another has about 12TB data/metadata in uniformly sized 6TB disks
>>>>> totalling 24TB.
>>>>> * Both of the arrays are on individually luks encrypted disks with
>>>>> btrfs on top of the luks.
>>>>> * Both have "defaults,autodefrag" turned on in fstab.
>>>>>
>>>>> We're starting to see large pauses during constant backups of 
>>>>> millions
>>>>> of chunk files (using duplicacy backup) in the 24TB array.
>>>>>
>>>>> Pauses sometimes take up to 20+ seconds in frequencies after every
>>>>> ~30secs of the end of the last pause.  "btrfs-transacti" process
>>>>> consistently shows up as the blocking process/thread locking up
>>>>> filesystem IO.  IO gets into the RAID5 array via nfsd. There are no
>>>>> >>>> disk
>>>>> or btrfs errors recorded.  scrub last finished yesterday 
>>>>> successfully.
>>>>>
>>>>> Please provide the "echo l > /proc/sysrq-trigger" output when such
>>>>> >>> pause
>>>>> happens.
>>>>>
>>>>> If you're using qgroup (may be enabled by things like snapper), it 
>>>>> may
>>>>> be the cause, as qgroup does its accounting when committing >>>
>>>>> transaction.
>>>>>
>>>>> If one transaction is super large, it can cause such problem.
>>>>>
>>>>> You can test if qgroup is enabled by:
>>>>>
>>>>> # btrfs qgroup show -prce <mnt>
>>>>>
>>>>> After doing some research around the internet, we've come to the
>>>>> consideration above as described.  Unfortunately the official
>>>>> documentation isn't clear on the following.
>>>>>
>>>>> Official documentation URL -
>>>>> https://btrfs.wiki.kernel.org/index.php/Manpage/btrfs(5)
>>>>>
>>>>> 1. How to migrate from default space_cache=v1 to space_cache=v2? It
>>>>> talks about the reverse, from v2 to v1!
>>>>>
>>>>> Just mount with "space_cache=v2".
>>>>>
>>>>> 2. If we use space_cache=v2, is it indeed still the case that the
>>>>> "btrfs" command will NOT work with the filesystem?
>>>>>
>>>>> Why would you think "btrfs" won't work on a btrfs?
>>>>>
>>>>> Thanks,
>>>>> Qu
>>>>>
>>>>> So will our
>>>>> "btrfs scrub start /mount/point/..." cron jobs FAIL? I'm guessing
>>>>> the btrfs command comes from btrfs-progs which is currently >>>>
>>>>> v5.4.1-2
>>>>> amd64, is that correct?
>>>>> 3. Any other ideas on how we can get rid of those annoying pauses 
>>>>> with
>>>>> large backups into the array?
>>>>>
>>>>> Thanks in advance!
>>>>>
>>>>> DP

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: migrating to space_cache=2 and btrfs userspace commands
  2021-07-16 12:59           ` Qu Wenruo
  2021-07-16 13:23             ` DanglingPointer
@ 2021-07-16 20:33             ` Joshua Villwock
  2021-07-16 23:00               ` Qu Wenruo
  1 sibling, 1 reply; 16+ messages in thread
From: Joshua Villwock @ 2021-07-16 20:33 UTC (permalink / raw)
  To: Qu Wenruo; +Cc: linux-btrfs

Qu, thanks indeed for your responses!

> On Jul 16, 2021, at 5:59 AM, Qu Wenruo <quwenruo.btrfs@gmx.com> wrote:
> 
> 
> 
>> On 2021/7/16 下午8:42, DanglingPointer wrote:
>> Hi Joshua, on that system where you tried to run the
>> "--clear-space-cache v1", when you gave up, did you continue using
>> "space_cache=v2" on it?
>> 
>> 
>> Here are some more questions to add for anyone who can help educate us:...
>> 
>> Why would someone want to clear the space_cache v1?
> 
> Because it takes up space and we won't be able to delete them.
> 
>> 
>> What's the value of clearing the previous version space_cache before
>> using the new version?
> 
> AFAIK, just save some space and reduce the root tree size.
> 
> The v1 cache exists as special files inside root tree (not accessible by
> end users).
> 
> Their existence takes up space and fragments the file system (one file
> is normally around 64K, and we have one such v1 file for each block
> group, you can see how many small files it has now)
> 
>> 
>> Why "clear" and not just "delete"?  Won't "deleting" the whole previous
>> space_cache files, blocks, whatever in the filesystem be faster then
>> doing whatever "clear" does?
> 
> Just bad naming, and properly from me.
> 
> Indeed "delete" would be more proper here.
> 
> And we're indeed deleting them in "btrfs check --clear-space-cache v1",
> that's also why it's so slow.
> 
> If you have 20T used space, then the it would be around 20,000 block
> groups, meaning 20,000 64K files inside root tree, and deleting them one
> by one, and each deletion will cause a new transaction, no wonder it
> will be slow to hell.

If v1 space cache exists as special files in the root tree, could those be a contributing factor to the issue I was having in the past with mounts taking too long and dropping to recovery mode?

I discovered running btrfs fi defrag on each of my subvolumes (for the subvolume tree/extent tree, not the files) every couple weeks has reduced the mount time enough I don’t run into that issue on my massive FS anymore.

Could actually getting rid of those thousands of now-useless v1 entries help reduce mount times on such a massive FS, or would that be completely unrelated?

Thanks,
-Joshua

>> 
>> Am I missing out on something by not attempting to clear the previous
>> version space_cache before using the new v2 version?
> 
> Except some wasted space, you're completely fine to skip the slow deletion.
> 
> This also means, I should enhance the deletion process to avoid too many
> transactions...
> 
> Thanks,
> Qu
> 
>> 
>> 
>>> On 16/7/21 3:51 am, Joshua wrote:
>>> Just as a point of data, I have a 96 TB array with RAID1 data, and
>>> RAID1C3 metadata.
>>> 
>>> I made the switch to space_cache=v2 some time ago, and I remember it
>>> made a huge difference when I did so!
>>> (It was RAID1 metadata at the time, as RAID1C3 was not available at
>>> the time.)
>>> 
>>> 
>>> However, I also tried a check with '--clear-space-cache v1' at the
>>> time, and after waiting a literal whole day without it completing, I
>>> gave up, canceled it, and put it back into production.  Is a
>>> --clear-space-cache v1 operation expected to take so long on such a
>>> large file system?
>>> 
>>> Thanks!
>>> --Joshua Villwock
>>> 
>>> 
>>> 
>>> July 15, 2021 9:40 AM, "DanglingPointer"
>>> <danglingpointerexception@gmail.com> wrote:
>>> 
>>>> Hi Qu,
>>>> 
>>>> Just updating here that setting the mount option "space_cache=v2" and
>>>> "noatime" completely SOLVED
>>>> the performance problem!
>>>> Basically like night and day!
>>>> 
>>>> These are my full fstab mount options...
>>>> 
>>>> btrfs defaults,autodefrag,space_cache=v2,noatime 0 2
>>>> 
>>>> Perhaps defaulting the space_cache=v2 should be considered?  Why
>>>> default to v1, what's the value of
>>>> v1?
>>>> 
>>>> So for conclusion, for large multi-terrabyte arrays (in my case
>>>> RAID5s), setting space_cache=v2 and
>>>> noatime massively increases performance and eliminates the large long
>>>> pauses in frequent intervals
>>>> by "btrfs-transacti" blocking all IO.
>>>> 
>>>> Thanks Qu for your help!
>>>> 
>>>> On 14/7/21 5:45 pm, Qu Wenruo wrote:
>>>> 
>>>>> On 2021/7/14 下午3:18, DanglingPointer wrote:
>>>>>> a) "echo l > /proc/sysrq-trigger"
>>>>>> 
>>>>>> The backup finished today already unfortunately and we are unlikely to
>>>>>> run it again until we get an outage to remount the array with the
>>>>>> space_cache=v2 and noatime mount options.
>>>>>> Thanks for the command, we'll definitely use it if/when it happens
>>>>>> again
>>>>>> on the next large migration of data.
>>>>> Just to avoid confusion, after that command, "dmesg" output is still
>>>>> needed, as that's where sysrq put its output.
>>>>>> b) "sudo btrfs qgroup show -prce" ........
>>>>>> 
>>>>>> $ ERROR: can't list qgroups: quotas not enabled
>>>>>> 
>>>>>> So looks like it isn't enabled.
>>>>> One less thing to bother.
>>>>>> File sizes are between: 1,048,576 bytes and 16,777,216 bytes
>>>>>> (Duplicacy
>>>>>> backup defaults)
>>>>> Between 1~16MiB, thus tons of small files.
>>>>> 
>>>>> Btrfs is not really good at handling tons of small files, as they
>>>>> generate a lot of metadata.
>>>>> 
>>>>> That may contribute to the hang.
>>>>> 
>>>>>> What classifies as a transaction?
>>>>> It's a little complex.
>>>>> 
>>>>> Technically it's a check point where before the checkpoint, all you see
>>>>> is old data, after the checkpoint, all you see is new data.
>>>>> 
>>>>> To end users, any data and metadata write will be included into one
>>>>> transaction (with proper dependency handled).
>>>>> 
>>>>> One way to finish (or commit) current transaction is to sync the fs,
>>>>> using "sync" command (sync all filesystems).
>>>>> 
>>>>>> Any/All writes done in a 30sec
>>>>>> interval?
>>>>> This the default commit interval. Almost all fses will try to commit
>>>>> its
>>>>> data/metadata to disk after a configurable interval.
>>>>> 
>>>>> The default one is 30s. That's also one way to commit current >
>>>>> transaction.
>>>>> 
>>>>>> If 100 unique files were written in 30secs, is that 1
>>>>>> transaction or 100 transactions?
>>>>> It depends. As things like syncfs() and subvolume/snapshot creation may
>>>>> try to commit transaction.
>>>>> 
>>>>> But without those special operations, just writing 100 unique files
>>>>> using buffered write, it would only start one transaction, and when the
>>>>> 30s interval get hit, the transaction will be committed to disk.
>>>>> 
>>>>>> Millions of files of the size range
>>>>>> above were backed up.
>>>>> The amount of files may not force a transaction commit, if it doesn't
>>>>> trigger enough memory pressure, or free space pressure.
>>>>> 
>>>>> Anyway, the "echo l" sysrq would help us to locate what's taking so
>>>>> long
>>>>> time.
>>>>> 
>>>>>> c) "Just mount with "space_cache=v2""
>>>>>> 
>>>>>> Ok so no need to "clear_cache" the v1 cache, right?
>>>>> Yes, and "clear_cache" won't really remove all the v1 cache anyway.
>>>>> 
>>>>> Thus it doesn't help much.
>>>>> 
>>>>> The only way to fully clear v1 cache is by using "btrfs check
>>>>> --clear-space-cache v1" on a *unmounted* btrfs.
>>>>> 
>>>>>> I wrote this in the fstab but hadn't remounted yet until I can get an
>>>>>> outage....
>>>>> IMHO if you really want to test if v2 would help, you can just remount,
>>>>> no need to wait for a break.
>>>>> 
>>>>> Thanks,
>>>>> Qu
>>>>>> ..."btrfs defaults,autodefrag,clear_cache,space_cache=v2,noatime
>>>>>> 0  2 >
>>>>>> Thanks again for your help Qu!
>>>>>> 
>>>>>> On 14/7/21 2:59 pm, Qu Wenruo wrote:
>>>>> On 2021/7/13 下午11:38, DanglingPointer wrote:
>>>>> We're currently considering switching to "space_cache=v2" with noatime
>>>>> mount options for my lab server-workstations running RAID5.
>>>>> 
>>>>> Btrfs RAID5 is unsafe due to its write-hole problem.
>>>>> 
>>>>> * One has 13TB of data/metadata in a bunch of 6TB and 2TB disks
>>>>> totalling 26TB.
>>>>> * Another has about 12TB data/metadata in uniformly sized 6TB disks
>>>>> totalling 24TB.
>>>>> * Both of the arrays are on individually luks encrypted disks with
>>>>> btrfs on top of the luks.
>>>>> * Both have "defaults,autodefrag" turned on in fstab.
>>>>> 
>>>>> We're starting to see large pauses during constant backups of millions
>>>>> of chunk files (using duplicacy backup) in the 24TB array.
>>>>> 
>>>>> Pauses sometimes take up to 20+ seconds in frequencies after every
>>>>> ~30secs of the end of the last pause.  "btrfs-transacti" process
>>>>> consistently shows up as the blocking process/thread locking up
>>>>> filesystem IO.  IO gets into the RAID5 array via nfsd. There are no
>>>>> >>>> disk
>>>>> or btrfs errors recorded.  scrub last finished yesterday successfully.
>>>>> 
>>>>> Please provide the "echo l > /proc/sysrq-trigger" output when such
>>>>> >>> pause
>>>>> happens.
>>>>> 
>>>>> If you're using qgroup (may be enabled by things like snapper), it may
>>>>> be the cause, as qgroup does its accounting when committing >>>
>>>>> transaction.
>>>>> 
>>>>> If one transaction is super large, it can cause such problem.
>>>>> 
>>>>> You can test if qgroup is enabled by:
>>>>> 
>>>>> # btrfs qgroup show -prce <mnt>
>>>>> 
>>>>> After doing some research around the internet, we've come to the
>>>>> consideration above as described.  Unfortunately the official
>>>>> documentation isn't clear on the following.
>>>>> 
>>>>> Official documentation URL -
>>>>> https://btrfs.wiki.kernel.org/index.php/Manpage/btrfs(5)
>>>>> 
>>>>> 1. How to migrate from default space_cache=v1 to space_cache=v2? It
>>>>> talks about the reverse, from v2 to v1!
>>>>> 
>>>>> Just mount with "space_cache=v2".
>>>>> 
>>>>> 2. If we use space_cache=v2, is it indeed still the case that the
>>>>> "btrfs" command will NOT work with the filesystem?
>>>>> 
>>>>> Why would you think "btrfs" won't work on a btrfs?
>>>>> 
>>>>> Thanks,
>>>>> Qu
>>>>> 
>>>>> So will our
>>>>> "btrfs scrub start /mount/point/..." cron jobs FAIL? I'm guessing
>>>>> the btrfs command comes from btrfs-progs which is currently >>>>
>>>>> v5.4.1-2
>>>>> amd64, is that correct?
>>>>> 3. Any other ideas on how we can get rid of those annoying pauses with
>>>>> large backups into the array?
>>>>> 
>>>>> Thanks in advance!
>>>>> 
>>>>> DP

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: migrating to space_cache=2 and btrfs userspace commands
  2021-07-16 20:33             ` Joshua Villwock
@ 2021-07-16 23:00               ` Qu Wenruo
  0 siblings, 0 replies; 16+ messages in thread
From: Qu Wenruo @ 2021-07-16 23:00 UTC (permalink / raw)
  To: Joshua Villwock; +Cc: linux-btrfs



On 2021/7/17 上午4:33, Joshua Villwock wrote:
> Qu, thanks indeed for your responses!
>
>> On Jul 16, 2021, at 5:59 AM, Qu Wenruo <quwenruo.btrfs@gmx.com> wrote:
>>
>> 
>>
>>> On 2021/7/16 下午8:42, DanglingPointer wrote:
>>> Hi Joshua, on that system where you tried to run the
>>> "--clear-space-cache v1", when you gave up, did you continue using
>>> "space_cache=v2" on it?
>>>
>>>
>>> Here are some more questions to add for anyone who can help educate us:...
>>>
>>> Why would someone want to clear the space_cache v1?
>>
>> Because it takes up space and we won't be able to delete them.
>>
>>>
>>> What's the value of clearing the previous version space_cache before
>>> using the new version?
>>
>> AFAIK, just save some space and reduce the root tree size.
>>
>> The v1 cache exists as special files inside root tree (not accessible by
>> end users).
>>
>> Their existence takes up space and fragments the file system (one file
>> is normally around 64K, and we have one such v1 file for each block
>> group, you can see how many small files it has now)
>>
>>>
>>> Why "clear" and not just "delete"?  Won't "deleting" the whole previous
>>> space_cache files, blocks, whatever in the filesystem be faster then
>>> doing whatever "clear" does?
>>
>> Just bad naming, and properly from me.
>>
>> Indeed "delete" would be more proper here.
>>
>> And we're indeed deleting them in "btrfs check --clear-space-cache v1",
>> that's also why it's so slow.
>>
>> If you have 20T used space, then the it would be around 20,000 block
>> groups, meaning 20,000 64K files inside root tree, and deleting them one
>> by one, and each deletion will cause a new transaction, no wonder it
>> will be slow to hell.
>
> If v1 space cache exists as special files in the root tree, could those be a contributing factor to the issue I was having in the past with mounts taking too long and dropping to recovery mode?

That's another known thing.

It's block group items in extent tree.

I have proposed skinny block group tree to greatly reduce the time
needed to mount large fs.

But unfortunately it's not yet merged, and even merged, you will still
need to do a lengthy convert to the new format.

>
> I discovered running btrfs fi defrag on each of my subvolumes (for the subvolume tree/extent tree, not the files) every couple weeks has reduced the mount time enough I don’t run into that issue on my massive FS anymore.

Yes, defrag reduce the size of extent tree, but not good enough,
especially when you have hundreds of thousands block groups, and every
time mount needs to find the block group items scattering around the
large extent tree, it will take quite some time.

>
> Could actually getting rid of those thousands of now-useless v1 entries help reduce mount times on such a massive FS, or would that be completely unrelated?

It can help, but not a root cause.

Thanks,
Qu

>
> Thanks,
> -Joshua
>
>>>
>>> Am I missing out on something by not attempting to clear the previous
>>> version space_cache before using the new v2 version?
>>
>> Except some wasted space, you're completely fine to skip the slow deletion.
>>
>> This also means, I should enhance the deletion process to avoid too many
>> transactions...
>>
>> Thanks,
>> Qu
>>
>>>
>>>
>>>> On 16/7/21 3:51 am, Joshua wrote:
>>>> Just as a point of data, I have a 96 TB array with RAID1 data, and
>>>> RAID1C3 metadata.
>>>>
>>>> I made the switch to space_cache=v2 some time ago, and I remember it
>>>> made a huge difference when I did so!
>>>> (It was RAID1 metadata at the time, as RAID1C3 was not available at
>>>> the time.)
>>>>
>>>>
>>>> However, I also tried a check with '--clear-space-cache v1' at the
>>>> time, and after waiting a literal whole day without it completing, I
>>>> gave up, canceled it, and put it back into production.  Is a
>>>> --clear-space-cache v1 operation expected to take so long on such a
>>>> large file system?
>>>>
>>>> Thanks!
>>>> --Joshua Villwock
>>>>
>>>>
>>>>
>>>> July 15, 2021 9:40 AM, "DanglingPointer"
>>>> <danglingpointerexception@gmail.com> wrote:
>>>>
>>>>> Hi Qu,
>>>>>
>>>>> Just updating here that setting the mount option "space_cache=v2" and
>>>>> "noatime" completely SOLVED
>>>>> the performance problem!
>>>>> Basically like night and day!
>>>>>
>>>>> These are my full fstab mount options...
>>>>>
>>>>> btrfs defaults,autodefrag,space_cache=v2,noatime 0 2
>>>>>
>>>>> Perhaps defaulting the space_cache=v2 should be considered?  Why
>>>>> default to v1, what's the value of
>>>>> v1?
>>>>>
>>>>> So for conclusion, for large multi-terrabyte arrays (in my case
>>>>> RAID5s), setting space_cache=v2 and
>>>>> noatime massively increases performance and eliminates the large long
>>>>> pauses in frequent intervals
>>>>> by "btrfs-transacti" blocking all IO.
>>>>>
>>>>> Thanks Qu for your help!
>>>>>
>>>>> On 14/7/21 5:45 pm, Qu Wenruo wrote:
>>>>>
>>>>>> On 2021/7/14 下午3:18, DanglingPointer wrote:
>>>>>>> a) "echo l > /proc/sysrq-trigger"
>>>>>>>
>>>>>>> The backup finished today already unfortunately and we are unlikely to
>>>>>>> run it again until we get an outage to remount the array with the
>>>>>>> space_cache=v2 and noatime mount options.
>>>>>>> Thanks for the command, we'll definitely use it if/when it happens
>>>>>>> again
>>>>>>> on the next large migration of data.
>>>>>> Just to avoid confusion, after that command, "dmesg" output is still
>>>>>> needed, as that's where sysrq put its output.
>>>>>>> b) "sudo btrfs qgroup show -prce" ........
>>>>>>>
>>>>>>> $ ERROR: can't list qgroups: quotas not enabled
>>>>>>>
>>>>>>> So looks like it isn't enabled.
>>>>>> One less thing to bother.
>>>>>>> File sizes are between: 1,048,576 bytes and 16,777,216 bytes
>>>>>>> (Duplicacy
>>>>>>> backup defaults)
>>>>>> Between 1~16MiB, thus tons of small files.
>>>>>>
>>>>>> Btrfs is not really good at handling tons of small files, as they
>>>>>> generate a lot of metadata.
>>>>>>
>>>>>> That may contribute to the hang.
>>>>>>
>>>>>>> What classifies as a transaction?
>>>>>> It's a little complex.
>>>>>>
>>>>>> Technically it's a check point where before the checkpoint, all you see
>>>>>> is old data, after the checkpoint, all you see is new data.
>>>>>>
>>>>>> To end users, any data and metadata write will be included into one
>>>>>> transaction (with proper dependency handled).
>>>>>>
>>>>>> One way to finish (or commit) current transaction is to sync the fs,
>>>>>> using "sync" command (sync all filesystems).
>>>>>>
>>>>>>> Any/All writes done in a 30sec
>>>>>>> interval?
>>>>>> This the default commit interval. Almost all fses will try to commit
>>>>>> its
>>>>>> data/metadata to disk after a configurable interval.
>>>>>>
>>>>>> The default one is 30s. That's also one way to commit current >
>>>>>> transaction.
>>>>>>
>>>>>>> If 100 unique files were written in 30secs, is that 1
>>>>>>> transaction or 100 transactions?
>>>>>> It depends. As things like syncfs() and subvolume/snapshot creation may
>>>>>> try to commit transaction.
>>>>>>
>>>>>> But without those special operations, just writing 100 unique files
>>>>>> using buffered write, it would only start one transaction, and when the
>>>>>> 30s interval get hit, the transaction will be committed to disk.
>>>>>>
>>>>>>> Millions of files of the size range
>>>>>>> above were backed up.
>>>>>> The amount of files may not force a transaction commit, if it doesn't
>>>>>> trigger enough memory pressure, or free space pressure.
>>>>>>
>>>>>> Anyway, the "echo l" sysrq would help us to locate what's taking so
>>>>>> long
>>>>>> time.
>>>>>>
>>>>>>> c) "Just mount with "space_cache=v2""
>>>>>>>
>>>>>>> Ok so no need to "clear_cache" the v1 cache, right?
>>>>>> Yes, and "clear_cache" won't really remove all the v1 cache anyway.
>>>>>>
>>>>>> Thus it doesn't help much.
>>>>>>
>>>>>> The only way to fully clear v1 cache is by using "btrfs check
>>>>>> --clear-space-cache v1" on a *unmounted* btrfs.
>>>>>>
>>>>>>> I wrote this in the fstab but hadn't remounted yet until I can get an
>>>>>>> outage....
>>>>>> IMHO if you really want to test if v2 would help, you can just remount,
>>>>>> no need to wait for a break.
>>>>>>
>>>>>> Thanks,
>>>>>> Qu
>>>>>>> ..."btrfs defaults,autodefrag,clear_cache,space_cache=v2,noatime
>>>>>>> 0  2 >
>>>>>>> Thanks again for your help Qu!
>>>>>>>
>>>>>>> On 14/7/21 2:59 pm, Qu Wenruo wrote:
>>>>>> On 2021/7/13 下午11:38, DanglingPointer wrote:
>>>>>> We're currently considering switching to "space_cache=v2" with noatime
>>>>>> mount options for my lab server-workstations running RAID5.
>>>>>>
>>>>>> Btrfs RAID5 is unsafe due to its write-hole problem.
>>>>>>
>>>>>> * One has 13TB of data/metadata in a bunch of 6TB and 2TB disks
>>>>>> totalling 26TB.
>>>>>> * Another has about 12TB data/metadata in uniformly sized 6TB disks
>>>>>> totalling 24TB.
>>>>>> * Both of the arrays are on individually luks encrypted disks with
>>>>>> btrfs on top of the luks.
>>>>>> * Both have "defaults,autodefrag" turned on in fstab.
>>>>>>
>>>>>> We're starting to see large pauses during constant backups of millions
>>>>>> of chunk files (using duplicacy backup) in the 24TB array.
>>>>>>
>>>>>> Pauses sometimes take up to 20+ seconds in frequencies after every
>>>>>> ~30secs of the end of the last pause.  "btrfs-transacti" process
>>>>>> consistently shows up as the blocking process/thread locking up
>>>>>> filesystem IO.  IO gets into the RAID5 array via nfsd. There are no
>>>>>>>>>> disk
>>>>>> or btrfs errors recorded.  scrub last finished yesterday successfully.
>>>>>>
>>>>>> Please provide the "echo l > /proc/sysrq-trigger" output when such
>>>>>>>>> pause
>>>>>> happens.
>>>>>>
>>>>>> If you're using qgroup (may be enabled by things like snapper), it may
>>>>>> be the cause, as qgroup does its accounting when committing >>>
>>>>>> transaction.
>>>>>>
>>>>>> If one transaction is super large, it can cause such problem.
>>>>>>
>>>>>> You can test if qgroup is enabled by:
>>>>>>
>>>>>> # btrfs qgroup show -prce <mnt>
>>>>>>
>>>>>> After doing some research around the internet, we've come to the
>>>>>> consideration above as described.  Unfortunately the official
>>>>>> documentation isn't clear on the following.
>>>>>>
>>>>>> Official documentation URL -
>>>>>> https://btrfs.wiki.kernel.org/index.php/Manpage/btrfs(5)
>>>>>>
>>>>>> 1. How to migrate from default space_cache=v1 to space_cache=v2? It
>>>>>> talks about the reverse, from v2 to v1!
>>>>>>
>>>>>> Just mount with "space_cache=v2".
>>>>>>
>>>>>> 2. If we use space_cache=v2, is it indeed still the case that the
>>>>>> "btrfs" command will NOT work with the filesystem?
>>>>>>
>>>>>> Why would you think "btrfs" won't work on a btrfs?
>>>>>>
>>>>>> Thanks,
>>>>>> Qu
>>>>>>
>>>>>> So will our
>>>>>> "btrfs scrub start /mount/point/..." cron jobs FAIL? I'm guessing
>>>>>> the btrfs command comes from btrfs-progs which is currently >>>>
>>>>>> v5.4.1-2
>>>>>> amd64, is that correct?
>>>>>> 3. Any other ideas on how we can get rid of those annoying pauses with
>>>>>> large backups into the array?
>>>>>>
>>>>>> Thanks in advance!
>>>>>>
>>>>>> DP

^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2021-07-16 23:00 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-07-13 15:38 migrating to space_cache=2 and btrfs userspace commands DanglingPointer
2021-07-14  4:59 ` Qu Wenruo
2021-07-14  5:44   ` Chris Murphy
2021-07-14  6:05     ` Qu Wenruo
2021-07-14  6:54       ` DanglingPointer
2021-07-14  7:07         ` Qu Wenruo
2021-07-14  7:18   ` DanglingPointer
2021-07-14  7:45     ` Qu Wenruo
2021-07-15 16:40       ` DanglingPointer
2021-07-15 22:13         ` Qu Wenruo
2021-07-15 17:51       ` Joshua
2021-07-16 12:42         ` DanglingPointer
2021-07-16 12:59           ` Qu Wenruo
2021-07-16 13:23             ` DanglingPointer
2021-07-16 20:33             ` Joshua Villwock
2021-07-16 23:00               ` Qu Wenruo

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.