All of lore.kernel.org
 help / color / mirror / Atom feed
* btrfs-qgroup-rescan using 100% CPU
@ 2018-10-27 22:58 Dave
  2018-10-27 23:59 ` Qu Wenruo
  0 siblings, 1 reply; 2+ messages in thread
From: Dave @ 2018-10-27 22:58 UTC (permalink / raw)
  To: Linux fs Btrfs

I'm using btrfs and snapper on a system with an SSD. On this system
when I run `snapper -c root ls` (where `root` is the snapper config
for /), the process takes a very long time and top shows the following
process using 100% of the CPU:

    kworker/u8:6+btrfs-qgroup-rescan

I have multiple computers (also with SSD's) set up the same way with
snapper and btrfs. On the other computers, `snapper -c root ls`
completes almost instantly, even on systems with many more snapshots.
This system has 20 total snapshots on `/`.

System info:

    4.18.16-arch1-1-ARCH (Arch Linux)
    btrfs-progs v4.17.1
    scrub started at Sat Oct 27 18:37:21 2018 and finished after 00:04:02
    total bytes scrubbed: 75.97GiB with 0 errors

    Filesystem           Size  Used Avail Use% Mounted on
    /dev/mapper/cryptdv  116G   77G   38G  67% /

    Data, single: total=72.01GiB, used=71.38GiB
    System, DUP: total=32.00MiB, used=16.00KiB
    Metadata, DUP: total=3.50GiB, used=2.22GiB

What other info would be helpful? What troubleshooting steps should I try?

^ permalink raw reply	[flat|nested] 2+ messages in thread

* Re: btrfs-qgroup-rescan using 100% CPU
  2018-10-27 22:58 btrfs-qgroup-rescan using 100% CPU Dave
@ 2018-10-27 23:59 ` Qu Wenruo
  0 siblings, 0 replies; 2+ messages in thread
From: Qu Wenruo @ 2018-10-27 23:59 UTC (permalink / raw)
  To: Dave, Linux fs Btrfs


[-- Attachment #1.1: Type: text/plain, Size: 2624 bytes --]



On 2018/10/28 上午6:58, Dave wrote:
> I'm using btrfs and snapper on a system with an SSD. On this system
> when I run `snapper -c root ls` (where `root` is the snapper config
> for /), the process takes a very long time and top shows the following
> process using 100% of the CPU:
> 
>     kworker/u8:6+btrfs-qgroup-rescan

Not sure about what snapper is doing, but it looks like snapper needs to
use btrfs qgroup.

And then enable btrfs qgroup will do a initial qgroup scan.

If you have a lot of snapshots or a lot of files, it will take a long
time to do the initial rescan.

That's the designed behavior.

> 
> I have multiple computers (also with SSD's) set up the same way with
> snapper and btrfs. On the other computers, `snapper -c root ls`
> completes almost instantly, even on systems with many more snapshots.

The size of each subvolume also counts.

The time consumed by qgroup rescan depends on the number of references.
Snapshots, reflinks all contribute to that number.

Also, large files contribute less than small files, as one large file
(128M) could only contain one reference, while 128 small files (1M)
contains 128 references.

Snapshot is one of the heaviest workload in such case.
If one extent is shared 4 times between 4 snapshots, then it will cost 4
times CPU resource to do the rescan.

> This system has 20 total snapshots on `/`.

That already sounds a lot, especially considering the size of the fs.

> 
> System info:
> 
>     4.18.16-arch1-1-ARCH (Arch Linux)
>     btrfs-progs v4.17.1
>     scrub started at Sat Oct 27 18:37:21 2018 and finished after 00:04:02
>     total bytes scrubbed: 75.97GiB with 0 errors
> 
>     Filesystem           Size  Used Avail Use% Mounted on
>     /dev/mapper/cryptdv  116G   77G   38G  67% /
> 
>     Data, single: total=72.01GiB, used=71.38GiB
>     System, DUP: total=32.00MiB, used=16.00KiB
>     Metadata, DUP: total=3.50GiB, used=2.22GiB

From a quick glance, it indeed needs some time do to the rescan.


Also, to make sure it's not some deadlock, you could check the rescan
progress by the following command:

# btrfs ins dump-tree -t quota /dev/mapper/cryptdv

And look for the following item:

	item 0 key (0 QGROUP_STATUS 0) itemoff 16251 itemsize 32
		version 1 generation 7 flags ON|RESCAN scan 1024

That scan number should change with rescan progress.
(That number is only updated after each transaction, so you need to wait
some time to see that change).

Thanks,
Qu

> 
> What other info would be helpful? What troubleshooting steps should I try?
> 


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2018-10-28  0:03 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-10-27 22:58 btrfs-qgroup-rescan using 100% CPU Dave
2018-10-27 23:59 ` Qu Wenruo

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.