* Re: btrfs-qgroup-rescan using 100% CPU
2018-10-27 22:58 btrfs-qgroup-rescan using 100% CPU Dave
@ 2018-10-27 23:59 ` Qu Wenruo
0 siblings, 0 replies; 2+ messages in thread
From: Qu Wenruo @ 2018-10-27 23:59 UTC (permalink / raw)
To: Dave, Linux fs Btrfs
[-- Attachment #1.1: Type: text/plain, Size: 2624 bytes --]
On 2018/10/28 上午6:58, Dave wrote:
> I'm using btrfs and snapper on a system with an SSD. On this system
> when I run `snapper -c root ls` (where `root` is the snapper config
> for /), the process takes a very long time and top shows the following
> process using 100% of the CPU:
>
> kworker/u8:6+btrfs-qgroup-rescan
Not sure about what snapper is doing, but it looks like snapper needs to
use btrfs qgroup.
And then enable btrfs qgroup will do a initial qgroup scan.
If you have a lot of snapshots or a lot of files, it will take a long
time to do the initial rescan.
That's the designed behavior.
>
> I have multiple computers (also with SSD's) set up the same way with
> snapper and btrfs. On the other computers, `snapper -c root ls`
> completes almost instantly, even on systems with many more snapshots.
The size of each subvolume also counts.
The time consumed by qgroup rescan depends on the number of references.
Snapshots, reflinks all contribute to that number.
Also, large files contribute less than small files, as one large file
(128M) could only contain one reference, while 128 small files (1M)
contains 128 references.
Snapshot is one of the heaviest workload in such case.
If one extent is shared 4 times between 4 snapshots, then it will cost 4
times CPU resource to do the rescan.
> This system has 20 total snapshots on `/`.
That already sounds a lot, especially considering the size of the fs.
>
> System info:
>
> 4.18.16-arch1-1-ARCH (Arch Linux)
> btrfs-progs v4.17.1
> scrub started at Sat Oct 27 18:37:21 2018 and finished after 00:04:02
> total bytes scrubbed: 75.97GiB with 0 errors
>
> Filesystem Size Used Avail Use% Mounted on
> /dev/mapper/cryptdv 116G 77G 38G 67% /
>
> Data, single: total=72.01GiB, used=71.38GiB
> System, DUP: total=32.00MiB, used=16.00KiB
> Metadata, DUP: total=3.50GiB, used=2.22GiB
From a quick glance, it indeed needs some time do to the rescan.
Also, to make sure it's not some deadlock, you could check the rescan
progress by the following command:
# btrfs ins dump-tree -t quota /dev/mapper/cryptdv
And look for the following item:
item 0 key (0 QGROUP_STATUS 0) itemoff 16251 itemsize 32
version 1 generation 7 flags ON|RESCAN scan 1024
That scan number should change with rescan progress.
(That number is only updated after each transaction, so you need to wait
some time to see that change).
Thanks,
Qu
>
> What other info would be helpful? What troubleshooting steps should I try?
>
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]
^ permalink raw reply [flat|nested] 2+ messages in thread