linux-btrfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* btrfs-transacti -- change from be/4 to idle (?)
@ 2020-08-26  8:58 Leszek Dubiel
  2020-08-26  9:39 ` Qu Wenruo
  2020-08-26  9:47 ` Hans van Kranenburg
  0 siblings, 2 replies; 4+ messages in thread
From: Leszek Dubiel @ 2020-08-26  8:58 UTC (permalink / raw)
  To: linux-btrfs



Hello!

Process btrfs-transacti takes 100% CPU time and server get very slow.

It runs with priority "best effort be/4".

Is it a good idea to change priority to "idle"?





root@wawel:/var/log# df -h  /

Filesystem      Size  Used Avail Use% Mounted on
/dev/sda2        20T   11T  7.7T  58% /



root@wawel:/var/log# btrfs sub list / | wc -l

367



root@wawel:/var/log# btrfs dev usag /

/dev/sda2, ID: 2
    Device size:             5.45TiB
    Device slack:              0.00B
    Data,RAID1:              3.97TiB
    Metadata,RAID1:         79.00GiB
    Unallocated:             1.41TiB

/dev/sdb3, ID: 5
    Device size:             9.06TiB
    Device slack:            3.50KiB
    Data,RAID1:              2.26TiB
    Metadata,RAID1:         18.00GiB
    System,RAID1:           32.00MiB
    Unallocated:             6.79TiB

/dev/sdc2, ID: 3
    Device size:             5.45TiB
    Device slack:              0.00B
    Data,RAID1:              4.00TiB
    Metadata,RAID1:         77.00GiB
    Unallocated:             1.38TiB

/dev/sdd3, ID: 6
    Device size:             5.43TiB
    Device slack:            3.50KiB
    Data,RAID1:              2.03TiB
    Metadata,RAID1:         18.00GiB
    System,RAID1:           32.00MiB
    Unallocated:             3.38TiB

/dev/sde3, ID: 4
    Device size:            10.90TiB
    Device slack:            3.50KiB
    Data,RAID1:              7.96TiB
    Metadata,RAID1:        146.00GiB
    Unallocated:             2.79TiB

/dev/sdf3, ID: 7
    Device size:             3.61TiB
    Device slack:            3.50KiB
    Data,RAID1:            235.00GiB
    Unallocated:             3.38TiB


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: btrfs-transacti -- change from be/4 to idle (?)
  2020-08-26  8:58 btrfs-transacti -- change from be/4 to idle (?) Leszek Dubiel
@ 2020-08-26  9:39 ` Qu Wenruo
  2020-08-26 12:49   ` Leszek Dubiel
  2020-08-26  9:47 ` Hans van Kranenburg
  1 sibling, 1 reply; 4+ messages in thread
From: Qu Wenruo @ 2020-08-26  9:39 UTC (permalink / raw)
  To: Leszek Dubiel, linux-btrfs


[-- Attachment #1.1: Type: text/plain, Size: 2591 bytes --]



On 2020/8/26 下午4:58, Leszek Dubiel wrote:
> 
> 
> Hello!
> 
> Process btrfs-transacti takes 100% CPU time and server get very slow.

What's the workload and kernel version?

Workload can tell us if it's really a bug, and different kernel has
quite different perf characteristic, especially if you're using qgroup.
(If you're using qgroup, recent v5.x kernel should have it fixed already)

Thanks,
Qu
> 
> It runs with priority "best effort be/4".
> 
> Is it a good idea to change priority to "idle"?
> 
> 
> 
> 
> 
> root@wawel:/var/log# df -h  /
> 
> Filesystem      Size  Used Avail Use% Mounted on
> /dev/sda2        20T   11T  7.7T  58% /
> 
> 
> 
> root@wawel:/var/log# btrfs sub list / | wc -l
> 
> 367
> 
> 
> 
> root@wawel:/var/log# btrfs dev usag /
> 
> /dev/sda2, ID: 2
>    Device size:             5.45TiB
>    Device slack:              0.00B
>    Data,RAID1:              3.97TiB
>    Metadata,RAID1:         79.00GiB
>    Unallocated:             1.41TiB
> 
> /dev/sdb3, ID: 5
>    Device size:             9.06TiB
>    Device slack:            3.50KiB
>    Data,RAID1:              2.26TiB
>    Metadata,RAID1:         18.00GiB
>    System,RAID1:           32.00MiB
>    Unallocated:             6.79TiB
> 
> /dev/sdc2, ID: 3
>    Device size:             5.45TiB
>    Device slack:              0.00B
>    Data,RAID1:              4.00TiB
>    Metadata,RAID1:         77.00GiB
>    Unallocated:             1.38TiB
> 
> /dev/sdd3, ID: 6
>    Device size:             5.43TiB
>    Device slack:            3.50KiB
>    Data,RAID1:              2.03TiB
>    Metadata,RAID1:         18.00GiB
>    System,RAID1:           32.00MiB
>    Unallocated:             3.38TiB
> 
> /dev/sde3, ID: 4
>    Device size:            10.90TiB
>    Device slack:            3.50KiB
>    Data,RAID1:              7.96TiB
>    Metadata,RAID1:        146.00GiB
>    Unallocated:             2.79TiB
> 
> /dev/sdf3, ID: 7
>    Device size:             3.61TiB
>    Device slack:            3.50KiB
>    Data,RAID1:            235.00GiB
>    Unallocated:             3.38TiB
> 


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: btrfs-transacti -- change from be/4 to idle (?)
  2020-08-26  8:58 btrfs-transacti -- change from be/4 to idle (?) Leszek Dubiel
  2020-08-26  9:39 ` Qu Wenruo
@ 2020-08-26  9:47 ` Hans van Kranenburg
  1 sibling, 0 replies; 4+ messages in thread
From: Hans van Kranenburg @ 2020-08-26  9:47 UTC (permalink / raw)
  To: Leszek Dubiel, linux-btrfs

Hi!

On 8/26/20 10:58 AM, Leszek Dubiel wrote:
> 
> Process btrfs-transacti takes 100% CPU time and server get very slow.

Is it slow and not doing much, or is it busy doing things, taking more
time than you would want? Those two things are quite different.

> It runs with priority "best effort be/4".
> 
> Is it a good idea to change priority to "idle"?

> root@wawel:/var/log# df -h  /
> 
> Filesystem      Size  Used Avail Use% Mounted on
> /dev/sda2        20T   11T  7.7T  58% /

Are you using space_cache=v2 already?

https://events.static.linuxfound.org/sites/events/files/slides/vault2016_0.pdf

> [...]

Hans

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: btrfs-transacti -- change from be/4 to idle (?)
  2020-08-26  9:39 ` Qu Wenruo
@ 2020-08-26 12:49   ` Leszek Dubiel
  0 siblings, 0 replies; 4+ messages in thread
From: Leszek Dubiel @ 2020-08-26 12:49 UTC (permalink / raw)
  To: linux-btrfs; +Cc: Qu Wenruo, hans




 >> Hello!
 >>
 >> Process btrfs-transacti takes 100% CPU time and server get very slow.
 > What's the workload and kernel version?


root@wawel:/var/log/ipfm# uname -a
Linux wawel 4.19.0-9-amd64 #1 SMP Debian 4.19.118-2+deb10u1 (2020-06-07) 
x86_64 GNU/Linux


Actually limit is not CPU but hardidsk bandwidth. Sorry for that mistake.
On program "iotop" I often see 100% usage  for "btrfs-transacti".


Workload when I see 100% usage for btrf-transacti:

For CPU:

root@wawel:~# sar 2 4
Linux 4.19.0-9-amd64 (wawel)     26.08.2020     _x86_64_    (4 CPU)

14:18:35        CPU     %user     %nice   %system   %iowait %steal     %idle
14:18:37        all      0,50      0,00      0,63     16,02 0,00     82,85
14:18:39        all      0,25      0,00      0,50     15,72 0,00     83,52
14:18:41        all      0,50      0,00      1,25     17,02 0,00     81,23
14:18:43        all      0,50      0,00     16,81     17,69 0,00     64,99
Średnia:       all      0,44      0,00      4,80     16,61 0,00     78,15



For DISKS:

root@wawel:/var/log# sar -p -d 5 2
Linux 4.19.0-9-amd64 (wawel)     08/26/20     _x86_64_    (4 CPU)

14:45:49          DEV       tps     rkB/s     wkB/s   areq-sz aqu-sz     
await     svctm     %util
14:45:54          sdc      0.00      0.00      0.00      0.00 0.00      
0.00      0.00      0.00
14:45:54          sda      0.00      0.00      0.00      0.00 0.00      
0.00      0.00      0.00
14:45:54          sdd    369.60      0.00   6716.80     18.17 6.21     
16.95      1.15     42.48
14:45:54          sde      0.00      0.00      0.00      0.00 0.00      
0.00      0.00      0.00
14:45:54          sdb    369.00      0.00   6707.20     18.18 18.73     
51.30      2.66     98.08
14:45:54          sdf      0.00      0.00      0.00      0.00 0.00      
0.00      0.00      0.00

14:45:54          DEV       tps     rkB/s     wkB/s   areq-sz aqu-sz     
await     svctm     %util
14:45:59          sdc      0.00      0.00      0.00      0.00 0.00      
0.00      0.00      0.00
14:45:59          sda      0.00      0.00      0.00      0.00 0.00      
0.00      0.00      0.00
14:45:59          sdd    341.20      0.00   6716.80     19.69 7.25     
21.03      1.46     49.92
14:45:59          sde      1.00     15.20      0.00     15.20 0.01     
10.40      5.60      0.56
14:45:59          sdb    345.40      0.00   6780.80     19.63 18.55     
54.16      2.75     95.04
14:45:59          sdf      0.00      0.00      0.00      0.00 0.00      
0.00      0.00      0.00

Average:          DEV       tps     rkB/s     wkB/s   areq-sz aqu-sz     
await     svctm     %util
Average:          sdc      0.00      0.00      0.00      0.00 0.00      
0.00      0.00      0.00
Average:          sda      0.00      0.00      0.00      0.00 0.00      
0.00      0.00      0.00
Average:          sdd    355.40      0.00   6716.80     18.90 6.73     
18.91      1.30     46.20
Average:          sde      0.50      7.60      0.00     15.20 0.01     
10.40      5.60      0.28
Average:          sdb    357.20      0.00   6744.00     18.88 18.64     
52.68      2.70     96.56
Average:          sdf      0.00      0.00      0.00      0.00 0.00      
0.00      0.00      0.00






 > Workload can tell us if it's really a bug, and different kernel has
 > quite different perf characteristic, especially if you're using qgroup.
 > (If you're using qgroup, recent v5.x kernel should have it fixed already)

Don't use qgroup.





W dniu 26.08.2020 o 11:47, Hans van Kranenburg pisze:
 > Hi!
 >
 >> Process btrfs-transacti takes 100% CPU time and server get very slow.
 >
 > Is it slow and not doing much, or is it busy doing things, taking more
 > time than you would want? Those two things are quite different.

Acutually this was IO traffic not CPU time. Sorry for mistake. Thanks 
for help.



 >> It runs with priority "best effort be/4".
 >>
 >> Is it a good idea to change priority to "idle"?
 >
 >> root@wawel:/var/log# df -h  /
 >>
 >> Filesystem      Size  Used Avail Use% Mounted on
 >> /dev/sda2        20T   11T  7.7T  58% /
 >
 > Are you using space_cache=v2 already?
 >
 > 
https://events.static.linuxfound.org/sites/events/files/slides/vault2016_0.pdf


I use defaults as mount options:

root@wawel:~# cat /etc/fstab  | egrep btrfs
UUID=44803366-3981-4ebb-853b-6c991380c8a6 /             btrfs 
defaults,subvol=/wawel              0       0
UUID=44803366-3981-4ebb-853b-6c991380c8a6 /mnt/root     btrfs 
defaults,subvol=/     0       0



And here is superblock:

root@wawel:/var/log# btrfs inspect-internal dump-super /dev/sda2
superblock: bytenr=65536, device=/dev/sda2
---------------------------------------------------------
csum_type        0 (crc32c)
csum_size        4
csum            0xa4b9b4ab [match]
bytenr            65536
flags            0x1
             ( WRITTEN )
magic            _BHRfS_M [match]
fsid            44803366-3981-4ebb-853b-6c991380c8a6
metadata_uuid        44803366-3981-4ebb-853b-6c991380c8a6
label
generation        1168642
root            21221477498880
sys_array_size        129
chunk_root_generation    1168634
root_level        1
chunk_root        21753638895616
chunk_root_level    1
log_root        0
log_root_transid    0
log_root_level        0
total_bytes        43865979846656
bytes_used        11379069538304
sectorsize        4096
nodesize        16384
leafsize (deprecated)        16384
stripesize        4096
root_dir        6
num_devices        6
compat_flags        0x0
compat_ro_flags        0x0
incompat_flags        0x163
             ( MIXED_BACKREF |
               DEFAULT_SUBVOL |
               BIG_METADATA |
               EXTENDED_IREF |
               SKINNY_METADATA )
cache_generation    1168642
uuid_tree_generation    594
dev_item.uuid        485e6e62-6e43-46bf-ad0b-d9ed88d7f908
dev_item.fsid        44803366-3981-4ebb-853b-6c991380c8a6 [match]
dev_item.type        0
dev_item.total_bytes    5992192409600
dev_item.bytes_used    4445291151360
dev_item.io_align    4096
dev_item.io_width    4096
dev_item.sector_size    4096
dev_item.devid        2
dev_item.dev_group    0
dev_item.seek_speed    0
dev_item.bandwidth    0
dev_item.generation    0







I have also found report in kern.log:

Aug 16 03:36:01 wawel kernel: [2476445.318200] INFO: task btrfs:4311 
blocked for more than 120 seconds.
Aug 16 03:36:01 wawel kernel: [2476445.318214]       Not tainted 
4.19.0-9-amd64 #1 Debian 4.19.118-2+deb10u1
Aug 16 03:36:01 wawel kernel: [2476445.318219] "echo 0 > 
/proc/sys/kernel/hung_task_timeout_secs" disables this message.



I wonder if system is broken. Scrub was done in 19 hours:

root@wawel:/var/log# btrfs scrub stat /
scrub status for 44803366-3981-4ebb-853b-6c991380c8a6
     scrub started at Sat Aug 22 14:19:01 2020 and finished after 19:17:16
     total bytes scrubbed: 20.15TiB with 0 errors















^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2020-08-26 12:50 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-08-26  8:58 btrfs-transacti -- change from be/4 to idle (?) Leszek Dubiel
2020-08-26  9:39 ` Qu Wenruo
2020-08-26 12:49   ` Leszek Dubiel
2020-08-26  9:47 ` Hans van Kranenburg

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).