linux-btrfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* btrfs-transacti hangs system for several seconds every few minutes
@ 2020-03-28 18:26 Brad Templeton
  2020-03-28 21:20 ` Zygo Blaxell
  2020-03-29  0:58 ` Qu Wenruo
  0 siblings, 2 replies; 15+ messages in thread
From: Brad Templeton @ 2020-03-28 18:26 UTC (permalink / raw)
  To: Btrfs BTRFS

I have a decent sized 3 disk Raid 1 that I have had on btrfs for many
years. Over time, a serious problem has emerged, in that from time to
time all I/O will pause, freezing any programs attempting to use the
btrfs filesystem.   Performance has degraded over the years as well, so
that just browsing around in directories with 300 or so files often
takes many seconds just to autocomplete a filename or do an ls.

But the big problem is that during periods of active but not heavy use,
every few minutes the i/o system will hang for periods of 1 to 10
seconds.   During these hangs, btrfs-transacti is doing very heavy I/O.
  Programs waiting on I/O block -- the most frustrating is typing in vi
and having the echo stop.  It's getting close to unusable and may be
time to leave btrfs after many years for a different FS.

During these incidents iotop will look like this:

Total DISK READ :     499.57 K/s | Total DISK WRITE :    1639.00 K/s
Actual DISK READ:     492.73 K/s | Actual DISK WRITE:       0.00 B/s
  TID  PRIO  USER     DISK READ  DISK WRITE  SWAPIN      IO    COMMAND
  882 be/4 root      499.57 K/s 1604.78 K/s  0.00 % 98.60 %
[btrfs-transacti]
21829 be/4 root        0.00 B/s    0.00 B/s  0.00 %  0.23 %
[kworker/u32:1-btrfs-endio-meta]
14662 be/4 root        0.00 B/s    0.00 B/s  0.00 %  0.17 %
[kworker/u32:0-btrfs-endio-meta]
22184 be/4 root        0.00 B/s    0.00 B/s  0.00 %  0.11 %
[kworker/u32:3-events_freezable_power_]
13063 be/4 root        0.00 B/s    0.00 B/s  0.00 %  0.06 %
[kworker/u32:6-events_freezable_power_]
  486 be/3 root        0.00 B/s    6.84 K/s  0.00 %  0.00 % systemd-journald
22213 be/4 brad        0.00 B/s    6.84 K/s  0.00 %  0.00 % chrome
--no-startup-window [ThreadPoolForeg]

A way to reliably generate it, I have found, is to quickly skim through
my large video collection  (looking for videos) I would be hitting
"next" every second or so -- lots of read, but very little write.
After doing about 40 seconds of this, it is sure to hang.

I am running kernel 5.3.0 on Ubuntu 18.04.4, but have seen this problem
gong back into much older kernels.

My array looks like this:

/dev/sda, ID: 2
   Device size:             3.64TiB
   Device slack:              0.00B
   Data,RAID1:              1.79TiB
   Metadata,RAID1:          8.00GiB
   Unallocated:             1.84TiB

/dev/sdg, ID: 1
   Device size:             9.10TiB
   Device slack:              0.00B
   Data,RAID1:              7.21TiB
   Metadata,RAID1:         14.00GiB
   System,RAID1:           32.00MiB
   Unallocated:             1.87TiB

/dev/sdh, ID: 3
   Device size:             7.28TiB
   Device slack:          344.00KiB
   Data,RAID1:              5.43TiB
   Metadata,RAID1:          8.00GiB
   System,RAID1:           32.00MiB
   Unallocated:             1.84TiB

/dev/sdg on /home type btrfs
(rw,relatime,space_cache,subvolid=256,subvol=/home)

I have 16gb of ram with 16gb of swap on a flash drive, the swap is in use

KiB Mem : 16393944 total,   398800 free, 13538088 used,  2457056 buff/cache
KiB Swap: 16777212 total,  6804352 free,  9972860 used.  2045812 avail Mem


What other information would be useful in attempting to diagnose or fix
this?   I like a number of things about BTFS.  One of them that I don't
want to give up is the ability to do RAID with different sized disks,
which seems like the only way it should work.  Switching to ZFS or mdadm
again would involve disk upgrades and a very large amount of time
copying this much data, but I'll have to do it if I can't diagnose this.



^ permalink raw reply	[flat|nested] 15+ messages in thread
* Re: btrfs-transacti hangs system for several seconds every few minutes
@ 2020-03-29  4:03 Brad Templeton
  2020-03-29 13:14 ` Qu Wenruo
  0 siblings, 1 reply; 15+ messages in thread
From: Brad Templeton @ 2020-03-29  4:03 UTC (permalink / raw)
  To: Btrfs BTRFS

Not using qgroups.  Not doing snapshots.    Did a reboot with the
options to upgrade to v2 -- it failed, in that the disk check took more
than 6 minutes, but it worked, and the second time I was able to boot,
and -- knock on wood -- so far it has not hung.

I wonder why they put 5.3.0 as the standard advanced Kernel in Ubuntu
LTS if it has a data corruption bug.   I don't know if I've seen any
release of 5.4.14 in a PPA yet -- manual kernel install is such a pain
the few times I have done it.  I could revert, but the reason I switched
to 5.3, not long ago, was another problem with sound drivers.

BTW, even though it now works, it still takes 90 seconds every boot
doing a disk check, even after what I think is a clean shutdown.   I
presume that is not normal, any clues on what may cause that?

^ permalink raw reply	[flat|nested] 15+ messages in thread
* Re: btrfs-transacti hangs system for several seconds every few minutes
@ 2020-03-30  2:29 Tomasz Chmielewski
  2020-03-30  5:56 ` Andrei Borzenkov
  0 siblings, 1 reply; 15+ messages in thread
From: Tomasz Chmielewski @ 2020-03-30  2:29 UTC (permalink / raw)
  To: Btrfs BTRFS; +Cc: 4brad

> I wonder why they put 5.3.0 as the standard advanced Kernel in Ubuntu
> LTS if it has a data corruption bug.   I don't know if I've seen any
> release of 5.4.14 in a PPA yet -- manual kernel install is such a pain
> the few times I have done it.

You have all kernels compiled as packages here (for Ubuntu):

https://kernel.ubuntu.com/~kernel-ppa/mainline/

So just download two deb packages, dpkg -i, and done.

btrfs can be still not quite as stable as one would wish, but the 
following work well for me on quite many servers:

- use a recent kernel - late 5.5.x, now perhaps 5.6 - will typically 
work better for btrfs than a default distribution kernel

- use "noatime" mount option

- use "space_cache=v2" mount option

- absolutely do not use qgroups (make sure this command returns an error 
saying that quotas are not enabled): btrfs qgroup show /mount/point

- if using RAID-5, make sure to use RAID-1 for metadata (and raid1c3 
metadata for RAID-6 data)

- if you use any software automation, make sure that it doesn't 
accidentally re-enable quotas (in btrfs, there is no mount flag for 
quotas, unlike in other filesystems, so it's not intuitive to say if the 
quotas are enabled or not)


Tomasz Chmielewski
https://lxadm.com

^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2020-03-31  4:20 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-03-28 18:26 btrfs-transacti hangs system for several seconds every few minutes Brad Templeton
2020-03-28 21:20 ` Zygo Blaxell
     [not found]   ` <7778ece0-67d4-8d1c-b773-35f07d81dcbe@templetons.com>
2020-03-29  6:42     ` Zygo Blaxell
2020-03-30 22:14       ` Chris Murphy
2020-03-31  4:04         ` Zygo Blaxell
2020-03-29  0:58 ` Qu Wenruo
2020-03-29  4:03 Brad Templeton
2020-03-29 13:14 ` Qu Wenruo
2020-03-29 17:58   ` Brad Templeton
2020-03-29 18:09     ` Zygo Blaxell
2020-03-30  2:29 Tomasz Chmielewski
2020-03-30  5:56 ` Andrei Borzenkov
2020-03-30  8:11   ` Brad Templeton
2020-03-30  8:35     ` Tomasz Chmielewski
2020-03-31  4:20     ` Zygo Blaxell

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).