All of lore.kernel.org
 help / color / mirror / Atom feed
* autodefrag causing freezes under heavy writes?
@ 2021-07-06  3:56 Yan Li
  2021-07-07 19:00 ` David Sterba
       [not found] ` <20210706161908.BE32.409509F4@e16-tech.com>
  0 siblings, 2 replies; 4+ messages in thread
From: Yan Li @ 2021-07-06  3:56 UTC (permalink / raw)
  To: linux-btrfs; +Cc: Yan Li

Hi!

I'm using 5.11.0-22-generic from Ubuntu 21.04. My btrfs is running in
raid1 mode with two SATA SSDs. Mount options
"defaults,ssd,noatime,compress=zstd". Motherboard is ASUS Pro WS
X570-ACE with 32GB ECC RAM and AMD Ryzen 5 5600X. The system has no
other known problems.

I found that when I added the autodefrag mount option, the system
would freeze under heavy write workload for a long time before the
write finished and the system recovered itself, and would occasionally
freeze with a simple sync. During heavy write workloads, dmesg showed:

INFO: task journal-offline:514885 blocked for more than 120 seconds.
      Tainted: P           OE     5.11.0-22-generic #23-Ubuntu
task:journal-offline state:D stack:    0 pid:514885 ppid:     1 flags:0x00000220
Call Trace:
 __schedule+0x23d/0x670
 schedule+0x4f/0xc0
 btrfs_start_ordered_extent+0xdd/0x110 [btrfs]
 ? wait_woken+0x80/0x80
 btrfs_wait_ordered_range+0x120/0x210 [btrfs]
 btrfs_sync_file+0x2d1/0x480 [btrfs]
 vfs_fsync_range+0x49/0x80
 ? __fget_light+0x32/0x80
 __x64_sys_fsync+0x39/0x60
 do_syscall_64+0x38/0x90
 entry_SYSCALL_64_after_hwframe+0x44/0xa9
RIP: 0033:0x7f49c9178d4b
RSP: 002b:00007f49c5150c50 EFLAGS: 00000293 ORIG_RAX: 000000000000004a
RAX: ffffffffffffffda RBX: 00005589e7e26140 RCX: 00007f49c9178d4b
RDX: 0000000000000002 RSI: 00007f49c94bf497 RDI: 000000000000002c
RBP: 00007f49c94c1db0 R08: 0000000000000000 R09: 00007f49c5151640
R10: 0000000000000017 R11: 0000000000000293 R12: 0000000000000002
R13: 00007ffee4c5c8bf R14: 0000000000000000 R15: 00007f49c5151640

And many similar messages. The heavy write workload was just a dd from
urandom to a file.

The system behaves fine when I remove the autodefrag mount option.

Is this a known problem? If you need any more information, please
kindly let me know.

Thanks!

-- 
Yan

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: autodefrag causing freezes under heavy writes?
  2021-07-06  3:56 autodefrag causing freezes under heavy writes? Yan Li
@ 2021-07-07 19:00 ` David Sterba
  2021-07-09 19:10   ` Yan Li
       [not found] ` <20210706161908.BE32.409509F4@e16-tech.com>
  1 sibling, 1 reply; 4+ messages in thread
From: David Sterba @ 2021-07-07 19:00 UTC (permalink / raw)
  To: Yan Li; +Cc: linux-btrfs

On Mon, Jul 05, 2021 at 08:56:23PM -0700, Yan Li wrote:
> I'm using 5.11.0-22-generic from Ubuntu 21.04. My btrfs is running in
> raid1 mode with two SATA SSDs. Mount options
> "defaults,ssd,noatime,compress=zstd". Motherboard is ASUS Pro WS
> X570-ACE with 32GB ECC RAM and AMD Ryzen 5 5600X. The system has no
> other known problems.
> 
> I found that when I added the autodefrag mount option, the system
> would freeze under heavy write workload for a long time before the

Do you have an estimate for 'long time' ? Like human percievable
"seconds" or like 5 seconds and more.

> write finished and the system recovered itself, and would occasionally
> freeze with a simple sync. During heavy write workloads, dmesg showed:
> 
> INFO: task journal-offline:514885 blocked for more than 120 seconds.
>       Tainted: P           OE     5.11.0-22-generic #23-Ubuntu
> task:journal-offline state:D stack:    0 pid:514885 ppid:     1 flags:0x00000220
> Call Trace:
>  __schedule+0x23d/0x670
>  schedule+0x4f/0xc0
>  btrfs_start_ordered_extent+0xdd/0x110 [btrfs]
>  ? wait_woken+0x80/0x80
>  btrfs_wait_ordered_range+0x120/0x210 [btrfs]
>  btrfs_sync_file+0x2d1/0x480 [btrfs]
>  vfs_fsync_range+0x49/0x80
>  ? __fget_light+0x32/0x80
>  __x64_sys_fsync+0x39/0x60
>  do_syscall_64+0x38/0x90
>  entry_SYSCALL_64_after_hwframe+0x44/0xa9
> RIP: 0033:0x7f49c9178d4b
> RSP: 002b:00007f49c5150c50 EFLAGS: 00000293 ORIG_RAX: 000000000000004a
> RAX: ffffffffffffffda RBX: 00005589e7e26140 RCX: 00007f49c9178d4b
> RDX: 0000000000000002 RSI: 00007f49c94bf497 RDI: 000000000000002c
> RBP: 00007f49c94c1db0 R08: 0000000000000000 R09: 00007f49c5151640
> R10: 0000000000000017 R11: 0000000000000293 R12: 0000000000000002
> R13: 00007ffee4c5c8bf R14: 0000000000000000 R15: 00007f49c5151640
> 
> And many similar messages. The heavy write workload was just a dd from
> urandom to a file.
> 
> The system behaves fine when I remove the autodefrag mount option.
> 
> Is this a known problem? If you need any more information, please
> kindly let me know.

The autodefrag can cause problems like this, yes, but it depends on
other factors too. Autodefrag can read additional pages from disk in
case they aren't contiguous and then writes them (in a small cluster)
together. You're using compression, so this may add a slightly more
delay before the data are written. On the default level it should be
unnoticeable and you mention that's on a Ryzen 5 so I'd rule that out.

IIRC autodefrag can help some workloads but may hurt others so if it's
making things worse you, then drop it. It helps when seeks are expensive
ie. on rotational disks but you use SSD so it should not be necessary.

If you'd still like to debug it, please take a snapshot of all process
stacks at the time the hang happens.

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: autodefrag causing freezes under heavy writes?
       [not found] ` <20210706161908.BE32.409509F4@e16-tech.com>
@ 2021-07-09 19:01   ` Yan Li
  0 siblings, 0 replies; 4+ messages in thread
From: Yan Li @ 2021-07-09 19:01 UTC (permalink / raw)
  To: wangyugui; +Cc: linux-btrfs

On Tue, Jul 6, 2021 at 1:19 AM Wang Yugui <wangyugui@e16-tech.com> wrote:
> The message 'blocked for more than 120 seconds.' show which job is
> blocked.
>
> Is there some message like 'watchdog: BUG: soft lockup - CPU#XX stuck for XXs!'?
> that show which job blocked others.

Nope. There was no such a message.

> so a full dmesg is useful.

Here you go
https://pastebin.com/TwChmFmC

> If there is no good info in the full dmesg, the call trace of 'freeze'
> status is useful too.
>
> echo "t" >/proc/sysrq-trigger  will output all jobs call stace.

I'll try next time. I can't reboot this system very often, since it's
a production system.

Thanks!

--
Yan

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: autodefrag causing freezes under heavy writes?
  2021-07-07 19:00 ` David Sterba
@ 2021-07-09 19:10   ` Yan Li
  0 siblings, 0 replies; 4+ messages in thread
From: Yan Li @ 2021-07-09 19:10 UTC (permalink / raw)
  To: dsterba, Yan Li, linux-btrfs

On Wed, Jul 7, 2021 at 12:03 PM David Sterba <dsterba@suse.cz> wrote:
> On Mon, Jul 05, 2021 at 08:56:23PM -0700, Yan Li wrote:
> > I found that when I added the autodefrag mount option, the system
> > would freeze under heavy write workload for a long time before the
>
> Do you have an estimate for 'long time' ? Like human percievable
> "seconds" or like 5 seconds and more.

It would freeze for minutes, during which the GUI was totally
unresponsive. After a few minutes, presumably after the dd was
finished, the machine would resume normal operation.

The full dmesg is here: https://pastebin.com/TwChmFmC

You could see that the blocked I/O messed up journald's output so that
the message towards the end were not even ordered by the timestamp.

> The autodefrag can cause problems like this, yes, but it depends on
> other factors too. Autodefrag can read additional pages from disk in
> case they aren't contiguous and then writes them (in a small cluster)
> together. You're using compression, so this may add a slightly more
> delay before the data are written. On the default level it should be
> unnoticeable and you mention that's on a Ryzen 5 so I'd rule that out.

The workload was just a simple:
dd if=/dev/urandom of=test_data bs=1M count=2000
so there should be no reason for it to block for such a long time.
And, yes, it's was a new workstation, and works flawlessly on much
heavier workloads when autodefrag was removed.

> IIRC autodefrag can help some workloads but may hurt others so if it's
> making things worse you, then drop it. It helps when seeks are expensive
> ie. on rotational disks but you use SSD so it should not be necessary.

I was advised to add it since I'm running VirtualBox VMs out of these
btrfs. But autodefrag made *everything* worse on this filesystem. It's
weird.

> If you'd still like to debug it, please take a snapshot of all process
> stacks at the time the hang happens.

This was from before I removed the autodefrag option.
https://pastebin.com/TwChmFmC

It's a production system so I can't reboot it every day. I can try to
add autodefrag back a few days later and retest.

Thanks!

-- 
Yan

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2021-07-09 19:10 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-07-06  3:56 autodefrag causing freezes under heavy writes? Yan Li
2021-07-07 19:00 ` David Sterba
2021-07-09 19:10   ` Yan Li
     [not found] ` <20210706161908.BE32.409509F4@e16-tech.com>
2021-07-09 19:01   ` Yan Li

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.