All of lore.kernel.org
 help / color / mirror / Atom feed
* Filesystem sometimes Hangs
@ 2021-03-29 11:11 Hendrik Friedel
  2021-03-29 18:07 ` Chris Murphy
  0 siblings, 1 reply; 7+ messages in thread
From: Hendrik Friedel @ 2021-03-29 11:11 UTC (permalink / raw)
  To: linux-btrfs

Hello,

I have a filesystem which is sometimes very slow, or even currently 
hangs deleting a file (plain and simple rm in bash).

Label: 'DataPool1'  uuid: c4a6a2c9-5cf0-49b8-812a-0784953f9ba3
         Total devices 2 FS bytes used 5.65TiB
         devid    1 size 7.28TiB used 6.71TiB path /dev/sda1
         devid    2 size 7.28TiB used 6.71TiB path /dev/sdh1

I did run a scrub without errors.

Checking the logs, I find:
dmesg -T |grep -i btrfs
[Mo Mär 29 09:29:16 2021] Btrfs loaded, crc32c=crc32c-intel
[Mo Mär 29 09:29:16 2021] BTRFS: device label DataPool1 devid 2 transid 
649014 /dev/sdh1 scanned by btrfs (213)
[Mo Mär 29 09:29:16 2021] BTRFS: device label DataPool1 devid 1 transid 
649014 /dev/sda1 scanned by btrfs (213)
[Mo Mär 29 09:29:16 2021] BTRFS: device label Daten devid 1 transid 
254377 /dev/sdd2 scanned by btrfs (213)
[Mo Mär 29 09:29:16 2021] BTRFS: device label DockerImages devid 1 
transid 209067 /dev/sdc2 scanned by btrfs (213)
[Mo Mär 29 09:29:21 2021] BTRFS info (device sda1): disk space caching 
is enabled
[Mo Mär 29 09:29:21 2021] BTRFS info (device sda1): has skinny extents
[Mo Mär 29 09:29:21 2021] BTRFS info (device sdd2): enabling ssd 
optimizations
[Mo Mär 29 09:29:21 2021] BTRFS info (device sdd2): disk space caching 
is enabled
[Mo Mär 29 09:29:21 2021] BTRFS info (device sdd2): has skinny extents
[Mo Mär 29 09:29:21 2021] BTRFS info (device sdc2): turning on sync 
discard
[Mo Mär 29 09:29:21 2021] BTRFS info (device sdc2): enabling ssd 
optimizations
[Mo Mär 29 09:29:21 2021] BTRFS info (device sdc2): disk space caching 
is enabled
[Mo Mär 29 09:29:21 2021] BTRFS info (device sdc2): has skinny extents
[Mo Mär 29 09:29:22 2021] BTRFS info (device sda1): bdev /dev/sda1 errs: 
wr 133, rd 133, flush 0, corrupt 0, gen 1

Maybe, the last line is concerning?


Syslog tells me:
Mar 28 20:22:19 homeserver kernel: [1297978.357508] task:btrfs-cleaner   
state:D stack:    0 pid:20078 ppid:     2 flags:0x00004000
Mar 28 20:22:19 homeserver kernel: [1297978.357547]  
wait_current_trans+0xc2/0x120 [btrfs]
Mar 28 20:22:19 homeserver kernel: [1297978.357564]  
start_transaction+0x46d/0x540 [btrfs]
Mar 28 20:22:19 homeserver kernel: [1297978.357577]  
btrfs_drop_snapshot+0x90/0x7f0 [btrfs]
Mar 28 20:22:19 homeserver kernel: [1297978.357594]  ? 
btrfs_delete_unused_bgs+0x3e/0x850 [btrfs]
Mar 28 20:22:19 homeserver kernel: [1297978.357609]  
btrfs_clean_one_deleted_snapshot+0xd7/0x130 [btrfs]
Mar 28 20:22:19 homeserver kernel: [1297978.357622]  
cleaner_kthread+0xfa/0x120 [btrfs]
Mar 28 20:22:19 homeserver kernel: [1297978.357636]  ? 
btrfs_alloc_root+0x3d0/0x3d0 [btrfs]
Mar 28 20:22:19 homeserver kernel: [1297978.360473]  
wait_current_trans+0xc2/0x120 [btrfs]
Mar 28 20:22:19 homeserver kernel: [1297978.360488]  
start_transaction+0x46d/0x540 [btrfs]
Mar 28 20:22:19 homeserver kernel: [1297978.360503]  
btrfs_create+0x58/0x1f0 [btrfs]
Mar 28 20:22:19 homeserver kernel: [1297978.363057]  
wait_current_trans+0xc2/0x120 [btrfs]
Mar 28 20:22:19 homeserver kernel: [1297978.363072]  
start_transaction+0x46d/0x540 [btrfs]
Mar 28 20:22:19 homeserver kernel: [1297978.363086]  
btrfs_rmdir+0x5c/0x180 [btrfs]
Mar 28 20:26:20 homeserver kernel: [1298220.024321] task:btrfs-cleaner   
state:D stack:    0 pid:20078 ppid:     2 flags:0x00004000
Mar 28 20:26:20 homeserver kernel: [1298220.024382]  
wait_current_trans+0xc2/0x120 [btrfs]
Mar 28 20:26:20 homeserver kernel: [1298220.024419]  
start_transaction+0x46d/0x540 [btrfs]
Mar 28 20:26:20 homeserver kernel: [1298220.024442]  
btrfs_drop_snapshot+0x90/0x7f0 [btrfs]
Mar 28 20:26:20 homeserver kernel: [1298220.024476]  ? 
btrfs_delete_unused_bgs+0x3e/0x850 [btrfs]
Mar 28 20:26:20 homeserver kernel: [1298220.024504]  
btrfs_clean_one_deleted_snapshot+0xd7/0x130 [btrfs]
Mar 28 20:26:20 homeserver kernel: [1298220.024531]  
cleaner_kthread+0xfa/0x120 [btrfs]
Mar 28 20:26:20 homeserver kernel: [1298220.024558]  ? 
btrfs_alloc_root+0x3d0/0x3d0 [btrfs]
Mar 28 20:26:20 homeserver kernel: [1298220.030300]  
wait_current_trans+0xc2/0x120 [btrfs]
Mar 28 20:26:20 homeserver kernel: [1298220.030331]  
start_transaction+0x46d/0x540 [btrfs]
Mar 28 20:26:20 homeserver kernel: [1298220.030361]  
btrfs_create+0x58/0x1f0 [btrfs]
Mar 28 20:28:21 homeserver kernel: [1298340.854109] task:btrfs-cleaner   
state:D stack:    0 pid:20078 ppid:     2 flags:0x00004000
Mar 28 20:28:21 homeserver kernel: [1298340.854151]  
wait_current_trans+0xc2/0x120 [btrfs]
Mar 28 20:28:21 homeserver kernel: [1298340.854169]  
start_transaction+0x46d/0x540 [btrfs]
Mar 28 20:28:21 homeserver kernel: [1298340.854183]  
btrfs_drop_snapshot+0x90/0x7f0 [btrfs]
Mar 28 20:28:21 homeserver kernel: [1298340.854202]  ? 
btrfs_delete_unused_bgs+0x3e/0x850 [btrfs]
Mar 28 20:28:21 homeserver kernel: [1298340.854218]  
btrfs_clean_one_deleted_snapshot+0xd7/0x130 [btrfs]
Mar 28 20:28:21 homeserver kernel: [1298340.854232]  
cleaner_kthread+0xfa/0x120 [btrfs]
Mar 28 20:28:21 homeserver kernel: [1298340.854247]  ? 
btrfs_alloc_root+0x3d0/0x3d0 [btrfs]
Mar 28 20:28:21 homeserver kernel: [1298340.857610]  
wait_current_trans+0xc2/0x120 [btrfs]
Mar 28 20:28:21 homeserver kernel: [1298340.857627]  
start_transaction+0x46d/0x540 [btrfs]
Mar 28 20:28:21 homeserver kernel: [1298340.857643]  
btrfs_create+0x58/0x1f0 [btrfs]
Mar 28 20:58:34 homeserver kernel: [1300153.336160] task:btrfs-transacti 
state:D stack:    0 pid:20080 ppid:     2 flags:0x00004000
Mar 28 20:58:34 homeserver kernel: [1300153.336215]  
btrfs_commit_transaction+0x92b/0xa50 [btrfs]
Mar 28 20:58:34 homeserver kernel: [1300153.336246]  
transaction_kthread+0x15d/0x180 [btrfs]
Mar 28 20:58:34 homeserver kernel: [1300153.336273]  ? 
btrfs_cleanup_transaction+0x590/0x590 [btrfs]


What could I do to find the cause?

Regards,
Hendrik


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Filesystem sometimes Hangs
  2021-03-29 11:11 Filesystem sometimes Hangs Hendrik Friedel
@ 2021-03-29 18:07 ` Chris Murphy
  2021-03-30 12:50   ` Re[2]: " Hendrik Friedel
  0 siblings, 1 reply; 7+ messages in thread
From: Chris Murphy @ 2021-03-29 18:07 UTC (permalink / raw)
  To: Hendrik Friedel; +Cc: Btrfs BTRFS

On Mon, Mar 29, 2021 at 5:12 AM Hendrik Friedel <hendrik@friedels.name> wrote:
>
> Hello,
>
> I have a filesystem which is sometimes very slow, or even currently
> hangs deleting a file (plain and simple rm in bash).
>
> Label: 'DataPool1'  uuid: c4a6a2c9-5cf0-49b8-812a-0784953f9ba3
>          Total devices 2 FS bytes used 5.65TiB
>          devid    1 size 7.28TiB used 6.71TiB path /dev/sda1
>          devid    2 size 7.28TiB used 6.71TiB path /dev/sdh1
>
> I did run a scrub without errors.
>
> Checking the logs, I find:
> dmesg -T |grep -i btrfs
> [Mo Mär 29 09:29:16 2021] Btrfs loaded, crc32c=crc32c-intel
> [Mo Mär 29 09:29:16 2021] BTRFS: device label DataPool1 devid 2 transid
> 649014 /dev/sdh1 scanned by btrfs (213)
> [Mo Mär 29 09:29:16 2021] BTRFS: device label DataPool1 devid 1 transid
> 649014 /dev/sda1 scanned by btrfs (213)
> [Mo Mär 29 09:29:16 2021] BTRFS: device label Daten devid 1 transid
> 254377 /dev/sdd2 scanned by btrfs (213)
> [Mo Mär 29 09:29:16 2021] BTRFS: device label DockerImages devid 1
> transid 209067 /dev/sdc2 scanned by btrfs (213)
> [Mo Mär 29 09:29:21 2021] BTRFS info (device sda1): disk space caching
> is enabled
> [Mo Mär 29 09:29:21 2021] BTRFS info (device sda1): has skinny extents
> [Mo Mär 29 09:29:21 2021] BTRFS info (device sdd2): enabling ssd
> optimizations
> [Mo Mär 29 09:29:21 2021] BTRFS info (device sdd2): disk space caching
> is enabled
> [Mo Mär 29 09:29:21 2021] BTRFS info (device sdd2): has skinny extents
> [Mo Mär 29 09:29:21 2021] BTRFS info (device sdc2): turning on sync
> discard
> [Mo Mär 29 09:29:21 2021] BTRFS info (device sdc2): enabling ssd
> optimizations
> [Mo Mär 29 09:29:21 2021] BTRFS info (device sdc2): disk space caching
> is enabled
> [Mo Mär 29 09:29:21 2021] BTRFS info (device sdc2): has skinny extents
> [Mo Mär 29 09:29:22 2021] BTRFS info (device sda1): bdev /dev/sda1 errs:
> wr 133, rd 133, flush 0, corrupt 0, gen 1
>
> Maybe, the last line is concerning?

Yes. Do a 'btrfs scrub' and check dmesg for detailed errors. Next
'btrfs check --readonly' (must be done offline ie booted from usb
stick). And if it all comes up without errors or problems, you can
zero the statistics with 'btrfs dev stats -z'. But otherwise we need
to see the errors to know what's going wrong. It's not normal to have
either read or write errors. It could be related to the problem, or an
additional problem.


>
>
> Syslog tells me:
> Mar 28 20:22:19 homeserver kernel: [1297978.357508] task:btrfs-cleaner
> state:D stack:    0 pid:20078 ppid:     2 flags:0x00004000
> Mar 28 20:22:19 homeserver kernel: [1297978.357547]
> wait_current_trans+0xc2/0x120 [btrfs]
> Mar 28 20:22:19 homeserver kernel: [1297978.357564]
> start_transaction+0x46d/0x540 [btrfs]
> Mar 28 20:22:19 homeserver kernel: [1297978.357577]
> btrfs_drop_snapshot+0x90/0x7f0 [btrfs]
> Mar 28 20:22:19 homeserver kernel: [1297978.357594]  ?
> btrfs_delete_unused_bgs+0x3e/0x850 [btrfs]
> Mar 28 20:22:19 homeserver kernel: [1297978.357609]
> btrfs_clean_one_deleted_snapshot+0xd7/0x130 [btrfs]
> Mar 28 20:22:19 homeserver kernel: [1297978.357622]
> cleaner_kthread+0xfa/0x120 [btrfs]
> Mar 28 20:22:19 homeserver kernel: [1297978.357636]  ?
> btrfs_alloc_root+0x3d0/0x3d0 [btrfs]
> Mar 28 20:22:19 homeserver kernel: [1297978.360473]
> wait_current_trans+0xc2/0x120 [btrfs]
> Mar 28 20:22:19 homeserver kernel: [1297978.360488]
> start_transaction+0x46d/0x540 [btrfs]
> Mar 28 20:22:19 homeserver kernel: [1297978.360503]
> btrfs_create+0x58/0x1f0 [btrfs]
> Mar 28 20:22:19 homeserver kernel: [1297978.363057]
> wait_current_trans+0xc2/0x120 [btrfs]
> Mar 28 20:22:19 homeserver kernel: [1297978.363072]
> start_transaction+0x46d/0x540 [btrfs]
> Mar 28 20:22:19 homeserver kernel: [1297978.363086]
> btrfs_rmdir+0x5c/0x180 [btrfs]
> Mar 28 20:26:20 homeserver kernel: [1298220.024321] task:btrfs-cleaner
> state:D stack:    0 pid:20078 ppid:     2 flags:0x00004000
> Mar 28 20:26:20 homeserver kernel: [1298220.024382]
> wait_current_trans+0xc2/0x120 [btrfs]
> Mar 28 20:26:20 homeserver kernel: [1298220.024419]
> start_transaction+0x46d/0x540 [btrfs]
> Mar 28 20:26:20 homeserver kernel: [1298220.024442]
> btrfs_drop_snapshot+0x90/0x7f0 [btrfs]
> Mar 28 20:26:20 homeserver kernel: [1298220.024476]  ?
> btrfs_delete_unused_bgs+0x3e/0x850 [btrfs]
> Mar 28 20:26:20 homeserver kernel: [1298220.024504]
> btrfs_clean_one_deleted_snapshot+0xd7/0x130 [btrfs]
> Mar 28 20:26:20 homeserver kernel: [1298220.024531]
> cleaner_kthread+0xfa/0x120 [btrfs]
> Mar 28 20:26:20 homeserver kernel: [1298220.024558]  ?
> btrfs_alloc_root+0x3d0/0x3d0 [btrfs]
> Mar 28 20:26:20 homeserver kernel: [1298220.030300]
> wait_current_trans+0xc2/0x120 [btrfs]
> Mar 28 20:26:20 homeserver kernel: [1298220.030331]
> start_transaction+0x46d/0x540 [btrfs]
> Mar 28 20:26:20 homeserver kernel: [1298220.030361]
> btrfs_create+0x58/0x1f0 [btrfs]
> Mar 28 20:28:21 homeserver kernel: [1298340.854109] task:btrfs-cleaner
> state:D stack:    0 pid:20078 ppid:     2 flags:0x00004000
> Mar 28 20:28:21 homeserver kernel: [1298340.854151]
> wait_current_trans+0xc2/0x120 [btrfs]
> Mar 28 20:28:21 homeserver kernel: [1298340.854169]
> start_transaction+0x46d/0x540 [btrfs]
> Mar 28 20:28:21 homeserver kernel: [1298340.854183]
> btrfs_drop_snapshot+0x90/0x7f0 [btrfs]
> Mar 28 20:28:21 homeserver kernel: [1298340.854202]  ?
> btrfs_delete_unused_bgs+0x3e/0x850 [btrfs]
> Mar 28 20:28:21 homeserver kernel: [1298340.854218]
> btrfs_clean_one_deleted_snapshot+0xd7/0x130 [btrfs]
> Mar 28 20:28:21 homeserver kernel: [1298340.854232]
> cleaner_kthread+0xfa/0x120 [btrfs]
> Mar 28 20:28:21 homeserver kernel: [1298340.854247]  ?
> btrfs_alloc_root+0x3d0/0x3d0 [btrfs]
> Mar 28 20:28:21 homeserver kernel: [1298340.857610]
> wait_current_trans+0xc2/0x120 [btrfs]
> Mar 28 20:28:21 homeserver kernel: [1298340.857627]
> start_transaction+0x46d/0x540 [btrfs]
> Mar 28 20:28:21 homeserver kernel: [1298340.857643]
> btrfs_create+0x58/0x1f0 [btrfs]
> Mar 28 20:58:34 homeserver kernel: [1300153.336160] task:btrfs-transacti
> state:D stack:    0 pid:20080 ppid:     2 flags:0x00004000
> Mar 28 20:58:34 homeserver kernel: [1300153.336215]
> btrfs_commit_transaction+0x92b/0xa50 [btrfs]
> Mar 28 20:58:34 homeserver kernel: [1300153.336246]
> transaction_kthread+0x15d/0x180 [btrfs]
> Mar 28 20:58:34 homeserver kernel: [1300153.336273]  ?
> btrfs_cleanup_transaction+0x590/0x590 [btrfs]
>
>
> What could I do to find the cause?

What kernel version?

-- 
Chris Murphy

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re[2]: Filesystem sometimes Hangs
  2021-03-29 18:07 ` Chris Murphy
@ 2021-03-30 12:50   ` Hendrik Friedel
  2021-03-31  6:27     ` Chris Murphy
  0 siblings, 1 reply; 7+ messages in thread
From: Hendrik Friedel @ 2021-03-30 12:50 UTC (permalink / raw)
  To: Chris Murphy, Btrfs BTRFS

Hello,

  thanks for your reply, Chris.
>>  [Mo Mär 29 09:29:22 2021] BTRFS info (device sda1): bdev /dev/sda1 errs:
>>  wr 133, rd 133, flush 0, corrupt 0, gen 1
>>
>>  Maybe, the last line is concerning?
>
>Yes. Do a 'btrfs scrub' and check dmesg for detailed errors.
[Mo Mär 29 09:29:22 2021] BTRFS info (device sda1): bdev /dev/sda1 errs: wr 133, rd 133, flush 0, corrupt 0, gen 1
[Mo Mär 29 13:10:39 2021] BTRFS info (device sda1): scrub: started on devid 1
[Mo Mär 29 13:10:39 2021] BTRFS info (device sda1): scrub: started on devid 2
[Mo Mär 29 23:30:49 2021] BTRFS info (device sda1): scrub: not finished on devid 2 with status: -125
[Mo Mär 29 23:30:50 2021] BTRFS info (device sda1): scrub: not finished on devid 1 with status: -125
[Di Mär 30 00:04:07 2021] BTRFS info (device sda1): scrub: started on devid 1
[Di Mär 30 00:04:07 2021] BTRFS info (device sda1): scrub: started on devid 2
[Di Mär 30 02:50:09 2021] BTRFS info (device sda1): scrub: finished on devid 1 with status: 0
[Di Mär 30 04:36:59 2021] BTRFS info (device sda1): scrub: finished on devid 2 with status: 0

  There is nothing more, related to btrfs.

What I find in syslog:
Mar 28 20:28:21 homeserver kernel: [1298340.851140] INFO: task btrfs-cleaner:20078 blocked for more than 241 seconds.
Mar 28 20:28:21 homeserver kernel: [1298340.854109] task:btrfs-cleaner   state:D stack:    0 pid:20078 ppid:     2 flags:0x00004000
Mar 28 20:28:21 homeserver kernel: [1298340.854151]  wait_current_trans+0xc2/0x120 [btrfs]
Mar 28 20:28:21 homeserver kernel: [1298340.854169]  start_transaction+0x46d/0x540 [btrfs]
Mar 28 20:28:21 homeserver kernel: [1298340.854183]  btrfs_drop_snapshot+0x90/0x7f0 [btrfs]
Mar 28 20:28:21 homeserver kernel: [1298340.854202]  ? btrfs_delete_unused_bgs+0x3e/0x850 [btrfs]
Mar 28 20:28:21 homeserver kernel: [1298340.854218]  btrfs_clean_one_deleted_snapshot+0xd7/0x130 [btrfs]
Mar 28 20:28:21 homeserver kernel: [1298340.854232]  cleaner_kthread+0xfa/0x120 [btrfs]
Mar 28 20:28:21 homeserver kernel: [1298340.854247]  ? btrfs_alloc_root+0x3d0/0x3d0 [btrfs]
Mar 28 20:28:21 homeserver kernel: [1298340.857610]  wait_current_trans+0xc2/0x120 [btrfs]
Mar 28 20:28:21 homeserver kernel: [1298340.857627]  start_transaction+0x46d/0x540 [btrfs]
Mar 28 20:28:21 homeserver kernel: [1298340.857643]  btrfs_create+0x58/0x1f0 [btrfs]



>  Next
>'btrfs check --readonly' (must be done offline ie booted from usb
>stick). And if it all comes up without errors or problems, you can
>zero the statistics with 'btrfs dev stats -z'.
No error found. Neither in btrfs check, nor in scrub.
So, shall I reset the stats then?

>But otherwise we need
>to see the errors to know what's going wrong. It's not normal to have
>either read or write errors. It could be related to the problem, or an
>additional problem.

>
>>  Mar 28 20:58:34 homeserver kernel: [1300153.336273]  ?
>>  btrfs_cleanup_transaction+0x590/0x590 [btrfs]
>>
>>
>>  What could I do to find the cause?
>
>What kernel version?

5.10.0-0.bpo.3-amd64

Best regards,
Hendrik


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Re[2]: Filesystem sometimes Hangs
  2021-03-30 12:50   ` Re[2]: " Hendrik Friedel
@ 2021-03-31  6:27     ` Chris Murphy
  2021-03-31 14:03       ` Re[4]: " Hendrik Friedel
  0 siblings, 1 reply; 7+ messages in thread
From: Chris Murphy @ 2021-03-31  6:27 UTC (permalink / raw)
  To: Hendrik Friedel; +Cc: Chris Murphy, Btrfs BTRFS

On Tue, Mar 30, 2021 at 6:50 AM Hendrik Friedel <hendrik@friedels.name> wrote:

> >  Next
> >'btrfs check --readonly' (must be done offline ie booted from usb
> >stick). And if it all comes up without errors or problems, you can
> >zero the statistics with 'btrfs dev stats -z'.
> No error found. Neither in btrfs check, nor in scrub.
> So, shall I reset the stats then?

Up to you. It's probably better to zero them because it's obvious if
the numbers change from 0, there's a problem.


> 5.10.0-0.bpo.3-amd64

It's probably OK. I'm not sure what upstream stable version this
translates into, but current stable are 5.10.27 and 5.11.11. There
have been multiple btrfs bug fixes since 5.10.0 was released.

I missed in your first email this line:

>[Mo Mär 29 09:29:21 2021] BTRFS info (device sdc2): turning on sync discard

Remove the discard mount option for this file system and see if that
fixes the problem. Run it for a week or two, or until you're certain
the problem is still happening (or certain it's gone). Some drives
just can't handle sync discards, they become really slow and hang,
just like you're reporting. It's probably adequate to just enable the
fstrim.timer, part of util-linux, which runs once per week. If you
have really heavy write and delete workloads, you might benefit from
discard=async mount option (async instead of sync). But first you
should just not do any discards at all for a while to see if that's
the problem and then deliberately re-introduce just that one single
change so you can monitor it for problems.



-- 
Chris Murphy

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re[4]: Filesystem sometimes Hangs
  2021-03-31  6:27     ` Chris Murphy
@ 2021-03-31 14:03       ` Hendrik Friedel
  2021-03-31 20:11         ` Chris Murphy
  0 siblings, 1 reply; 7+ messages in thread
From: Hendrik Friedel @ 2021-03-31 14:03 UTC (permalink / raw)
  To: Chris Murphy; +Cc: Chris Murphy, Btrfs BTRFS

Hello Chris,

thanks again for your reply.

>
>>  5.10.0-0.bpo.3-amd64
>
>It's probably OK. I'm not sure what upstream stable version this
>translates into, but current stable are 5.10.27 and 5.11.11. There
>have been multiple btrfs bug fixes since 5.10.0 was released.
>
>I missed in your first email this line:
Ok, I am compiling 5.11.11.
>>[Mo Mär 29 09:29:21 2021] BTRFS info (device sdc2): turning on sync discard
>
>Remove the discard mount option for this file system and see if that
>fixes the problem. Run it for a week or two, or until you're certain
>the problem is still happening (or certain it's gone). Some drives
>just can't handle sync discards, they become really slow and hang,
>just like you're reporting.

In fstab, this option is not set:
/dev/disk/by-label/DataPool1            /srv/dev-disk-by-label-DataPool1 
        btrfs   noatime,defaults,nofail 0 2

How do I deactivate discard then?
These drives are spinning disks. I thought that discard is only relevant 
for SSDs?

Regards,
Hendrik

>
>


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Re[4]: Filesystem sometimes Hangs
  2021-03-31 14:03       ` Re[4]: " Hendrik Friedel
@ 2021-03-31 20:11         ` Chris Murphy
  2021-04-03 16:12           ` Re[6]: " Hendrik Friedel
  0 siblings, 1 reply; 7+ messages in thread
From: Chris Murphy @ 2021-03-31 20:11 UTC (permalink / raw)
  To: Hendrik Friedel; +Cc: Chris Murphy, Btrfs BTRFS

On Wed, Mar 31, 2021 at 8:03 AM Hendrik Friedel <hendrik@friedels.name> wrote:


> >>[Mo Mär 29 09:29:21 2021] BTRFS info (device sdc2): turning on sync discard
> >
> >Remove the discard mount option for this file system and see if that
> >fixes the problem. Run it for a week or two, or until you're certain
> >the problem is still happening (or certain it's gone). Some drives
> >just can't handle sync discards, they become really slow and hang,
> >just like you're reporting.
>
> In fstab, this option is not set:
> /dev/disk/by-label/DataPool1            /srv/dev-disk-by-label-DataPool1
>         btrfs   noatime,defaults,nofail 0 2

You have more than one btrfs file system. I'm suggesting not using
discard on any of them to try and narrow down the problem.  Something
is turning on discards for sdc2, find it and don't use it for a while.


> How do I deactivate discard then?
> These drives are spinning disks. I thought that discard is only relevant
> for SSDs?

It's relevant for thin provisioning and sparse files too. But if sdc2
is a HDD then the sync discard message isn't related to the problem,
but also makes me wonder why something is enabling sync discards on a
HDD?

Anway I think you're on the right track to try 5.11.11 and if you
experience a hang again, use sysrq+w and that will dump the blocked
task trace into dmesg. Also include a description of the workload at
the time of the hang, and recent commands issued.



-- 
Chris Murphy

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re[6]: Filesystem sometimes Hangs
  2021-03-31 20:11         ` Chris Murphy
@ 2021-04-03 16:12           ` Hendrik Friedel
  0 siblings, 0 replies; 7+ messages in thread
From: Hendrik Friedel @ 2021-04-03 16:12 UTC (permalink / raw)
  To: Chris Murphy; +Cc: Chris Murphy, Btrfs BTRFS

Hello Chris,

thanks for your reply.

>>  >Remove the discard mount option for this file system and see if that
>>  >fixes the problem. Run it for a week or two, or until you're certain
>>  >the problem is still happening (or certain it's gone). Some drives
>>  >just can't handle sync discards, they become really slow and hang,
>>  >just like you're reporting.
>>
>>  In fstab, this option is not set:
>>  /dev/disk/by-label/DataPool1            /srv/dev-disk-by-label-DataPool1
>>          btrfs   noatime,defaults,nofail 0 2
>
>You have more than one btrfs file system.
Indeed:
grep btrfs
/dev/sdc2 on /srv/dev-disk-by-label-DockerImages type btrfs 
(rw,noatime,ssd,discard,space_cache,subvolid=5,subvol=/)
/dev/sdd2 on /srv/dev-disk-by-label-Daten type btrfs 
(rw,relatime,ssd,space_cache,subvolid=5,subvol=/)
/dev/sda1 on /srv/dev-disk-by-label-DataPool1 type btrfs 
(rw,noatime,space_cache,subvolid=5,subvol=/)

You can see, that for sdc2 and sdd2, "ssd" is in the mount options. For 
sdc2 discard is present. I have not added this myelf. But it is in 
fstab.
In fact, I got confused in my previous mail:
sdc is an ssd and not the drive that I was experiencing problems with 
(well: The system was slow when accessing sda1. I cannot exclude, that 
this was caused by sdc2.


>I'm suggesting not using
>discard on any of them to try and narrow down the problem.  Something
>is turning on discards for sdc2, find it and don't use it for a while.
Will do.
>
>>  How do I deactivate discard then?
>>  These drives are spinning disks. I thought that discard is only relevant
>>  for SSDs?
>
>It's relevant for thin provisioning and sparse files too. But if sdc2
>is a HDD then the sync discard message isn't related to the problem,
>but also makes me wonder why something is enabling sync discards on a
>HDD?
See above. It was my mistake, thinking it sdc2 was the spinning disc.

>Anway I think you're on the right track to try 5.11.11 and if you
>experience a hang again, use sysrq+w and that will dump the blocked
>task trace into dmesg. Also include a description of the workload at
>the time of the hang, and recent commands issued.
Ok, will do.

Thanks,
Hendrik
>
>
>
>
>--
>Chris Murphy


^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2021-04-03 16:12 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-03-29 11:11 Filesystem sometimes Hangs Hendrik Friedel
2021-03-29 18:07 ` Chris Murphy
2021-03-30 12:50   ` Re[2]: " Hendrik Friedel
2021-03-31  6:27     ` Chris Murphy
2021-03-31 14:03       ` Re[4]: " Hendrik Friedel
2021-03-31 20:11         ` Chris Murphy
2021-04-03 16:12           ` Re[6]: " Hendrik Friedel

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.