linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [Regression] Bug 216961 - Severe IO scheduling starvation issues with btrfs
@ 2023-01-26 11:17 Linux kernel regression tracking (Thorsten Leemhuis)
  2023-02-16 17:43 ` Thorsten Leemhuis
  0 siblings, 1 reply; 6+ messages in thread
From: Linux kernel regression tracking (Thorsten Leemhuis) @ 2023-01-26 11:17 UTC (permalink / raw)
  To: David Sterba
  Cc: Chris Mason, Josef Bacik, linux-btrfs,
	Linux kernel regressions list, LKML, Wyatt Childers

Hi, this is your Linux kernel regression tracker.

I noticed a regression report in bugzilla.kernel.org. As many (most?)
kernel developer don't keep an eye on it, I decided to forward it by
mail. Quoting from https://bugzilla.kernel.org/show_bug.cgi?id=216961 :

>  Wyatt Childers 2023-01-23 17:44:13 UTC
> 
> There seems to be an issue with btrfs in Kernel 6 (first seen in at least 6.0.12-300.fc37.x86_64, possibly earlier) and still observed in 6.1 (6.1.5-200.fc37.x86_64).
> 
> This has manifested in prolonged user space hangs. I first noticed this in internal tooling for my employer where simple IO operations like a trivial cp commands that typically take under 100ms (when run with many other concurrent IO operations of a similar ilk) are observing "scheduling starvation" being stalled in a disk sleep state for nearly a minute.
> 
> This also seems to be manifesting in hangs to Kwin that correlate with btrfs usage and things as routine and trivial (even on high performance systems) as system updates https://bugs.kde.org/show_bug.cgi?id=463353
> 
> I would also not be surprised if this behavior is responsible for things like this reddit thread: https://www.reddit.com/r/btrfs/comments/102or0t/btrfs_on_luks_whole_system_freezing_under_high_io/
> 
> [reply] [−] Comment 1 Wyatt Childers 2023-01-23 17:57:47 UTC
> 
> As additional context, here are some sample kernel traces (echo w > /proc/sysrq-trigger):
> 
> Dec 16 14:10:02 localhost kernel: sysrq: Show Blocked State
> Dec 16 14:10:02 localhost kernel: task:kworker/u64:6   state:D stack:    0 pid:2673507 ppid:     2 flags:0x00004000
> Dec 16 14:10:02 localhost kernel: Workqueue: btrfs-worker btrfs_work_helper
> Dec 16 14:10:02 localhost kernel: Call Trace:
> Dec 16 14:10:02 localhost kernel:  <TASK>
> Dec 16 14:10:02 localhost kernel:  __schedule+0x332/0x12c0
> Dec 16 14:10:02 localhost kernel:  ? __sbitmap_get_word+0x37/0x80
> Dec 16 14:10:02 localhost kernel:  schedule+0x5d/0xe0
> Dec 16 14:10:02 localhost kernel:  io_schedule+0x42/0x70
> Dec 16 14:10:02 localhost kernel:  blk_mq_get_tag+0x10c/0x290
> Dec 16 14:10:02 localhost kernel:  ? dequeue_task_stop+0x70/0x70
> Dec 16 14:10:02 localhost kernel:  __blk_mq_alloc_requests+0x183/0x2b0
> Dec 16 14:10:02 localhost kernel:  blk_mq_submit_bio+0x2b7/0x5b0
> Dec 16 14:10:02 localhost kernel:  __submit_bio+0xf5/0x180
> Dec 16 14:10:02 localhost kernel:  submit_bio_noacct_nocheck+0x20d/0x2a0
> Dec 16 14:10:02 localhost kernel:  btrfs_submit_bio+0x17a/0x3d0
> Dec 16 14:10:02 localhost kernel:  btrfs_work_helper+0x14b/0x380
> Dec 16 14:10:02 localhost kernel:  process_one_work+0x1c7/0x380
> Dec 16 14:10:02 localhost kernel:  worker_thread+0x4d/0x380
> Dec 16 14:10:02 localhost kernel:  ? _raw_spin_lock_irqsave+0x23/0x50
> Dec 16 14:10:02 localhost kernel:  ? rescuer_thread+0x380/0x380
> Dec 16 14:10:02 localhost kernel:  kthread+0xe9/0x110
> Dec 16 14:10:02 localhost kernel:  ? kthread_complete_and_exit+0x20/0x20
> Dec 16 14:10:02 localhost kernel:  ret_from_fork+0x22/0x30
> Dec 16 14:10:02 localhost kernel:  </TASK>
> Dec 16 14:10:02 localhost kernel: task:cp              state:D stack:    0 pid:2422766 ppid:2410777 flags:0x00000002
> Dec 16 14:10:02 localhost kernel: Call Trace:
> Dec 16 14:10:02 localhost kernel:  <TASK>
> Dec 16 14:10:02 localhost kernel:  __schedule+0x332/0x12c0
> Dec 16 14:10:02 localhost kernel:  ? btrfs_delayed_update_inode+0x102/0x1f0
> Dec 16 14:10:02 localhost kernel:  schedule+0x5d/0xe0
> Dec 16 14:10:02 localhost kernel:  io_schedule+0x42/0x70
> Dec 16 14:10:02 localhost kernel:  folio_wait_bit_common+0x12d/0x3a0
> Dec 16 14:10:02 localhost kernel:  ? filemap_alloc_folio+0xc0/0xc0
> Dec 16 14:10:02 localhost kernel:  folio_wait_writeback+0x28/0x80
> Dec 16 14:10:02 localhost kernel:  __filemap_fdatawait_range+0x7f/0x100
> Dec 16 14:10:02 localhost kernel:  filemap_fdatawait_range+0xe/0x20
> Dec 16 14:10:02 localhost kernel:  btrfs_wait_ordered_range+0x7a/0x120
> Dec 16 14:10:02 localhost kernel:  btrfs_remap_file_range+0x139/0x550
> Dec 16 14:10:02 localhost kernel:  do_clone_file_range+0xe9/0x230
> Dec 16 14:10:02 localhost kernel:  vfs_clone_file_range+0x4d/0x140
> Dec 16 14:10:02 localhost kernel:  ovl_copyfile+0x15d/0x180 [overlay]
> Dec 16 14:10:02 localhost kernel:  ovl_remap_file_range+0x6e/0xa0 [overlay]
> Dec 16 14:10:02 localhost kernel:  do_clone_file_range+0xe9/0x230
> Dec 16 14:10:02 localhost kernel:  vfs_clone_file_range+0x4d/0x140
> Dec 16 14:10:02 localhost kernel:  ioctl_file_clone+0x49/0xb0
> Dec 16 14:10:02 localhost kernel:  do_vfs_ioctl+0x77/0x950
> Dec 16 14:10:02 localhost kernel:  ? security_file_ioctl+0x3c/0x60
> Dec 16 14:10:02 localhost kernel:  __x64_sys_ioctl+0x6e/0xd0
> Dec 16 14:10:02 localhost kernel:  do_syscall_64+0x5b/0x80
> Dec 16 14:10:02 localhost kernel:  ? syscall_exit_to_user_mode+0x17/0x40
> Dec 16 14:10:02 localhost kernel:  ? do_syscall_64+0x67/0x80
> Dec 16 14:10:02 localhost kernel:  ? syscall_exit_to_user_mode+0x17/0x40
> Dec 16 14:10:02 localhost kernel:  ? do_syscall_64+0x67/0x80
> Dec 16 14:10:02 localhost kernel:  ? do_user_addr_fault+0x1ef/0x690
> Dec 16 14:10:02 localhost kernel:  ? do_syscall_64+0x67/0x80
> Dec 16 14:10:02 localhost kernel:  ? exc_page_fault+0x70/0x170
> Dec 16 14:10:02 localhost kernel:  entry_SYSCALL_64_after_hwframe+0x63/0xcd
> Dec 16 14:10:02 localhost kernel: RIP: 0033:0x7f07d2d7dbaf
> Dec 16 14:10:02 localhost kernel: RSP: 002b:00007ffc4ba5bf40 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
> Dec 16 14:10:02 localhost kernel: RAX: ffffffffffffffda RBX: 0000000000000003 RCX: 00007f07d2d7dbaf
> Dec 16 14:10:02 localhost kernel: RDX: 0000000000000003 RSI: 0000000040049409 RDI: 0000000000000004
> Dec 16 14:10:02 localhost kernel: RBP: 00007ffc4ba5d8cc R08: 0000000000000001 R09: 00000000000001a4
> Dec 16 14:10:02 localhost kernel: R10: 0000000000000000 R11: 0000000000000246 R12: 00007ffc4ba5c3e0
> Dec 16 14:10:02 localhost kernel: R13: 0000000000000001 R14: 0000000000000014 R15: 00007ffc4ba5c7b0
> Dec 16 14:10:02 localhost kernel:  </TASK>
> Dec 16 14:10:02 localhost kernel: task:cp              state:D stack:    0 pid:2422815 ppid:2419287 flags:0x00000002
> Dec 16 14:10:02 localhost kernel: Call Trace:
> Dec 16 14:10:02 localhost kernel:  <TASK>
> Dec 16 14:10:02 localhost kernel:  __schedule+0x332/0x12c0
> Dec 16 14:10:02 localhost kernel:  ? btrfs_dirty_inode+0x91/0xe0
> Dec 16 14:10:02 localhost kernel:  schedule+0x5d/0xe0
> Dec 16 14:10:02 localhost kernel:  io_schedule+0x42/0x70
> Dec 16 14:10:02 localhost kernel:  folio_wait_bit_common+0x12d/0x3a0
> Dec 16 14:10:02 localhost kernel:  ? filemap_alloc_folio+0xc0/0xc0
> Dec 16 14:10:02 localhost kernel:  folio_wait_writeback+0x28/0x80
> Dec 16 14:10:02 localhost kernel:  __filemap_fdatawait_range+0x7f/0x100
> Dec 16 14:10:02 localhost kernel:  filemap_fdatawait_range+0xe/0x20
> Dec 16 14:10:02 localhost kernel:  btrfs_wait_ordered_range+0x7a/0x120
> Dec 16 14:10:02 localhost kernel:  btrfs_remap_file_range+0x139/0x550
> Dec 16 14:10:02 localhost kernel:  do_clone_file_range+0xe9/0x230
> Dec 16 14:10:02 localhost kernel:  vfs_clone_file_range+0x4d/0x140
> Dec 16 14:10:02 localhost kernel:  ovl_copyfile+0x15d/0x180 [overlay]
> Dec 16 14:10:02 localhost kernel:  ovl_remap_file_range+0x6e/0xa0 [overlay]
> Dec 16 14:10:02 localhost kernel:  do_clone_file_range+0xe9/0x230
> Dec 16 14:10:02 localhost kernel:  vfs_clone_file_range+0x4d/0x140
> Dec 16 14:10:02 localhost kernel:  ioctl_file_clone+0x49/0xb0
> Dec 16 14:10:02 localhost kernel:  do_vfs_ioctl+0x77/0x950
> Dec 16 14:10:02 localhost kernel:  __x64_sys_ioctl+0x6e/0xd0
> Dec 16 14:10:02 localhost kernel:  do_syscall_64+0x5b/0x80
> Dec 16 14:10:02 localhost kernel:  ? exc_page_fault+0x70/0x170
> Dec 16 14:10:02 localhost kernel:  entry_SYSCALL_64_after_hwframe+0x63/0xcd
> Dec 16 14:10:02 localhost kernel: RIP: 0033:0x7f1e8dc22baf
> Dec 16 14:10:02 localhost kernel: RSP: 002b:00007ffd65479d50 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
> Dec 16 14:10:02 localhost kernel: RAX: ffffffffffffffda RBX: 0000000000000003 RCX: 00007f1e8dc22baf
> Dec 16 14:10:02 localhost kernel: RDX: 0000000000000003 RSI: 0000000040049409 RDI: 0000000000000004
> Dec 16 14:10:02 localhost kernel: RBP: 00007ffd6547b8cc R08: 0000000000000001 R09: 00000000000001a4
> Dec 16 14:10:02 localhost kernel: R10: 0000000000000000 R11: 0000000000000246 R12: 00007ffd6547a1f0
> Dec 16 14:10:02 localhost kernel: R13: 0000000000000001 R14: 0000000000000014 R15: 00007ffd6547a5c0
> Dec 16 14:10:02 localhost kernel:  </TASK>
> Dec 16 14:10:02 localhost kernel: task:cp              state:D stack:    0 pid:2422818 ppid:2421995 flags:0x00000002
> Dec 16 14:10:02 localhost kernel: Call Trace:
> Dec 16 14:10:02 localhost kernel:  <TASK>
> Dec 16 14:10:02 localhost kernel:  __schedule+0x332/0x12c0
> Dec 16 14:10:02 localhost kernel:  schedule+0x5d/0xe0
> Dec 16 14:10:02 localhost kernel:  io_schedule+0x42/0x70
> Dec 16 14:10:02 localhost kernel:  folio_wait_bit_common+0x12d/0x3a0
> Dec 16 14:10:02 localhost kernel:  ? filemap_alloc_folio+0xc0/0xc0
> Dec 16 14:10:02 localhost kernel:  folio_wait_writeback+0x28/0x80
> Dec 16 14:10:02 localhost kernel:  __filemap_fdatawait_range+0x7f/0x100
> Dec 16 14:10:02 localhost kernel:  filemap_fdatawait_range+0xe/0x20
> Dec 16 14:10:02 localhost kernel:  btrfs_wait_ordered_range+0x7a/0x120
> Dec 16 14:10:02 localhost kernel:  btrfs_remap_file_range+0x139/0x550
> Dec 16 14:10:02 localhost kernel:  do_clone_file_range+0xe9/0x230
> Dec 16 14:10:02 localhost kernel:  vfs_clone_file_range+0x4d/0x140
> Dec 16 14:10:02 localhost kernel:  ovl_copyfile+0x15d/0x180 [overlay]
> Dec 16 14:10:02 localhost kernel:  ovl_remap_file_range+0x6e/0xa0 [overlay]
> Dec 16 14:10:02 localhost kernel:  do_clone_file_range+0xe9/0x230
> Dec 16 14:10:02 localhost kernel:  vfs_clone_file_range+0x4d/0x140
> Dec 16 14:10:02 localhost kernel:  ioctl_file_clone+0x49/0xb0
> Dec 16 14:10:02 localhost kernel:  do_vfs_ioctl+0x77/0x950
> Dec 16 14:10:02 localhost kernel:  __x64_sys_ioctl+0x6e/0xd0
> Dec 16 14:10:02 localhost kernel:  do_syscall_64+0x5b/0x80
> Dec 16 14:10:02 localhost kernel:  ? syscall_exit_to_user_mode+0x17/0x40
> Dec 16 14:10:02 localhost kernel:  ? do_syscall_64+0x67/0x80
> Dec 16 14:10:02 localhost kernel:  entry_SYSCALL_64_after_hwframe+0x63/0xcd
> Dec 16 14:10:02 localhost kernel: RIP: 0033:0x7ffa2a678baf
> Dec 16 14:10:02 localhost kernel: RSP: 002b:00007ffe77c03de0 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
> Dec 16 14:10:02 localhost kernel: RAX: ffffffffffffffda RBX: 0000000000000003 RCX: 00007ffa2a678baf
> Dec 16 14:10:02 localhost kernel: RDX: 0000000000000003 RSI: 0000000040049409 RDI: 0000000000000004
> Dec 16 14:10:02 localhost kernel: RBP: 00007ffe77c058cc R08: 0000000000000001 R09: 00000000000001a4
> Dec 16 14:10:02 localhost kernel: R10: 0000000000000000 R11: 0000000000000246 R12: 00007ffe77c04280
> Dec 16 14:10:02 localhost kernel: R13: 0000000000000001 R14: 0000000000000014 R15: 00007ffe77c04650
> Dec 16 14:10:02 localhost kernel:  </TASK>
> Dec 16 14:10:02 localhost kernel: task:cp              state:D stack:    0 pid:2422821 ppid:2420167 flags:0x00000002
> Dec 16 14:10:02 localhost kernel: Call Trace:
> Dec 16 14:10:02 localhost kernel:  <TASK>
> Dec 16 14:10:02 localhost kernel:  __schedule+0x332/0x12c0
> Dec 16 14:10:02 localhost kernel:  ? _raw_spin_unlock_irqrestore+0x23/0x40
> Dec 16 14:10:02 localhost kernel:  schedule+0x5d/0xe0
> Dec 16 14:10:02 localhost kernel:  io_schedule+0x42/0x70
> Dec 16 14:10:02 localhost kernel:  folio_wait_bit_common+0x12d/0x3a0
> Dec 16 14:10:02 localhost kernel:  ? filemap_alloc_folio+0xc0/0xc0
> Dec 16 14:10:02 localhost kernel:  folio_wait_writeback+0x28/0x80
> Dec 16 14:10:02 localhost kernel:  __filemap_fdatawait_range+0x7f/0x100
> Dec 16 14:10:02 localhost kernel:  filemap_fdatawait_range+0xe/0x20
> Dec 16 14:10:02 localhost kernel:  btrfs_wait_ordered_range+0x7a/0x120
> Dec 16 14:10:02 localhost kernel:  btrfs_remap_file_range+0x139/0x550
> Dec 16 14:10:02 localhost kernel:  do_clone_file_range+0xe9/0x230
> Dec 16 14:10:02 localhost kernel:  vfs_clone_file_range+0x4d/0x140
> Dec 16 14:10:02 localhost kernel:  ovl_copyfile+0x15d/0x180 [overlay]
> Dec 16 14:10:02 localhost kernel:  ovl_remap_file_range+0x6e/0xa0 [overlay]
> Dec 16 14:10:02 localhost kernel:  do_clone_file_range+0xe9/0x230
> Dec 16 14:10:02 localhost kernel:  vfs_clone_file_range+0x4d/0x140
> Dec 16 14:10:02 localhost kernel:  ioctl_file_clone+0x49/0xb0
> Dec 16 14:10:02 localhost kernel:  do_vfs_ioctl+0x77/0x950
> Dec 16 14:10:02 localhost kernel:  ? security_file_ioctl+0x3c/0x60
> Dec 16 14:10:02 localhost kernel:  __x64_sys_ioctl+0x6e/0xd0
> Dec 16 14:10:02 localhost kernel:  do_syscall_64+0x5b/0x80
> Dec 16 14:10:02 localhost kernel:  ? do_syscall_64+0x67/0x80
> Dec 16 14:10:02 localhost kernel:  entry_SYSCALL_64_after_hwframe+0x63/0xcd
> Dec 16 14:10:02 localhost kernel: RIP: 0033:0x7fafe34febaf
> Dec 16 14:10:02 localhost kernel: RSP: 002b:00007fff79bee930 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
> Dec 16 14:10:02 localhost kernel: RAX: ffffffffffffffda RBX: 0000000000000003 RCX: 00007fafe34febaf
> Dec 16 14:10:02 localhost kernel: RDX: 0000000000000003 RSI: 0000000040049409 RDI: 0000000000000004
> Dec 16 14:10:02 localhost kernel: RBP: 00007fff79bef8f6 R08: 0000000000000001 R09: 00000000000001a4
> Dec 16 14:10:02 localhost kernel: R10: 0000000000000004 R11: 0000000000000246 R12: 00007fff79beedd0
> Dec 16 14:10:02 localhost kernel: R13: 0000000000000001 R14: 0000000000000002 R15: 00007fff79bef1a0
> Dec 16 14:10:02 localhost kernel:  </TASK>
> Dec 16 14:10:02 localhost kernel: task:cp              state:D stack:    0 pid:2422911 ppid:2401445 flags:0x00000002
> Dec 16 14:10:02 localhost kernel: Call Trace:
> Dec 16 14:10:02 localhost kernel:  <TASK>
> Dec 16 14:10:02 localhost kernel:  __schedule+0x332/0x12c0
> Dec 16 14:10:02 localhost kernel:  ? __btrfs_end_transaction+0xf8/0x240
> Dec 16 14:10:02 localhost kernel:  schedule+0x5d/0xe0
> Dec 16 14:10:02 localhost kernel:  io_schedule+0x42/0x70
> Dec 16 14:10:02 localhost kernel:  folio_wait_bit_common+0x12d/0x3a0
> Dec 16 14:10:02 localhost kernel:  ? filemap_alloc_folio+0xc0/0xc0
> Dec 16 14:10:02 localhost kernel:  folio_wait_writeback+0x28/0x80
> Dec 16 14:10:02 localhost kernel:  __filemap_fdatawait_range+0x7f/0x100
> Dec 16 14:10:02 localhost kernel:  filemap_fdatawait_range+0xe/0x20
> Dec 16 14:10:02 localhost kernel:  btrfs_wait_ordered_range+0x7a/0x120
> Dec 16 14:10:02 localhost kernel:  btrfs_remap_file_range+0x139/0x550
> Dec 16 14:10:02 localhost kernel:  do_clone_file_range+0xe9/0x230
> Dec 16 14:10:02 localhost kernel:  vfs_clone_file_range+0x4d/0x140
> Dec 16 14:10:02 localhost kernel:  ovl_copyfile+0x15d/0x180 [overlay]
> Dec 16 14:10:02 localhost kernel:  ovl_remap_file_range+0x6e/0xa0 [overlay]
> Dec 16 14:10:02 localhost kernel:  do_clone_file_range+0xe9/0x230
> Dec 16 14:10:02 localhost kernel:  vfs_clone_file_range+0x4d/0x140
> Dec 16 14:10:02 localhost kernel:  ioctl_file_clone+0x49/0xb0
> Dec 16 14:10:02 localhost kernel:  do_vfs_ioctl+0x77/0x950
> Dec 16 14:10:02 localhost kernel:  __x64_sys_ioctl+0x6e/0xd0
> Dec 16 14:10:02 localhost kernel:  do_syscall_64+0x5b/0x80
> Dec 16 14:10:02 localhost kernel:  ? exc_page_fault+0x70/0x170
> Dec 16 14:10:02 localhost kernel:  entry_SYSCALL_64_after_hwframe+0x63/0xcd
> Dec 16 14:10:02 localhost kernel: RIP: 0033:0x7ff280aeebaf
> Dec 16 14:10:02 localhost kernel: RSP: 002b:00007ffdbfa595c0 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
> Dec 16 14:10:02 localhost kernel: RAX: ffffffffffffffda RBX: 0000000000000003 RCX: 00007ff280aeebaf
> Dec 16 14:10:02 localhost kernel: RDX: 0000000000000003 RSI: 0000000040049409 RDI: 0000000000000004
> Dec 16 14:10:02 localhost kernel: RBP: 00007ffdbfa5a8cc R08: 0000000000000001 R09: 00000000000001a4
> Dec 16 14:10:02 localhost kernel: R10: 0000000000000000 R11: 0000000000000246 R12: 00007ffdbfa59a60
> Dec 16 14:10:02 localhost kernel: R13: 0000000000000001 R14: 0000000000000014 R15: 00007ffdbfa59e30
> Dec 16 14:10:02 localhost kernel:  </TASK>
See the ticket for more details.


[TLDR for the rest of this mail: I'm adding this report to the list of
tracked Linux kernel regressions; the text you find below is based on a
few templates paragraphs you might have encountered already in similar
form.]

BTW, let me use this mail to also add the report to the list of tracked
regressions to ensure it's doesn't fall through the cracks (I for now
assume it was introduced between v5.19..v6.0, even if I don't know for
sure ):

#regzbot introduced: v5.19..v6.0
https://bugzilla.kernel.org/show_bug.cgi?id=216961
#regzbot title: btrfs: severe IO scheduling starvation issues with btrfs
#regzbot ignore-activity

This isn't a regression? This issue or a fix for it are already
discussed somewhere else? It was fixed already? You want to clarify when
the regression started to happen? Or point out I got the title or
something else totally wrong? Then just reply and tell me -- ideally
while also telling regzbot about it, as explained by the page listed in
the footer of this mail.

Developers: When fixing the issue, remember to add 'Link:' tags pointing
to the report (e.g. the buzgzilla ticket and maybe this mail as well, if
this thread sees some discussion). See page linked in footer for details.

Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat)
--
Everything you wanna know about Linux kernel regression tracking:
https://linux-regtracking.leemhuis.info/about/#tldr
If I did something stupid, please tell me, as explained on that page.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [Regression] Bug 216961 - Severe IO scheduling starvation issues with btrfs
  2023-01-26 11:17 [Regression] Bug 216961 - Severe IO scheduling starvation issues with btrfs Linux kernel regression tracking (Thorsten Leemhuis)
@ 2023-02-16 17:43 ` Thorsten Leemhuis
  2023-02-16 22:39   ` Josef Bacik
  0 siblings, 1 reply; 6+ messages in thread
From: Thorsten Leemhuis @ 2023-02-16 17:43 UTC (permalink / raw)
  To: David Sterba
  Cc: Chris Mason, Josef Bacik, linux-btrfs,
	Linux kernel regressions list, LKML, Wyatt Childers, pmacedo

Hi btrfs maintainers and developers.

I might be missing something, but it looks like the report was ignored.
Is there a reason for that?

FWIW, meanwhile another user showed up in the report that apparently is
affected by the same issue:

>  pmacedo 2023-02-07 12:47:31 UTC
> 
> This matches my experience on Debian with kernel from their unstable
> repository (6.1.4-1). I've rolled back to the current stable kernel
> from Bullseye (5.10.0-20-amd64, which according to the changelog is
> based on 5.10.158-2) about a week ago after finding this bug and the
> long IO stalls have disappeared.
> 
> Looks like Debian had 6.0 in backports, so I'll give it a try and
> report back.
> 
> [reply] [−] Comment 5 pmacedo 2023-02-07 19:22:55 UTC
> 
> From my testing, this seems to be between 5.19 and 6.0 - on 6.0 I've
> seen mkdir take up to 40 seconds, while on 5.19 it doesn't seem to
> ever go above half a second.
> 
> My setup is raid 6 for data, raid1c3 for metadata; in my case the
> stalls seem to happen more frequently after a large number of
> deletions and btrfs-transaction will appear to be very busy according
> to iotop, with everything else just stalling.

Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat)
--
Everything you wanna know about Linux kernel regression tracking:
https://linux-regtracking.leemhuis.info/about/#tldr
If I did something stupid, please tell me, as explained on that page.

#regzbot poke

On 26.01.23 12:17, Linux kernel regression tracking (Thorsten Leemhuis)
wrote:
> Hi, this is your Linux kernel regression tracker.
> 
> I noticed a regression report in bugzilla.kernel.org. As many (most?)
> kernel developer don't keep an eye on it, I decided to forward it by
> mail. Quoting from https://bugzilla.kernel.org/show_bug.cgi?id=216961 :
> 
>>  Wyatt Childers 2023-01-23 17:44:13 UTC
>>
>> There seems to be an issue with btrfs in Kernel 6 (first seen in at least 6.0.12-300.fc37.x86_64, possibly earlier) and still observed in 6.1 (6.1.5-200.fc37.x86_64).
>>
>> This has manifested in prolonged user space hangs. I first noticed this in internal tooling for my employer where simple IO operations like a trivial cp commands that typically take under 100ms (when run with many other concurrent IO operations of a similar ilk) are observing "scheduling starvation" being stalled in a disk sleep state for nearly a minute.
>>
>> This also seems to be manifesting in hangs to Kwin that correlate with btrfs usage and things as routine and trivial (even on high performance systems) as system updates https://bugs.kde.org/show_bug.cgi?id=463353
>>
>> I would also not be surprised if this behavior is responsible for things like this reddit thread: https://www.reddit.com/r/btrfs/comments/102or0t/btrfs_on_luks_whole_system_freezing_under_high_io/
>>
>> [reply] [−] Comment 1 Wyatt Childers 2023-01-23 17:57:47 UTC
>>
>> As additional context, here are some sample kernel traces (echo w > /proc/sysrq-trigger):
>>
>> Dec 16 14:10:02 localhost kernel: sysrq: Show Blocked State
>> Dec 16 14:10:02 localhost kernel: task:kworker/u64:6   state:D stack:    0 pid:2673507 ppid:     2 flags:0x00004000
>> Dec 16 14:10:02 localhost kernel: Workqueue: btrfs-worker btrfs_work_helper
>> Dec 16 14:10:02 localhost kernel: Call Trace:
>> Dec 16 14:10:02 localhost kernel:  <TASK>
>> Dec 16 14:10:02 localhost kernel:  __schedule+0x332/0x12c0
>> Dec 16 14:10:02 localhost kernel:  ? __sbitmap_get_word+0x37/0x80
>> Dec 16 14:10:02 localhost kernel:  schedule+0x5d/0xe0
>> Dec 16 14:10:02 localhost kernel:  io_schedule+0x42/0x70
>> Dec 16 14:10:02 localhost kernel:  blk_mq_get_tag+0x10c/0x290
>> Dec 16 14:10:02 localhost kernel:  ? dequeue_task_stop+0x70/0x70
>> Dec 16 14:10:02 localhost kernel:  __blk_mq_alloc_requests+0x183/0x2b0
>> Dec 16 14:10:02 localhost kernel:  blk_mq_submit_bio+0x2b7/0x5b0
>> Dec 16 14:10:02 localhost kernel:  __submit_bio+0xf5/0x180
>> Dec 16 14:10:02 localhost kernel:  submit_bio_noacct_nocheck+0x20d/0x2a0
>> Dec 16 14:10:02 localhost kernel:  btrfs_submit_bio+0x17a/0x3d0
>> Dec 16 14:10:02 localhost kernel:  btrfs_work_helper+0x14b/0x380
>> Dec 16 14:10:02 localhost kernel:  process_one_work+0x1c7/0x380
>> Dec 16 14:10:02 localhost kernel:  worker_thread+0x4d/0x380
>> Dec 16 14:10:02 localhost kernel:  ? _raw_spin_lock_irqsave+0x23/0x50
>> Dec 16 14:10:02 localhost kernel:  ? rescuer_thread+0x380/0x380
>> Dec 16 14:10:02 localhost kernel:  kthread+0xe9/0x110
>> Dec 16 14:10:02 localhost kernel:  ? kthread_complete_and_exit+0x20/0x20
>> Dec 16 14:10:02 localhost kernel:  ret_from_fork+0x22/0x30
>> Dec 16 14:10:02 localhost kernel:  </TASK>
>> Dec 16 14:10:02 localhost kernel: task:cp              state:D stack:    0 pid:2422766 ppid:2410777 flags:0x00000002
>> Dec 16 14:10:02 localhost kernel: Call Trace:
>> Dec 16 14:10:02 localhost kernel:  <TASK>
>> Dec 16 14:10:02 localhost kernel:  __schedule+0x332/0x12c0
>> Dec 16 14:10:02 localhost kernel:  ? btrfs_delayed_update_inode+0x102/0x1f0
>> Dec 16 14:10:02 localhost kernel:  schedule+0x5d/0xe0
>> Dec 16 14:10:02 localhost kernel:  io_schedule+0x42/0x70
>> Dec 16 14:10:02 localhost kernel:  folio_wait_bit_common+0x12d/0x3a0
>> Dec 16 14:10:02 localhost kernel:  ? filemap_alloc_folio+0xc0/0xc0
>> Dec 16 14:10:02 localhost kernel:  folio_wait_writeback+0x28/0x80
>> Dec 16 14:10:02 localhost kernel:  __filemap_fdatawait_range+0x7f/0x100
>> Dec 16 14:10:02 localhost kernel:  filemap_fdatawait_range+0xe/0x20
>> Dec 16 14:10:02 localhost kernel:  btrfs_wait_ordered_range+0x7a/0x120
>> Dec 16 14:10:02 localhost kernel:  btrfs_remap_file_range+0x139/0x550
>> Dec 16 14:10:02 localhost kernel:  do_clone_file_range+0xe9/0x230
>> Dec 16 14:10:02 localhost kernel:  vfs_clone_file_range+0x4d/0x140
>> Dec 16 14:10:02 localhost kernel:  ovl_copyfile+0x15d/0x180 [overlay]
>> Dec 16 14:10:02 localhost kernel:  ovl_remap_file_range+0x6e/0xa0 [overlay]
>> Dec 16 14:10:02 localhost kernel:  do_clone_file_range+0xe9/0x230
>> Dec 16 14:10:02 localhost kernel:  vfs_clone_file_range+0x4d/0x140
>> Dec 16 14:10:02 localhost kernel:  ioctl_file_clone+0x49/0xb0
>> Dec 16 14:10:02 localhost kernel:  do_vfs_ioctl+0x77/0x950
>> Dec 16 14:10:02 localhost kernel:  ? security_file_ioctl+0x3c/0x60
>> Dec 16 14:10:02 localhost kernel:  __x64_sys_ioctl+0x6e/0xd0
>> Dec 16 14:10:02 localhost kernel:  do_syscall_64+0x5b/0x80
>> Dec 16 14:10:02 localhost kernel:  ? syscall_exit_to_user_mode+0x17/0x40
>> Dec 16 14:10:02 localhost kernel:  ? do_syscall_64+0x67/0x80
>> Dec 16 14:10:02 localhost kernel:  ? syscall_exit_to_user_mode+0x17/0x40
>> Dec 16 14:10:02 localhost kernel:  ? do_syscall_64+0x67/0x80
>> Dec 16 14:10:02 localhost kernel:  ? do_user_addr_fault+0x1ef/0x690
>> Dec 16 14:10:02 localhost kernel:  ? do_syscall_64+0x67/0x80
>> Dec 16 14:10:02 localhost kernel:  ? exc_page_fault+0x70/0x170
>> Dec 16 14:10:02 localhost kernel:  entry_SYSCALL_64_after_hwframe+0x63/0xcd
>> Dec 16 14:10:02 localhost kernel: RIP: 0033:0x7f07d2d7dbaf
>> Dec 16 14:10:02 localhost kernel: RSP: 002b:00007ffc4ba5bf40 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
>> Dec 16 14:10:02 localhost kernel: RAX: ffffffffffffffda RBX: 0000000000000003 RCX: 00007f07d2d7dbaf
>> Dec 16 14:10:02 localhost kernel: RDX: 0000000000000003 RSI: 0000000040049409 RDI: 0000000000000004
>> Dec 16 14:10:02 localhost kernel: RBP: 00007ffc4ba5d8cc R08: 0000000000000001 R09: 00000000000001a4
>> Dec 16 14:10:02 localhost kernel: R10: 0000000000000000 R11: 0000000000000246 R12: 00007ffc4ba5c3e0
>> Dec 16 14:10:02 localhost kernel: R13: 0000000000000001 R14: 0000000000000014 R15: 00007ffc4ba5c7b0
>> Dec 16 14:10:02 localhost kernel:  </TASK>
>> Dec 16 14:10:02 localhost kernel: task:cp              state:D stack:    0 pid:2422815 ppid:2419287 flags:0x00000002
>> Dec 16 14:10:02 localhost kernel: Call Trace:
>> Dec 16 14:10:02 localhost kernel:  <TASK>
>> Dec 16 14:10:02 localhost kernel:  __schedule+0x332/0x12c0
>> Dec 16 14:10:02 localhost kernel:  ? btrfs_dirty_inode+0x91/0xe0
>> Dec 16 14:10:02 localhost kernel:  schedule+0x5d/0xe0
>> Dec 16 14:10:02 localhost kernel:  io_schedule+0x42/0x70
>> Dec 16 14:10:02 localhost kernel:  folio_wait_bit_common+0x12d/0x3a0
>> Dec 16 14:10:02 localhost kernel:  ? filemap_alloc_folio+0xc0/0xc0
>> Dec 16 14:10:02 localhost kernel:  folio_wait_writeback+0x28/0x80
>> Dec 16 14:10:02 localhost kernel:  __filemap_fdatawait_range+0x7f/0x100
>> Dec 16 14:10:02 localhost kernel:  filemap_fdatawait_range+0xe/0x20
>> Dec 16 14:10:02 localhost kernel:  btrfs_wait_ordered_range+0x7a/0x120
>> Dec 16 14:10:02 localhost kernel:  btrfs_remap_file_range+0x139/0x550
>> Dec 16 14:10:02 localhost kernel:  do_clone_file_range+0xe9/0x230
>> Dec 16 14:10:02 localhost kernel:  vfs_clone_file_range+0x4d/0x140
>> Dec 16 14:10:02 localhost kernel:  ovl_copyfile+0x15d/0x180 [overlay]
>> Dec 16 14:10:02 localhost kernel:  ovl_remap_file_range+0x6e/0xa0 [overlay]
>> Dec 16 14:10:02 localhost kernel:  do_clone_file_range+0xe9/0x230
>> Dec 16 14:10:02 localhost kernel:  vfs_clone_file_range+0x4d/0x140
>> Dec 16 14:10:02 localhost kernel:  ioctl_file_clone+0x49/0xb0
>> Dec 16 14:10:02 localhost kernel:  do_vfs_ioctl+0x77/0x950
>> Dec 16 14:10:02 localhost kernel:  __x64_sys_ioctl+0x6e/0xd0
>> Dec 16 14:10:02 localhost kernel:  do_syscall_64+0x5b/0x80
>> Dec 16 14:10:02 localhost kernel:  ? exc_page_fault+0x70/0x170
>> Dec 16 14:10:02 localhost kernel:  entry_SYSCALL_64_after_hwframe+0x63/0xcd
>> Dec 16 14:10:02 localhost kernel: RIP: 0033:0x7f1e8dc22baf
>> Dec 16 14:10:02 localhost kernel: RSP: 002b:00007ffd65479d50 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
>> Dec 16 14:10:02 localhost kernel: RAX: ffffffffffffffda RBX: 0000000000000003 RCX: 00007f1e8dc22baf
>> Dec 16 14:10:02 localhost kernel: RDX: 0000000000000003 RSI: 0000000040049409 RDI: 0000000000000004
>> Dec 16 14:10:02 localhost kernel: RBP: 00007ffd6547b8cc R08: 0000000000000001 R09: 00000000000001a4
>> Dec 16 14:10:02 localhost kernel: R10: 0000000000000000 R11: 0000000000000246 R12: 00007ffd6547a1f0
>> Dec 16 14:10:02 localhost kernel: R13: 0000000000000001 R14: 0000000000000014 R15: 00007ffd6547a5c0
>> Dec 16 14:10:02 localhost kernel:  </TASK>
>> Dec 16 14:10:02 localhost kernel: task:cp              state:D stack:    0 pid:2422818 ppid:2421995 flags:0x00000002
>> Dec 16 14:10:02 localhost kernel: Call Trace:
>> Dec 16 14:10:02 localhost kernel:  <TASK>
>> Dec 16 14:10:02 localhost kernel:  __schedule+0x332/0x12c0
>> Dec 16 14:10:02 localhost kernel:  schedule+0x5d/0xe0
>> Dec 16 14:10:02 localhost kernel:  io_schedule+0x42/0x70
>> Dec 16 14:10:02 localhost kernel:  folio_wait_bit_common+0x12d/0x3a0
>> Dec 16 14:10:02 localhost kernel:  ? filemap_alloc_folio+0xc0/0xc0
>> Dec 16 14:10:02 localhost kernel:  folio_wait_writeback+0x28/0x80
>> Dec 16 14:10:02 localhost kernel:  __filemap_fdatawait_range+0x7f/0x100
>> Dec 16 14:10:02 localhost kernel:  filemap_fdatawait_range+0xe/0x20
>> Dec 16 14:10:02 localhost kernel:  btrfs_wait_ordered_range+0x7a/0x120
>> Dec 16 14:10:02 localhost kernel:  btrfs_remap_file_range+0x139/0x550
>> Dec 16 14:10:02 localhost kernel:  do_clone_file_range+0xe9/0x230
>> Dec 16 14:10:02 localhost kernel:  vfs_clone_file_range+0x4d/0x140
>> Dec 16 14:10:02 localhost kernel:  ovl_copyfile+0x15d/0x180 [overlay]
>> Dec 16 14:10:02 localhost kernel:  ovl_remap_file_range+0x6e/0xa0 [overlay]
>> Dec 16 14:10:02 localhost kernel:  do_clone_file_range+0xe9/0x230
>> Dec 16 14:10:02 localhost kernel:  vfs_clone_file_range+0x4d/0x140
>> Dec 16 14:10:02 localhost kernel:  ioctl_file_clone+0x49/0xb0
>> Dec 16 14:10:02 localhost kernel:  do_vfs_ioctl+0x77/0x950
>> Dec 16 14:10:02 localhost kernel:  __x64_sys_ioctl+0x6e/0xd0
>> Dec 16 14:10:02 localhost kernel:  do_syscall_64+0x5b/0x80
>> Dec 16 14:10:02 localhost kernel:  ? syscall_exit_to_user_mode+0x17/0x40
>> Dec 16 14:10:02 localhost kernel:  ? do_syscall_64+0x67/0x80
>> Dec 16 14:10:02 localhost kernel:  entry_SYSCALL_64_after_hwframe+0x63/0xcd
>> Dec 16 14:10:02 localhost kernel: RIP: 0033:0x7ffa2a678baf
>> Dec 16 14:10:02 localhost kernel: RSP: 002b:00007ffe77c03de0 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
>> Dec 16 14:10:02 localhost kernel: RAX: ffffffffffffffda RBX: 0000000000000003 RCX: 00007ffa2a678baf
>> Dec 16 14:10:02 localhost kernel: RDX: 0000000000000003 RSI: 0000000040049409 RDI: 0000000000000004
>> Dec 16 14:10:02 localhost kernel: RBP: 00007ffe77c058cc R08: 0000000000000001 R09: 00000000000001a4
>> Dec 16 14:10:02 localhost kernel: R10: 0000000000000000 R11: 0000000000000246 R12: 00007ffe77c04280
>> Dec 16 14:10:02 localhost kernel: R13: 0000000000000001 R14: 0000000000000014 R15: 00007ffe77c04650
>> Dec 16 14:10:02 localhost kernel:  </TASK>
>> Dec 16 14:10:02 localhost kernel: task:cp              state:D stack:    0 pid:2422821 ppid:2420167 flags:0x00000002
>> Dec 16 14:10:02 localhost kernel: Call Trace:
>> Dec 16 14:10:02 localhost kernel:  <TASK>
>> Dec 16 14:10:02 localhost kernel:  __schedule+0x332/0x12c0
>> Dec 16 14:10:02 localhost kernel:  ? _raw_spin_unlock_irqrestore+0x23/0x40
>> Dec 16 14:10:02 localhost kernel:  schedule+0x5d/0xe0
>> Dec 16 14:10:02 localhost kernel:  io_schedule+0x42/0x70
>> Dec 16 14:10:02 localhost kernel:  folio_wait_bit_common+0x12d/0x3a0
>> Dec 16 14:10:02 localhost kernel:  ? filemap_alloc_folio+0xc0/0xc0
>> Dec 16 14:10:02 localhost kernel:  folio_wait_writeback+0x28/0x80
>> Dec 16 14:10:02 localhost kernel:  __filemap_fdatawait_range+0x7f/0x100
>> Dec 16 14:10:02 localhost kernel:  filemap_fdatawait_range+0xe/0x20
>> Dec 16 14:10:02 localhost kernel:  btrfs_wait_ordered_range+0x7a/0x120
>> Dec 16 14:10:02 localhost kernel:  btrfs_remap_file_range+0x139/0x550
>> Dec 16 14:10:02 localhost kernel:  do_clone_file_range+0xe9/0x230
>> Dec 16 14:10:02 localhost kernel:  vfs_clone_file_range+0x4d/0x140
>> Dec 16 14:10:02 localhost kernel:  ovl_copyfile+0x15d/0x180 [overlay]
>> Dec 16 14:10:02 localhost kernel:  ovl_remap_file_range+0x6e/0xa0 [overlay]
>> Dec 16 14:10:02 localhost kernel:  do_clone_file_range+0xe9/0x230
>> Dec 16 14:10:02 localhost kernel:  vfs_clone_file_range+0x4d/0x140
>> Dec 16 14:10:02 localhost kernel:  ioctl_file_clone+0x49/0xb0
>> Dec 16 14:10:02 localhost kernel:  do_vfs_ioctl+0x77/0x950
>> Dec 16 14:10:02 localhost kernel:  ? security_file_ioctl+0x3c/0x60
>> Dec 16 14:10:02 localhost kernel:  __x64_sys_ioctl+0x6e/0xd0
>> Dec 16 14:10:02 localhost kernel:  do_syscall_64+0x5b/0x80
>> Dec 16 14:10:02 localhost kernel:  ? do_syscall_64+0x67/0x80
>> Dec 16 14:10:02 localhost kernel:  entry_SYSCALL_64_after_hwframe+0x63/0xcd
>> Dec 16 14:10:02 localhost kernel: RIP: 0033:0x7fafe34febaf
>> Dec 16 14:10:02 localhost kernel: RSP: 002b:00007fff79bee930 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
>> Dec 16 14:10:02 localhost kernel: RAX: ffffffffffffffda RBX: 0000000000000003 RCX: 00007fafe34febaf
>> Dec 16 14:10:02 localhost kernel: RDX: 0000000000000003 RSI: 0000000040049409 RDI: 0000000000000004
>> Dec 16 14:10:02 localhost kernel: RBP: 00007fff79bef8f6 R08: 0000000000000001 R09: 00000000000001a4
>> Dec 16 14:10:02 localhost kernel: R10: 0000000000000004 R11: 0000000000000246 R12: 00007fff79beedd0
>> Dec 16 14:10:02 localhost kernel: R13: 0000000000000001 R14: 0000000000000002 R15: 00007fff79bef1a0
>> Dec 16 14:10:02 localhost kernel:  </TASK>
>> Dec 16 14:10:02 localhost kernel: task:cp              state:D stack:    0 pid:2422911 ppid:2401445 flags:0x00000002
>> Dec 16 14:10:02 localhost kernel: Call Trace:
>> Dec 16 14:10:02 localhost kernel:  <TASK>
>> Dec 16 14:10:02 localhost kernel:  __schedule+0x332/0x12c0
>> Dec 16 14:10:02 localhost kernel:  ? __btrfs_end_transaction+0xf8/0x240
>> Dec 16 14:10:02 localhost kernel:  schedule+0x5d/0xe0
>> Dec 16 14:10:02 localhost kernel:  io_schedule+0x42/0x70
>> Dec 16 14:10:02 localhost kernel:  folio_wait_bit_common+0x12d/0x3a0
>> Dec 16 14:10:02 localhost kernel:  ? filemap_alloc_folio+0xc0/0xc0
>> Dec 16 14:10:02 localhost kernel:  folio_wait_writeback+0x28/0x80
>> Dec 16 14:10:02 localhost kernel:  __filemap_fdatawait_range+0x7f/0x100
>> Dec 16 14:10:02 localhost kernel:  filemap_fdatawait_range+0xe/0x20
>> Dec 16 14:10:02 localhost kernel:  btrfs_wait_ordered_range+0x7a/0x120
>> Dec 16 14:10:02 localhost kernel:  btrfs_remap_file_range+0x139/0x550
>> Dec 16 14:10:02 localhost kernel:  do_clone_file_range+0xe9/0x230
>> Dec 16 14:10:02 localhost kernel:  vfs_clone_file_range+0x4d/0x140
>> Dec 16 14:10:02 localhost kernel:  ovl_copyfile+0x15d/0x180 [overlay]
>> Dec 16 14:10:02 localhost kernel:  ovl_remap_file_range+0x6e/0xa0 [overlay]
>> Dec 16 14:10:02 localhost kernel:  do_clone_file_range+0xe9/0x230
>> Dec 16 14:10:02 localhost kernel:  vfs_clone_file_range+0x4d/0x140
>> Dec 16 14:10:02 localhost kernel:  ioctl_file_clone+0x49/0xb0
>> Dec 16 14:10:02 localhost kernel:  do_vfs_ioctl+0x77/0x950
>> Dec 16 14:10:02 localhost kernel:  __x64_sys_ioctl+0x6e/0xd0
>> Dec 16 14:10:02 localhost kernel:  do_syscall_64+0x5b/0x80
>> Dec 16 14:10:02 localhost kernel:  ? exc_page_fault+0x70/0x170
>> Dec 16 14:10:02 localhost kernel:  entry_SYSCALL_64_after_hwframe+0x63/0xcd
>> Dec 16 14:10:02 localhost kernel: RIP: 0033:0x7ff280aeebaf
>> Dec 16 14:10:02 localhost kernel: RSP: 002b:00007ffdbfa595c0 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
>> Dec 16 14:10:02 localhost kernel: RAX: ffffffffffffffda RBX: 0000000000000003 RCX: 00007ff280aeebaf
>> Dec 16 14:10:02 localhost kernel: RDX: 0000000000000003 RSI: 0000000040049409 RDI: 0000000000000004
>> Dec 16 14:10:02 localhost kernel: RBP: 00007ffdbfa5a8cc R08: 0000000000000001 R09: 00000000000001a4
>> Dec 16 14:10:02 localhost kernel: R10: 0000000000000000 R11: 0000000000000246 R12: 00007ffdbfa59a60
>> Dec 16 14:10:02 localhost kernel: R13: 0000000000000001 R14: 0000000000000014 R15: 00007ffdbfa59e30
>> Dec 16 14:10:02 localhost kernel:  </TASK>
> See the ticket for more details.
> 
> 
> [TLDR for the rest of this mail: I'm adding this report to the list of
> tracked Linux kernel regressions; the text you find below is based on a
> few templates paragraphs you might have encountered already in similar
> form.]
> 
> BTW, let me use this mail to also add the report to the list of tracked
> regressions to ensure it's doesn't fall through the cracks (I for now
> assume it was introduced between v5.19..v6.0, even if I don't know for
> sure ):
> 
> #regzbot introduced: v5.19..v6.0
> https://bugzilla.kernel.org/show_bug.cgi?id=216961
> #regzbot title: btrfs: severe IO scheduling starvation issues with btrfs
> #regzbot ignore-activity
> 
> This isn't a regression? This issue or a fix for it are already
> discussed somewhere else? It was fixed already? You want to clarify when
> the regression started to happen? Or point out I got the title or
> something else totally wrong? Then just reply and tell me -- ideally
> while also telling regzbot about it, as explained by the page listed in
> the footer of this mail.
> 
> Developers: When fixing the issue, remember to add 'Link:' tags pointing
> to the report (e.g. the buzgzilla ticket and maybe this mail as well, if
> this thread sees some discussion). See page linked in footer for details.
> 
> Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat)
> --
> Everything you wanna know about Linux kernel regression tracking:
> https://linux-regtracking.leemhuis.info/about/#tldr
> If I did something stupid, please tell me, as explained on that page.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [Regression] Bug 216961 - Severe IO scheduling starvation issues with btrfs
  2023-02-16 17:43 ` Thorsten Leemhuis
@ 2023-02-16 22:39   ` Josef Bacik
  2023-02-17  5:17     ` Linux regression tracking (Thorsten Leemhuis)
  0 siblings, 1 reply; 6+ messages in thread
From: Josef Bacik @ 2023-02-16 22:39 UTC (permalink / raw)
  To: Linux regressions mailing list
  Cc: David Sterba, Chris Mason, linux-btrfs, LKML, Wyatt Childers, pmacedo

On Thu, Feb 16, 2023 at 12:43 PM Thorsten Leemhuis
<regressions@leemhuis.info> wrote:
>
> Hi btrfs maintainers and developers.
>
> I might be missing something, but it looks like the report was ignored.
> Is there a reason for that?
>

Mostly just being blind.  I've responded to the BZ, I'll investigate
through the BZ.  Thanks,

Josef

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [Regression] Bug 216961 - Severe IO scheduling starvation issues with btrfs
  2023-02-16 22:39   ` Josef Bacik
@ 2023-02-17  5:17     ` Linux regression tracking (Thorsten Leemhuis)
  2023-02-28 19:40       ` David Sterba
  0 siblings, 1 reply; 6+ messages in thread
From: Linux regression tracking (Thorsten Leemhuis) @ 2023-02-17  5:17 UTC (permalink / raw)
  To: Josef Bacik, Linux regressions mailing list
  Cc: David Sterba, Chris Mason, linux-btrfs, LKML, Wyatt Childers, pmacedo

On 16.02.23 23:39, Josef Bacik wrote:
> On Thu, Feb 16, 2023 at 12:43 PM Thorsten Leemhuis
> <regressions@leemhuis.info> wrote:
>>
>> Hi btrfs maintainers and developers.
>>
>> I might be missing something, but it looks like the report was ignored.
>> Is there a reason for that?
> 
> Mostly just being blind.

Happens.

>  I've responded to the BZ, I'll investigate
> through the BZ.

Great, many thx!

Ciao, THorsten

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [Regression] Bug 216961 - Severe IO scheduling starvation issues with btrfs
  2023-02-17  5:17     ` Linux regression tracking (Thorsten Leemhuis)
@ 2023-02-28 19:40       ` David Sterba
  2023-03-01  7:34         ` Linux regression tracking (Thorsten Leemhuis)
  0 siblings, 1 reply; 6+ messages in thread
From: David Sterba @ 2023-02-28 19:40 UTC (permalink / raw)
  To: Linux regressions mailing list
  Cc: Josef Bacik, David Sterba, Chris Mason, linux-btrfs, LKML,
	Wyatt Childers, pmacedo

On Fri, Feb 17, 2023 at 06:17:58AM +0100, Linux regression tracking (Thorsten Leemhuis) wrote:
> On 16.02.23 23:39, Josef Bacik wrote:
> > On Thu, Feb 16, 2023 at 12:43 PM Thorsten Leemhuis
> > <regressions@leemhuis.info> wrote:
> >>
> >> Hi btrfs maintainers and developers.
> >>
> >> I might be missing something, but it looks like the report was ignored.
> >> Is there a reason for that?
> > 
> > Mostly just being blind.
> 
> Happens.
> 
> >  I've responded to the BZ, I'll investigate
> > through the BZ.

According to the bug it's a problem in BFQ.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [Regression] Bug 216961 - Severe IO scheduling starvation issues with btrfs
  2023-02-28 19:40       ` David Sterba
@ 2023-03-01  7:34         ` Linux regression tracking (Thorsten Leemhuis)
  0 siblings, 0 replies; 6+ messages in thread
From: Linux regression tracking (Thorsten Leemhuis) @ 2023-03-01  7:34 UTC (permalink / raw)
  To: Paolo Valente, Jens Axboe
  Cc: Josef Bacik, David Sterba, Chris Mason, linux-btrfs, LKML,
	Wyatt Childers, pmacedo, Linux regressions mailing list, dsterba

On 28.02.23 20:40, David Sterba wrote:
> On Fri, Feb 17, 2023 at 06:17:58AM +0100, Linux regression tracking (Thorsten Leemhuis) wrote:
>> On 16.02.23 23:39, Josef Bacik wrote:
>>> On Thu, Feb 16, 2023 at 12:43 PM Thorsten Leemhuis
>>> <regressions@leemhuis.info> wrote:
>>>>
>>>> I might be missing something, but it looks like the report was ignored.
>>>> Is there a reason for that?
>>> Mostly just being blind.
>> Happens.
>>
>>>  I've responded to the BZ, I'll investigate
>>> through the BZ.
> According to the bug it's a problem in BFQ.

Yeah, I noticed yesterday, after I looked into the issue again, as it
looked stalled. But as I already wrote in a comment in the ticket
yesterday:

Did anyone tell the bfq developers about this?

Doesn't look like it. Adding them to the list of recipients them now,
even if they might know about it. If not:

Paolo, Jens, there seems to be a regression in BFQ likely introduced
between 5.19 and 6.0 that people apparently notice often with Btrfs. For
details see:
https://bugzilla.kernel.org/show_bug.cgi?id=216961

Josef apparently has seen report about this as well where switching to
another io scheduler helped:
https://bugzilla.kernel.org/show_bug.cgi?id=216961#c8

Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat)
--
Everything you wanna know about Linux kernel regression tracking:
https://linux-regtracking.leemhuis.info/about/#tldr
If I did something stupid, please tell me, as explained on that page.

#regzbot title: bfq: severe IO scheduling starvation issues with btrfs

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2023-03-01  7:34 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-01-26 11:17 [Regression] Bug 216961 - Severe IO scheduling starvation issues with btrfs Linux kernel regression tracking (Thorsten Leemhuis)
2023-02-16 17:43 ` Thorsten Leemhuis
2023-02-16 22:39   ` Josef Bacik
2023-02-17  5:17     ` Linux regression tracking (Thorsten Leemhuis)
2023-02-28 19:40       ` David Sterba
2023-03-01  7:34         ` Linux regression tracking (Thorsten Leemhuis)

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).