All of lore.kernel.org
 help / color / mirror / Atom feed
* seeing this stace trace on kernel 5.15
@ 2022-06-28  6:31 Nikhil Kshirsagar
  2022-06-28  7:23 ` Nikhil Kshirsagar
  2022-06-28 14:37 ` Coly Li
  0 siblings, 2 replies; 14+ messages in thread
From: Nikhil Kshirsagar @ 2022-06-28  6:31 UTC (permalink / raw)
  To: Coly Li, linux-bcache

Hi Coly,

I just kicked off a 20gb random readwrite test using fio to a bcache device,

fio --filename=/home/ubuntu/bcache_mount/cacahedfile --size=20GB
--direct=1 --rw=randrw --bs=4k --ioengine=libaio --iodepth=128
--name=iops-test-job --eta-newline=1

# lsblk /dev/sdc /dev/nvme0n1
NAME        MAJ:MIN RM   SIZE RO TYPE MOUNTPOINTS
sdc           8:32   0 279.4G  0 disk
└─sdc1        8:33   0    60G  0 part
  └─bcache0 252:0    0    60G  0 disk /home/ubuntu/bcache_mount
nvme0n1     259:0    0 372.6G  0 disk
└─nvme0n1p1 259:2    0    15G  0 part
  └─bcache0 252:0    0    60G  0 disk /home/ubuntu/bcache_mount

and am seeing this trace in the dmesg,

[ 2475.034909] XFS (bcache0): Ending clean mount
[ 2475.036238] xfs filesystem being mounted at
/home/ubuntu/bcache_mount supports timestamps until 2038 (0x7fffffff)

[ 2782.176705] INFO: task kworker/2:1:255 blocked for more than 120 seconds.
[ 2782.507171]       Not tainted 5.15.0-40-generic #43-Ubuntu
[ 2782.774078] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
disables this message.
[ 2783.155206] task:kworker/2:1     state:D stack:    0 pid:  255
ppid:     2 flags:0x00004000
[ 2783.155210] Workqueue: bcache bch_data_insert_start [bcache]
[ 2783.155259] Call Trace:
[ 2783.155261]  <TASK>
[ 2783.155263]  __schedule+0x23d/0x590
[ 2783.155269]  schedule+0x4e/0xb0
[ 2783.155271]  bch_bucket_alloc+0x21a/0x5b0 [bcache]
[ 2783.155281]  ? bch_btree_insert_check_key+0x1e0/0x1e0 [bcache]
[ 2783.155294]  ? wait_woken+0x70/0x70
[ 2783.155298]  __bch_bucket_alloc_set+0x52/0xf0 [bcache]
[ 2783.155309]  bch_alloc_sectors+0x1c9/0x4c0 [bcache]
[ 2783.155319]  ? bch_btree_insert+0xea/0x130 [bcache]
[ 2783.155331]  ? __queue_work+0x211/0x480
[ 2783.155336]  bch_data_insert_start+0x15e/0x3a0 [bcache]
[ 2783.155349]  ? closure_sub+0x94/0xb0 [bcache]
[ 2783.155362]  process_one_work+0x22b/0x3d0
[ 2783.155366]  worker_thread+0x53/0x410
[ 2783.155369]  ? process_one_work+0x3d0/0x3d0
[ 2783.155372]  kthread+0x12a/0x150
[ 2783.155376]  ? set_kthread_struct+0x50/0x50
[ 2783.155379]  ret_from_fork+0x22/0x30
[ 2783.155385]  </TASK>
[ 2783.155423] INFO: task kworker/3:1:267 blocked for more than 121 seconds.
[ 2783.485797]       Not tainted 5.15.0-40-generic #43-Ubuntu
[ 2783.752485] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
disables this message.
[ 2784.133541] task:kworker/3:1     state:D stack:    0 pid:  267
ppid:     2 flags:0x00004000
[ 2784.133544] Workqueue: events update_writeback_rate [bcache]
[ 2784.133588] Call Trace:
[ 2784.133589]  <TASK>
[ 2784.133591]  __schedule+0x23d/0x590
[ 2784.133594]  schedule+0x4e/0xb0
[ 2784.133596]  rwsem_down_read_slowpath+0x32e/0x380
[ 2784.133600]  down_read+0x43/0x90
[ 2784.133602]  update_writeback_rate+0x134/0x190 [bcache]
[ 2784.133619]  process_one_work+0x22b/0x3d0
[ 2784.133623]  worker_thread+0x53/0x410
[ 2784.133626]  ? process_one_work+0x3d0/0x3d0
[ 2784.133630]  kthread+0x12a/0x150
[ 2784.133633]  ? set_kthread_struct+0x50/0x50
[ 2784.133636]  ret_from_fork+0x22/0x30
[ 2784.133640]  </TASK>
[ 2784.133650] INFO: task kworker/25:2:326 blocked for more than 122 seconds.
[ 2784.467880]       Not tainted 5.15.0-40-generic #43-Ubuntu
[ 2784.734405] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
disables this message.
[ 2785.114677] task:kworker/25:2    state:D stack:    0 pid:  326
ppid:     2 flags:0x00004000
[ 2785.114692] Workqueue: bcache bch_data_insert_start [bcache]
[ 2785.114720] Call Trace:
[ 2785.114721]  <TASK>
[ 2785.114723]  __schedule+0x23d/0x590
[ 2785.114726]  schedule+0x4e/0xb0
[ 2785.114729]  bch_bucket_alloc+0x21a/0x5b0 [bcache]
[ 2785.114740]  ? bch_btree_insert_check_key+0x1e0/0x1e0 [bcache]
[ 2785.114753]  ? wait_woken+0x70/0x70
[ 2785.114756]  __bch_bucket_alloc_set+0x52/0xf0 [bcache]
[ 2785.114766]  bch_alloc_sectors+0x1c9/0x4c0 [bcache]
[ 2785.114777]  ? bch_btree_insert+0xea/0x130 [bcache]
[ 2785.114789]  ? __queue_work+0x211/0x480
[ 2785.114793]  bch_data_insert_start+0x15e/0x3a0 [bcache]
[ 2785.114806]  ? closure_sub+0x94/0xb0 [bcache]
[ 2785.114818]  process_one_work+0x22b/0x3d0
[ 2785.114822]  worker_thread+0x53/0x410
[ 2785.114826]  ? process_one_work+0x3d0/0x3d0
[ 2785.114829]  kthread+0x12a/0x150
[ 2785.114832]  ? set_kthread_struct+0x50/0x50
[ 2785.114835]  ret_from_fork+0x22/0x30
[ 2785.114839]  </TASK>
[ 2785.114864] INFO: task kworker/6:0:3038 blocked for more than 123 seconds.
[ 2785.444697]       Not tainted 5.15.0-40-generic #43-Ubuntu
[ 2785.711071] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
disables this message.
[ 2786.091440] task:kworker/6:0     state:D stack:    0 pid: 3038
ppid:     2 flags:0x00004000
[ 2786.091446] Workqueue: xfs-sync/bcache0 xfs_log_worker [xfs]
[ 2786.091582] Call Trace:
[ 2786.091584]  <TASK>
[ 2786.091585]  __schedule+0x23d/0x590
[ 2786.091589]  schedule+0x4e/0xb0
[ 2786.091591]  rwsem_down_read_slowpath+0x32e/0x380
[ 2786.091594]  down_read+0x43/0x90
[ 2786.091597]  cached_dev_write+0x7e/0x480 [bcache]
[ 2786.091613]  cached_dev_submit_bio+0x4ef/0x540 [bcache]
[ 2786.091626]  __submit_bio+0x1a1/0x220
[ 2786.091631]  __submit_bio_noacct+0x85/0x1f0
[ 2786.091634]  ? mutex_lock+0x13/0x40
[ 2786.091637]  submit_bio_noacct+0x4e/0x120
[ 2786.091640]  submit_bio+0x4a/0x130
[ 2786.091642]  xlog_write_iclog+0x254/0x300 [xfs]
[ 2786.091725]  xlog_sync+0x1ab/0x2c0 [xfs]
[ 2786.091807]  xlog_state_release_iclog+0x123/0x1d0 [xfs]
[ 2786.091925]  xfs_log_force+0x186/0x250 [xfs]
[ 2786.091994]  xfs_log_worker+0x39/0x90 [xfs]
[ 2786.092063]  process_one_work+0x22b/0x3d0
[ 2786.092067]  worker_thread+0x53/0x410
[ 2786.092069]  ? process_one_work+0x3d0/0x3d0
[ 2786.092072]  kthread+0x12a/0x150
[ 2786.092074]  ? set_kthread_struct+0x50/0x50
[ 2786.092077]  ret_from_fork+0x22/0x30
[ 2786.092081]  </TASK>
[ 2786.092088] INFO: task kworker/25:0:3047 blocked for more than 124 seconds.
[ 2786.430313]       Not tainted 5.15.0-40-generic #43-Ubuntu
[ 2786.696626] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
disables this message.
[ 2787.077056] task:kworker/25:0    state:D stack:    0 pid: 3047
ppid:     2 flags:0x00004000
[ 2787.077071] Workqueue: bcache bch_data_insert_start [bcache]
[ 2787.077098] Call Trace:
[ 2787.077099]  <TASK>
[ 2787.077101]  __schedule+0x23d/0x590
[ 2787.077104]  schedule+0x4e/0xb0
[ 2787.077106]  bch_bucket_alloc+0x21a/0x5b0 [bcache]
[ 2787.077116]  ? wait_woken+0x70/0x70
[ 2787.077119]  __bch_bucket_alloc_set+0x52/0xf0 [bcache]
[ 2787.077142]  bch_alloc_sectors+0x1c9/0x4c0 [bcache]
[ 2787.077153]  ? __bch_submit_bbio+0x97/0xb0 [bcache]
[ 2787.077166]  bch_data_insert_start+0x15e/0x3a0 [bcache]
[ 2787.077179]  ? closure_sub+0x94/0xb0 [bcache]
[ 2787.077191]  process_one_work+0x22b/0x3d0
[ 2787.077195]  worker_thread+0x53/0x410
[ 2787.077198]  ? process_one_work+0x3d0/0x3d0
[ 2787.077202]  kthread+0x12a/0x150
[ 2787.077205]  ? set_kthread_struct+0x50/0x50
[ 2787.077207]  ret_from_fork+0x22/0x30
[ 2787.077212]  </TASK>
[ 2787.077247] INFO: task bcache_writebac:3080 blocked for more than
125 seconds.
[ 2787.423890]       Not tainted 5.15.0-40-generic #43-Ubuntu
[ 2787.690476] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
disables this message.
[ 2788.071063] task:bcache_writebac state:D stack:    0 pid: 3080
ppid:     2 flags:0x00004000
[ 2788.071080] Call Trace:
[ 2788.071081]  <TASK>
[ 2788.071082]  __schedule+0x23d/0x590
[ 2788.071086]  schedule+0x4e/0xb0
[ 2788.071100]  rwsem_down_write_slowpath+0x220/0x3d0
[ 2788.071105]  ? del_timer_sync+0x6c/0xb0
[ 2788.071109]  down_write+0x43/0x50
[ 2788.071112]  bch_writeback_thread+0x78/0x320 [bcache]
[ 2788.071142]  ? read_dirty+0x5a0/0x5a0 [bcache]
[ 2788.071158]  kthread+0x12a/0x150
[ 2788.071161]  ? set_kthread_struct+0x50/0x50
[ 2788.071164]  ret_from_fork+0x22/0x30
[ 2788.071168]  </TASK>
[ 2788.071199] INFO: task fio:3123 blocked for more than 126 seconds.
[ 2788.367175]       Not tainted 5.15.0-40-generic #43-Ubuntu
[ 2788.633386] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
disables this message.
[ 2789.014084] task:fio             state:D stack:    0 pid: 3123
ppid:  2378 flags:0x00004002
[ 2789.014087] Call Trace:
[ 2789.014089]  <TASK>
[ 2789.014090]  __schedule+0x23d/0x590
[ 2789.014106]  schedule+0x4e/0xb0
[ 2789.014108]  io_schedule+0x46/0x70
[ 2789.014110]  wait_on_page_bit_common+0x10c/0x3d0
[ 2789.014115]  ? filemap_invalidate_unlock_two+0x40/0x40
[ 2789.014118]  wait_on_page_bit+0x3f/0x50
[ 2789.014120]  wait_on_page_writeback+0x26/0x80
[ 2789.014124]  __filemap_fdatawait_range+0x97/0x110
[ 2789.014126]  file_write_and_wait_range+0xcc/0xf0
[ 2789.014130]  xfs_file_fsync+0x5b/0x250 [xfs]
[ 2789.014207]  vfs_fsync_range+0x49/0x80
[ 2789.014212]  ? __fget_light+0x32/0x80
[ 2789.014217]  __x64_sys_fsync+0x38/0x60
[ 2789.014220]  do_syscall_64+0x5c/0xc0
[ 2789.014223]  ? do_syscall_64+0x69/0xc0
[ 2789.014225]  ? do_syscall_64+0x69/0xc0
[ 2789.014226]  ? syscall_exit_to_user_mode+0x27/0x50
[ 2789.014230]  ? __x64_sys_write+0x19/0x20
[ 2789.014232]  ? do_syscall_64+0x69/0xc0
[ 2789.014234]  ? do_syscall_64+0x69/0xc0
[ 2789.014235]  ? asm_sysvec_apic_timer_interrupt+0xa/0x20
[ 2789.014239]  entry_SYSCALL_64_after_hwframe+0x44/0xae
[ 2789.014243] RIP: 0033:0x7f2cfdcfea5b
[ 2789.014245] RSP: 002b:00007ffcd87c6e60 EFLAGS: 00000293 ORIG_RAX:
000000000000004a
[ 2789.014248] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f2cfdcfea5b
[ 2789.014249] RDX: 0000000000000000 RSI: 000055c8d26e6f40 RDI: 0000000000000006
[ 2789.014250] RBP: 0000000000001000 R08: 0000000000000000 R09: 000055c8d26e6f40
[ 2789.014252] R10: 00000000a318c620 R11: 0000000000000293 R12: 00007f2cf38a7000
[ 2789.014253] R13: 0000000000000000 R14: 00007f2cfb0522f0 R15: 0000000500000000
[ 2789.014255]  </TASK>
[ 2909.151501] INFO: task kworker/2:1:255 blocked for more than 247 seconds.
[ 2909.481856]       Not tainted 5.15.0-40-generic #43-Ubuntu
[ 2909.748707] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
disables this message.
[ 2910.130132] task:kworker/2:1     state:D stack:    0 pid:  255
ppid:     2 flags:0x00004000
[ 2910.130147] Workqueue: bcache bch_data_insert_start [bcache]
[ 2910.130176] Call Trace:
[ 2910.130177]  <TASK>
[ 2910.130179]  __schedule+0x23d/0x590
[ 2910.130182]  schedule+0x4e/0xb0
[ 2910.130184]  bch_bucket_alloc+0x21a/0x5b0 [bcache]
[ 2910.130194]  ? bch_btree_insert_check_key+0x1e0/0x1e0 [bcache]
[ 2910.130219]  ? wait_woken+0x70/0x70
[ 2910.130222]  __bch_bucket_alloc_set+0x52/0xf0 [bcache]
[ 2910.130233]  bch_alloc_sectors+0x1c9/0x4c0 [bcache]
[ 2910.130243]  ? bch_btree_insert+0xea/0x130 [bcache]
[ 2910.130255]  ? __queue_work+0x211/0x480
[ 2910.130259]  bch_data_insert_start+0x15e/0x3a0 [bcache]
[ 2910.130272]  ? closure_sub+0x94/0xb0 [bcache]
[ 2910.130285]  process_one_work+0x22b/0x3d0
[ 2910.130288]  worker_thread+0x53/0x410
[ 2910.130292]  ? process_one_work+0x3d0/0x3d0
[ 2910.130295]  kthread+0x12a/0x150
[ 2910.130298]  ? set_kthread_struct+0x50/0x50
[ 2910.130301]  ret_from_fork+0x22/0x30
[ 2910.130305]  </TASK>
[ 2910.130307] INFO: task kworker/3:1:267 blocked for more than 248 seconds.
[ 2910.456520]       Not tainted 5.15.0-40-generic #43-Ubuntu
[ 2910.723476] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
disables this message.
[ 2911.104601] task:kworker/3:1     state:D stack:    0 pid:  267
ppid:     2 flags:0x00004000
[ 2911.104617] Workqueue: events update_writeback_rate [bcache]
[ 2911.104647] Call Trace:
[ 2911.104648]  <TASK>
[ 2911.104649]  __schedule+0x23d/0x590
[ 2911.104652]  schedule+0x4e/0xb0
[ 2911.104654]  rwsem_down_read_slowpath+0x32e/0x380
[ 2911.104657]  down_read+0x43/0x90
[ 2911.104660]  update_writeback_rate+0x134/0x190 [bcache]
[ 2911.104676]  process_one_work+0x22b/0x3d0
[ 2911.104680]  worker_thread+0x53/0x410
[ 2911.104683]  ? process_one_work+0x3d0/0x3d0
[ 2911.104687]  kthread+0x12a/0x150
[ 2911.104690]  ? set_kthread_struct+0x50/0x50
[ 2911.104693]  ret_from_fork+0x22/0x30
[ 2911.104697]  </TASK>
[ 2911.104733] INFO: task kworker/25:2:326 blocked for more than 249 seconds.
[ 2911.439086]       Not tainted 5.15.0-40-generic #43-Ubuntu
[ 2911.706294] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
disables this message.
[ 2912.087471] task:kworker/25:2    state:D stack:    0 pid:  326
ppid:     2 flags:0x00004000
[ 2912.087475] Workqueue: bcache bch_data_insert_start [bcache]
[ 2912.087495] Call Trace:
[ 2912.087497]  <TASK>
[ 2912.087498]  __schedule+0x23d/0x590
[ 2912.087503]  schedule+0x4e/0xb0
[ 2912.087505]  bch_bucket_alloc+0x21a/0x5b0 [bcache]
[ 2912.087516]  ? bch_btree_insert_check_key+0x1e0/0x1e0 [bcache]
[ 2912.087528]  ? wait_woken+0x70/0x70
[ 2912.087532]  __bch_bucket_alloc_set+0x52/0xf0 [bcache]
[ 2912.087543]  bch_alloc_sectors+0x1c9/0x4c0 [bcache]
[ 2912.087553]  ? bch_btree_insert+0xea/0x130 [bcache]
[ 2912.087565]  ? __queue_work+0x211/0x480
[ 2912.087570]  bch_data_insert_start+0x15e/0x3a0 [bcache]
[ 2912.087583]  ? closure_sub+0x94/0xb0 [bcache]
[ 2912.087595]  process_one_work+0x22b/0x3d0
[ 2912.087599]  worker_thread+0x53/0x410
[ 2912.087602]  ? process_one_work+0x3d0/0x3d0
[ 2912.087606]  kthread+0x12a/0x150
[ 2912.087609]  ? set_kthread_struct+0x50/0x50
[ 2912.087612]  ret_from_fork+0x22/0x30
[ 2912.087617]  </TASK>

Is this a bug? It's in writeback mode. I'd setup the cache and run stuff like,

echo 0 > /sys/block/bcache0/bcache/sequential_cutoff

I had also echoed 0 into congested_read_threshold_us,
congested_write_threshold_us.

echo writeback > /sys/block/bcache0/bcache/cache_mode

Regards,
Nikhil.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: seeing this stace trace on kernel 5.15
  2022-06-28  6:31 seeing this stace trace on kernel 5.15 Nikhil Kshirsagar
@ 2022-06-28  7:23 ` Nikhil Kshirsagar
  2022-06-28 14:37 ` Coly Li
  1 sibling, 0 replies; 14+ messages in thread
From: Nikhil Kshirsagar @ 2022-06-28  7:23 UTC (permalink / raw)
  To: Coly Li, linux-bcache

Able to reproduce this very easily each time I run the same fio test -

[  969.525374] INFO: task kworker/0:6:261 blocked for more than 120 seconds.
[  969.850975]       Not tainted 5.15.0-40-generic #43-Ubuntu
[  970.117328] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
disables this message.
[  970.497751] task:kworker/0:6     state:D stack:    0 pid:  261
ppid:     2 flags:0x00004000
[  970.497768] Workqueue: events update_writeback_rate [bcache]
[  970.497806] Call Trace:
[  970.497808]  <TASK>
[  970.497810]  __schedule+0x23d/0x590
[  970.497826]  schedule+0x4e/0xb0
[  970.497828]  rwsem_down_read_slowpath+0x32e/0x380
[  970.497832]  down_read+0x43/0x90
[  970.497834]  update_writeback_rate+0x134/0x190 [bcache]
[  970.497850]  process_one_work+0x22b/0x3d0
[  970.497856]  worker_thread+0x53/0x410
[  970.497859]  ? process_one_work+0x3d0/0x3d0
[  970.497862]  kthread+0x12a/0x150
[  970.497865]  ? set_kthread_struct+0x50/0x50
[  970.497868]  ret_from_fork+0x22/0x30
[  970.497874]  </TASK>
[  970.497885] INFO: task kworker/22:2:326 blocked for more than 121 seconds.
[  970.831922]       Not tainted 5.15.0-40-generic #43-Ubuntu
[  971.098073] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
disables this message.
[  971.478693] task:kworker/22:2    state:D stack:    0 pid:  326
ppid:     2 flags:0x00004000
[  971.478696] Workqueue: bcache bch_data_insert_start [bcache]
[  971.478734] Call Trace:
[  971.478736]  <TASK>
[  971.478737]  __schedule+0x23d/0x590
[  971.478740]  schedule+0x4e/0xb0
[  971.478742]  bch_bucket_alloc+0x21a/0x5b0 [bcache]
[  971.478752]  ? wait_woken+0x70/0x70
[  971.478756]  __bch_bucket_alloc_set+0x52/0xf0 [bcache]
[  971.478779]  bch_alloc_sectors+0x1c9/0x4c0 [bcache]
[  971.478790]  ? __bch_submit_bbio+0x97/0xb0 [bcache]
[  971.478803]  bch_data_insert_start+0x15e/0x3a0 [bcache]
[  971.478816]  ? closure_sub+0x94/0xb0 [bcache]
[  971.478829]  process_one_work+0x22b/0x3d0
[  971.478832]  worker_thread+0x53/0x410
[  971.478836]  ? process_one_work+0x3d0/0x3d0
[  971.478839]  kthread+0x12a/0x150
[  971.478842]  ? set_kthread_struct+0x50/0x50
[  971.478845]  ret_from_fork+0x22/0x30
[  971.478849]  </TASK>
[  971.478895] INFO: task bcache_writebac:2911 blocked for more than
122 seconds.
[  971.829737]       Not tainted 5.15.0-40-generic #43-Ubuntu
[  972.096191] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
disables this message.
[  972.476825] task:bcache_writebac state:D stack:    0 pid: 2911
ppid:     2 flags:0x00004000
[  972.476841] Call Trace:
[  972.476842]  <TASK>
[  972.476843]  __schedule+0x23d/0x590
[  972.476847]  schedule+0x4e/0xb0
[  972.476861]  rwsem_down_write_slowpath+0x220/0x3d0
[  972.476865]  ? del_timer_sync+0x6c/0xb0
[  972.476869]  down_write+0x43/0x50
[  972.476872]  bch_writeback_thread+0x78/0x320 [bcache]
[  972.476891]  ? read_dirty+0x5a0/0x5a0 [bcache]
[  972.476919]  kthread+0x12a/0x150
[  972.476922]  ? set_kthread_struct+0x50/0x50
[  972.476926]  ret_from_fork+0x22/0x30
[  972.476930]  </TASK>
[  972.476933] INFO: task xfsaild/bcache0:2955 blocked for more than
123 seconds.
[  972.825513]       Not tainted 5.15.0-40-generic #43-Ubuntu
[  973.092507] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
disables this message.
[  973.473651] task:xfsaild/bcache0 state:D stack:    0 pid: 2955
ppid:     2 flags:0x00004000
[  973.473653] Call Trace:
[  973.473654]  <TASK>
[  973.473667]  __schedule+0x23d/0x590
[  973.473669]  ? cpumask_next_and+0x24/0x30
[  973.473674]  schedule+0x4e/0xb0
[  973.473687]  rwsem_down_read_slowpath+0x32e/0x380
[  973.473690]  down_read+0x43/0x90
[  973.473693]  cached_dev_write+0x7e/0x480 [bcache]
[  973.473707]  ? recalibrate_cpu_khz+0x10/0x10
[  973.473712]  cached_dev_submit_bio+0x4ef/0x540 [bcache]
[  973.473725]  __submit_bio+0x1a1/0x220
[  973.473740]  ? kmem_cache_alloc+0x1ab/0x2e0
[  973.473744]  ? mempool_alloc_slab+0x17/0x20
[  973.473747]  __submit_bio_noacct+0x85/0x1f0
[  973.473750]  submit_bio_noacct+0x4e/0x120
[  973.473753]  ? bio_add_page+0x68/0x90
[  973.473757]  submit_bio+0x4a/0x130
[  973.473761]  xfs_buf_ioapply_map+0x205/0x290 [xfs]
[  973.473860]  _xfs_buf_ioapply+0xe2/0x1b0 [xfs]
[  973.473936]  ? wake_up_q+0x90/0x90
[  973.473940]  __xfs_buf_submit+0x6d/0x1d0 [xfs]
[  973.474016]  xfs_buf_delwri_submit_buffers+0xd9/0x200 [xfs]
[  973.474093]  xfs_buf_delwri_submit_nowait+0x10/0x20 [xfs]
[  973.474179]  xfsaild_push+0x185/0x890 [xfs]
[  973.474260]  ? del_timer_sync+0x6c/0xb0
[  973.474262]  xfsaild+0xf2/0x210 [xfs]
[  973.474343]  ? xfsaild_push+0x890/0x890 [xfs]
[  973.474437]  kthread+0x12a/0x150
[  973.474440]  ? set_kthread_struct+0x50/0x50
[  973.474443]  ret_from_fork+0x22/0x30
[  973.474447]  </TASK>
[  973.474483] INFO: task fio:2965 blocked for more than 124 seconds.
[  973.771249]       Not tainted 5.15.0-40-generic #43-Ubuntu
[  974.038183] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
disables this message.
[  974.419303] task:fio             state:D stack:    0 pid: 2965
ppid:  2613 flags:0x00004002
[  974.419307] Call Trace:
[  974.419308]  <TASK>
[  974.419310]  __schedule+0x23d/0x590
[  974.419325]  schedule+0x4e/0xb0
[  974.419327]  io_schedule+0x46/0x70
[  974.419329]  wait_on_page_bit_common+0x10c/0x3d0
[  974.419332]  ? filemap_invalidate_unlock_two+0x40/0x40
[  974.419335]  wait_on_page_bit+0x3f/0x50
[  974.419337]  wait_on_page_writeback+0x26/0x80
[  974.419341]  __filemap_fdatawait_range+0x97/0x110
[  974.419344]  file_write_and_wait_range+0xcc/0xf0
[  974.419347]  xfs_file_fsync+0x5b/0x250 [xfs]
[  974.419439]  vfs_fsync_range+0x49/0x80
[  974.419443]  ? __fget_light+0x32/0x80
[  974.419448]  __x64_sys_fsync+0x38/0x60
[  974.419451]  do_syscall_64+0x5c/0xc0
[  974.419454]  ? fput+0x13/0x20
[  974.419457]  ? ksys_write+0xce/0xe0
[  974.419459]  ? exit_to_user_mode_prepare+0x37/0xb0
[  974.419463]  ? syscall_exit_to_user_mode+0x27/0x50
[  974.419466]  ? __x64_sys_write+0x19/0x20
[  974.419468]  ? do_syscall_64+0x69/0xc0
[  974.419470]  ? do_syscall_64+0x69/0xc0
[  974.419471]  ? do_syscall_64+0x69/0xc0
[  974.419473]  ? asm_sysvec_apic_timer_interrupt+0xa/0x20
[  974.419477]  entry_SYSCALL_64_after_hwframe+0x44/0xae
[  974.419480] RIP: 0033:0x7f91f1c76a5b
[  974.419483] RSP: 002b:00007fff7fc22340 EFLAGS: 00000293 ORIG_RAX:
000000000000004a
[  974.419485] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f91f1c76a5b
[  974.419487] RDX: 0000000000000000 RSI: 0000564d2447bf40 RDI: 0000000000000006
[  974.419488] RBP: 0000000000001000 R08: 0000000000000000 R09: 0000564d2447bf40
[  974.419489] R10: 00000000a318c620 R11: 0000000000000293 R12: 00007f91e781f000
[  974.419491] R13: 0000000000000000 R14: 00007f91eefca2f0 R15: 0000000500000000
[  974.419493]  </TASK>
[  974.419534] INFO: task kworker/16:0:3008 blocked for more than 125 seconds.
[  974.758685]       Not tainted 5.15.0-40-generic #43-Ubuntu
[  975.025650] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
disables this message.
[  975.407087] task:kworker/16:0    state:D stack:    0 pid: 3008
ppid:     2 flags:0x00004000
[  975.407103] Workqueue: bcache bch_data_insert_start [bcache]
[  975.407133] Call Trace:
[  975.407135]  <TASK>
[  975.407136]  __schedule+0x23d/0x590
[  975.407140]  schedule+0x4e/0xb0
[  975.407143]  bch_bucket_alloc+0x21a/0x5b0 [bcache]
[  975.407153]  ? bch_btree_insert_check_key+0x1e0/0x1e0 [bcache]
[  975.407166]  ? wait_woken+0x70/0x70
[  975.407169]  __bch_bucket_alloc_set+0x52/0xf0 [bcache]
[  975.407180]  bch_alloc_sectors+0x1c9/0x4c0 [bcache]
[  975.407191]  ? __wake_up_common_lock+0x8a/0xc0
[  975.407193]  bch_data_insert_start+0x15e/0x3a0 [bcache]
[  975.407207]  ? closure_sub+0x94/0xb0 [bcache]
[  975.407220]  process_one_work+0x22b/0x3d0
[  975.407224]  worker_thread+0x53/0x410
[  975.407228]  ? process_one_work+0x3d0/0x3d0
[  975.407231]  kthread+0x12a/0x150
[  975.407234]  ? set_kthread_struct+0x50/0x50
[  975.407237]  ret_from_fork+0x22/0x30
[  975.407242]  </TASK>


On Tue, 28 Jun 2022 at 12:01, Nikhil Kshirsagar <nkshirsagar@gmail.com> wrote:
>
> Hi Coly,
>
> I just kicked off a 20gb random readwrite test using fio to a bcache device,
>
> fio --filename=/home/ubuntu/bcache_mount/cacahedfile --size=20GB
> --direct=1 --rw=randrw --bs=4k --ioengine=libaio --iodepth=128
> --name=iops-test-job --eta-newline=1
>
> # lsblk /dev/sdc /dev/nvme0n1
> NAME        MAJ:MIN RM   SIZE RO TYPE MOUNTPOINTS
> sdc           8:32   0 279.4G  0 disk
> └─sdc1        8:33   0    60G  0 part
>   └─bcache0 252:0    0    60G  0 disk /home/ubuntu/bcache_mount
> nvme0n1     259:0    0 372.6G  0 disk
> └─nvme0n1p1 259:2    0    15G  0 part
>   └─bcache0 252:0    0    60G  0 disk /home/ubuntu/bcache_mount
>
> and am seeing this trace in the dmesg,
>
> [ 2475.034909] XFS (bcache0): Ending clean mount
> [ 2475.036238] xfs filesystem being mounted at
> /home/ubuntu/bcache_mount supports timestamps until 2038 (0x7fffffff)
>
> [ 2782.176705] INFO: task kworker/2:1:255 blocked for more than 120 seconds.
> [ 2782.507171]       Not tainted 5.15.0-40-generic #43-Ubuntu
> [ 2782.774078] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
> disables this message.
> [ 2783.155206] task:kworker/2:1     state:D stack:    0 pid:  255
> ppid:     2 flags:0x00004000
> [ 2783.155210] Workqueue: bcache bch_data_insert_start [bcache]
> [ 2783.155259] Call Trace:
> [ 2783.155261]  <TASK>
> [ 2783.155263]  __schedule+0x23d/0x590
> [ 2783.155269]  schedule+0x4e/0xb0
> [ 2783.155271]  bch_bucket_alloc+0x21a/0x5b0 [bcache]
> [ 2783.155281]  ? bch_btree_insert_check_key+0x1e0/0x1e0 [bcache]
> [ 2783.155294]  ? wait_woken+0x70/0x70
> [ 2783.155298]  __bch_bucket_alloc_set+0x52/0xf0 [bcache]
> [ 2783.155309]  bch_alloc_sectors+0x1c9/0x4c0 [bcache]
> [ 2783.155319]  ? bch_btree_insert+0xea/0x130 [bcache]
> [ 2783.155331]  ? __queue_work+0x211/0x480
> [ 2783.155336]  bch_data_insert_start+0x15e/0x3a0 [bcache]
> [ 2783.155349]  ? closure_sub+0x94/0xb0 [bcache]
> [ 2783.155362]  process_one_work+0x22b/0x3d0
> [ 2783.155366]  worker_thread+0x53/0x410
> [ 2783.155369]  ? process_one_work+0x3d0/0x3d0
> [ 2783.155372]  kthread+0x12a/0x150
> [ 2783.155376]  ? set_kthread_struct+0x50/0x50
> [ 2783.155379]  ret_from_fork+0x22/0x30
> [ 2783.155385]  </TASK>
> [ 2783.155423] INFO: task kworker/3:1:267 blocked for more than 121 seconds.
> [ 2783.485797]       Not tainted 5.15.0-40-generic #43-Ubuntu
> [ 2783.752485] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
> disables this message.
> [ 2784.133541] task:kworker/3:1     state:D stack:    0 pid:  267
> ppid:     2 flags:0x00004000
> [ 2784.133544] Workqueue: events update_writeback_rate [bcache]
> [ 2784.133588] Call Trace:
> [ 2784.133589]  <TASK>
> [ 2784.133591]  __schedule+0x23d/0x590
> [ 2784.133594]  schedule+0x4e/0xb0
> [ 2784.133596]  rwsem_down_read_slowpath+0x32e/0x380
> [ 2784.133600]  down_read+0x43/0x90
> [ 2784.133602]  update_writeback_rate+0x134/0x190 [bcache]
> [ 2784.133619]  process_one_work+0x22b/0x3d0
> [ 2784.133623]  worker_thread+0x53/0x410
> [ 2784.133626]  ? process_one_work+0x3d0/0x3d0
> [ 2784.133630]  kthread+0x12a/0x150
> [ 2784.133633]  ? set_kthread_struct+0x50/0x50
> [ 2784.133636]  ret_from_fork+0x22/0x30
> [ 2784.133640]  </TASK>
> [ 2784.133650] INFO: task kworker/25:2:326 blocked for more than 122 seconds.
> [ 2784.467880]       Not tainted 5.15.0-40-generic #43-Ubuntu
> [ 2784.734405] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
> disables this message.
> [ 2785.114677] task:kworker/25:2    state:D stack:    0 pid:  326
> ppid:     2 flags:0x00004000
> [ 2785.114692] Workqueue: bcache bch_data_insert_start [bcache]
> [ 2785.114720] Call Trace:
> [ 2785.114721]  <TASK>
> [ 2785.114723]  __schedule+0x23d/0x590
> [ 2785.114726]  schedule+0x4e/0xb0
> [ 2785.114729]  bch_bucket_alloc+0x21a/0x5b0 [bcache]
> [ 2785.114740]  ? bch_btree_insert_check_key+0x1e0/0x1e0 [bcache]
> [ 2785.114753]  ? wait_woken+0x70/0x70
> [ 2785.114756]  __bch_bucket_alloc_set+0x52/0xf0 [bcache]
> [ 2785.114766]  bch_alloc_sectors+0x1c9/0x4c0 [bcache]
> [ 2785.114777]  ? bch_btree_insert+0xea/0x130 [bcache]
> [ 2785.114789]  ? __queue_work+0x211/0x480
> [ 2785.114793]  bch_data_insert_start+0x15e/0x3a0 [bcache]
> [ 2785.114806]  ? closure_sub+0x94/0xb0 [bcache]
> [ 2785.114818]  process_one_work+0x22b/0x3d0
> [ 2785.114822]  worker_thread+0x53/0x410
> [ 2785.114826]  ? process_one_work+0x3d0/0x3d0
> [ 2785.114829]  kthread+0x12a/0x150
> [ 2785.114832]  ? set_kthread_struct+0x50/0x50
> [ 2785.114835]  ret_from_fork+0x22/0x30
> [ 2785.114839]  </TASK>
> [ 2785.114864] INFO: task kworker/6:0:3038 blocked for more than 123 seconds.
> [ 2785.444697]       Not tainted 5.15.0-40-generic #43-Ubuntu
> [ 2785.711071] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
> disables this message.
> [ 2786.091440] task:kworker/6:0     state:D stack:    0 pid: 3038
> ppid:     2 flags:0x00004000
> [ 2786.091446] Workqueue: xfs-sync/bcache0 xfs_log_worker [xfs]
> [ 2786.091582] Call Trace:
> [ 2786.091584]  <TASK>
> [ 2786.091585]  __schedule+0x23d/0x590
> [ 2786.091589]  schedule+0x4e/0xb0
> [ 2786.091591]  rwsem_down_read_slowpath+0x32e/0x380
> [ 2786.091594]  down_read+0x43/0x90
> [ 2786.091597]  cached_dev_write+0x7e/0x480 [bcache]
> [ 2786.091613]  cached_dev_submit_bio+0x4ef/0x540 [bcache]
> [ 2786.091626]  __submit_bio+0x1a1/0x220
> [ 2786.091631]  __submit_bio_noacct+0x85/0x1f0
> [ 2786.091634]  ? mutex_lock+0x13/0x40
> [ 2786.091637]  submit_bio_noacct+0x4e/0x120
> [ 2786.091640]  submit_bio+0x4a/0x130
> [ 2786.091642]  xlog_write_iclog+0x254/0x300 [xfs]
> [ 2786.091725]  xlog_sync+0x1ab/0x2c0 [xfs]
> [ 2786.091807]  xlog_state_release_iclog+0x123/0x1d0 [xfs]
> [ 2786.091925]  xfs_log_force+0x186/0x250 [xfs]
> [ 2786.091994]  xfs_log_worker+0x39/0x90 [xfs]
> [ 2786.092063]  process_one_work+0x22b/0x3d0
> [ 2786.092067]  worker_thread+0x53/0x410
> [ 2786.092069]  ? process_one_work+0x3d0/0x3d0
> [ 2786.092072]  kthread+0x12a/0x150
> [ 2786.092074]  ? set_kthread_struct+0x50/0x50
> [ 2786.092077]  ret_from_fork+0x22/0x30
> [ 2786.092081]  </TASK>
> [ 2786.092088] INFO: task kworker/25:0:3047 blocked for more than 124 seconds.
> [ 2786.430313]       Not tainted 5.15.0-40-generic #43-Ubuntu
> [ 2786.696626] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
> disables this message.
> [ 2787.077056] task:kworker/25:0    state:D stack:    0 pid: 3047
> ppid:     2 flags:0x00004000
> [ 2787.077071] Workqueue: bcache bch_data_insert_start [bcache]
> [ 2787.077098] Call Trace:
> [ 2787.077099]  <TASK>
> [ 2787.077101]  __schedule+0x23d/0x590
> [ 2787.077104]  schedule+0x4e/0xb0
> [ 2787.077106]  bch_bucket_alloc+0x21a/0x5b0 [bcache]
> [ 2787.077116]  ? wait_woken+0x70/0x70
> [ 2787.077119]  __bch_bucket_alloc_set+0x52/0xf0 [bcache]
> [ 2787.077142]  bch_alloc_sectors+0x1c9/0x4c0 [bcache]
> [ 2787.077153]  ? __bch_submit_bbio+0x97/0xb0 [bcache]
> [ 2787.077166]  bch_data_insert_start+0x15e/0x3a0 [bcache]
> [ 2787.077179]  ? closure_sub+0x94/0xb0 [bcache]
> [ 2787.077191]  process_one_work+0x22b/0x3d0
> [ 2787.077195]  worker_thread+0x53/0x410
> [ 2787.077198]  ? process_one_work+0x3d0/0x3d0
> [ 2787.077202]  kthread+0x12a/0x150
> [ 2787.077205]  ? set_kthread_struct+0x50/0x50
> [ 2787.077207]  ret_from_fork+0x22/0x30
> [ 2787.077212]  </TASK>
> [ 2787.077247] INFO: task bcache_writebac:3080 blocked for more than
> 125 seconds.
> [ 2787.423890]       Not tainted 5.15.0-40-generic #43-Ubuntu
> [ 2787.690476] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
> disables this message.
> [ 2788.071063] task:bcache_writebac state:D stack:    0 pid: 3080
> ppid:     2 flags:0x00004000
> [ 2788.071080] Call Trace:
> [ 2788.071081]  <TASK>
> [ 2788.071082]  __schedule+0x23d/0x590
> [ 2788.071086]  schedule+0x4e/0xb0
> [ 2788.071100]  rwsem_down_write_slowpath+0x220/0x3d0
> [ 2788.071105]  ? del_timer_sync+0x6c/0xb0
> [ 2788.071109]  down_write+0x43/0x50
> [ 2788.071112]  bch_writeback_thread+0x78/0x320 [bcache]
> [ 2788.071142]  ? read_dirty+0x5a0/0x5a0 [bcache]
> [ 2788.071158]  kthread+0x12a/0x150
> [ 2788.071161]  ? set_kthread_struct+0x50/0x50
> [ 2788.071164]  ret_from_fork+0x22/0x30
> [ 2788.071168]  </TASK>
> [ 2788.071199] INFO: task fio:3123 blocked for more than 126 seconds.
> [ 2788.367175]       Not tainted 5.15.0-40-generic #43-Ubuntu
> [ 2788.633386] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
> disables this message.
> [ 2789.014084] task:fio             state:D stack:    0 pid: 3123
> ppid:  2378 flags:0x00004002
> [ 2789.014087] Call Trace:
> [ 2789.014089]  <TASK>
> [ 2789.014090]  __schedule+0x23d/0x590
> [ 2789.014106]  schedule+0x4e/0xb0
> [ 2789.014108]  io_schedule+0x46/0x70
> [ 2789.014110]  wait_on_page_bit_common+0x10c/0x3d0
> [ 2789.014115]  ? filemap_invalidate_unlock_two+0x40/0x40
> [ 2789.014118]  wait_on_page_bit+0x3f/0x50
> [ 2789.014120]  wait_on_page_writeback+0x26/0x80
> [ 2789.014124]  __filemap_fdatawait_range+0x97/0x110
> [ 2789.014126]  file_write_and_wait_range+0xcc/0xf0
> [ 2789.014130]  xfs_file_fsync+0x5b/0x250 [xfs]
> [ 2789.014207]  vfs_fsync_range+0x49/0x80
> [ 2789.014212]  ? __fget_light+0x32/0x80
> [ 2789.014217]  __x64_sys_fsync+0x38/0x60
> [ 2789.014220]  do_syscall_64+0x5c/0xc0
> [ 2789.014223]  ? do_syscall_64+0x69/0xc0
> [ 2789.014225]  ? do_syscall_64+0x69/0xc0
> [ 2789.014226]  ? syscall_exit_to_user_mode+0x27/0x50
> [ 2789.014230]  ? __x64_sys_write+0x19/0x20
> [ 2789.014232]  ? do_syscall_64+0x69/0xc0
> [ 2789.014234]  ? do_syscall_64+0x69/0xc0
> [ 2789.014235]  ? asm_sysvec_apic_timer_interrupt+0xa/0x20
> [ 2789.014239]  entry_SYSCALL_64_after_hwframe+0x44/0xae
> [ 2789.014243] RIP: 0033:0x7f2cfdcfea5b
> [ 2789.014245] RSP: 002b:00007ffcd87c6e60 EFLAGS: 00000293 ORIG_RAX:
> 000000000000004a
> [ 2789.014248] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f2cfdcfea5b
> [ 2789.014249] RDX: 0000000000000000 RSI: 000055c8d26e6f40 RDI: 0000000000000006
> [ 2789.014250] RBP: 0000000000001000 R08: 0000000000000000 R09: 000055c8d26e6f40
> [ 2789.014252] R10: 00000000a318c620 R11: 0000000000000293 R12: 00007f2cf38a7000
> [ 2789.014253] R13: 0000000000000000 R14: 00007f2cfb0522f0 R15: 0000000500000000
> [ 2789.014255]  </TASK>
> [ 2909.151501] INFO: task kworker/2:1:255 blocked for more than 247 seconds.
> [ 2909.481856]       Not tainted 5.15.0-40-generic #43-Ubuntu
> [ 2909.748707] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
> disables this message.
> [ 2910.130132] task:kworker/2:1     state:D stack:    0 pid:  255
> ppid:     2 flags:0x00004000
> [ 2910.130147] Workqueue: bcache bch_data_insert_start [bcache]
> [ 2910.130176] Call Trace:
> [ 2910.130177]  <TASK>
> [ 2910.130179]  __schedule+0x23d/0x590
> [ 2910.130182]  schedule+0x4e/0xb0
> [ 2910.130184]  bch_bucket_alloc+0x21a/0x5b0 [bcache]
> [ 2910.130194]  ? bch_btree_insert_check_key+0x1e0/0x1e0 [bcache]
> [ 2910.130219]  ? wait_woken+0x70/0x70
> [ 2910.130222]  __bch_bucket_alloc_set+0x52/0xf0 [bcache]
> [ 2910.130233]  bch_alloc_sectors+0x1c9/0x4c0 [bcache]
> [ 2910.130243]  ? bch_btree_insert+0xea/0x130 [bcache]
> [ 2910.130255]  ? __queue_work+0x211/0x480
> [ 2910.130259]  bch_data_insert_start+0x15e/0x3a0 [bcache]
> [ 2910.130272]  ? closure_sub+0x94/0xb0 [bcache]
> [ 2910.130285]  process_one_work+0x22b/0x3d0
> [ 2910.130288]  worker_thread+0x53/0x410
> [ 2910.130292]  ? process_one_work+0x3d0/0x3d0
> [ 2910.130295]  kthread+0x12a/0x150
> [ 2910.130298]  ? set_kthread_struct+0x50/0x50
> [ 2910.130301]  ret_from_fork+0x22/0x30
> [ 2910.130305]  </TASK>
> [ 2910.130307] INFO: task kworker/3:1:267 blocked for more than 248 seconds.
> [ 2910.456520]       Not tainted 5.15.0-40-generic #43-Ubuntu
> [ 2910.723476] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
> disables this message.
> [ 2911.104601] task:kworker/3:1     state:D stack:    0 pid:  267
> ppid:     2 flags:0x00004000
> [ 2911.104617] Workqueue: events update_writeback_rate [bcache]
> [ 2911.104647] Call Trace:
> [ 2911.104648]  <TASK>
> [ 2911.104649]  __schedule+0x23d/0x590
> [ 2911.104652]  schedule+0x4e/0xb0
> [ 2911.104654]  rwsem_down_read_slowpath+0x32e/0x380
> [ 2911.104657]  down_read+0x43/0x90
> [ 2911.104660]  update_writeback_rate+0x134/0x190 [bcache]
> [ 2911.104676]  process_one_work+0x22b/0x3d0
> [ 2911.104680]  worker_thread+0x53/0x410
> [ 2911.104683]  ? process_one_work+0x3d0/0x3d0
> [ 2911.104687]  kthread+0x12a/0x150
> [ 2911.104690]  ? set_kthread_struct+0x50/0x50
> [ 2911.104693]  ret_from_fork+0x22/0x30
> [ 2911.104697]  </TASK>
> [ 2911.104733] INFO: task kworker/25:2:326 blocked for more than 249 seconds.
> [ 2911.439086]       Not tainted 5.15.0-40-generic #43-Ubuntu
> [ 2911.706294] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
> disables this message.
> [ 2912.087471] task:kworker/25:2    state:D stack:    0 pid:  326
> ppid:     2 flags:0x00004000
> [ 2912.087475] Workqueue: bcache bch_data_insert_start [bcache]
> [ 2912.087495] Call Trace:
> [ 2912.087497]  <TASK>
> [ 2912.087498]  __schedule+0x23d/0x590
> [ 2912.087503]  schedule+0x4e/0xb0
> [ 2912.087505]  bch_bucket_alloc+0x21a/0x5b0 [bcache]
> [ 2912.087516]  ? bch_btree_insert_check_key+0x1e0/0x1e0 [bcache]
> [ 2912.087528]  ? wait_woken+0x70/0x70
> [ 2912.087532]  __bch_bucket_alloc_set+0x52/0xf0 [bcache]
> [ 2912.087543]  bch_alloc_sectors+0x1c9/0x4c0 [bcache]
> [ 2912.087553]  ? bch_btree_insert+0xea/0x130 [bcache]
> [ 2912.087565]  ? __queue_work+0x211/0x480
> [ 2912.087570]  bch_data_insert_start+0x15e/0x3a0 [bcache]
> [ 2912.087583]  ? closure_sub+0x94/0xb0 [bcache]
> [ 2912.087595]  process_one_work+0x22b/0x3d0
> [ 2912.087599]  worker_thread+0x53/0x410
> [ 2912.087602]  ? process_one_work+0x3d0/0x3d0
> [ 2912.087606]  kthread+0x12a/0x150
> [ 2912.087609]  ? set_kthread_struct+0x50/0x50
> [ 2912.087612]  ret_from_fork+0x22/0x30
> [ 2912.087617]  </TASK>
>
> Is this a bug? It's in writeback mode. I'd setup the cache and run stuff like,
>
> echo 0 > /sys/block/bcache0/bcache/sequential_cutoff
>
> I had also echoed 0 into congested_read_threshold_us,
> congested_write_threshold_us.
>
> echo writeback > /sys/block/bcache0/bcache/cache_mode
>
> Regards,
> Nikhil.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: seeing this stace trace on kernel 5.15
  2022-06-28  6:31 seeing this stace trace on kernel 5.15 Nikhil Kshirsagar
  2022-06-28  7:23 ` Nikhil Kshirsagar
@ 2022-06-28 14:37 ` Coly Li
  2022-06-29  7:02   ` Nikhil Kshirsagar
  1 sibling, 1 reply; 14+ messages in thread
From: Coly Li @ 2022-06-28 14:37 UTC (permalink / raw)
  To: Nikhil Kshirsagar; +Cc: linux-bcache



> 2022年6月28日 14:31,Nikhil Kshirsagar <nkshirsagar@gmail.com> 写道:
> 
> Hi Coly,
> 
> I just kicked off a 20gb random readwrite test using fio to a bcache device,
> 
> fio --filename=/home/ubuntu/bcache_mount/cacahedfile --size=20GB
> --direct=1 --rw=randrw --bs=4k --ioengine=libaio --iodepth=128
> --name=iops-test-job --eta-newline=1
> 
> # lsblk /dev/sdc /dev/nvme0n1
> NAME        MAJ:MIN RM   SIZE RO TYPE MOUNTPOINTS
> sdc           8:32   0 279.4G  0 disk
> └─sdc1        8:33   0    60G  0 part
>  └─bcache0 252:0    0    60G  0 disk /home/ubuntu/bcache_mount
> nvme0n1     259:0    0 372.6G  0 disk
> └─nvme0n1p1 259:2    0    15G  0 part
>  └─bcache0 252:0    0    60G  0 disk /home/ubuntu/bcache_mount
> 
> and am seeing this trace in the dmesg,
> 
> [ 2475.034909] XFS (bcache0): Ending clean mount
> [ 2475.036238] xfs filesystem being mounted at
> /home/ubuntu/bcache_mount supports timestamps until 2038 (0x7fffffff)
> 
> [ 2782.176705] INFO: task kworker/2:1:255 blocked for more than 120 seconds.
> [ 2782.507171]       Not tainted 5.15.0-40-generic #43-Ubuntu
> [ 2782.774078] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
> disables this message.
> [ 2783.155206] task:kworker/2:1     state:D stack:    0 pid:  255
> ppid:     2 flags:0x00004000
> [ 2783.155210] Workqueue: bcache bch_data_insert_start [bcache]
> [ 2783.155259] Call Trace:
> [ 2783.155261]  <TASK>
> [ 2783.155263]  __schedule+0x23d/0x590
> [ 2783.155269]  schedule+0x4e/0xb0
> [ 2783.155271]  bch_bucket_alloc+0x21a/0x5b0 [bcache]
> [ 2783.155281]  ? bch_btree_insert_check_key+0x1e0/0x1e0 [bcache]
> [ 2783.155294]  ? wait_woken+0x70/0x70
> [ 2783.155298]  __bch_bucket_alloc_set+0x52/0xf0 [bcache]
> [ 2783.155309]  bch_alloc_sectors+0x1c9/0x4c0 [bcache]
> [ 2783.155319]  ? bch_btree_insert+0xea/0x130 [bcache]
> [ 2783.155331]  ? __queue_work+0x211/0x480
> [ 2783.155336]  bch_data_insert_start+0x15e/0x3a0 [bcache]
> [ 2783.155349]  ? closure_sub+0x94/0xb0 [bcache]
> [ 2783.155362]  process_one_work+0x22b/0x3d0
> [ 2783.155366]  worker_thread+0x53/0x410
> [ 2783.155369]  ? process_one_work+0x3d0/0x3d0
> [ 2783.155372]  kthread+0x12a/0x150
> [ 2783.155376]  ? set_kthread_struct+0x50/0x50
> [ 2783.155379]  ret_from_fork+0x22/0x30
> [ 2783.155385]  </TASK>
> [ 2783.155423] INFO: task kworker/3:1:267 blocked for more than 121 seconds.
> [ 2783.485797]       Not tainted 5.15.0-40-generic #43-Ubuntu
> [ 2783.752485] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
> disables this message.
> [ 2784.133541] task:kworker/3:1     state:D stack:    0 pid:  267
> ppid:     2 flags:0x00004000
> [ 2784.133544] Workqueue: events update_writeback_rate [bcache]
> [ 2784.133588] Call Trace:
> [ 2784.133589]  <TASK>
> [ 2784.133591]  __schedule+0x23d/0x590
> [ 2784.133594]  schedule+0x4e/0xb0
> [ 2784.133596]  rwsem_down_read_slowpath+0x32e/0x380
> [ 2784.133600]  down_read+0x43/0x90
> [ 2784.133602]  update_writeback_rate+0x134/0x190 [bcache]
> [ 2784.133619]  process_one_work+0x22b/0x3d0
> [ 2784.133623]  worker_thread+0x53/0x410
> [ 2784.133626]  ? process_one_work+0x3d0/0x3d0
> [ 2784.133630]  kthread+0x12a/0x150
> [ 2784.133633]  ? set_kthread_struct+0x50/0x50
> [ 2784.133636]  ret_from_fork+0x22/0x30
> [ 2784.133640]  </TASK>
> [ 2784.133650] INFO: task kworker/25:2:326 blocked for more than 122 seconds.
> [ 2784.467880]       Not tainted 5.15.0-40-generic #43-Ubuntu
> [ 2784.734405] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
> disables this message.
> [ 2785.114677] task:kworker/25:2    state:D stack:    0 pid:  326
> ppid:     2 flags:0x00004000
> [ 2785.114692] Workqueue: bcache bch_data_insert_start [bcache]
> [ 2785.114720] Call Trace:
> [ 2785.114721]  <TASK>
> [ 2785.114723]  __schedule+0x23d/0x590
> [ 2785.114726]  schedule+0x4e/0xb0
> [ 2785.114729]  bch_bucket_alloc+0x21a/0x5b0 [bcache]
> [ 2785.114740]  ? bch_btree_insert_check_key+0x1e0/0x1e0 [bcache]
> [ 2785.114753]  ? wait_woken+0x70/0x70
> [ 2785.114756]  __bch_bucket_alloc_set+0x52/0xf0 [bcache]
> [ 2785.114766]  bch_alloc_sectors+0x1c9/0x4c0 [bcache]
> [ 2785.114777]  ? bch_btree_insert+0xea/0x130 [bcache]
> [ 2785.114789]  ? __queue_work+0x211/0x480
> [ 2785.114793]  bch_data_insert_start+0x15e/0x3a0 [bcache]
> [ 2785.114806]  ? closure_sub+0x94/0xb0 [bcache]
> [ 2785.114818]  process_one_work+0x22b/0x3d0
> [ 2785.114822]  worker_thread+0x53/0x410
> [ 2785.114826]  ? process_one_work+0x3d0/0x3d0
> [ 2785.114829]  kthread+0x12a/0x150
> [ 2785.114832]  ? set_kthread_struct+0x50/0x50
> [ 2785.114835]  ret_from_fork+0x22/0x30
> [ 2785.114839]  </TASK>
> [ 2785.114864] INFO: task kworker/6:0:3038 blocked for more than 123 seconds.
> [ 2785.444697]       Not tainted 5.15.0-40-generic #43-Ubuntu
> [ 2785.711071] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
> disables this message.
> [ 2786.091440] task:kworker/6:0     state:D stack:    0 pid: 3038
> ppid:     2 flags:0x00004000
> [ 2786.091446] Workqueue: xfs-sync/bcache0 xfs_log_worker [xfs]
> [ 2786.091582] Call Trace:
> [ 2786.091584]  <TASK>
> [ 2786.091585]  __schedule+0x23d/0x590
> [ 2786.091589]  schedule+0x4e/0xb0
> [ 2786.091591]  rwsem_down_read_slowpath+0x32e/0x380
> [ 2786.091594]  down_read+0x43/0x90
> [ 2786.091597]  cached_dev_write+0x7e/0x480 [bcache]
> [ 2786.091613]  cached_dev_submit_bio+0x4ef/0x540 [bcache]
> [ 2786.091626]  __submit_bio+0x1a1/0x220
> [ 2786.091631]  __submit_bio_noacct+0x85/0x1f0
> [ 2786.091634]  ? mutex_lock+0x13/0x40
> [ 2786.091637]  submit_bio_noacct+0x4e/0x120
> [ 2786.091640]  submit_bio+0x4a/0x130
> [ 2786.091642]  xlog_write_iclog+0x254/0x300 [xfs]
> [ 2786.091725]  xlog_sync+0x1ab/0x2c0 [xfs]
> [ 2786.091807]  xlog_state_release_iclog+0x123/0x1d0 [xfs]
> [ 2786.091925]  xfs_log_force+0x186/0x250 [xfs]
> [ 2786.091994]  xfs_log_worker+0x39/0x90 [xfs]
> [ 2786.092063]  process_one_work+0x22b/0x3d0
> [ 2786.092067]  worker_thread+0x53/0x410
> [ 2786.092069]  ? process_one_work+0x3d0/0x3d0
> [ 2786.092072]  kthread+0x12a/0x150
> [ 2786.092074]  ? set_kthread_struct+0x50/0x50
> [ 2786.092077]  ret_from_fork+0x22/0x30
> [ 2786.092081]  </TASK>
> [ 2786.092088] INFO: task kworker/25:0:3047 blocked for more than 124 seconds.
> [ 2786.430313]       Not tainted 5.15.0-40-generic #43-Ubuntu
> [ 2786.696626] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
> disables this message.
> [ 2787.077056] task:kworker/25:0    state:D stack:    0 pid: 3047
> ppid:     2 flags:0x00004000
> [ 2787.077071] Workqueue: bcache bch_data_insert_start [bcache]
> [ 2787.077098] Call Trace:
> [ 2787.077099]  <TASK>
> [ 2787.077101]  __schedule+0x23d/0x590
> [ 2787.077104]  schedule+0x4e/0xb0
> [ 2787.077106]  bch_bucket_alloc+0x21a/0x5b0 [bcache]
> [ 2787.077116]  ? wait_woken+0x70/0x70
> [ 2787.077119]  __bch_bucket_alloc_set+0x52/0xf0 [bcache]
> [ 2787.077142]  bch_alloc_sectors+0x1c9/0x4c0 [bcache]
> [ 2787.077153]  ? __bch_submit_bbio+0x97/0xb0 [bcache]
> [ 2787.077166]  bch_data_insert_start+0x15e/0x3a0 [bcache]
> [ 2787.077179]  ? closure_sub+0x94/0xb0 [bcache]
> [ 2787.077191]  process_one_work+0x22b/0x3d0
> [ 2787.077195]  worker_thread+0x53/0x410
> [ 2787.077198]  ? process_one_work+0x3d0/0x3d0
> [ 2787.077202]  kthread+0x12a/0x150
> [ 2787.077205]  ? set_kthread_struct+0x50/0x50
> [ 2787.077207]  ret_from_fork+0x22/0x30
> [ 2787.077212]  </TASK>
> [ 2787.077247] INFO: task bcache_writebac:3080 blocked for more than
> 125 seconds.
> [ 2787.423890]       Not tainted 5.15.0-40-generic #43-Ubuntu
> [ 2787.690476] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
> disables this message.
> [ 2788.071063] task:bcache_writebac state:D stack:    0 pid: 3080
> ppid:     2 flags:0x00004000
> [ 2788.071080] Call Trace:
> [ 2788.071081]  <TASK>
> [ 2788.071082]  __schedule+0x23d/0x590
> [ 2788.071086]  schedule+0x4e/0xb0
> [ 2788.071100]  rwsem_down_write_slowpath+0x220/0x3d0
> [ 2788.071105]  ? del_timer_sync+0x6c/0xb0
> [ 2788.071109]  down_write+0x43/0x50
> [ 2788.071112]  bch_writeback_thread+0x78/0x320 [bcache]
> [ 2788.071142]  ? read_dirty+0x5a0/0x5a0 [bcache]
> [ 2788.071158]  kthread+0x12a/0x150
> [ 2788.071161]  ? set_kthread_struct+0x50/0x50
> [ 2788.071164]  ret_from_fork+0x22/0x30
> [ 2788.071168]  </TASK>
> [ 2788.071199] INFO: task fio:3123 blocked for more than 126 seconds.
> [ 2788.367175]       Not tainted 5.15.0-40-generic #43-Ubuntu
> [ 2788.633386] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
> disables this message.
> [ 2789.014084] task:fio             state:D stack:    0 pid: 3123
> ppid:  2378 flags:0x00004002
> [ 2789.014087] Call Trace:
> [ 2789.014089]  <TASK>
> [ 2789.014090]  __schedule+0x23d/0x590
> [ 2789.014106]  schedule+0x4e/0xb0
> [ 2789.014108]  io_schedule+0x46/0x70
> [ 2789.014110]  wait_on_page_bit_common+0x10c/0x3d0
> [ 2789.014115]  ? filemap_invalidate_unlock_two+0x40/0x40
> [ 2789.014118]  wait_on_page_bit+0x3f/0x50
> [ 2789.014120]  wait_on_page_writeback+0x26/0x80
> [ 2789.014124]  __filemap_fdatawait_range+0x97/0x110
> [ 2789.014126]  file_write_and_wait_range+0xcc/0xf0
> [ 2789.014130]  xfs_file_fsync+0x5b/0x250 [xfs]
> [ 2789.014207]  vfs_fsync_range+0x49/0x80
> [ 2789.014212]  ? __fget_light+0x32/0x80
> [ 2789.014217]  __x64_sys_fsync+0x38/0x60
> [ 2789.014220]  do_syscall_64+0x5c/0xc0
> [ 2789.014223]  ? do_syscall_64+0x69/0xc0
> [ 2789.014225]  ? do_syscall_64+0x69/0xc0
> [ 2789.014226]  ? syscall_exit_to_user_mode+0x27/0x50
> [ 2789.014230]  ? __x64_sys_write+0x19/0x20
> [ 2789.014232]  ? do_syscall_64+0x69/0xc0
> [ 2789.014234]  ? do_syscall_64+0x69/0xc0
> [ 2789.014235]  ? asm_sysvec_apic_timer_interrupt+0xa/0x20
> [ 2789.014239]  entry_SYSCALL_64_after_hwframe+0x44/0xae
> [ 2789.014243] RIP: 0033:0x7f2cfdcfea5b
> [ 2789.014245] RSP: 002b:00007ffcd87c6e60 EFLAGS: 00000293 ORIG_RAX:
> 000000000000004a
> [ 2789.014248] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f2cfdcfea5b
> [ 2789.014249] RDX: 0000000000000000 RSI: 000055c8d26e6f40 RDI: 0000000000000006
> [ 2789.014250] RBP: 0000000000001000 R08: 0000000000000000 R09: 000055c8d26e6f40
> [ 2789.014252] R10: 00000000a318c620 R11: 0000000000000293 R12: 00007f2cf38a7000
> [ 2789.014253] R13: 0000000000000000 R14: 00007f2cfb0522f0 R15: 0000000500000000
> [ 2789.014255]  </TASK>
> [ 2909.151501] INFO: task kworker/2:1:255 blocked for more than 247 seconds.
> [ 2909.481856]       Not tainted 5.15.0-40-generic #43-Ubuntu
> [ 2909.748707] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
> disables this message.
> [ 2910.130132] task:kworker/2:1     state:D stack:    0 pid:  255
> ppid:     2 flags:0x00004000
> [ 2910.130147] Workqueue: bcache bch_data_insert_start [bcache]
> [ 2910.130176] Call Trace:
> [ 2910.130177]  <TASK>
> [ 2910.130179]  __schedule+0x23d/0x590
> [ 2910.130182]  schedule+0x4e/0xb0
> [ 2910.130184]  bch_bucket_alloc+0x21a/0x5b0 [bcache]
> [ 2910.130194]  ? bch_btree_insert_check_key+0x1e0/0x1e0 [bcache]
> [ 2910.130219]  ? wait_woken+0x70/0x70
> [ 2910.130222]  __bch_bucket_alloc_set+0x52/0xf0 [bcache]
> [ 2910.130233]  bch_alloc_sectors+0x1c9/0x4c0 [bcache]
> [ 2910.130243]  ? bch_btree_insert+0xea/0x130 [bcache]
> [ 2910.130255]  ? __queue_work+0x211/0x480
> [ 2910.130259]  bch_data_insert_start+0x15e/0x3a0 [bcache]
> [ 2910.130272]  ? closure_sub+0x94/0xb0 [bcache]
> [ 2910.130285]  process_one_work+0x22b/0x3d0
> [ 2910.130288]  worker_thread+0x53/0x410
> [ 2910.130292]  ? process_one_work+0x3d0/0x3d0
> [ 2910.130295]  kthread+0x12a/0x150
> [ 2910.130298]  ? set_kthread_struct+0x50/0x50
> [ 2910.130301]  ret_from_fork+0x22/0x30
> [ 2910.130305]  </TASK>
> [ 2910.130307] INFO: task kworker/3:1:267 blocked for more than 248 seconds.
> [ 2910.456520]       Not tainted 5.15.0-40-generic #43-Ubuntu
> [ 2910.723476] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
> disables this message.
> [ 2911.104601] task:kworker/3:1     state:D stack:    0 pid:  267
> ppid:     2 flags:0x00004000
> [ 2911.104617] Workqueue: events update_writeback_rate [bcache]
> [ 2911.104647] Call Trace:
> [ 2911.104648]  <TASK>
> [ 2911.104649]  __schedule+0x23d/0x590
> [ 2911.104652]  schedule+0x4e/0xb0
> [ 2911.104654]  rwsem_down_read_slowpath+0x32e/0x380
> [ 2911.104657]  down_read+0x43/0x90
> [ 2911.104660]  update_writeback_rate+0x134/0x190 [bcache]
> [ 2911.104676]  process_one_work+0x22b/0x3d0
> [ 2911.104680]  worker_thread+0x53/0x410
> [ 2911.104683]  ? process_one_work+0x3d0/0x3d0
> [ 2911.104687]  kthread+0x12a/0x150
> [ 2911.104690]  ? set_kthread_struct+0x50/0x50
> [ 2911.104693]  ret_from_fork+0x22/0x30
> [ 2911.104697]  </TASK>
> [ 2911.104733] INFO: task kworker/25:2:326 blocked for more than 249 seconds.
> [ 2911.439086]       Not tainted 5.15.0-40-generic #43-Ubuntu
> [ 2911.706294] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
> disables this message.
> [ 2912.087471] task:kworker/25:2    state:D stack:    0 pid:  326
> ppid:     2 flags:0x00004000
> [ 2912.087475] Workqueue: bcache bch_data_insert_start [bcache]
> [ 2912.087495] Call Trace:
> [ 2912.087497]  <TASK>
> [ 2912.087498]  __schedule+0x23d/0x590
> [ 2912.087503]  schedule+0x4e/0xb0
> [ 2912.087505]  bch_bucket_alloc+0x21a/0x5b0 [bcache]
> [ 2912.087516]  ? bch_btree_insert_check_key+0x1e0/0x1e0 [bcache]
> [ 2912.087528]  ? wait_woken+0x70/0x70
> [ 2912.087532]  __bch_bucket_alloc_set+0x52/0xf0 [bcache]
> [ 2912.087543]  bch_alloc_sectors+0x1c9/0x4c0 [bcache]
> [ 2912.087553]  ? bch_btree_insert+0xea/0x130 [bcache]
> [ 2912.087565]  ? __queue_work+0x211/0x480
> [ 2912.087570]  bch_data_insert_start+0x15e/0x3a0 [bcache]
> [ 2912.087583]  ? closure_sub+0x94/0xb0 [bcache]
> [ 2912.087595]  process_one_work+0x22b/0x3d0
> [ 2912.087599]  worker_thread+0x53/0x410
> [ 2912.087602]  ? process_one_work+0x3d0/0x3d0
> [ 2912.087606]  kthread+0x12a/0x150
> [ 2912.087609]  ? set_kthread_struct+0x50/0x50
> [ 2912.087612]  ret_from_fork+0x22/0x30
> [ 2912.087617]  </TASK>
> 
> Is this a bug? It's in writeback mode. I'd setup the cache and run stuff like,
> 
> echo 0 > /sys/block/bcache0/bcache/sequential_cutoff
> 
> I had also echoed 0 into congested_read_threshold_us,
> congested_write_threshold_us.
> 
> echo writeback > /sys/block/bcache0/bcache/cache_mode

Where do you get the kernel? If this is stable kernel, could you give me the HEAD commit id?

Coly Li


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: seeing this stace trace on kernel 5.15
  2022-06-28 14:37 ` Coly Li
@ 2022-06-29  7:02   ` Nikhil Kshirsagar
  2022-06-29  7:50     ` Coly Li
  0 siblings, 1 reply; 14+ messages in thread
From: Nikhil Kshirsagar @ 2022-06-29  7:02 UTC (permalink / raw)
  To: Coly Li; +Cc: linux-bcache

Hi Coly,

I see the same bug on upstream kernel when I tried with
https://kernel.ubuntu.com/~kernel-ppa/mainline/v5.15.50/  (
cod/mainline/v5.15.50 (767db4b286c3e101ac220b813c873f492d9e4ee8)

Reads seem to trigger it, not writes. So this test triggered it -

fio --name=read_iops --directory=/home/ubuntu/bcache_mount --size=12G
--ioengine=libaio --direct=1 --verify=0 --bs=4K --iodepth=128
--rw=randread --group_reporting=1

https://pastebin.com/KyVSfnik has all the details.

Regards,
Nikhil.

On Tue, 28 Jun 2022 at 20:08, Coly Li <colyli@suse.de> wrote:
>
>
>
> > 2022年6月28日 14:31,Nikhil Kshirsagar <nkshirsagar@gmail.com> 写道:
> >
> > Hi Coly,
> >
> > I just kicked off a 20gb random readwrite test using fio to a bcache device,
> >
> > fio --filename=/home/ubuntu/bcache_mount/cacahedfile --size=20GB
> > --direct=1 --rw=randrw --bs=4k --ioengine=libaio --iodepth=128
> > --name=iops-test-job --eta-newline=1
> >
> > # lsblk /dev/sdc /dev/nvme0n1
> > NAME        MAJ:MIN RM   SIZE RO TYPE MOUNTPOINTS
> > sdc           8:32   0 279.4G  0 disk
> > └─sdc1        8:33   0    60G  0 part
> >  └─bcache0 252:0    0    60G  0 disk /home/ubuntu/bcache_mount
> > nvme0n1     259:0    0 372.6G  0 disk
> > └─nvme0n1p1 259:2    0    15G  0 part
> >  └─bcache0 252:0    0    60G  0 disk /home/ubuntu/bcache_mount
> >
> > and am seeing this trace in the dmesg,
> >
> > [ 2475.034909] XFS (bcache0): Ending clean mount
> > [ 2475.036238] xfs filesystem being mounted at
> > /home/ubuntu/bcache_mount supports timestamps until 2038 (0x7fffffff)
> >
> > [ 2782.176705] INFO: task kworker/2:1:255 blocked for more than 120 seconds.
> > [ 2782.507171]       Not tainted 5.15.0-40-generic #43-Ubuntu
> > [ 2782.774078] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
> > disables this message.
> > [ 2783.155206] task:kworker/2:1     state:D stack:    0 pid:  255
> > ppid:     2 flags:0x00004000
> > [ 2783.155210] Workqueue: bcache bch_data_insert_start [bcache]
> > [ 2783.155259] Call Trace:
> > [ 2783.155261]  <TASK>
> > [ 2783.155263]  __schedule+0x23d/0x590
> > [ 2783.155269]  schedule+0x4e/0xb0
> > [ 2783.155271]  bch_bucket_alloc+0x21a/0x5b0 [bcache]
> > [ 2783.155281]  ? bch_btree_insert_check_key+0x1e0/0x1e0 [bcache]
> > [ 2783.155294]  ? wait_woken+0x70/0x70
> > [ 2783.155298]  __bch_bucket_alloc_set+0x52/0xf0 [bcache]
> > [ 2783.155309]  bch_alloc_sectors+0x1c9/0x4c0 [bcache]
> > [ 2783.155319]  ? bch_btree_insert+0xea/0x130 [bcache]
> > [ 2783.155331]  ? __queue_work+0x211/0x480
> > [ 2783.155336]  bch_data_insert_start+0x15e/0x3a0 [bcache]
> > [ 2783.155349]  ? closure_sub+0x94/0xb0 [bcache]
> > [ 2783.155362]  process_one_work+0x22b/0x3d0
> > [ 2783.155366]  worker_thread+0x53/0x410
> > [ 2783.155369]  ? process_one_work+0x3d0/0x3d0
> > [ 2783.155372]  kthread+0x12a/0x150
> > [ 2783.155376]  ? set_kthread_struct+0x50/0x50
> > [ 2783.155379]  ret_from_fork+0x22/0x30
> > [ 2783.155385]  </TASK>
> > [ 2783.155423] INFO: task kworker/3:1:267 blocked for more than 121 seconds.
> > [ 2783.485797]       Not tainted 5.15.0-40-generic #43-Ubuntu
> > [ 2783.752485] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
> > disables this message.
> > [ 2784.133541] task:kworker/3:1     state:D stack:    0 pid:  267
> > ppid:     2 flags:0x00004000
> > [ 2784.133544] Workqueue: events update_writeback_rate [bcache]
> > [ 2784.133588] Call Trace:
> > [ 2784.133589]  <TASK>
> > [ 2784.133591]  __schedule+0x23d/0x590
> > [ 2784.133594]  schedule+0x4e/0xb0
> > [ 2784.133596]  rwsem_down_read_slowpath+0x32e/0x380
> > [ 2784.133600]  down_read+0x43/0x90
> > [ 2784.133602]  update_writeback_rate+0x134/0x190 [bcache]
> > [ 2784.133619]  process_one_work+0x22b/0x3d0
> > [ 2784.133623]  worker_thread+0x53/0x410
> > [ 2784.133626]  ? process_one_work+0x3d0/0x3d0
> > [ 2784.133630]  kthread+0x12a/0x150
> > [ 2784.133633]  ? set_kthread_struct+0x50/0x50
> > [ 2784.133636]  ret_from_fork+0x22/0x30
> > [ 2784.133640]  </TASK>
> > [ 2784.133650] INFO: task kworker/25:2:326 blocked for more than 122 seconds.
> > [ 2784.467880]       Not tainted 5.15.0-40-generic #43-Ubuntu
> > [ 2784.734405] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
> > disables this message.
> > [ 2785.114677] task:kworker/25:2    state:D stack:    0 pid:  326
> > ppid:     2 flags:0x00004000
> > [ 2785.114692] Workqueue: bcache bch_data_insert_start [bcache]
> > [ 2785.114720] Call Trace:
> > [ 2785.114721]  <TASK>
> > [ 2785.114723]  __schedule+0x23d/0x590
> > [ 2785.114726]  schedule+0x4e/0xb0
> > [ 2785.114729]  bch_bucket_alloc+0x21a/0x5b0 [bcache]
> > [ 2785.114740]  ? bch_btree_insert_check_key+0x1e0/0x1e0 [bcache]
> > [ 2785.114753]  ? wait_woken+0x70/0x70
> > [ 2785.114756]  __bch_bucket_alloc_set+0x52/0xf0 [bcache]
> > [ 2785.114766]  bch_alloc_sectors+0x1c9/0x4c0 [bcache]
> > [ 2785.114777]  ? bch_btree_insert+0xea/0x130 [bcache]
> > [ 2785.114789]  ? __queue_work+0x211/0x480
> > [ 2785.114793]  bch_data_insert_start+0x15e/0x3a0 [bcache]
> > [ 2785.114806]  ? closure_sub+0x94/0xb0 [bcache]
> > [ 2785.114818]  process_one_work+0x22b/0x3d0
> > [ 2785.114822]  worker_thread+0x53/0x410
> > [ 2785.114826]  ? process_one_work+0x3d0/0x3d0
> > [ 2785.114829]  kthread+0x12a/0x150
> > [ 2785.114832]  ? set_kthread_struct+0x50/0x50
> > [ 2785.114835]  ret_from_fork+0x22/0x30
> > [ 2785.114839]  </TASK>
> > [ 2785.114864] INFO: task kworker/6:0:3038 blocked for more than 123 seconds.
> > [ 2785.444697]       Not tainted 5.15.0-40-generic #43-Ubuntu
> > [ 2785.711071] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
> > disables this message.
> > [ 2786.091440] task:kworker/6:0     state:D stack:    0 pid: 3038
> > ppid:     2 flags:0x00004000
> > [ 2786.091446] Workqueue: xfs-sync/bcache0 xfs_log_worker [xfs]
> > [ 2786.091582] Call Trace:
> > [ 2786.091584]  <TASK>
> > [ 2786.091585]  __schedule+0x23d/0x590
> > [ 2786.091589]  schedule+0x4e/0xb0
> > [ 2786.091591]  rwsem_down_read_slowpath+0x32e/0x380
> > [ 2786.091594]  down_read+0x43/0x90
> > [ 2786.091597]  cached_dev_write+0x7e/0x480 [bcache]
> > [ 2786.091613]  cached_dev_submit_bio+0x4ef/0x540 [bcache]
> > [ 2786.091626]  __submit_bio+0x1a1/0x220
> > [ 2786.091631]  __submit_bio_noacct+0x85/0x1f0
> > [ 2786.091634]  ? mutex_lock+0x13/0x40
> > [ 2786.091637]  submit_bio_noacct+0x4e/0x120
> > [ 2786.091640]  submit_bio+0x4a/0x130
> > [ 2786.091642]  xlog_write_iclog+0x254/0x300 [xfs]
> > [ 2786.091725]  xlog_sync+0x1ab/0x2c0 [xfs]
> > [ 2786.091807]  xlog_state_release_iclog+0x123/0x1d0 [xfs]
> > [ 2786.091925]  xfs_log_force+0x186/0x250 [xfs]
> > [ 2786.091994]  xfs_log_worker+0x39/0x90 [xfs]
> > [ 2786.092063]  process_one_work+0x22b/0x3d0
> > [ 2786.092067]  worker_thread+0x53/0x410
> > [ 2786.092069]  ? process_one_work+0x3d0/0x3d0
> > [ 2786.092072]  kthread+0x12a/0x150
> > [ 2786.092074]  ? set_kthread_struct+0x50/0x50
> > [ 2786.092077]  ret_from_fork+0x22/0x30
> > [ 2786.092081]  </TASK>
> > [ 2786.092088] INFO: task kworker/25:0:3047 blocked for more than 124 seconds.
> > [ 2786.430313]       Not tainted 5.15.0-40-generic #43-Ubuntu
> > [ 2786.696626] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
> > disables this message.
> > [ 2787.077056] task:kworker/25:0    state:D stack:    0 pid: 3047
> > ppid:     2 flags:0x00004000
> > [ 2787.077071] Workqueue: bcache bch_data_insert_start [bcache]
> > [ 2787.077098] Call Trace:
> > [ 2787.077099]  <TASK>
> > [ 2787.077101]  __schedule+0x23d/0x590
> > [ 2787.077104]  schedule+0x4e/0xb0
> > [ 2787.077106]  bch_bucket_alloc+0x21a/0x5b0 [bcache]
> > [ 2787.077116]  ? wait_woken+0x70/0x70
> > [ 2787.077119]  __bch_bucket_alloc_set+0x52/0xf0 [bcache]
> > [ 2787.077142]  bch_alloc_sectors+0x1c9/0x4c0 [bcache]
> > [ 2787.077153]  ? __bch_submit_bbio+0x97/0xb0 [bcache]
> > [ 2787.077166]  bch_data_insert_start+0x15e/0x3a0 [bcache]
> > [ 2787.077179]  ? closure_sub+0x94/0xb0 [bcache]
> > [ 2787.077191]  process_one_work+0x22b/0x3d0
> > [ 2787.077195]  worker_thread+0x53/0x410
> > [ 2787.077198]  ? process_one_work+0x3d0/0x3d0
> > [ 2787.077202]  kthread+0x12a/0x150
> > [ 2787.077205]  ? set_kthread_struct+0x50/0x50
> > [ 2787.077207]  ret_from_fork+0x22/0x30
> > [ 2787.077212]  </TASK>
> > [ 2787.077247] INFO: task bcache_writebac:3080 blocked for more than
> > 125 seconds.
> > [ 2787.423890]       Not tainted 5.15.0-40-generic #43-Ubuntu
> > [ 2787.690476] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
> > disables this message.
> > [ 2788.071063] task:bcache_writebac state:D stack:    0 pid: 3080
> > ppid:     2 flags:0x00004000
> > [ 2788.071080] Call Trace:
> > [ 2788.071081]  <TASK>
> > [ 2788.071082]  __schedule+0x23d/0x590
> > [ 2788.071086]  schedule+0x4e/0xb0
> > [ 2788.071100]  rwsem_down_write_slowpath+0x220/0x3d0
> > [ 2788.071105]  ? del_timer_sync+0x6c/0xb0
> > [ 2788.071109]  down_write+0x43/0x50
> > [ 2788.071112]  bch_writeback_thread+0x78/0x320 [bcache]
> > [ 2788.071142]  ? read_dirty+0x5a0/0x5a0 [bcache]
> > [ 2788.071158]  kthread+0x12a/0x150
> > [ 2788.071161]  ? set_kthread_struct+0x50/0x50
> > [ 2788.071164]  ret_from_fork+0x22/0x30
> > [ 2788.071168]  </TASK>
> > [ 2788.071199] INFO: task fio:3123 blocked for more than 126 seconds.
> > [ 2788.367175]       Not tainted 5.15.0-40-generic #43-Ubuntu
> > [ 2788.633386] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
> > disables this message.
> > [ 2789.014084] task:fio             state:D stack:    0 pid: 3123
> > ppid:  2378 flags:0x00004002
> > [ 2789.014087] Call Trace:
> > [ 2789.014089]  <TASK>
> > [ 2789.014090]  __schedule+0x23d/0x590
> > [ 2789.014106]  schedule+0x4e/0xb0
> > [ 2789.014108]  io_schedule+0x46/0x70
> > [ 2789.014110]  wait_on_page_bit_common+0x10c/0x3d0
> > [ 2789.014115]  ? filemap_invalidate_unlock_two+0x40/0x40
> > [ 2789.014118]  wait_on_page_bit+0x3f/0x50
> > [ 2789.014120]  wait_on_page_writeback+0x26/0x80
> > [ 2789.014124]  __filemap_fdatawait_range+0x97/0x110
> > [ 2789.014126]  file_write_and_wait_range+0xcc/0xf0
> > [ 2789.014130]  xfs_file_fsync+0x5b/0x250 [xfs]
> > [ 2789.014207]  vfs_fsync_range+0x49/0x80
> > [ 2789.014212]  ? __fget_light+0x32/0x80
> > [ 2789.014217]  __x64_sys_fsync+0x38/0x60
> > [ 2789.014220]  do_syscall_64+0x5c/0xc0
> > [ 2789.014223]  ? do_syscall_64+0x69/0xc0
> > [ 2789.014225]  ? do_syscall_64+0x69/0xc0
> > [ 2789.014226]  ? syscall_exit_to_user_mode+0x27/0x50
> > [ 2789.014230]  ? __x64_sys_write+0x19/0x20
> > [ 2789.014232]  ? do_syscall_64+0x69/0xc0
> > [ 2789.014234]  ? do_syscall_64+0x69/0xc0
> > [ 2789.014235]  ? asm_sysvec_apic_timer_interrupt+0xa/0x20
> > [ 2789.014239]  entry_SYSCALL_64_after_hwframe+0x44/0xae
> > [ 2789.014243] RIP: 0033:0x7f2cfdcfea5b
> > [ 2789.014245] RSP: 002b:00007ffcd87c6e60 EFLAGS: 00000293 ORIG_RAX:
> > 000000000000004a
> > [ 2789.014248] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f2cfdcfea5b
> > [ 2789.014249] RDX: 0000000000000000 RSI: 000055c8d26e6f40 RDI: 0000000000000006
> > [ 2789.014250] RBP: 0000000000001000 R08: 0000000000000000 R09: 000055c8d26e6f40
> > [ 2789.014252] R10: 00000000a318c620 R11: 0000000000000293 R12: 00007f2cf38a7000
> > [ 2789.014253] R13: 0000000000000000 R14: 00007f2cfb0522f0 R15: 0000000500000000
> > [ 2789.014255]  </TASK>
> > [ 2909.151501] INFO: task kworker/2:1:255 blocked for more than 247 seconds.
> > [ 2909.481856]       Not tainted 5.15.0-40-generic #43-Ubuntu
> > [ 2909.748707] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
> > disables this message.
> > [ 2910.130132] task:kworker/2:1     state:D stack:    0 pid:  255
> > ppid:     2 flags:0x00004000
> > [ 2910.130147] Workqueue: bcache bch_data_insert_start [bcache]
> > [ 2910.130176] Call Trace:
> > [ 2910.130177]  <TASK>
> > [ 2910.130179]  __schedule+0x23d/0x590
> > [ 2910.130182]  schedule+0x4e/0xb0
> > [ 2910.130184]  bch_bucket_alloc+0x21a/0x5b0 [bcache]
> > [ 2910.130194]  ? bch_btree_insert_check_key+0x1e0/0x1e0 [bcache]
> > [ 2910.130219]  ? wait_woken+0x70/0x70
> > [ 2910.130222]  __bch_bucket_alloc_set+0x52/0xf0 [bcache]
> > [ 2910.130233]  bch_alloc_sectors+0x1c9/0x4c0 [bcache]
> > [ 2910.130243]  ? bch_btree_insert+0xea/0x130 [bcache]
> > [ 2910.130255]  ? __queue_work+0x211/0x480
> > [ 2910.130259]  bch_data_insert_start+0x15e/0x3a0 [bcache]
> > [ 2910.130272]  ? closure_sub+0x94/0xb0 [bcache]
> > [ 2910.130285]  process_one_work+0x22b/0x3d0
> > [ 2910.130288]  worker_thread+0x53/0x410
> > [ 2910.130292]  ? process_one_work+0x3d0/0x3d0
> > [ 2910.130295]  kthread+0x12a/0x150
> > [ 2910.130298]  ? set_kthread_struct+0x50/0x50
> > [ 2910.130301]  ret_from_fork+0x22/0x30
> > [ 2910.130305]  </TASK>
> > [ 2910.130307] INFO: task kworker/3:1:267 blocked for more than 248 seconds.
> > [ 2910.456520]       Not tainted 5.15.0-40-generic #43-Ubuntu
> > [ 2910.723476] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
> > disables this message.
> > [ 2911.104601] task:kworker/3:1     state:D stack:    0 pid:  267
> > ppid:     2 flags:0x00004000
> > [ 2911.104617] Workqueue: events update_writeback_rate [bcache]
> > [ 2911.104647] Call Trace:
> > [ 2911.104648]  <TASK>
> > [ 2911.104649]  __schedule+0x23d/0x590
> > [ 2911.104652]  schedule+0x4e/0xb0
> > [ 2911.104654]  rwsem_down_read_slowpath+0x32e/0x380
> > [ 2911.104657]  down_read+0x43/0x90
> > [ 2911.104660]  update_writeback_rate+0x134/0x190 [bcache]
> > [ 2911.104676]  process_one_work+0x22b/0x3d0
> > [ 2911.104680]  worker_thread+0x53/0x410
> > [ 2911.104683]  ? process_one_work+0x3d0/0x3d0
> > [ 2911.104687]  kthread+0x12a/0x150
> > [ 2911.104690]  ? set_kthread_struct+0x50/0x50
> > [ 2911.104693]  ret_from_fork+0x22/0x30
> > [ 2911.104697]  </TASK>
> > [ 2911.104733] INFO: task kworker/25:2:326 blocked for more than 249 seconds.
> > [ 2911.439086]       Not tainted 5.15.0-40-generic #43-Ubuntu
> > [ 2911.706294] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
> > disables this message.
> > [ 2912.087471] task:kworker/25:2    state:D stack:    0 pid:  326
> > ppid:     2 flags:0x00004000
> > [ 2912.087475] Workqueue: bcache bch_data_insert_start [bcache]
> > [ 2912.087495] Call Trace:
> > [ 2912.087497]  <TASK>
> > [ 2912.087498]  __schedule+0x23d/0x590
> > [ 2912.087503]  schedule+0x4e/0xb0
> > [ 2912.087505]  bch_bucket_alloc+0x21a/0x5b0 [bcache]
> > [ 2912.087516]  ? bch_btree_insert_check_key+0x1e0/0x1e0 [bcache]
> > [ 2912.087528]  ? wait_woken+0x70/0x70
> > [ 2912.087532]  __bch_bucket_alloc_set+0x52/0xf0 [bcache]
> > [ 2912.087543]  bch_alloc_sectors+0x1c9/0x4c0 [bcache]
> > [ 2912.087553]  ? bch_btree_insert+0xea/0x130 [bcache]
> > [ 2912.087565]  ? __queue_work+0x211/0x480
> > [ 2912.087570]  bch_data_insert_start+0x15e/0x3a0 [bcache]
> > [ 2912.087583]  ? closure_sub+0x94/0xb0 [bcache]
> > [ 2912.087595]  process_one_work+0x22b/0x3d0
> > [ 2912.087599]  worker_thread+0x53/0x410
> > [ 2912.087602]  ? process_one_work+0x3d0/0x3d0
> > [ 2912.087606]  kthread+0x12a/0x150
> > [ 2912.087609]  ? set_kthread_struct+0x50/0x50
> > [ 2912.087612]  ret_from_fork+0x22/0x30
> > [ 2912.087617]  </TASK>
> >
> > Is this a bug? It's in writeback mode. I'd setup the cache and run stuff like,
> >
> > echo 0 > /sys/block/bcache0/bcache/sequential_cutoff
> >
> > I had also echoed 0 into congested_read_threshold_us,
> > congested_write_threshold_us.
> >
> > echo writeback > /sys/block/bcache0/bcache/cache_mode
>
> Where do you get the kernel? If this is stable kernel, could you give me the HEAD commit id?
>
> Coly Li
>

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: seeing this stace trace on kernel 5.15
  2022-06-29  7:02   ` Nikhil Kshirsagar
@ 2022-06-29  7:50     ` Coly Li
  2022-06-29  7:56       ` Nikhil Kshirsagar
  0 siblings, 1 reply; 14+ messages in thread
From: Coly Li @ 2022-06-29  7:50 UTC (permalink / raw)
  To: Nikhil Kshirsagar; +Cc: linux-bcache



> 2022年6月29日 15:02,Nikhil Kshirsagar <nkshirsagar@gmail.com> 写道:
> 
> Hi Coly,
> 
> I see the same bug on upstream kernel when I tried with
> https://kernel.ubuntu.com/~kernel-ppa/mainline/v5.15.50/  (
> cod/mainline/v5.15.50 (767db4b286c3e101ac220b813c873f492d9e4ee8)


Can you help to try the linux-stable tree, or the latest upstream Linus tree? Then I can have a clean code base.

> 
> Reads seem to trigger it, not writes. So this test triggered it -
> 
> fio --name=read_iops --directory=/home/ubuntu/bcache_mount --size=12G
> --ioengine=libaio --direct=1 --verify=0 --bs=4K --iodepth=128
> --rw=randread --group_reporting=1
> 
> https://pastebin.com/KyVSfnik has all the details.


OK, I will try to reproduce with above operation. What is the preferred cache and backing device sizes to reproduce the soft lockup?

Thanks.

Coly Li


> 
> Regards,
> Nikhil.
> 
> On Tue, 28 Jun 2022 at 20:08, Coly Li <colyli@suse.de> wrote:
>> 
>> 
>> 
>>> 2022年6月28日 14:31,Nikhil Kshirsagar <nkshirsagar@gmail.com> 写道:
>>> 
>>> Hi Coly,
>>> 
>>> I just kicked off a 20gb random readwrite test using fio to a bcache device,
>>> 
>>> fio --filename=/home/ubuntu/bcache_mount/cacahedfile --size=20GB
>>> --direct=1 --rw=randrw --bs=4k --ioengine=libaio --iodepth=128
>>> --name=iops-test-job --eta-newline=1
>>> 
>>> # lsblk /dev/sdc /dev/nvme0n1
>>> NAME        MAJ:MIN RM   SIZE RO TYPE MOUNTPOINTS
>>> sdc           8:32   0 279.4G  0 disk
>>> └─sdc1        8:33   0    60G  0 part
>>> └─bcache0 252:0    0    60G  0 disk /home/ubuntu/bcache_mount
>>> nvme0n1     259:0    0 372.6G  0 disk
>>> └─nvme0n1p1 259:2    0    15G  0 part
>>> └─bcache0 252:0    0    60G  0 disk /home/ubuntu/bcache_mount
>>> 
>>> and am seeing this trace in the dmesg,
>>> 
>>> [ 2475.034909] XFS (bcache0): Ending clean mount
>>> [ 2475.036238] xfs filesystem being mounted at
>>> /home/ubuntu/bcache_mount supports timestamps until 2038 (0x7fffffff)
>>> 
>>> [ 2782.176705] INFO: task kworker/2:1:255 blocked for more than 120 seconds.
>>> [ 2782.507171]       Not tainted 5.15.0-40-generic #43-Ubuntu
>>> [ 2782.774078] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
>>> disables this message.
>>> [ 2783.155206] task:kworker/2:1     state:D stack:    0 pid:  255
>>> ppid:     2 flags:0x00004000
>>> [ 2783.155210] Workqueue: bcache bch_data_insert_start [bcache]
>>> [ 2783.155259] Call Trace:
>>> [ 2783.155261]  <TASK>
>>> [ 2783.155263]  __schedule+0x23d/0x590
>>> [ 2783.155269]  schedule+0x4e/0xb0
>>> [ 2783.155271]  bch_bucket_alloc+0x21a/0x5b0 [bcache]
>>> [ 2783.155281]  ? bch_btree_insert_check_key+0x1e0/0x1e0 [bcache]
>>> [ 2783.155294]  ? wait_woken+0x70/0x70
>>> [ 2783.155298]  __bch_bucket_alloc_set+0x52/0xf0 [bcache]
>>> [ 2783.155309]  bch_alloc_sectors+0x1c9/0x4c0 [bcache]
>>> [ 2783.155319]  ? bch_btree_insert+0xea/0x130 [bcache]
>>> [ 2783.155331]  ? __queue_work+0x211/0x480
>>> [ 2783.155336]  bch_data_insert_start+0x15e/0x3a0 [bcache]
>>> [ 2783.155349]  ? closure_sub+0x94/0xb0 [bcache]
>>> [ 2783.155362]  process_one_work+0x22b/0x3d0
>>> [ 2783.155366]  worker_thread+0x53/0x410
>>> [ 2783.155369]  ? process_one_work+0x3d0/0x3d0
>>> [ 2783.155372]  kthread+0x12a/0x150
>>> [ 2783.155376]  ? set_kthread_struct+0x50/0x50
>>> [ 2783.155379]  ret_from_fork+0x22/0x30
>>> [ 2783.155385]  </TASK>
>>> [ 2783.155423] INFO: task kworker/3:1:267 blocked for more than 121 seconds.
>>> [ 2783.485797]       Not tainted 5.15.0-40-generic #43-Ubuntu
>>> [ 2783.752485] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
>>> disables this message.
>>> [ 2784.133541] task:kworker/3:1     state:D stack:    0 pid:  267
>>> ppid:     2 flags:0x00004000
>>> [ 2784.133544] Workqueue: events update_writeback_rate [bcache]
>>> [ 2784.133588] Call Trace:
>>> [ 2784.133589]  <TASK>
>>> [ 2784.133591]  __schedule+0x23d/0x590
>>> [ 2784.133594]  schedule+0x4e/0xb0
>>> [ 2784.133596]  rwsem_down_read_slowpath+0x32e/0x380
>>> [ 2784.133600]  down_read+0x43/0x90
>>> [ 2784.133602]  update_writeback_rate+0x134/0x190 [bcache]
>>> [ 2784.133619]  process_one_work+0x22b/0x3d0
>>> [ 2784.133623]  worker_thread+0x53/0x410
>>> [ 2784.133626]  ? process_one_work+0x3d0/0x3d0
>>> [ 2784.133630]  kthread+0x12a/0x150
>>> [ 2784.133633]  ? set_kthread_struct+0x50/0x50
>>> [ 2784.133636]  ret_from_fork+0x22/0x30
>>> [ 2784.133640]  </TASK>
>>> [ 2784.133650] INFO: task kworker/25:2:326 blocked for more than 122 seconds.
>>> [ 2784.467880]       Not tainted 5.15.0-40-generic #43-Ubuntu
>>> [ 2784.734405] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
>>> disables this message.
>>> [ 2785.114677] task:kworker/25:2    state:D stack:    0 pid:  326
>>> ppid:     2 flags:0x00004000
>>> [ 2785.114692] Workqueue: bcache bch_data_insert_start [bcache]
>>> [ 2785.114720] Call Trace:
>>> [ 2785.114721]  <TASK>
>>> [ 2785.114723]  __schedule+0x23d/0x590
>>> [ 2785.114726]  schedule+0x4e/0xb0
>>> [ 2785.114729]  bch_bucket_alloc+0x21a/0x5b0 [bcache]
>>> [ 2785.114740]  ? bch_btree_insert_check_key+0x1e0/0x1e0 [bcache]
>>> [ 2785.114753]  ? wait_woken+0x70/0x70
>>> [ 2785.114756]  __bch_bucket_alloc_set+0x52/0xf0 [bcache]
>>> [ 2785.114766]  bch_alloc_sectors+0x1c9/0x4c0 [bcache]
>>> [ 2785.114777]  ? bch_btree_insert+0xea/0x130 [bcache]
>>> [ 2785.114789]  ? __queue_work+0x211/0x480
>>> [ 2785.114793]  bch_data_insert_start+0x15e/0x3a0 [bcache]
>>> [ 2785.114806]  ? closure_sub+0x94/0xb0 [bcache]
>>> [ 2785.114818]  process_one_work+0x22b/0x3d0
>>> [ 2785.114822]  worker_thread+0x53/0x410
>>> [ 2785.114826]  ? process_one_work+0x3d0/0x3d0
>>> [ 2785.114829]  kthread+0x12a/0x150
>>> [ 2785.114832]  ? set_kthread_struct+0x50/0x50
>>> [ 2785.114835]  ret_from_fork+0x22/0x30
>>> [ 2785.114839]  </TASK>
>>> [ 2785.114864] INFO: task kworker/6:0:3038 blocked for more than 123 seconds.
>>> [ 2785.444697]       Not tainted 5.15.0-40-generic #43-Ubuntu
>>> [ 2785.711071] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
>>> disables this message.
>>> [ 2786.091440] task:kworker/6:0     state:D stack:    0 pid: 3038
>>> ppid:     2 flags:0x00004000
>>> [ 2786.091446] Workqueue: xfs-sync/bcache0 xfs_log_worker [xfs]
>>> [ 2786.091582] Call Trace:
>>> [ 2786.091584]  <TASK>
>>> [ 2786.091585]  __schedule+0x23d/0x590
>>> [ 2786.091589]  schedule+0x4e/0xb0
>>> [ 2786.091591]  rwsem_down_read_slowpath+0x32e/0x380
>>> [ 2786.091594]  down_read+0x43/0x90
>>> [ 2786.091597]  cached_dev_write+0x7e/0x480 [bcache]
>>> [ 2786.091613]  cached_dev_submit_bio+0x4ef/0x540 [bcache]
>>> [ 2786.091626]  __submit_bio+0x1a1/0x220
>>> [ 2786.091631]  __submit_bio_noacct+0x85/0x1f0
>>> [ 2786.091634]  ? mutex_lock+0x13/0x40
>>> [ 2786.091637]  submit_bio_noacct+0x4e/0x120
>>> [ 2786.091640]  submit_bio+0x4a/0x130
>>> [ 2786.091642]  xlog_write_iclog+0x254/0x300 [xfs]
>>> [ 2786.091725]  xlog_sync+0x1ab/0x2c0 [xfs]
>>> [ 2786.091807]  xlog_state_release_iclog+0x123/0x1d0 [xfs]
>>> [ 2786.091925]  xfs_log_force+0x186/0x250 [xfs]
>>> [ 2786.091994]  xfs_log_worker+0x39/0x90 [xfs]
>>> [ 2786.092063]  process_one_work+0x22b/0x3d0
>>> [ 2786.092067]  worker_thread+0x53/0x410
>>> [ 2786.092069]  ? process_one_work+0x3d0/0x3d0
>>> [ 2786.092072]  kthread+0x12a/0x150
>>> [ 2786.092074]  ? set_kthread_struct+0x50/0x50
>>> [ 2786.092077]  ret_from_fork+0x22/0x30
>>> [ 2786.092081]  </TASK>
>>> [ 2786.092088] INFO: task kworker/25:0:3047 blocked for more than 124 seconds.
>>> [ 2786.430313]       Not tainted 5.15.0-40-generic #43-Ubuntu
>>> [ 2786.696626] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
>>> disables this message.
>>> [ 2787.077056] task:kworker/25:0    state:D stack:    0 pid: 3047
>>> ppid:     2 flags:0x00004000
>>> [ 2787.077071] Workqueue: bcache bch_data_insert_start [bcache]
>>> [ 2787.077098] Call Trace:
>>> [ 2787.077099]  <TASK>
>>> [ 2787.077101]  __schedule+0x23d/0x590
>>> [ 2787.077104]  schedule+0x4e/0xb0
>>> [ 2787.077106]  bch_bucket_alloc+0x21a/0x5b0 [bcache]
>>> [ 2787.077116]  ? wait_woken+0x70/0x70
>>> [ 2787.077119]  __bch_bucket_alloc_set+0x52/0xf0 [bcache]
>>> [ 2787.077142]  bch_alloc_sectors+0x1c9/0x4c0 [bcache]
>>> [ 2787.077153]  ? __bch_submit_bbio+0x97/0xb0 [bcache]
>>> [ 2787.077166]  bch_data_insert_start+0x15e/0x3a0 [bcache]
>>> [ 2787.077179]  ? closure_sub+0x94/0xb0 [bcache]
>>> [ 2787.077191]  process_one_work+0x22b/0x3d0
>>> [ 2787.077195]  worker_thread+0x53/0x410
>>> [ 2787.077198]  ? process_one_work+0x3d0/0x3d0
>>> [ 2787.077202]  kthread+0x12a/0x150
>>> [ 2787.077205]  ? set_kthread_struct+0x50/0x50
>>> [ 2787.077207]  ret_from_fork+0x22/0x30
>>> [ 2787.077212]  </TASK>
>>> [ 2787.077247] INFO: task bcache_writebac:3080 blocked for more than
>>> 125 seconds.
>>> [ 2787.423890]       Not tainted 5.15.0-40-generic #43-Ubuntu
>>> [ 2787.690476] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
>>> disables this message.
>>> [ 2788.071063] task:bcache_writebac state:D stack:    0 pid: 3080
>>> ppid:     2 flags:0x00004000
>>> [ 2788.071080] Call Trace:
>>> [ 2788.071081]  <TASK>
>>> [ 2788.071082]  __schedule+0x23d/0x590
>>> [ 2788.071086]  schedule+0x4e/0xb0
>>> [ 2788.071100]  rwsem_down_write_slowpath+0x220/0x3d0
>>> [ 2788.071105]  ? del_timer_sync+0x6c/0xb0
>>> [ 2788.071109]  down_write+0x43/0x50
>>> [ 2788.071112]  bch_writeback_thread+0x78/0x320 [bcache]
>>> [ 2788.071142]  ? read_dirty+0x5a0/0x5a0 [bcache]
>>> [ 2788.071158]  kthread+0x12a/0x150
>>> [ 2788.071161]  ? set_kthread_struct+0x50/0x50
>>> [ 2788.071164]  ret_from_fork+0x22/0x30
>>> [ 2788.071168]  </TASK>
>>> [ 2788.071199] INFO: task fio:3123 blocked for more than 126 seconds.
>>> [ 2788.367175]       Not tainted 5.15.0-40-generic #43-Ubuntu
>>> [ 2788.633386] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
>>> disables this message.
>>> [ 2789.014084] task:fio             state:D stack:    0 pid: 3123
>>> ppid:  2378 flags:0x00004002
>>> [ 2789.014087] Call Trace:
>>> [ 2789.014089]  <TASK>
>>> [ 2789.014090]  __schedule+0x23d/0x590
>>> [ 2789.014106]  schedule+0x4e/0xb0
>>> [ 2789.014108]  io_schedule+0x46/0x70
>>> [ 2789.014110]  wait_on_page_bit_common+0x10c/0x3d0
>>> [ 2789.014115]  ? filemap_invalidate_unlock_two+0x40/0x40
>>> [ 2789.014118]  wait_on_page_bit+0x3f/0x50
>>> [ 2789.014120]  wait_on_page_writeback+0x26/0x80
>>> [ 2789.014124]  __filemap_fdatawait_range+0x97/0x110
>>> [ 2789.014126]  file_write_and_wait_range+0xcc/0xf0
>>> [ 2789.014130]  xfs_file_fsync+0x5b/0x250 [xfs]
>>> [ 2789.014207]  vfs_fsync_range+0x49/0x80
>>> [ 2789.014212]  ? __fget_light+0x32/0x80
>>> [ 2789.014217]  __x64_sys_fsync+0x38/0x60
>>> [ 2789.014220]  do_syscall_64+0x5c/0xc0
>>> [ 2789.014223]  ? do_syscall_64+0x69/0xc0
>>> [ 2789.014225]  ? do_syscall_64+0x69/0xc0
>>> [ 2789.014226]  ? syscall_exit_to_user_mode+0x27/0x50
>>> [ 2789.014230]  ? __x64_sys_write+0x19/0x20
>>> [ 2789.014232]  ? do_syscall_64+0x69/0xc0
>>> [ 2789.014234]  ? do_syscall_64+0x69/0xc0
>>> [ 2789.014235]  ? asm_sysvec_apic_timer_interrupt+0xa/0x20
>>> [ 2789.014239]  entry_SYSCALL_64_after_hwframe+0x44/0xae
>>> [ 2789.014243] RIP: 0033:0x7f2cfdcfea5b
>>> [ 2789.014245] RSP: 002b:00007ffcd87c6e60 EFLAGS: 00000293 ORIG_RAX:
>>> 000000000000004a
>>> [ 2789.014248] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f2cfdcfea5b
>>> [ 2789.014249] RDX: 0000000000000000 RSI: 000055c8d26e6f40 RDI: 0000000000000006
>>> [ 2789.014250] RBP: 0000000000001000 R08: 0000000000000000 R09: 000055c8d26e6f40
>>> [ 2789.014252] R10: 00000000a318c620 R11: 0000000000000293 R12: 00007f2cf38a7000
>>> [ 2789.014253] R13: 0000000000000000 R14: 00007f2cfb0522f0 R15: 0000000500000000
>>> [ 2789.014255]  </TASK>
>>> [ 2909.151501] INFO: task kworker/2:1:255 blocked for more than 247 seconds.
>>> [ 2909.481856]       Not tainted 5.15.0-40-generic #43-Ubuntu
>>> [ 2909.748707] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
>>> disables this message.
>>> [ 2910.130132] task:kworker/2:1     state:D stack:    0 pid:  255
>>> ppid:     2 flags:0x00004000
>>> [ 2910.130147] Workqueue: bcache bch_data_insert_start [bcache]
>>> [ 2910.130176] Call Trace:
>>> [ 2910.130177]  <TASK>
>>> [ 2910.130179]  __schedule+0x23d/0x590
>>> [ 2910.130182]  schedule+0x4e/0xb0
>>> [ 2910.130184]  bch_bucket_alloc+0x21a/0x5b0 [bcache]
>>> [ 2910.130194]  ? bch_btree_insert_check_key+0x1e0/0x1e0 [bcache]
>>> [ 2910.130219]  ? wait_woken+0x70/0x70
>>> [ 2910.130222]  __bch_bucket_alloc_set+0x52/0xf0 [bcache]
>>> [ 2910.130233]  bch_alloc_sectors+0x1c9/0x4c0 [bcache]
>>> [ 2910.130243]  ? bch_btree_insert+0xea/0x130 [bcache]
>>> [ 2910.130255]  ? __queue_work+0x211/0x480
>>> [ 2910.130259]  bch_data_insert_start+0x15e/0x3a0 [bcache]
>>> [ 2910.130272]  ? closure_sub+0x94/0xb0 [bcache]
>>> [ 2910.130285]  process_one_work+0x22b/0x3d0
>>> [ 2910.130288]  worker_thread+0x53/0x410
>>> [ 2910.130292]  ? process_one_work+0x3d0/0x3d0
>>> [ 2910.130295]  kthread+0x12a/0x150
>>> [ 2910.130298]  ? set_kthread_struct+0x50/0x50
>>> [ 2910.130301]  ret_from_fork+0x22/0x30
>>> [ 2910.130305]  </TASK>
>>> [ 2910.130307] INFO: task kworker/3:1:267 blocked for more than 248 seconds.
>>> [ 2910.456520]       Not tainted 5.15.0-40-generic #43-Ubuntu
>>> [ 2910.723476] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
>>> disables this message.
>>> [ 2911.104601] task:kworker/3:1     state:D stack:    0 pid:  267
>>> ppid:     2 flags:0x00004000
>>> [ 2911.104617] Workqueue: events update_writeback_rate [bcache]
>>> [ 2911.104647] Call Trace:
>>> [ 2911.104648]  <TASK>
>>> [ 2911.104649]  __schedule+0x23d/0x590
>>> [ 2911.104652]  schedule+0x4e/0xb0
>>> [ 2911.104654]  rwsem_down_read_slowpath+0x32e/0x380
>>> [ 2911.104657]  down_read+0x43/0x90
>>> [ 2911.104660]  update_writeback_rate+0x134/0x190 [bcache]
>>> [ 2911.104676]  process_one_work+0x22b/0x3d0
>>> [ 2911.104680]  worker_thread+0x53/0x410
>>> [ 2911.104683]  ? process_one_work+0x3d0/0x3d0
>>> [ 2911.104687]  kthread+0x12a/0x150
>>> [ 2911.104690]  ? set_kthread_struct+0x50/0x50
>>> [ 2911.104693]  ret_from_fork+0x22/0x30
>>> [ 2911.104697]  </TASK>
>>> [ 2911.104733] INFO: task kworker/25:2:326 blocked for more than 249 seconds.
>>> [ 2911.439086]       Not tainted 5.15.0-40-generic #43-Ubuntu
>>> [ 2911.706294] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
>>> disables this message.
>>> [ 2912.087471] task:kworker/25:2    state:D stack:    0 pid:  326
>>> ppid:     2 flags:0x00004000
>>> [ 2912.087475] Workqueue: bcache bch_data_insert_start [bcache]
>>> [ 2912.087495] Call Trace:
>>> [ 2912.087497]  <TASK>
>>> [ 2912.087498]  __schedule+0x23d/0x590
>>> [ 2912.087503]  schedule+0x4e/0xb0
>>> [ 2912.087505]  bch_bucket_alloc+0x21a/0x5b0 [bcache]
>>> [ 2912.087516]  ? bch_btree_insert_check_key+0x1e0/0x1e0 [bcache]
>>> [ 2912.087528]  ? wait_woken+0x70/0x70
>>> [ 2912.087532]  __bch_bucket_alloc_set+0x52/0xf0 [bcache]
>>> [ 2912.087543]  bch_alloc_sectors+0x1c9/0x4c0 [bcache]
>>> [ 2912.087553]  ? bch_btree_insert+0xea/0x130 [bcache]
>>> [ 2912.087565]  ? __queue_work+0x211/0x480
>>> [ 2912.087570]  bch_data_insert_start+0x15e/0x3a0 [bcache]
>>> [ 2912.087583]  ? closure_sub+0x94/0xb0 [bcache]
>>> [ 2912.087595]  process_one_work+0x22b/0x3d0
>>> [ 2912.087599]  worker_thread+0x53/0x410
>>> [ 2912.087602]  ? process_one_work+0x3d0/0x3d0
>>> [ 2912.087606]  kthread+0x12a/0x150
>>> [ 2912.087609]  ? set_kthread_struct+0x50/0x50
>>> [ 2912.087612]  ret_from_fork+0x22/0x30
>>> [ 2912.087617]  </TASK>
>>> 
>>> Is this a bug? It's in writeback mode. I'd setup the cache and run stuff like,
>>> 
>>> echo 0 > /sys/block/bcache0/bcache/sequential_cutoff
>>> 
>>> I had also echoed 0 into congested_read_threshold_us,
>>> congested_write_threshold_us.
>>> 
>>> echo writeback > /sys/block/bcache0/bcache/cache_mode
>> 
>> Where do you get the kernel? If this is stable kernel, could you give me the HEAD commit id?
>> 
>> Coly Li
>> 


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: seeing this stace trace on kernel 5.15
  2022-06-29  7:50     ` Coly Li
@ 2022-06-29  7:56       ` Nikhil Kshirsagar
  2022-06-29  7:59         ` Coly Li
  2022-06-29  8:09         ` Nikhil Kshirsagar
  0 siblings, 2 replies; 14+ messages in thread
From: Nikhil Kshirsagar @ 2022-06-29  7:56 UTC (permalink / raw)
  To: Coly Li; +Cc: linux-bcache

60gb slow hdd, 15gb fast nvme cache. shall i file upstream bug for this issue?

On Wed, 29 Jun 2022 at 13:20, Coly Li <colyli@suse.de> wrote:
>
>
>
> > 2022年6月29日 15:02,Nikhil Kshirsagar <nkshirsagar@gmail.com> 写道:
> >
> > Hi Coly,
> >
> > I see the same bug on upstream kernel when I tried with
> > https://kernel.ubuntu.com/~kernel-ppa/mainline/v5.15.50/  (
> > cod/mainline/v5.15.50 (767db4b286c3e101ac220b813c873f492d9e4ee8)
>
>
> Can you help to try the linux-stable tree, or the latest upstream Linus tree? Then I can have a clean code base.
>
> >
> > Reads seem to trigger it, not writes. So this test triggered it -
> >
> > fio --name=read_iops --directory=/home/ubuntu/bcache_mount --size=12G
> > --ioengine=libaio --direct=1 --verify=0 --bs=4K --iodepth=128
> > --rw=randread --group_reporting=1
> >
> > https://pastebin.com/KyVSfnik has all the details.
>
>
> OK, I will try to reproduce with above operation. What is the preferred cache and backing device sizes to reproduce the soft lockup?
>
> Thanks.
>
> Coly Li
>
>
> >
> > Regards,
> > Nikhil.
> >
> > On Tue, 28 Jun 2022 at 20:08, Coly Li <colyli@suse.de> wrote:
> >>
> >>
> >>
> >>> 2022年6月28日 14:31,Nikhil Kshirsagar <nkshirsagar@gmail.com> 写道:
> >>>
> >>> Hi Coly,
> >>>
> >>> I just kicked off a 20gb random readwrite test using fio to a bcache device,
> >>>
> >>> fio --filename=/home/ubuntu/bcache_mount/cacahedfile --size=20GB
> >>> --direct=1 --rw=randrw --bs=4k --ioengine=libaio --iodepth=128
> >>> --name=iops-test-job --eta-newline=1
> >>>
> >>> # lsblk /dev/sdc /dev/nvme0n1
> >>> NAME        MAJ:MIN RM   SIZE RO TYPE MOUNTPOINTS
> >>> sdc           8:32   0 279.4G  0 disk
> >>> └─sdc1        8:33   0    60G  0 part
> >>> └─bcache0 252:0    0    60G  0 disk /home/ubuntu/bcache_mount
> >>> nvme0n1     259:0    0 372.6G  0 disk
> >>> └─nvme0n1p1 259:2    0    15G  0 part
> >>> └─bcache0 252:0    0    60G  0 disk /home/ubuntu/bcache_mount
> >>>
> >>> and am seeing this trace in the dmesg,
> >>>
> >>> [ 2475.034909] XFS (bcache0): Ending clean mount
> >>> [ 2475.036238] xfs filesystem being mounted at
> >>> /home/ubuntu/bcache_mount supports timestamps until 2038 (0x7fffffff)
> >>>
> >>> [ 2782.176705] INFO: task kworker/2:1:255 blocked for more than 120 seconds.
> >>> [ 2782.507171]       Not tainted 5.15.0-40-generic #43-Ubuntu
> >>> [ 2782.774078] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
> >>> disables this message.
> >>> [ 2783.155206] task:kworker/2:1     state:D stack:    0 pid:  255
> >>> ppid:     2 flags:0x00004000
> >>> [ 2783.155210] Workqueue: bcache bch_data_insert_start [bcache]
> >>> [ 2783.155259] Call Trace:
> >>> [ 2783.155261]  <TASK>
> >>> [ 2783.155263]  __schedule+0x23d/0x590
> >>> [ 2783.155269]  schedule+0x4e/0xb0
> >>> [ 2783.155271]  bch_bucket_alloc+0x21a/0x5b0 [bcache]
> >>> [ 2783.155281]  ? bch_btree_insert_check_key+0x1e0/0x1e0 [bcache]
> >>> [ 2783.155294]  ? wait_woken+0x70/0x70
> >>> [ 2783.155298]  __bch_bucket_alloc_set+0x52/0xf0 [bcache]
> >>> [ 2783.155309]  bch_alloc_sectors+0x1c9/0x4c0 [bcache]
> >>> [ 2783.155319]  ? bch_btree_insert+0xea/0x130 [bcache]
> >>> [ 2783.155331]  ? __queue_work+0x211/0x480
> >>> [ 2783.155336]  bch_data_insert_start+0x15e/0x3a0 [bcache]
> >>> [ 2783.155349]  ? closure_sub+0x94/0xb0 [bcache]
> >>> [ 2783.155362]  process_one_work+0x22b/0x3d0
> >>> [ 2783.155366]  worker_thread+0x53/0x410
> >>> [ 2783.155369]  ? process_one_work+0x3d0/0x3d0
> >>> [ 2783.155372]  kthread+0x12a/0x150
> >>> [ 2783.155376]  ? set_kthread_struct+0x50/0x50
> >>> [ 2783.155379]  ret_from_fork+0x22/0x30
> >>> [ 2783.155385]  </TASK>
> >>> [ 2783.155423] INFO: task kworker/3:1:267 blocked for more than 121 seconds.
> >>> [ 2783.485797]       Not tainted 5.15.0-40-generic #43-Ubuntu
> >>> [ 2783.752485] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
> >>> disables this message.
> >>> [ 2784.133541] task:kworker/3:1     state:D stack:    0 pid:  267
> >>> ppid:     2 flags:0x00004000
> >>> [ 2784.133544] Workqueue: events update_writeback_rate [bcache]
> >>> [ 2784.133588] Call Trace:
> >>> [ 2784.133589]  <TASK>
> >>> [ 2784.133591]  __schedule+0x23d/0x590
> >>> [ 2784.133594]  schedule+0x4e/0xb0
> >>> [ 2784.133596]  rwsem_down_read_slowpath+0x32e/0x380
> >>> [ 2784.133600]  down_read+0x43/0x90
> >>> [ 2784.133602]  update_writeback_rate+0x134/0x190 [bcache]
> >>> [ 2784.133619]  process_one_work+0x22b/0x3d0
> >>> [ 2784.133623]  worker_thread+0x53/0x410
> >>> [ 2784.133626]  ? process_one_work+0x3d0/0x3d0
> >>> [ 2784.133630]  kthread+0x12a/0x150
> >>> [ 2784.133633]  ? set_kthread_struct+0x50/0x50
> >>> [ 2784.133636]  ret_from_fork+0x22/0x30
> >>> [ 2784.133640]  </TASK>
> >>> [ 2784.133650] INFO: task kworker/25:2:326 blocked for more than 122 seconds.
> >>> [ 2784.467880]       Not tainted 5.15.0-40-generic #43-Ubuntu
> >>> [ 2784.734405] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
> >>> disables this message.
> >>> [ 2785.114677] task:kworker/25:2    state:D stack:    0 pid:  326
> >>> ppid:     2 flags:0x00004000
> >>> [ 2785.114692] Workqueue: bcache bch_data_insert_start [bcache]
> >>> [ 2785.114720] Call Trace:
> >>> [ 2785.114721]  <TASK>
> >>> [ 2785.114723]  __schedule+0x23d/0x590
> >>> [ 2785.114726]  schedule+0x4e/0xb0
> >>> [ 2785.114729]  bch_bucket_alloc+0x21a/0x5b0 [bcache]
> >>> [ 2785.114740]  ? bch_btree_insert_check_key+0x1e0/0x1e0 [bcache]
> >>> [ 2785.114753]  ? wait_woken+0x70/0x70
> >>> [ 2785.114756]  __bch_bucket_alloc_set+0x52/0xf0 [bcache]
> >>> [ 2785.114766]  bch_alloc_sectors+0x1c9/0x4c0 [bcache]
> >>> [ 2785.114777]  ? bch_btree_insert+0xea/0x130 [bcache]
> >>> [ 2785.114789]  ? __queue_work+0x211/0x480
> >>> [ 2785.114793]  bch_data_insert_start+0x15e/0x3a0 [bcache]
> >>> [ 2785.114806]  ? closure_sub+0x94/0xb0 [bcache]
> >>> [ 2785.114818]  process_one_work+0x22b/0x3d0
> >>> [ 2785.114822]  worker_thread+0x53/0x410
> >>> [ 2785.114826]  ? process_one_work+0x3d0/0x3d0
> >>> [ 2785.114829]  kthread+0x12a/0x150
> >>> [ 2785.114832]  ? set_kthread_struct+0x50/0x50
> >>> [ 2785.114835]  ret_from_fork+0x22/0x30
> >>> [ 2785.114839]  </TASK>
> >>> [ 2785.114864] INFO: task kworker/6:0:3038 blocked for more than 123 seconds.
> >>> [ 2785.444697]       Not tainted 5.15.0-40-generic #43-Ubuntu
> >>> [ 2785.711071] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
> >>> disables this message.
> >>> [ 2786.091440] task:kworker/6:0     state:D stack:    0 pid: 3038
> >>> ppid:     2 flags:0x00004000
> >>> [ 2786.091446] Workqueue: xfs-sync/bcache0 xfs_log_worker [xfs]
> >>> [ 2786.091582] Call Trace:
> >>> [ 2786.091584]  <TASK>
> >>> [ 2786.091585]  __schedule+0x23d/0x590
> >>> [ 2786.091589]  schedule+0x4e/0xb0
> >>> [ 2786.091591]  rwsem_down_read_slowpath+0x32e/0x380
> >>> [ 2786.091594]  down_read+0x43/0x90
> >>> [ 2786.091597]  cached_dev_write+0x7e/0x480 [bcache]
> >>> [ 2786.091613]  cached_dev_submit_bio+0x4ef/0x540 [bcache]
> >>> [ 2786.091626]  __submit_bio+0x1a1/0x220
> >>> [ 2786.091631]  __submit_bio_noacct+0x85/0x1f0
> >>> [ 2786.091634]  ? mutex_lock+0x13/0x40
> >>> [ 2786.091637]  submit_bio_noacct+0x4e/0x120
> >>> [ 2786.091640]  submit_bio+0x4a/0x130
> >>> [ 2786.091642]  xlog_write_iclog+0x254/0x300 [xfs]
> >>> [ 2786.091725]  xlog_sync+0x1ab/0x2c0 [xfs]
> >>> [ 2786.091807]  xlog_state_release_iclog+0x123/0x1d0 [xfs]
> >>> [ 2786.091925]  xfs_log_force+0x186/0x250 [xfs]
> >>> [ 2786.091994]  xfs_log_worker+0x39/0x90 [xfs]
> >>> [ 2786.092063]  process_one_work+0x22b/0x3d0
> >>> [ 2786.092067]  worker_thread+0x53/0x410
> >>> [ 2786.092069]  ? process_one_work+0x3d0/0x3d0
> >>> [ 2786.092072]  kthread+0x12a/0x150
> >>> [ 2786.092074]  ? set_kthread_struct+0x50/0x50
> >>> [ 2786.092077]  ret_from_fork+0x22/0x30
> >>> [ 2786.092081]  </TASK>
> >>> [ 2786.092088] INFO: task kworker/25:0:3047 blocked for more than 124 seconds.
> >>> [ 2786.430313]       Not tainted 5.15.0-40-generic #43-Ubuntu
> >>> [ 2786.696626] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
> >>> disables this message.
> >>> [ 2787.077056] task:kworker/25:0    state:D stack:    0 pid: 3047
> >>> ppid:     2 flags:0x00004000
> >>> [ 2787.077071] Workqueue: bcache bch_data_insert_start [bcache]
> >>> [ 2787.077098] Call Trace:
> >>> [ 2787.077099]  <TASK>
> >>> [ 2787.077101]  __schedule+0x23d/0x590
> >>> [ 2787.077104]  schedule+0x4e/0xb0
> >>> [ 2787.077106]  bch_bucket_alloc+0x21a/0x5b0 [bcache]
> >>> [ 2787.077116]  ? wait_woken+0x70/0x70
> >>> [ 2787.077119]  __bch_bucket_alloc_set+0x52/0xf0 [bcache]
> >>> [ 2787.077142]  bch_alloc_sectors+0x1c9/0x4c0 [bcache]
> >>> [ 2787.077153]  ? __bch_submit_bbio+0x97/0xb0 [bcache]
> >>> [ 2787.077166]  bch_data_insert_start+0x15e/0x3a0 [bcache]
> >>> [ 2787.077179]  ? closure_sub+0x94/0xb0 [bcache]
> >>> [ 2787.077191]  process_one_work+0x22b/0x3d0
> >>> [ 2787.077195]  worker_thread+0x53/0x410
> >>> [ 2787.077198]  ? process_one_work+0x3d0/0x3d0
> >>> [ 2787.077202]  kthread+0x12a/0x150
> >>> [ 2787.077205]  ? set_kthread_struct+0x50/0x50
> >>> [ 2787.077207]  ret_from_fork+0x22/0x30
> >>> [ 2787.077212]  </TASK>
> >>> [ 2787.077247] INFO: task bcache_writebac:3080 blocked for more than
> >>> 125 seconds.
> >>> [ 2787.423890]       Not tainted 5.15.0-40-generic #43-Ubuntu
> >>> [ 2787.690476] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
> >>> disables this message.
> >>> [ 2788.071063] task:bcache_writebac state:D stack:    0 pid: 3080
> >>> ppid:     2 flags:0x00004000
> >>> [ 2788.071080] Call Trace:
> >>> [ 2788.071081]  <TASK>
> >>> [ 2788.071082]  __schedule+0x23d/0x590
> >>> [ 2788.071086]  schedule+0x4e/0xb0
> >>> [ 2788.071100]  rwsem_down_write_slowpath+0x220/0x3d0
> >>> [ 2788.071105]  ? del_timer_sync+0x6c/0xb0
> >>> [ 2788.071109]  down_write+0x43/0x50
> >>> [ 2788.071112]  bch_writeback_thread+0x78/0x320 [bcache]
> >>> [ 2788.071142]  ? read_dirty+0x5a0/0x5a0 [bcache]
> >>> [ 2788.071158]  kthread+0x12a/0x150
> >>> [ 2788.071161]  ? set_kthread_struct+0x50/0x50
> >>> [ 2788.071164]  ret_from_fork+0x22/0x30
> >>> [ 2788.071168]  </TASK>
> >>> [ 2788.071199] INFO: task fio:3123 blocked for more than 126 seconds.
> >>> [ 2788.367175]       Not tainted 5.15.0-40-generic #43-Ubuntu
> >>> [ 2788.633386] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
> >>> disables this message.
> >>> [ 2789.014084] task:fio             state:D stack:    0 pid: 3123
> >>> ppid:  2378 flags:0x00004002
> >>> [ 2789.014087] Call Trace:
> >>> [ 2789.014089]  <TASK>
> >>> [ 2789.014090]  __schedule+0x23d/0x590
> >>> [ 2789.014106]  schedule+0x4e/0xb0
> >>> [ 2789.014108]  io_schedule+0x46/0x70
> >>> [ 2789.014110]  wait_on_page_bit_common+0x10c/0x3d0
> >>> [ 2789.014115]  ? filemap_invalidate_unlock_two+0x40/0x40
> >>> [ 2789.014118]  wait_on_page_bit+0x3f/0x50
> >>> [ 2789.014120]  wait_on_page_writeback+0x26/0x80
> >>> [ 2789.014124]  __filemap_fdatawait_range+0x97/0x110
> >>> [ 2789.014126]  file_write_and_wait_range+0xcc/0xf0
> >>> [ 2789.014130]  xfs_file_fsync+0x5b/0x250 [xfs]
> >>> [ 2789.014207]  vfs_fsync_range+0x49/0x80
> >>> [ 2789.014212]  ? __fget_light+0x32/0x80
> >>> [ 2789.014217]  __x64_sys_fsync+0x38/0x60
> >>> [ 2789.014220]  do_syscall_64+0x5c/0xc0
> >>> [ 2789.014223]  ? do_syscall_64+0x69/0xc0
> >>> [ 2789.014225]  ? do_syscall_64+0x69/0xc0
> >>> [ 2789.014226]  ? syscall_exit_to_user_mode+0x27/0x50
> >>> [ 2789.014230]  ? __x64_sys_write+0x19/0x20
> >>> [ 2789.014232]  ? do_syscall_64+0x69/0xc0
> >>> [ 2789.014234]  ? do_syscall_64+0x69/0xc0
> >>> [ 2789.014235]  ? asm_sysvec_apic_timer_interrupt+0xa/0x20
> >>> [ 2789.014239]  entry_SYSCALL_64_after_hwframe+0x44/0xae
> >>> [ 2789.014243] RIP: 0033:0x7f2cfdcfea5b
> >>> [ 2789.014245] RSP: 002b:00007ffcd87c6e60 EFLAGS: 00000293 ORIG_RAX:
> >>> 000000000000004a
> >>> [ 2789.014248] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f2cfdcfea5b
> >>> [ 2789.014249] RDX: 0000000000000000 RSI: 000055c8d26e6f40 RDI: 0000000000000006
> >>> [ 2789.014250] RBP: 0000000000001000 R08: 0000000000000000 R09: 000055c8d26e6f40
> >>> [ 2789.014252] R10: 00000000a318c620 R11: 0000000000000293 R12: 00007f2cf38a7000
> >>> [ 2789.014253] R13: 0000000000000000 R14: 00007f2cfb0522f0 R15: 0000000500000000
> >>> [ 2789.014255]  </TASK>
> >>> [ 2909.151501] INFO: task kworker/2:1:255 blocked for more than 247 seconds.
> >>> [ 2909.481856]       Not tainted 5.15.0-40-generic #43-Ubuntu
> >>> [ 2909.748707] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
> >>> disables this message.
> >>> [ 2910.130132] task:kworker/2:1     state:D stack:    0 pid:  255
> >>> ppid:     2 flags:0x00004000
> >>> [ 2910.130147] Workqueue: bcache bch_data_insert_start [bcache]
> >>> [ 2910.130176] Call Trace:
> >>> [ 2910.130177]  <TASK>
> >>> [ 2910.130179]  __schedule+0x23d/0x590
> >>> [ 2910.130182]  schedule+0x4e/0xb0
> >>> [ 2910.130184]  bch_bucket_alloc+0x21a/0x5b0 [bcache]
> >>> [ 2910.130194]  ? bch_btree_insert_check_key+0x1e0/0x1e0 [bcache]
> >>> [ 2910.130219]  ? wait_woken+0x70/0x70
> >>> [ 2910.130222]  __bch_bucket_alloc_set+0x52/0xf0 [bcache]
> >>> [ 2910.130233]  bch_alloc_sectors+0x1c9/0x4c0 [bcache]
> >>> [ 2910.130243]  ? bch_btree_insert+0xea/0x130 [bcache]
> >>> [ 2910.130255]  ? __queue_work+0x211/0x480
> >>> [ 2910.130259]  bch_data_insert_start+0x15e/0x3a0 [bcache]
> >>> [ 2910.130272]  ? closure_sub+0x94/0xb0 [bcache]
> >>> [ 2910.130285]  process_one_work+0x22b/0x3d0
> >>> [ 2910.130288]  worker_thread+0x53/0x410
> >>> [ 2910.130292]  ? process_one_work+0x3d0/0x3d0
> >>> [ 2910.130295]  kthread+0x12a/0x150
> >>> [ 2910.130298]  ? set_kthread_struct+0x50/0x50
> >>> [ 2910.130301]  ret_from_fork+0x22/0x30
> >>> [ 2910.130305]  </TASK>
> >>> [ 2910.130307] INFO: task kworker/3:1:267 blocked for more than 248 seconds.
> >>> [ 2910.456520]       Not tainted 5.15.0-40-generic #43-Ubuntu
> >>> [ 2910.723476] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
> >>> disables this message.
> >>> [ 2911.104601] task:kworker/3:1     state:D stack:    0 pid:  267
> >>> ppid:     2 flags:0x00004000
> >>> [ 2911.104617] Workqueue: events update_writeback_rate [bcache]
> >>> [ 2911.104647] Call Trace:
> >>> [ 2911.104648]  <TASK>
> >>> [ 2911.104649]  __schedule+0x23d/0x590
> >>> [ 2911.104652]  schedule+0x4e/0xb0
> >>> [ 2911.104654]  rwsem_down_read_slowpath+0x32e/0x380
> >>> [ 2911.104657]  down_read+0x43/0x90
> >>> [ 2911.104660]  update_writeback_rate+0x134/0x190 [bcache]
> >>> [ 2911.104676]  process_one_work+0x22b/0x3d0
> >>> [ 2911.104680]  worker_thread+0x53/0x410
> >>> [ 2911.104683]  ? process_one_work+0x3d0/0x3d0
> >>> [ 2911.104687]  kthread+0x12a/0x150
> >>> [ 2911.104690]  ? set_kthread_struct+0x50/0x50
> >>> [ 2911.104693]  ret_from_fork+0x22/0x30
> >>> [ 2911.104697]  </TASK>
> >>> [ 2911.104733] INFO: task kworker/25:2:326 blocked for more than 249 seconds.
> >>> [ 2911.439086]       Not tainted 5.15.0-40-generic #43-Ubuntu
> >>> [ 2911.706294] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
> >>> disables this message.
> >>> [ 2912.087471] task:kworker/25:2    state:D stack:    0 pid:  326
> >>> ppid:     2 flags:0x00004000
> >>> [ 2912.087475] Workqueue: bcache bch_data_insert_start [bcache]
> >>> [ 2912.087495] Call Trace:
> >>> [ 2912.087497]  <TASK>
> >>> [ 2912.087498]  __schedule+0x23d/0x590
> >>> [ 2912.087503]  schedule+0x4e/0xb0
> >>> [ 2912.087505]  bch_bucket_alloc+0x21a/0x5b0 [bcache]
> >>> [ 2912.087516]  ? bch_btree_insert_check_key+0x1e0/0x1e0 [bcache]
> >>> [ 2912.087528]  ? wait_woken+0x70/0x70
> >>> [ 2912.087532]  __bch_bucket_alloc_set+0x52/0xf0 [bcache]
> >>> [ 2912.087543]  bch_alloc_sectors+0x1c9/0x4c0 [bcache]
> >>> [ 2912.087553]  ? bch_btree_insert+0xea/0x130 [bcache]
> >>> [ 2912.087565]  ? __queue_work+0x211/0x480
> >>> [ 2912.087570]  bch_data_insert_start+0x15e/0x3a0 [bcache]
> >>> [ 2912.087583]  ? closure_sub+0x94/0xb0 [bcache]
> >>> [ 2912.087595]  process_one_work+0x22b/0x3d0
> >>> [ 2912.087599]  worker_thread+0x53/0x410
> >>> [ 2912.087602]  ? process_one_work+0x3d0/0x3d0
> >>> [ 2912.087606]  kthread+0x12a/0x150
> >>> [ 2912.087609]  ? set_kthread_struct+0x50/0x50
> >>> [ 2912.087612]  ret_from_fork+0x22/0x30
> >>> [ 2912.087617]  </TASK>
> >>>
> >>> Is this a bug? It's in writeback mode. I'd setup the cache and run stuff like,
> >>>
> >>> echo 0 > /sys/block/bcache0/bcache/sequential_cutoff
> >>>
> >>> I had also echoed 0 into congested_read_threshold_us,
> >>> congested_write_threshold_us.
> >>>
> >>> echo writeback > /sys/block/bcache0/bcache/cache_mode
> >>
> >> Where do you get the kernel? If this is stable kernel, could you give me the HEAD commit id?
> >>
> >> Coly Li
> >>
>

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: seeing this stace trace on kernel 5.15
  2022-06-29  7:56       ` Nikhil Kshirsagar
@ 2022-06-29  7:59         ` Coly Li
  2022-06-29  8:09         ` Nikhil Kshirsagar
  1 sibling, 0 replies; 14+ messages in thread
From: Coly Li @ 2022-06-29  7:59 UTC (permalink / raw)
  To: Nikhil Kshirsagar; +Cc: linux-bcache



> 2022年6月29日 15:56,Nikhil Kshirsagar <nkshirsagar@gmail.com> 写道:
> 
> 60gb slow hdd, 15gb fast nvme cache. shall i file upstream bug for this issue?


If you may trigger this on upstream kernel or stable kernel, please do it.


Coly Li




> 
> On Wed, 29 Jun 2022 at 13:20, Coly Li <colyli@suse.de> wrote:
>> 
>> 
>> 
>>> 2022年6月29日 15:02,Nikhil Kshirsagar <nkshirsagar@gmail.com> 写道:
>>> 
>>> Hi Coly,
>>> 
>>> I see the same bug on upstream kernel when I tried with
>>> https://kernel.ubuntu.com/~kernel-ppa/mainline/v5.15.50/  (
>>> cod/mainline/v5.15.50 (767db4b286c3e101ac220b813c873f492d9e4ee8)
>> 
>> 
>> Can you help to try the linux-stable tree, or the latest upstream Linus tree? Then I can have a clean code base.
>> 
>>> 
>>> Reads seem to trigger it, not writes. So this test triggered it -
>>> 
>>> fio --name=read_iops --directory=/home/ubuntu/bcache_mount --size=12G
>>> --ioengine=libaio --direct=1 --verify=0 --bs=4K --iodepth=128
>>> --rw=randread --group_reporting=1
>>> 
>>> https://pastebin.com/KyVSfnik has all the details.
>> 
>> 
>> OK, I will try to reproduce with above operation. What is the preferred cache and backing device sizes to reproduce the soft lockup?
>> 
>> Thanks.
>> 
>> Coly Li
>> 
>> 



^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: seeing this stace trace on kernel 5.15
  2022-06-29  7:56       ` Nikhil Kshirsagar
  2022-06-29  7:59         ` Coly Li
@ 2022-06-29  8:09         ` Nikhil Kshirsagar
  2022-06-29  8:17           ` Coly Li
  1 sibling, 1 reply; 14+ messages in thread
From: Nikhil Kshirsagar @ 2022-06-29  8:09 UTC (permalink / raw)
  To: Coly Li; +Cc: linux-bcache

Hi Coly,

Note I used partitions for the bcache as well as the hdd, not sure if
that's a factor.

the kernel is upstream kernel -

# uname -a
Linux bronzor 5.15.50-051550-generic #202206251445 SMP Sat Jun 25
14:51:22 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux

lsblk shows the setup,

NAME        MAJ:MIN RM   SIZE RO TYPE MOUNTPOINTS
sdd           8:48   0 279.4G  0 disk
└─sdd1        8:49   0    60G  0 part
  └─bcache0 252:0    0    60G  0 disk /home/ubuntu/bcache_mount
nvme0n1     259:0    0 372.6G  0 disk
└─nvme0n1p1 259:2    0    15G  0 part
  └─bcache0 252:0    0    60G  0 disk /home/ubuntu/bcache_mount

echo 0 > /sys/block/bcache0/bcache/sequential_cutoff
echo 0 > /sys/fs/bcache/dbadc4cf-feed-42b6-9534-1eeff569a450/congested_read_threshold_us
echo 0 > /sys/fs/bcache/dbadc4cf-feed-42b6-9534-1eeff569a450/congested_write_threshold_us
echo writeback > /sys/block/bcache0/bcache/cache_mode

test that triggers the deadlock -
fio --name=read_iops --directory=/home/ubuntu/bcache_mount --size=12G
--ioengine=libaio --direct=1 --verify=0 --bs=4K --iodepth=128
--rw=randread --group_reporting=1


[ 4473.699902] INFO: task bcache_writebac:1835 blocked for more than
120 seconds.
[ 4474.050921]       Not tainted 5.15.50-051550-generic #202206251445
[ 4474.350883] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
disables this message.
[ 4474.731391] task:bcache_writebac state:D stack:    0 pid: 1835
ppid:     2 flags:0x00004000
[ 4474.731408] Call Trace:
[ 4474.731411]  <TASK>
[ 4474.731413]  __schedule+0x23d/0x5a0
[ 4474.731433]  schedule+0x4e/0xb0
[ 4474.731436]  rwsem_down_write_slowpath+0x220/0x3d0
[ 4474.731441]  down_write+0x43/0x50
[ 4474.731446]  bch_writeback_thread+0x78/0x320 [bcache]
[ 4474.731471]  ? read_dirty_submit+0x70/0x70 [bcache]
[ 4474.731487]  kthread+0x12a/0x150
[ 4474.731491]  ? set_kthread_struct+0x50/0x50
[ 4474.731494]  ret_from_fork+0x22/0x30
[ 4474.731499]  </TASK>
[ 4474.731735] INFO: task fio:16626 blocked for more than 121 seconds.
[ 4475.035858]       Not tainted 5.15.50-051550-generic #202206251445
[ 4475.335859] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
disables this message.
[ 4475.716526] task:fio             state:D stack:    0 pid:16626
ppid:  1715 flags:0x00004002
[ 4475.716531] Call Trace:
[ 4475.716546]  <TASK>
[ 4475.716549]  __schedule+0x23d/0x5a0
[ 4475.716555]  ? sysvec_apic_timer_interrupt+0x4e/0x90
[ 4475.716560]  schedule+0x4e/0xb0
[ 4475.716563]  rwsem_down_read_slowpath+0x32e/0x380
[ 4475.716567]  down_read+0x43/0x90
[ 4475.716571]  cached_dev_write+0x7e/0x480 [bcache]
[ 4475.716604]  cached_dev_submit_bio+0x502/0x550 [bcache]
[ 4475.716618]  __submit_bio+0x1a1/0x220
[ 4475.716623]  __submit_bio_noacct+0x85/0x1f0
[ 4475.716626]  submit_bio_noacct+0x4e/0x120
[ 4475.716628]  submit_bio+0x4a/0x130
[ 4475.716631]  iomap_submit_ioend+0x53/0x80
[ 4475.716634]  iomap_writepages+0x35/0x40
[ 4475.716636]  xfs_vm_writepages+0x84/0xb0 [xfs]
[ 4475.716730]  do_writepages+0xda/0x200
[ 4475.716736]  filemap_fdatawrite_wbc+0x81/0xd0
[ 4475.716739]  file_write_and_wait_range+0xac/0xf0
[ 4475.716742]  xfs_file_fsync+0x5b/0x250 [xfs]
[ 4475.716821]  vfs_fsync_range+0x49/0x80
[ 4475.716826]  ? __fget_light+0x32/0x80
[ 4475.716829]  __x64_sys_fsync+0x38/0x60
[ 4475.716832]  do_syscall_64+0x5c/0xc0
[ 4475.716836]  ? ksys_write+0xce/0xe0
[ 4475.716838]  ? exit_to_user_mode_prepare+0x37/0xb0
[ 4475.716843]  ? syscall_exit_to_user_mode+0x27/0x50
[ 4475.716847]  ? __x64_sys_write+0x19/0x20
[ 4475.716849]  ? do_syscall_64+0x69/0xc0
[ 4475.716851]  ? syscall_exit_to_user_mode+0x27/0x50
[ 4475.716855]  ? __x64_sys_write+0x19/0x20
[ 4475.716857]  ? do_syscall_64+0x69/0xc0
[ 4475.716859]  ? exit_to_user_mode_prepare+0x37/0xb0
[ 4475.716862]  ? syscall_exit_to_user_mode+0x27/0x50
[ 4475.716865]  ? __x64_sys_write+0x19/0x20
[ 4475.716867]  ? do_syscall_64+0x69/0xc0
[ 4475.716869]  ? syscall_exit_to_user_mode+0x27/0x50
[ 4475.716873]  ? __x64_sys_write+0x19/0x20
[ 4475.716875]  ? do_syscall_64+0x69/0xc0
[ 4475.716877]  ? do_syscall_64+0x69/0xc0
[ 4475.716879]  ? asm_common_interrupt+0x8/0x40
[ 4475.716883]  entry_SYSCALL_64_after_hwframe+0x44/0xae
[ 4475.716885] RIP: 0033:0x7fd3c498aa5b
[ 4475.716888] RSP: 002b:00007ffc81602ee0 EFLAGS: 00000293 ORIG_RAX:
000000000000004a
[ 4475.716890] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007fd3c498aa5b
[ 4475.716892] RDX: 0000000000000000 RSI: 000055ededf41f40 RDI: 0000000000000006
[ 4475.716893] RBP: 0000000000001000 R08: 0000000000000000 R09: 000055ededf41f40
[ 4475.716895] R10: 00000000e8746551 R11: 0000000000000293 R12: 00007fd3ba533000
[ 4475.716896] R13: 0000000000000000 R14: 00007fd3c1cde110 R15: 0000000300000000
[ 4475.716898]  </TASK>
[ 4475.716912] INFO: task kworker/26:87:16682 blocked for more than 122 seconds.
[ 4476.063391]       Not tainted 5.15.50-051550-generic #202206251445
[ 4476.363476] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
disables this message.
[ 4476.743988] task:kworker/26:87   state:D stack:    0 pid:16682
ppid:     2 flags:0x00004000
[ 4476.743994] Workqueue: events update_writeback_rate [bcache]
[ 4476.744022] Call Trace:
[ 4476.744024]  <TASK>
[ 4476.744026]  __schedule+0x23d/0x5a0
[ 4476.744032]  schedule+0x4e/0xb0
[ 4476.744035]  rwsem_down_read_slowpath+0x32e/0x380
[ 4476.744039]  down_read+0x43/0x90
[ 4476.744042]  update_writeback_rate+0x134/0x190 [bcache]
[ 4476.744057]  process_one_work+0x22b/0x3d0
[ 4476.744063]  worker_thread+0x53/0x410
[ 4476.744066]  ? process_one_work+0x3d0/0x3d0
[ 4476.744070]  kthread+0x12a/0x150
[ 4476.744074]  ? set_kthread_struct+0x50/0x50
[ 4476.744077]  ret_from_fork+0x22/0x30
[ 4476.744082]  </TASK>
[ 4598.626220] INFO: task bcache_writebac:1835 blocked for more than
245 seconds.
[ 4598.978432]       Not tainted 5.15.50-051550-generic #202206251445
[ 4599.279900] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
disables this message.
[ 4599.657293] task:bcache_writebac state:D stack:    0 pid: 1835
ppid:     2 flags:0x00004000
[ 4599.657298] Call Trace:
[ 4599.657300]  <TASK>
[ 4599.657302]  __schedule+0x23d/0x5a0
[ 4599.657325]  schedule+0x4e/0xb0
[ 4599.657328]  rwsem_down_write_slowpath+0x220/0x3d0
[ 4599.657333]  down_write+0x43/0x50
[ 4599.657338]  bch_writeback_thread+0x78/0x320 [bcache]
[ 4599.657374]  ? read_dirty_submit+0x70/0x70 [bcache]
[ 4599.657390]  kthread+0x12a/0x150
[ 4599.657393]  ? set_kthread_struct+0x50/0x50
[ 4599.657396]  ret_from_fork+0x22/0x30
[ 4599.657401]  </TASK>
[ 4599.657434] INFO: task kworker/2:63:14895 blocked for more than 124 seconds.
[ 4600.000498]       Not tainted 5.15.50-051550-generic #202206251445
[ 4600.301188] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
disables this message.
[ 4600.681984] task:kworker/2:63    state:D stack:    0 pid:14895
ppid:     2 flags:0x00004000
[ 4600.682005] Workqueue: bcache bch_data_insert_start [bcache]
[ 4600.682031] Call Trace:
[ 4600.682033]  <TASK>
[ 4600.682035]  __schedule+0x23d/0x5a0
[ 4600.682040]  schedule+0x4e/0xb0
[ 4600.682043]  bch_bucket_alloc+0x21a/0x5b0 [bcache]
[ 4600.682066]  ? wait_woken+0x70/0x70
[ 4600.682071]  __bch_bucket_alloc_set+0x52/0xf0 [bcache]
[ 4600.682082]  bch_alloc_sectors+0x1c9/0x4c0 [bcache]
[ 4600.682098]  ? krealloc+0x9d/0xd0
[ 4600.682102]  ? __bch_keylist_realloc+0xb7/0x100 [bcache]
[ 4600.682113]  ? __bch_submit_bbio+0x97/0xb0 [bcache]
[ 4600.682126]  bch_data_insert_start+0x15e/0x3a0 [bcache]
[ 4600.682139]  ? closure_sub+0x94/0xb0 [bcache]
[ 4600.682152]  process_one_work+0x22b/0x3d0
[ 4600.682157]  worker_thread+0x53/0x410
[ 4600.682161]  ? process_one_work+0x3d0/0x3d0
[ 4600.682164]  kthread+0x12a/0x150
[ 4600.682168]  ? set_kthread_struct+0x50/0x50
[ 4600.682171]  ret_from_fork+0x22/0x30
[ 4600.682176]  </TASK>
[ 4600.682240] INFO: task kworker/25:123:16464 blocked for more than
125 seconds.
[ 4601.033116]       Not tainted 5.15.50-051550-generic #202206251445
[ 4601.333405] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
disables this message.
[ 4601.713829] task:kworker/25:123  state:D stack:    0 pid:16464
ppid:     2 flags:0x00004000
[ 4601.713835] Workqueue: xfs-sync/bcache0 xfs_log_worker [xfs]
[ 4601.713975] Call Trace:
[ 4601.713977]  <TASK>
[ 4601.713979]  __schedule+0x23d/0x5a0
[ 4601.713986]  schedule+0x4e/0xb0
[ 4601.713988]  rwsem_down_read_slowpath+0x32e/0x380
[ 4601.713993]  down_read+0x43/0x90
[ 4601.713996]  cached_dev_write+0x7e/0x480 [bcache]
[ 4601.714018]  cached_dev_submit_bio+0x502/0x550 [bcache]
[ 4601.714031]  __submit_bio+0x1a1/0x220
[ 4601.714035]  ? ttwu_do_activate+0x72/0xf0
[ 4601.714039]  __submit_bio_noacct+0x85/0x1f0
[ 4601.714042]  ? mutex_lock+0x13/0x40
[ 4601.714045]  submit_bio_noacct+0x4e/0x120
[ 4601.714047]  submit_bio+0x4a/0x130
[ 4601.714050]  xlog_write_iclog+0x254/0x300 [xfs]
[ 4601.714142]  xlog_sync+0x1ab/0x2c0 [xfs]
[ 4601.714225]  xlog_state_release_iclog+0x123/0x1d0 [xfs]
[ 4601.714306]  xfs_log_force+0x186/0x250 [xfs]
[ 4601.714386]  xfs_log_worker+0x39/0x90 [xfs]
[ 4601.714466]  process_one_work+0x22b/0x3d0
[ 4601.714471]  worker_thread+0x53/0x410
[ 4601.714474]  ? process_one_work+0x3d0/0x3d0
[ 4601.714477]  kthread+0x12a/0x150
[ 4601.714481]  ? set_kthread_struct+0x50/0x50
[ 4601.714484]  ret_from_fork+0x22/0x30
[ 4601.714489]  </TASK>
[ 4601.714526] INFO: task kworker/22:89:16557 blocked for more than 126 seconds.
[ 4602.056844]       Not tainted 5.15.50-051550-generic #202206251445
[ 4602.357148] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
disables this message.
[ 4602.733923] task:kworker/22:89   state:D stack:    0 pid:16557
ppid:     2 flags:0x00004000
[ 4602.733940] Workqueue: bcache bch_data_insert_start [bcache]
[ 4602.733966] Call Trace:
[ 4602.733968]  <TASK>
[ 4602.733970]  __schedule+0x23d/0x5a0
[ 4602.733988]  schedule+0x4e/0xb0
[ 4602.733991]  bch_bucket_alloc+0x21a/0x5b0 [bcache]
[ 4602.734002]  ? bch_btree_insert_check_key+0x1e0/0x1e0 [bcache]
[ 4602.734014]  ? wait_woken+0x70/0x70
[ 4602.734019]  __bch_bucket_alloc_set+0x52/0xf0 [bcache]
[ 4602.734031]  bch_alloc_sectors+0x1c9/0x4c0 [bcache]
[ 4602.734043]  ? __wake_up_common_lock+0x8a/0xc0
[ 4602.734046]  bch_data_insert_start+0x15e/0x3a0 [bcache]
[ 4602.734062]  ? closure_sub+0x94/0xb0 [bcache]
[ 4602.734083]  process_one_work+0x22b/0x3d0
[ 4602.734088]  worker_thread+0x53/0x410
[ 4602.734092]  ? process_one_work+0x3d0/0x3d0
[ 4602.734095]  kthread+0x12a/0x150
[ 4602.734099]  ? set_kthread_struct+0x50/0x50
[ 4602.734102]  ret_from_fork+0x22/0x30
[ 4602.734108]  </TASK>
[ 4602.734134] INFO: task fio:16626 blocked for more than 249 seconds.
[ 4603.034541]       Not tainted 5.15.50-051550-generic #202206251445
[ 4603.330400] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
disables this message.
[ 4603.711013] task:fio             state:D stack:    0 pid:16626
ppid:  1715 flags:0x00004002
[ 4603.711030] Call Trace:
[ 4603.711032]  <TASK>
[ 4603.711046]  __schedule+0x23d/0x5a0
[ 4603.711054]  ? sysvec_apic_timer_interrupt+0x4e/0x90
[ 4603.711060]  schedule+0x4e/0xb0
[ 4603.711063]  rwsem_down_read_slowpath+0x32e/0x380
[ 4603.711067]  down_read+0x43/0x90
[ 4603.711072]  cached_dev_write+0x7e/0x480 [bcache]
[ 4603.711095]  cached_dev_submit_bio+0x502/0x550 [bcache]
[ 4603.711108]  __submit_bio+0x1a1/0x220
[ 4603.711113]  __submit_bio_noacct+0x85/0x1f0
[ 4603.711116]  submit_bio_noacct+0x4e/0x120
[ 4603.711118]  submit_bio+0x4a/0x130
[ 4603.711121]  iomap_submit_ioend+0x53/0x80
[ 4603.711124]  iomap_writepages+0x35/0x40
[ 4603.711126]  xfs_vm_writepages+0x84/0xb0 [xfs]
[ 4603.711223]  do_writepages+0xda/0x200
[ 4603.711229]  filemap_fdatawrite_wbc+0x81/0xd0
[ 4603.711232]  file_write_and_wait_range+0xac/0xf0
[ 4603.711235]  xfs_file_fsync+0x5b/0x250 [xfs]
[ 4603.711312]  vfs_fsync_range+0x49/0x80
[ 4603.711317]  ? __fget_light+0x32/0x80
[ 4603.711321]  __x64_sys_fsync+0x38/0x60
[ 4603.711324]  do_syscall_64+0x5c/0xc0
[ 4603.711327]  ? ksys_write+0xce/0xe0
[ 4603.711330]  ? exit_to_user_mode_prepare+0x37/0xb0
[ 4603.711335]  ? syscall_exit_to_user_mode+0x27/0x50
[ 4603.711338]  ? __x64_sys_write+0x19/0x20
[ 4603.711340]  ? do_syscall_64+0x69/0xc0
[ 4603.711342]  ? syscall_exit_to_user_mode+0x27/0x50
[ 4603.711345]  ? __x64_sys_write+0x19/0x20
[ 4603.711347]  ? do_syscall_64+0x69/0xc0
[ 4603.711349]  ? exit_to_user_mode_prepare+0x37/0xb0
[ 4603.711352]  ? syscall_exit_to_user_mode+0x27/0x50
[ 4603.711355]  ? __x64_sys_write+0x19/0x20
[ 4603.711357]  ? do_syscall_64+0x69/0xc0
[ 4603.711359]  ? syscall_exit_to_user_mode+0x27/0x50
[ 4603.711362]  ? __x64_sys_write+0x19/0x20
[ 4603.711364]  ? do_syscall_64+0x69/0xc0
[ 4603.711366]  ? do_syscall_64+0x69/0xc0
[ 4603.711368]  ? asm_common_interrupt+0x8/0x40
[ 4603.711371]  entry_SYSCALL_64_after_hwframe+0x44/0xae
[ 4603.711374] RIP: 0033:0x7fd3c498aa5b
[ 4603.711376] RSP: 002b:00007ffc81602ee0 EFLAGS: 00000293 ORIG_RAX:
000000000000004a
[ 4603.711379] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007fd3c498aa5b
[ 4603.711381] RDX: 0000000000000000 RSI: 000055ededf41f40 RDI: 0000000000000006
[ 4603.711382] RBP: 0000000000001000 R08: 0000000000000000 R09: 000055ededf41f40
[ 4603.711383] R10: 00000000e8746551 R11: 0000000000000293 R12: 00007fd3ba533000
[ 4603.711384] R13: 0000000000000000 R14: 00007fd3c1cde110 R15: 0000000300000000
[ 4603.711387]  </TASK>
[ 4603.711421] INFO: task kworker/26:87:16682 blocked for more than 250 seconds.
[ 4604.058441]       Not tainted 5.15.50-051550-generic #202206251445
[ 4604.359128] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
disables this message.
[ 4604.739950] task:kworker/26:87   state:D stack:    0 pid:16682
ppid:     2 flags:0x00004000
[ 4604.739968] Workqueue: events update_writeback_rate [bcache]
[ 4604.739995] Call Trace:
[ 4604.739997]  <TASK>
[ 4604.740012]  __schedule+0x23d/0x5a0
[ 4604.740017]  schedule+0x4e/0xb0
[ 4604.740020]  rwsem_down_read_slowpath+0x32e/0x380
[ 4604.740024]  down_read+0x43/0x90
[ 4604.740027]  update_writeback_rate+0x134/0x190 [bcache]
[ 4604.740043]  process_one_work+0x22b/0x3d0
[ 4604.740049]  worker_thread+0x53/0x410
[ 4604.740052]  ? process_one_work+0x3d0/0x3d0
[ 4604.740055]  kthread+0x12a/0x150
[ 4604.740059]  ? set_kthread_struct+0x50/0x50
[ 4604.740062]  ret_from_fork+0x22/0x30
[ 4604.740067]  </TASK>
[ 4725.600602] INFO: task bcache_writebac:1835 blocked for more than
372 seconds.
[ 4725.952827]       Not tainted 5.15.50-051550-generic #202206251445
[ 4726.253763] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
disables this message.
[ 4726.630937] task:bcache_writebac state:D stack:    0 pid: 1835
ppid:     2 flags:0x00004000
[ 4726.630954] Call Trace:
[ 4726.630956]  <TASK>
[ 4726.630959]  __schedule+0x23d/0x5a0
[ 4726.630966]  schedule+0x4e/0xb0
[ 4726.630970]  rwsem_down_write_slowpath+0x220/0x3d0
[ 4726.630976]  down_write+0x43/0x50
[ 4726.630980]  bch_writeback_thread+0x78/0x320 [bcache]
[ 4726.631005]  ? read_dirty_submit+0x70/0x70 [bcache]
[ 4726.631021]  kthread+0x12a/0x150
[ 4726.631025]  ? set_kthread_struct+0x50/0x50
[ 4726.631028]  ret_from_fork+0x22/0x30
[ 4726.631033]  </TASK>

Regards,
Nikhil.

On Wed, 29 Jun 2022 at 13:26, Nikhil Kshirsagar <nkshirsagar@gmail.com> wrote:
>
> 60gb slow hdd, 15gb fast nvme cache. shall i file upstream bug for this issue?
>
> On Wed, 29 Jun 2022 at 13:20, Coly Li <colyli@suse.de> wrote:
> >
> >
> >
> > > 2022年6月29日 15:02,Nikhil Kshirsagar <nkshirsagar@gmail.com> 写道:
> > >
> > > Hi Coly,
> > >
> > > I see the same bug on upstream kernel when I tried with
> > > https://kernel.ubuntu.com/~kernel-ppa/mainline/v5.15.50/  (
> > > cod/mainline/v5.15.50 (767db4b286c3e101ac220b813c873f492d9e4ee8)
> >
> >
> > Can you help to try the linux-stable tree, or the latest upstream Linus tree? Then I can have a clean code base.
> >
> > >
> > > Reads seem to trigger it, not writes. So this test triggered it -
> > >
> > > fio --name=read_iops --directory=/home/ubuntu/bcache_mount --size=12G
> > > --ioengine=libaio --direct=1 --verify=0 --bs=4K --iodepth=128
> > > --rw=randread --group_reporting=1
> > >
> > > https://pastebin.com/KyVSfnik has all the details.
> >
> >
> > OK, I will try to reproduce with above operation. What is the preferred cache and backing device sizes to reproduce the soft lockup?
> >
> > Thanks.
> >
> > Coly Li
> >
> >
> > >
> > > Regards,
> > > Nikhil.
> > >
> > > On Tue, 28 Jun 2022 at 20:08, Coly Li <colyli@suse.de> wrote:
> > >>
> > >>
> > >>
> > >>> 2022年6月28日 14:31,Nikhil Kshirsagar <nkshirsagar@gmail.com> 写道:
> > >>>
> > >>> Hi Coly,
> > >>>
> > >>> I just kicked off a 20gb random readwrite test using fio to a bcache device,
> > >>>
> > >>> fio --filename=/home/ubuntu/bcache_mount/cacahedfile --size=20GB
> > >>> --direct=1 --rw=randrw --bs=4k --ioengine=libaio --iodepth=128
> > >>> --name=iops-test-job --eta-newline=1
> > >>>
> > >>> # lsblk /dev/sdc /dev/nvme0n1
> > >>> NAME        MAJ:MIN RM   SIZE RO TYPE MOUNTPOINTS
> > >>> sdc           8:32   0 279.4G  0 disk
> > >>> └─sdc1        8:33   0    60G  0 part
> > >>> └─bcache0 252:0    0    60G  0 disk /home/ubuntu/bcache_mount
> > >>> nvme0n1     259:0    0 372.6G  0 disk
> > >>> └─nvme0n1p1 259:2    0    15G  0 part
> > >>> └─bcache0 252:0    0    60G  0 disk /home/ubuntu/bcache_mount
> > >>>
> > >>> and am seeing this trace in the dmesg,
> > >>>
> > >>> [ 2475.034909] XFS (bcache0): Ending clean mount
> > >>> [ 2475.036238] xfs filesystem being mounted at
> > >>> /home/ubuntu/bcache_mount supports timestamps until 2038 (0x7fffffff)
> > >>>
> > >>> [ 2782.176705] INFO: task kworker/2:1:255 blocked for more than 120 seconds.
> > >>> [ 2782.507171]       Not tainted 5.15.0-40-generic #43-Ubuntu
> > >>> [ 2782.774078] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
> > >>> disables this message.
> > >>> [ 2783.155206] task:kworker/2:1     state:D stack:    0 pid:  255
> > >>> ppid:     2 flags:0x00004000
> > >>> [ 2783.155210] Workqueue: bcache bch_data_insert_start [bcache]
> > >>> [ 2783.155259] Call Trace:
> > >>> [ 2783.155261]  <TASK>
> > >>> [ 2783.155263]  __schedule+0x23d/0x590
> > >>> [ 2783.155269]  schedule+0x4e/0xb0
> > >>> [ 2783.155271]  bch_bucket_alloc+0x21a/0x5b0 [bcache]
> > >>> [ 2783.155281]  ? bch_btree_insert_check_key+0x1e0/0x1e0 [bcache]
> > >>> [ 2783.155294]  ? wait_woken+0x70/0x70
> > >>> [ 2783.155298]  __bch_bucket_alloc_set+0x52/0xf0 [bcache]
> > >>> [ 2783.155309]  bch_alloc_sectors+0x1c9/0x4c0 [bcache]
> > >>> [ 2783.155319]  ? bch_btree_insert+0xea/0x130 [bcache]
> > >>> [ 2783.155331]  ? __queue_work+0x211/0x480
> > >>> [ 2783.155336]  bch_data_insert_start+0x15e/0x3a0 [bcache]
> > >>> [ 2783.155349]  ? closure_sub+0x94/0xb0 [bcache]
> > >>> [ 2783.155362]  process_one_work+0x22b/0x3d0
> > >>> [ 2783.155366]  worker_thread+0x53/0x410
> > >>> [ 2783.155369]  ? process_one_work+0x3d0/0x3d0
> > >>> [ 2783.155372]  kthread+0x12a/0x150
> > >>> [ 2783.155376]  ? set_kthread_struct+0x50/0x50
> > >>> [ 2783.155379]  ret_from_fork+0x22/0x30
> > >>> [ 2783.155385]  </TASK>
> > >>> [ 2783.155423] INFO: task kworker/3:1:267 blocked for more than 121 seconds.
> > >>> [ 2783.485797]       Not tainted 5.15.0-40-generic #43-Ubuntu
> > >>> [ 2783.752485] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
> > >>> disables this message.
> > >>> [ 2784.133541] task:kworker/3:1     state:D stack:    0 pid:  267
> > >>> ppid:     2 flags:0x00004000
> > >>> [ 2784.133544] Workqueue: events update_writeback_rate [bcache]
> > >>> [ 2784.133588] Call Trace:
> > >>> [ 2784.133589]  <TASK>
> > >>> [ 2784.133591]  __schedule+0x23d/0x590
> > >>> [ 2784.133594]  schedule+0x4e/0xb0
> > >>> [ 2784.133596]  rwsem_down_read_slowpath+0x32e/0x380
> > >>> [ 2784.133600]  down_read+0x43/0x90
> > >>> [ 2784.133602]  update_writeback_rate+0x134/0x190 [bcache]
> > >>> [ 2784.133619]  process_one_work+0x22b/0x3d0
> > >>> [ 2784.133623]  worker_thread+0x53/0x410
> > >>> [ 2784.133626]  ? process_one_work+0x3d0/0x3d0
> > >>> [ 2784.133630]  kthread+0x12a/0x150
> > >>> [ 2784.133633]  ? set_kthread_struct+0x50/0x50
> > >>> [ 2784.133636]  ret_from_fork+0x22/0x30
> > >>> [ 2784.133640]  </TASK>
> > >>> [ 2784.133650] INFO: task kworker/25:2:326 blocked for more than 122 seconds.
> > >>> [ 2784.467880]       Not tainted 5.15.0-40-generic #43-Ubuntu
> > >>> [ 2784.734405] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
> > >>> disables this message.
> > >>> [ 2785.114677] task:kworker/25:2    state:D stack:    0 pid:  326
> > >>> ppid:     2 flags:0x00004000
> > >>> [ 2785.114692] Workqueue: bcache bch_data_insert_start [bcache]
> > >>> [ 2785.114720] Call Trace:
> > >>> [ 2785.114721]  <TASK>
> > >>> [ 2785.114723]  __schedule+0x23d/0x590
> > >>> [ 2785.114726]  schedule+0x4e/0xb0
> > >>> [ 2785.114729]  bch_bucket_alloc+0x21a/0x5b0 [bcache]
> > >>> [ 2785.114740]  ? bch_btree_insert_check_key+0x1e0/0x1e0 [bcache]
> > >>> [ 2785.114753]  ? wait_woken+0x70/0x70
> > >>> [ 2785.114756]  __bch_bucket_alloc_set+0x52/0xf0 [bcache]
> > >>> [ 2785.114766]  bch_alloc_sectors+0x1c9/0x4c0 [bcache]
> > >>> [ 2785.114777]  ? bch_btree_insert+0xea/0x130 [bcache]
> > >>> [ 2785.114789]  ? __queue_work+0x211/0x480
> > >>> [ 2785.114793]  bch_data_insert_start+0x15e/0x3a0 [bcache]
> > >>> [ 2785.114806]  ? closure_sub+0x94/0xb0 [bcache]
> > >>> [ 2785.114818]  process_one_work+0x22b/0x3d0
> > >>> [ 2785.114822]  worker_thread+0x53/0x410
> > >>> [ 2785.114826]  ? process_one_work+0x3d0/0x3d0
> > >>> [ 2785.114829]  kthread+0x12a/0x150
> > >>> [ 2785.114832]  ? set_kthread_struct+0x50/0x50
> > >>> [ 2785.114835]  ret_from_fork+0x22/0x30
> > >>> [ 2785.114839]  </TASK>
> > >>> [ 2785.114864] INFO: task kworker/6:0:3038 blocked for more than 123 seconds.
> > >>> [ 2785.444697]       Not tainted 5.15.0-40-generic #43-Ubuntu
> > >>> [ 2785.711071] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
> > >>> disables this message.
> > >>> [ 2786.091440] task:kworker/6:0     state:D stack:    0 pid: 3038
> > >>> ppid:     2 flags:0x00004000
> > >>> [ 2786.091446] Workqueue: xfs-sync/bcache0 xfs_log_worker [xfs]
> > >>> [ 2786.091582] Call Trace:
> > >>> [ 2786.091584]  <TASK>
> > >>> [ 2786.091585]  __schedule+0x23d/0x590
> > >>> [ 2786.091589]  schedule+0x4e/0xb0
> > >>> [ 2786.091591]  rwsem_down_read_slowpath+0x32e/0x380
> > >>> [ 2786.091594]  down_read+0x43/0x90
> > >>> [ 2786.091597]  cached_dev_write+0x7e/0x480 [bcache]
> > >>> [ 2786.091613]  cached_dev_submit_bio+0x4ef/0x540 [bcache]
> > >>> [ 2786.091626]  __submit_bio+0x1a1/0x220
> > >>> [ 2786.091631]  __submit_bio_noacct+0x85/0x1f0
> > >>> [ 2786.091634]  ? mutex_lock+0x13/0x40
> > >>> [ 2786.091637]  submit_bio_noacct+0x4e/0x120
> > >>> [ 2786.091640]  submit_bio+0x4a/0x130
> > >>> [ 2786.091642]  xlog_write_iclog+0x254/0x300 [xfs]
> > >>> [ 2786.091725]  xlog_sync+0x1ab/0x2c0 [xfs]
> > >>> [ 2786.091807]  xlog_state_release_iclog+0x123/0x1d0 [xfs]
> > >>> [ 2786.091925]  xfs_log_force+0x186/0x250 [xfs]
> > >>> [ 2786.091994]  xfs_log_worker+0x39/0x90 [xfs]
> > >>> [ 2786.092063]  process_one_work+0x22b/0x3d0
> > >>> [ 2786.092067]  worker_thread+0x53/0x410
> > >>> [ 2786.092069]  ? process_one_work+0x3d0/0x3d0
> > >>> [ 2786.092072]  kthread+0x12a/0x150
> > >>> [ 2786.092074]  ? set_kthread_struct+0x50/0x50
> > >>> [ 2786.092077]  ret_from_fork+0x22/0x30
> > >>> [ 2786.092081]  </TASK>
> > >>> [ 2786.092088] INFO: task kworker/25:0:3047 blocked for more than 124 seconds.
> > >>> [ 2786.430313]       Not tainted 5.15.0-40-generic #43-Ubuntu
> > >>> [ 2786.696626] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
> > >>> disables this message.
> > >>> [ 2787.077056] task:kworker/25:0    state:D stack:    0 pid: 3047
> > >>> ppid:     2 flags:0x00004000
> > >>> [ 2787.077071] Workqueue: bcache bch_data_insert_start [bcache]
> > >>> [ 2787.077098] Call Trace:
> > >>> [ 2787.077099]  <TASK>
> > >>> [ 2787.077101]  __schedule+0x23d/0x590
> > >>> [ 2787.077104]  schedule+0x4e/0xb0
> > >>> [ 2787.077106]  bch_bucket_alloc+0x21a/0x5b0 [bcache]
> > >>> [ 2787.077116]  ? wait_woken+0x70/0x70
> > >>> [ 2787.077119]  __bch_bucket_alloc_set+0x52/0xf0 [bcache]
> > >>> [ 2787.077142]  bch_alloc_sectors+0x1c9/0x4c0 [bcache]
> > >>> [ 2787.077153]  ? __bch_submit_bbio+0x97/0xb0 [bcache]
> > >>> [ 2787.077166]  bch_data_insert_start+0x15e/0x3a0 [bcache]
> > >>> [ 2787.077179]  ? closure_sub+0x94/0xb0 [bcache]
> > >>> [ 2787.077191]  process_one_work+0x22b/0x3d0
> > >>> [ 2787.077195]  worker_thread+0x53/0x410
> > >>> [ 2787.077198]  ? process_one_work+0x3d0/0x3d0
> > >>> [ 2787.077202]  kthread+0x12a/0x150
> > >>> [ 2787.077205]  ? set_kthread_struct+0x50/0x50
> > >>> [ 2787.077207]  ret_from_fork+0x22/0x30
> > >>> [ 2787.077212]  </TASK>
> > >>> [ 2787.077247] INFO: task bcache_writebac:3080 blocked for more than
> > >>> 125 seconds.
> > >>> [ 2787.423890]       Not tainted 5.15.0-40-generic #43-Ubuntu
> > >>> [ 2787.690476] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
> > >>> disables this message.
> > >>> [ 2788.071063] task:bcache_writebac state:D stack:    0 pid: 3080
> > >>> ppid:     2 flags:0x00004000
> > >>> [ 2788.071080] Call Trace:
> > >>> [ 2788.071081]  <TASK>
> > >>> [ 2788.071082]  __schedule+0x23d/0x590
> > >>> [ 2788.071086]  schedule+0x4e/0xb0
> > >>> [ 2788.071100]  rwsem_down_write_slowpath+0x220/0x3d0
> > >>> [ 2788.071105]  ? del_timer_sync+0x6c/0xb0
> > >>> [ 2788.071109]  down_write+0x43/0x50
> > >>> [ 2788.071112]  bch_writeback_thread+0x78/0x320 [bcache]
> > >>> [ 2788.071142]  ? read_dirty+0x5a0/0x5a0 [bcache]
> > >>> [ 2788.071158]  kthread+0x12a/0x150
> > >>> [ 2788.071161]  ? set_kthread_struct+0x50/0x50
> > >>> [ 2788.071164]  ret_from_fork+0x22/0x30
> > >>> [ 2788.071168]  </TASK>
> > >>> [ 2788.071199] INFO: task fio:3123 blocked for more than 126 seconds.
> > >>> [ 2788.367175]       Not tainted 5.15.0-40-generic #43-Ubuntu
> > >>> [ 2788.633386] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
> > >>> disables this message.
> > >>> [ 2789.014084] task:fio             state:D stack:    0 pid: 3123
> > >>> ppid:  2378 flags:0x00004002
> > >>> [ 2789.014087] Call Trace:
> > >>> [ 2789.014089]  <TASK>
> > >>> [ 2789.014090]  __schedule+0x23d/0x590
> > >>> [ 2789.014106]  schedule+0x4e/0xb0
> > >>> [ 2789.014108]  io_schedule+0x46/0x70
> > >>> [ 2789.014110]  wait_on_page_bit_common+0x10c/0x3d0
> > >>> [ 2789.014115]  ? filemap_invalidate_unlock_two+0x40/0x40
> > >>> [ 2789.014118]  wait_on_page_bit+0x3f/0x50
> > >>> [ 2789.014120]  wait_on_page_writeback+0x26/0x80
> > >>> [ 2789.014124]  __filemap_fdatawait_range+0x97/0x110
> > >>> [ 2789.014126]  file_write_and_wait_range+0xcc/0xf0
> > >>> [ 2789.014130]  xfs_file_fsync+0x5b/0x250 [xfs]
> > >>> [ 2789.014207]  vfs_fsync_range+0x49/0x80
> > >>> [ 2789.014212]  ? __fget_light+0x32/0x80
> > >>> [ 2789.014217]  __x64_sys_fsync+0x38/0x60
> > >>> [ 2789.014220]  do_syscall_64+0x5c/0xc0
> > >>> [ 2789.014223]  ? do_syscall_64+0x69/0xc0
> > >>> [ 2789.014225]  ? do_syscall_64+0x69/0xc0
> > >>> [ 2789.014226]  ? syscall_exit_to_user_mode+0x27/0x50
> > >>> [ 2789.014230]  ? __x64_sys_write+0x19/0x20
> > >>> [ 2789.014232]  ? do_syscall_64+0x69/0xc0
> > >>> [ 2789.014234]  ? do_syscall_64+0x69/0xc0
> > >>> [ 2789.014235]  ? asm_sysvec_apic_timer_interrupt+0xa/0x20
> > >>> [ 2789.014239]  entry_SYSCALL_64_after_hwframe+0x44/0xae
> > >>> [ 2789.014243] RIP: 0033:0x7f2cfdcfea5b
> > >>> [ 2789.014245] RSP: 002b:00007ffcd87c6e60 EFLAGS: 00000293 ORIG_RAX:
> > >>> 000000000000004a
> > >>> [ 2789.014248] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f2cfdcfea5b
> > >>> [ 2789.014249] RDX: 0000000000000000 RSI: 000055c8d26e6f40 RDI: 0000000000000006
> > >>> [ 2789.014250] RBP: 0000000000001000 R08: 0000000000000000 R09: 000055c8d26e6f40
> > >>> [ 2789.014252] R10: 00000000a318c620 R11: 0000000000000293 R12: 00007f2cf38a7000
> > >>> [ 2789.014253] R13: 0000000000000000 R14: 00007f2cfb0522f0 R15: 0000000500000000
> > >>> [ 2789.014255]  </TASK>
> > >>> [ 2909.151501] INFO: task kworker/2:1:255 blocked for more than 247 seconds.
> > >>> [ 2909.481856]       Not tainted 5.15.0-40-generic #43-Ubuntu
> > >>> [ 2909.748707] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
> > >>> disables this message.
> > >>> [ 2910.130132] task:kworker/2:1     state:D stack:    0 pid:  255
> > >>> ppid:     2 flags:0x00004000
> > >>> [ 2910.130147] Workqueue: bcache bch_data_insert_start [bcache]
> > >>> [ 2910.130176] Call Trace:
> > >>> [ 2910.130177]  <TASK>
> > >>> [ 2910.130179]  __schedule+0x23d/0x590
> > >>> [ 2910.130182]  schedule+0x4e/0xb0
> > >>> [ 2910.130184]  bch_bucket_alloc+0x21a/0x5b0 [bcache]
> > >>> [ 2910.130194]  ? bch_btree_insert_check_key+0x1e0/0x1e0 [bcache]
> > >>> [ 2910.130219]  ? wait_woken+0x70/0x70
> > >>> [ 2910.130222]  __bch_bucket_alloc_set+0x52/0xf0 [bcache]
> > >>> [ 2910.130233]  bch_alloc_sectors+0x1c9/0x4c0 [bcache]
> > >>> [ 2910.130243]  ? bch_btree_insert+0xea/0x130 [bcache]
> > >>> [ 2910.130255]  ? __queue_work+0x211/0x480
> > >>> [ 2910.130259]  bch_data_insert_start+0x15e/0x3a0 [bcache]
> > >>> [ 2910.130272]  ? closure_sub+0x94/0xb0 [bcache]
> > >>> [ 2910.130285]  process_one_work+0x22b/0x3d0
> > >>> [ 2910.130288]  worker_thread+0x53/0x410
> > >>> [ 2910.130292]  ? process_one_work+0x3d0/0x3d0
> > >>> [ 2910.130295]  kthread+0x12a/0x150
> > >>> [ 2910.130298]  ? set_kthread_struct+0x50/0x50
> > >>> [ 2910.130301]  ret_from_fork+0x22/0x30
> > >>> [ 2910.130305]  </TASK>
> > >>> [ 2910.130307] INFO: task kworker/3:1:267 blocked for more than 248 seconds.
> > >>> [ 2910.456520]       Not tainted 5.15.0-40-generic #43-Ubuntu
> > >>> [ 2910.723476] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
> > >>> disables this message.
> > >>> [ 2911.104601] task:kworker/3:1     state:D stack:    0 pid:  267
> > >>> ppid:     2 flags:0x00004000
> > >>> [ 2911.104617] Workqueue: events update_writeback_rate [bcache]
> > >>> [ 2911.104647] Call Trace:
> > >>> [ 2911.104648]  <TASK>
> > >>> [ 2911.104649]  __schedule+0x23d/0x590
> > >>> [ 2911.104652]  schedule+0x4e/0xb0
> > >>> [ 2911.104654]  rwsem_down_read_slowpath+0x32e/0x380
> > >>> [ 2911.104657]  down_read+0x43/0x90
> > >>> [ 2911.104660]  update_writeback_rate+0x134/0x190 [bcache]
> > >>> [ 2911.104676]  process_one_work+0x22b/0x3d0
> > >>> [ 2911.104680]  worker_thread+0x53/0x410
> > >>> [ 2911.104683]  ? process_one_work+0x3d0/0x3d0
> > >>> [ 2911.104687]  kthread+0x12a/0x150
> > >>> [ 2911.104690]  ? set_kthread_struct+0x50/0x50
> > >>> [ 2911.104693]  ret_from_fork+0x22/0x30
> > >>> [ 2911.104697]  </TASK>
> > >>> [ 2911.104733] INFO: task kworker/25:2:326 blocked for more than 249 seconds.
> > >>> [ 2911.439086]       Not tainted 5.15.0-40-generic #43-Ubuntu
> > >>> [ 2911.706294] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
> > >>> disables this message.
> > >>> [ 2912.087471] task:kworker/25:2    state:D stack:    0 pid:  326
> > >>> ppid:     2 flags:0x00004000
> > >>> [ 2912.087475] Workqueue: bcache bch_data_insert_start [bcache]
> > >>> [ 2912.087495] Call Trace:
> > >>> [ 2912.087497]  <TASK>
> > >>> [ 2912.087498]  __schedule+0x23d/0x590
> > >>> [ 2912.087503]  schedule+0x4e/0xb0
> > >>> [ 2912.087505]  bch_bucket_alloc+0x21a/0x5b0 [bcache]
> > >>> [ 2912.087516]  ? bch_btree_insert_check_key+0x1e0/0x1e0 [bcache]
> > >>> [ 2912.087528]  ? wait_woken+0x70/0x70
> > >>> [ 2912.087532]  __bch_bucket_alloc_set+0x52/0xf0 [bcache]
> > >>> [ 2912.087543]  bch_alloc_sectors+0x1c9/0x4c0 [bcache]
> > >>> [ 2912.087553]  ? bch_btree_insert+0xea/0x130 [bcache]
> > >>> [ 2912.087565]  ? __queue_work+0x211/0x480
> > >>> [ 2912.087570]  bch_data_insert_start+0x15e/0x3a0 [bcache]
> > >>> [ 2912.087583]  ? closure_sub+0x94/0xb0 [bcache]
> > >>> [ 2912.087595]  process_one_work+0x22b/0x3d0
> > >>> [ 2912.087599]  worker_thread+0x53/0x410
> > >>> [ 2912.087602]  ? process_one_work+0x3d0/0x3d0
> > >>> [ 2912.087606]  kthread+0x12a/0x150
> > >>> [ 2912.087609]  ? set_kthread_struct+0x50/0x50
> > >>> [ 2912.087612]  ret_from_fork+0x22/0x30
> > >>> [ 2912.087617]  </TASK>
> > >>>
> > >>> Is this a bug? It's in writeback mode. I'd setup the cache and run stuff like,
> > >>>
> > >>> echo 0 > /sys/block/bcache0/bcache/sequential_cutoff
> > >>>
> > >>> I had also echoed 0 into congested_read_threshold_us,
> > >>> congested_write_threshold_us.
> > >>>
> > >>> echo writeback > /sys/block/bcache0/bcache/cache_mode
> > >>
> > >> Where do you get the kernel? If this is stable kernel, could you give me the HEAD commit id?
> > >>
> > >> Coly Li
> > >>
> >

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: seeing this stace trace on kernel 5.15
  2022-06-29  8:09         ` Nikhil Kshirsagar
@ 2022-06-29  8:17           ` Coly Li
  2022-06-29 10:33             ` Nikhil Kshirsagar
  0 siblings, 1 reply; 14+ messages in thread
From: Coly Li @ 2022-06-29  8:17 UTC (permalink / raw)
  To: Nikhil Kshirsagar; +Cc: linux-bcache



> 2022年6月29日 16:09,Nikhil Kshirsagar <nkshirsagar@gmail.com> 写道:
> 
> Hi Coly,
> 
> Note I used partitions for the bcache as well as the hdd, not sure if
> that's a factor.
> 
> the kernel is upstream kernel -
> 
> # uname -a
> Linux bronzor 5.15.50-051550-generic #202206251445 SMP Sat Jun 25
> 14:51:22 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux


Hi Nikhil,

I don’t find the commit id 767db4b286c3e101ac220b813c873f492d9e4ee8 fro neither Linus tree nor stable tree.

The tree you mentioned at https://kernel.ubuntu.com/~kernel-ppa/mainline/v5.15.50/cod/mainline/v5.15.50 (767db4b286c3e101ac220b813c873f492d9e4ee8), I am not sure whether it is a clone of Linus tree or stable tree. Maybe you may try v5.15.50 from the stable tree (git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git), then we can focus on identical code base.

Thanks.

Coly Li


>>>>>> 

[snipped]
>>>>>> 
>>>>>> Is this a bug? It's in writeback mode. I'd setup the cache and run stuff like,
>>>>>> 
>>>>>> echo 0 > /sys/block/bcache0/bcache/sequential_cutoff
>>>>>> 
>>>>>> I had also echoed 0 into congested_read_threshold_us,
>>>>>> congested_write_threshold_us.
>>>>>> 
>>>>>> echo writeback > /sys/block/bcache0/bcache/cache_mode
>>>>> 
>>>>> Where do you get the kernel? If this is stable kernel, could you give me the HEAD commit id?
>>>>> 
>>>>> Coly Li
>>>>> 
>>> 


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: seeing this stace trace on kernel 5.15
  2022-06-29  8:17           ` Coly Li
@ 2022-06-29 10:33             ` Nikhil Kshirsagar
  2022-06-29 12:30               ` Nikhil Kshirsagar
  2022-06-29 13:00               ` Coly Li
  0 siblings, 2 replies; 14+ messages in thread
From: Nikhil Kshirsagar @ 2022-06-29 10:33 UTC (permalink / raw)
  To: Coly Li; +Cc: linux-bcache

Hi Coly,

please use this as a reference, this is the source code to refer to
for the kernel i was able to reproduce this on
(https://kernel.ubuntu.com/~kernel-ppa/mainline/v5.15.50/)


commit 18a33c8dabb88b50b860e0177a73933f2c0ddf68 (tag: v5.15.50)
Author: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Date:   Sat Jun 25 15:18:40 2022 +0200

    Linux 5.15.50

    Link: https://lore.kernel.org/r/20220623164322.288837280@linuxfoundation.org
    Tested-by: Florian Fainelli <f.fainelli@gmail.com>
    Tested-by: Shuah Khan <skhan@linuxfoundation.org>
    Tested-by: Bagas Sanjaya <bagasdotme@gmail.com>
    Tested-by: Jon Hunter <jonathanh@nvidia.com>
    Tested-by: Sudip Mukherjee <sudip.mukherjee@codethink.co.uk>
    Tested-by: Ron Economos <re@w6rz.net>
    Tested-by: Guenter Roeck <linux@roeck-us.net>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

I am able to reproduce this bug on older kernels too, like 5.4.0-121.
I will also test the latest upstream kernel soon.

Regards,
Nikhil.

On Wed, 29 Jun 2022 at 13:47, Coly Li <colyli@suse.de> wrote:
>
>
>
> > 2022年6月29日 16:09,Nikhil Kshirsagar <nkshirsagar@gmail.com> 写道:
> >
> > Hi Coly,
> >
> > Note I used partitions for the bcache as well as the hdd, not sure if
> > that's a factor.
> >
> > the kernel is upstream kernel -
> >
> > # uname -a
> > Linux bronzor 5.15.50-051550-generic #202206251445 SMP Sat Jun 25
> > 14:51:22 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux
>
>
> Hi Nikhil,
>
> I don’t find the commit id 767db4b286c3e101ac220b813c873f492d9e4ee8 fro neither Linus tree nor stable tree.
>
> The tree you mentioned at https://kernel.ubuntu.com/~kernel-ppa/mainline/v5.15.50/cod/mainline/v5.15.50 (767db4b286c3e101ac220b813c873f492d9e4ee8), I am not sure whether it is a clone of Linus tree or stable tree. Maybe you may try v5.15.50 from the stable tree (git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git), then we can focus on identical code base.
>
> Thanks.
>
> Coly Li
>
>
> >>>>>>
>
> [snipped]
> >>>>>>
> >>>>>> Is this a bug? It's in writeback mode. I'd setup the cache and run stuff like,
> >>>>>>
> >>>>>> echo 0 > /sys/block/bcache0/bcache/sequential_cutoff
> >>>>>>
> >>>>>> I had also echoed 0 into congested_read_threshold_us,
> >>>>>> congested_write_threshold_us.
> >>>>>>
> >>>>>> echo writeback > /sys/block/bcache0/bcache/cache_mode
> >>>>>
> >>>>> Where do you get the kernel? If this is stable kernel, could you give me the HEAD commit id?
> >>>>>
> >>>>> Coly Li
> >>>>>
> >>>
>

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: seeing this stace trace on kernel 5.15
  2022-06-29 10:33             ` Nikhil Kshirsagar
@ 2022-06-29 12:30               ` Nikhil Kshirsagar
  2022-06-29 13:03                 ` Coly Li
  2022-06-29 13:00               ` Coly Li
  1 sibling, 1 reply; 14+ messages in thread
From: Nikhil Kshirsagar @ 2022-06-29 12:30 UTC (permalink / raw)
  To: Coly Li; +Cc: linux-bcache

Hi Coly,

I am not able to see this issue on  5.18.0-051800-generic
(https://kernel.ubuntu.com/~kernel-ppa/mainline/v5.18/amd64/), so I am
guessing its been fixed since 5.15.50 sometime. If you are aware of
which commit may be the fix please let me know. I will keep testing on
5.18.0-051800-generic to see if I see it ever.

Regards,
Nikhil.

On Wed, 29 Jun 2022 at 16:03, Nikhil Kshirsagar <nkshirsagar@gmail.com> wrote:
>
> Hi Coly,
>
> please use this as a reference, this is the source code to refer to
> for the kernel i was able to reproduce this on
> (https://kernel.ubuntu.com/~kernel-ppa/mainline/v5.15.50/)
>
>
> commit 18a33c8dabb88b50b860e0177a73933f2c0ddf68 (tag: v5.15.50)
> Author: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
> Date:   Sat Jun 25 15:18:40 2022 +0200
>
>     Linux 5.15.50
>
>     Link: https://lore.kernel.org/r/20220623164322.288837280@linuxfoundation.org
>     Tested-by: Florian Fainelli <f.fainelli@gmail.com>
>     Tested-by: Shuah Khan <skhan@linuxfoundation.org>
>     Tested-by: Bagas Sanjaya <bagasdotme@gmail.com>
>     Tested-by: Jon Hunter <jonathanh@nvidia.com>
>     Tested-by: Sudip Mukherjee <sudip.mukherjee@codethink.co.uk>
>     Tested-by: Ron Economos <re@w6rz.net>
>     Tested-by: Guenter Roeck <linux@roeck-us.net>
>     Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
>
> I am able to reproduce this bug on older kernels too, like 5.4.0-121.
> I will also test the latest upstream kernel soon.
>
> Regards,
> Nikhil.
>
> On Wed, 29 Jun 2022 at 13:47, Coly Li <colyli@suse.de> wrote:
> >
> >
> >
> > > 2022年6月29日 16:09,Nikhil Kshirsagar <nkshirsagar@gmail.com> 写道:
> > >
> > > Hi Coly,
> > >
> > > Note I used partitions for the bcache as well as the hdd, not sure if
> > > that's a factor.
> > >
> > > the kernel is upstream kernel -
> > >
> > > # uname -a
> > > Linux bronzor 5.15.50-051550-generic #202206251445 SMP Sat Jun 25
> > > 14:51:22 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux
> >
> >
> > Hi Nikhil,
> >
> > I don’t find the commit id 767db4b286c3e101ac220b813c873f492d9e4ee8 fro neither Linus tree nor stable tree.
> >
> > The tree you mentioned at https://kernel.ubuntu.com/~kernel-ppa/mainline/v5.15.50/cod/mainline/v5.15.50 (767db4b286c3e101ac220b813c873f492d9e4ee8), I am not sure whether it is a clone of Linus tree or stable tree. Maybe you may try v5.15.50 from the stable tree (git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git), then we can focus on identical code base.
> >
> > Thanks.
> >
> > Coly Li
> >
> >
> > >>>>>>
> >
> > [snipped]
> > >>>>>>
> > >>>>>> Is this a bug? It's in writeback mode. I'd setup the cache and run stuff like,
> > >>>>>>
> > >>>>>> echo 0 > /sys/block/bcache0/bcache/sequential_cutoff
> > >>>>>>
> > >>>>>> I had also echoed 0 into congested_read_threshold_us,
> > >>>>>> congested_write_threshold_us.
> > >>>>>>
> > >>>>>> echo writeback > /sys/block/bcache0/bcache/cache_mode
> > >>>>>
> > >>>>> Where do you get the kernel? If this is stable kernel, could you give me the HEAD commit id?
> > >>>>>
> > >>>>> Coly Li
> > >>>>>
> > >>>
> >

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: seeing this stace trace on kernel 5.15
  2022-06-29 10:33             ` Nikhil Kshirsagar
  2022-06-29 12:30               ` Nikhil Kshirsagar
@ 2022-06-29 13:00               ` Coly Li
  1 sibling, 0 replies; 14+ messages in thread
From: Coly Li @ 2022-06-29 13:00 UTC (permalink / raw)
  To: Nikhil Kshirsagar; +Cc: linux-bcache



> 2022年6月29日 18:33,Nikhil Kshirsagar <nkshirsagar@gmail.com> 写道:
> 
> Hi Coly,
> 
> please use this as a reference, this is the source code to refer to
> for the kernel i was able to reproduce this on
> (https://kernel.ubuntu.com/~kernel-ppa/mainline/v5.15.50/)
> 
> 

Oh, I didn’t know this. Thanks for the hint.


> commit 18a33c8dabb88b50b860e0177a73933f2c0ddf68 (tag: v5.15.50)
> Author: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
> Date:   Sat Jun 25 15:18:40 2022 +0200
> 
>    Linux 5.15.50
> 
>    Link: https://lore.kernel.org/r/20220623164322.288837280@linuxfoundation.org
>    Tested-by: Florian Fainelli <f.fainelli@gmail.com>
>    Tested-by: Shuah Khan <skhan@linuxfoundation.org>
>    Tested-by: Bagas Sanjaya <bagasdotme@gmail.com>
>    Tested-by: Jon Hunter <jonathanh@nvidia.com>
>    Tested-by: Sudip Mukherjee <sudip.mukherjee@codethink.co.uk>
>    Tested-by: Ron Economos <re@w6rz.net>
>    Tested-by: Guenter Roeck <linux@roeck-us.net>
>    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
> 
> I am able to reproduce this bug on older kernels too, like 5.4.0-121.
> I will also test the latest upstream kernel soon.
> 


Thanks.

Coly Li


> Regards,
> Nikhil.
> 
> On Wed, 29 Jun 2022 at 13:47, Coly Li <colyli@suse.de> wrote:
>> 
>> 
>> 
>>> 2022年6月29日 16:09,Nikhil Kshirsagar <nkshirsagar@gmail.com> 写道:
>>> 
>>> Hi Coly,
>>> 
>>> Note I used partitions for the bcache as well as the hdd, not sure if
>>> that's a factor.
>>> 
>>> the kernel is upstream kernel -
>>> 
>>> # uname -a
>>> Linux bronzor 5.15.50-051550-generic #202206251445 SMP Sat Jun 25
>>> 14:51:22 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux
>> 
>> 
>> Hi Nikhil,
>> 
>> I don’t find the commit id 767db4b286c3e101ac220b813c873f492d9e4ee8 fro neither Linus tree nor stable tree.
>> 
>> The tree you mentioned at https://kernel.ubuntu.com/~kernel-ppa/mainline/v5.15.50/cod/mainline/v5.15.50 (767db4b286c3e101ac220b813c873f492d9e4ee8), I am not sure whether it is a clone of Linus tree or stable tree. Maybe you may try v5.15.50 from the stable tree (git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git), then we can focus on identical code base.
>> 
>> Thanks.
>> 
>> Coly Li
>> 
>> 
>>>>>>>> 
>> 
>> [snipped]
>>>>>>>> 
>>>>>>>> Is this a bug? It's in writeback mode. I'd setup the cache and run stuff like,
>>>>>>>> 
>>>>>>>> echo 0 > /sys/block/bcache0/bcache/sequential_cutoff
>>>>>>>> 
>>>>>>>> I had also echoed 0 into congested_read_threshold_us,
>>>>>>>> congested_write_threshold_us.
>>>>>>>> 
>>>>>>>> echo writeback > /sys/block/bcache0/bcache/cache_mode
>>>>>>> 
>>>>>>> Where do you get the kernel? If this is stable kernel, could you give me the HEAD commit id?
>>>>>>> 
>>>>>>> Coly Li
>>>>>>> 
>>>>> 
>> 


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: seeing this stace trace on kernel 5.15
  2022-06-29 12:30               ` Nikhil Kshirsagar
@ 2022-06-29 13:03                 ` Coly Li
  2022-07-07  9:22                   ` Nikhil Kshirsagar
  0 siblings, 1 reply; 14+ messages in thread
From: Coly Li @ 2022-06-29 13:03 UTC (permalink / raw)
  To: Nikhil Kshirsagar; +Cc: linux-bcache



> 2022年6月29日 20:30,Nikhil Kshirsagar <nkshirsagar@gmail.com> 写道:
> 
> Hi Coly,
> 
> I am not able to see this issue on  5.18.0-051800-generic
> (https://kernel.ubuntu.com/~kernel-ppa/mainline/v5.18/amd64/), so I am
> guessing its been fixed since 5.15.50 sometime. If you are aware of
> which commit may be the fix please let me know. I will keep testing on
> 5.18.0-051800-generic to see if I see it ever.


Hi Nikhil,

Thanks for the update. Recently there are several fixes go into stable 5.15.51 (but still one patch I just submitted to Greg yesterday, maybe it will show up in future 5.15.52).

Maybe you can try stable tree 5.15.51 with appying one more patch,
commit 7d6b902ea0e0 (“bcache: memset on stack variables in bch_btree_check() and bch_sectors_dirty_init()”)

BTW, the journal no-space deadlock fix is in 5.15.51 now.

Coly Li

> 
> On Wed, 29 Jun 2022 at 16:03, Nikhil Kshirsagar <nkshirsagar@gmail.com> wrote:
>> 
>> Hi Coly,
>> 
>> please use this as a reference, this is the source code to refer to
>> for the kernel i was able to reproduce this on
>> (https://kernel.ubuntu.com/~kernel-ppa/mainline/v5.15.50/)
>> 
>> 
>> commit 18a33c8dabb88b50b860e0177a73933f2c0ddf68 (tag: v5.15.50)
>> Author: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
>> Date:   Sat Jun 25 15:18:40 2022 +0200
>> 
>>    Linux 5.15.50
>> 
>>    Link: https://lore.kernel.org/r/20220623164322.288837280@linuxfoundation.org
>>    Tested-by: Florian Fainelli <f.fainelli@gmail.com>
>>    Tested-by: Shuah Khan <skhan@linuxfoundation.org>
>>    Tested-by: Bagas Sanjaya <bagasdotme@gmail.com>
>>    Tested-by: Jon Hunter <jonathanh@nvidia.com>
>>    Tested-by: Sudip Mukherjee <sudip.mukherjee@codethink.co.uk>
>>    Tested-by: Ron Economos <re@w6rz.net>
>>    Tested-by: Guenter Roeck <linux@roeck-us.net>
>>    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
>> 
>> I am able to reproduce this bug on older kernels too, like 5.4.0-121.
>> I will also test the latest upstream kernel soon.
>> 
>> Regards,
>> Nikhil.
>> 
>> On Wed, 29 Jun 2022 at 13:47, Coly Li <colyli@suse.de> wrote:
>>> 
>>> 
>>> 
>>>> 2022年6月29日 16:09,Nikhil Kshirsagar <nkshirsagar@gmail.com> 写道:
>>>> 
>>>> Hi Coly,
>>>> 
>>>> Note I used partitions for the bcache as well as the hdd, not sure if
>>>> that's a factor.
>>>> 
>>>> the kernel is upstream kernel -
>>>> 
>>>> # uname -a
>>>> Linux bronzor 5.15.50-051550-generic #202206251445 SMP Sat Jun 25
>>>> 14:51:22 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux
>>> 
>>> 
>>> Hi Nikhil,
>>> 
>>> I don’t find the commit id 767db4b286c3e101ac220b813c873f492d9e4ee8 fro neither Linus tree nor stable tree.
>>> 
>>> The tree you mentioned at https://kernel.ubuntu.com/~kernel-ppa/mainline/v5.15.50/cod/mainline/v5.15.50 (767db4b286c3e101ac220b813c873f492d9e4ee8), I am not sure whether it is a clone of Linus tree or stable tree. Maybe you may try v5.15.50 from the stable tree (git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git), then we can focus on identical code base.
>>> 
>>> Thanks.
>>> 
>>> Coly Li
>>> 
>>> 
>>>>>>>>> 
>>> 
>>> [snipped]
>>>>>>>>> 
>>>>>>>>> Is this a bug? It's in writeback mode. I'd setup the cache and run stuff like,
>>>>>>>>> 
>>>>>>>>> echo 0 > /sys/block/bcache0/bcache/sequential_cutoff
>>>>>>>>> 
>>>>>>>>> I had also echoed 0 into congested_read_threshold_us,
>>>>>>>>> congested_write_threshold_us.
>>>>>>>>> 
>>>>>>>>> echo writeback > /sys/block/bcache0/bcache/cache_mode
>>>>>>>> 
>>>>>>>> Where do you get the kernel? If this is stable kernel, could you give me the HEAD commit id?
>>>>>>>> 
>>>>>>>> Coly Li
>>>>>>>> 
>>>>>> 
>>> 


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: seeing this stace trace on kernel 5.15
  2022-06-29 13:03                 ` Coly Li
@ 2022-07-07  9:22                   ` Nikhil Kshirsagar
  0 siblings, 0 replies; 14+ messages in thread
From: Nikhil Kshirsagar @ 2022-07-07  9:22 UTC (permalink / raw)
  To: Coly Li; +Cc: linux-bcache

Hi Coly,

Yes, 7d6b902ea0e0 seems to be the fix for it. I cannot reproduce this
deadlock when I build this patch into 5.15.50.

Thanks for pointing me to this patch!

Regards,
Nikhil.


On Wed, 29 Jun 2022 at 18:33, Coly Li <colyli@suse.de> wrote:
>
>
>
> > 2022年6月29日 20:30,Nikhil Kshirsagar <nkshirsagar@gmail.com> 写道:
> >
> > Hi Coly,
> >
> > I am not able to see this issue on  5.18.0-051800-generic
> > (https://kernel.ubuntu.com/~kernel-ppa/mainline/v5.18/amd64/), so I am
> > guessing its been fixed since 5.15.50 sometime. If you are aware of
> > which commit may be the fix please let me know. I will keep testing on
> > 5.18.0-051800-generic to see if I see it ever.
>
>
> Hi Nikhil,
>
> Thanks for the update. Recently there are several fixes go into stable 5.15.51 (but still one patch I just submitted to Greg yesterday, maybe it will show up in future 5.15.52).
>
> Maybe you can try stable tree 5.15.51 with appying one more patch,
> commit 7d6b902ea0e0 (“bcache: memset on stack variables in bch_btree_check() and bch_sectors_dirty_init()”)
>
> BTW, the journal no-space deadlock fix is in 5.15.51 now.
>
> Coly Li
>
> >
> > On Wed, 29 Jun 2022 at 16:03, Nikhil Kshirsagar <nkshirsagar@gmail.com> wrote:
> >>
> >> Hi Coly,
> >>
> >> please use this as a reference, this is the source code to refer to
> >> for the kernel i was able to reproduce this on
> >> (https://kernel.ubuntu.com/~kernel-ppa/mainline/v5.15.50/)
> >>
> >>
> >> commit 18a33c8dabb88b50b860e0177a73933f2c0ddf68 (tag: v5.15.50)
> >> Author: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
> >> Date:   Sat Jun 25 15:18:40 2022 +0200
> >>
> >>    Linux 5.15.50
> >>
> >>    Link: https://lore.kernel.org/r/20220623164322.288837280@linuxfoundation.org
> >>    Tested-by: Florian Fainelli <f.fainelli@gmail.com>
> >>    Tested-by: Shuah Khan <skhan@linuxfoundation.org>
> >>    Tested-by: Bagas Sanjaya <bagasdotme@gmail.com>
> >>    Tested-by: Jon Hunter <jonathanh@nvidia.com>
> >>    Tested-by: Sudip Mukherjee <sudip.mukherjee@codethink.co.uk>
> >>    Tested-by: Ron Economos <re@w6rz.net>
> >>    Tested-by: Guenter Roeck <linux@roeck-us.net>
> >>    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
> >>
> >> I am able to reproduce this bug on older kernels too, like 5.4.0-121.
> >> I will also test the latest upstream kernel soon.
> >>
> >> Regards,
> >> Nikhil.
> >>
> >> On Wed, 29 Jun 2022 at 13:47, Coly Li <colyli@suse.de> wrote:
> >>>
> >>>
> >>>
> >>>> 2022年6月29日 16:09,Nikhil Kshirsagar <nkshirsagar@gmail.com> 写道:
> >>>>
> >>>> Hi Coly,
> >>>>
> >>>> Note I used partitions for the bcache as well as the hdd, not sure if
> >>>> that's a factor.
> >>>>
> >>>> the kernel is upstream kernel -
> >>>>
> >>>> # uname -a
> >>>> Linux bronzor 5.15.50-051550-generic #202206251445 SMP Sat Jun 25
> >>>> 14:51:22 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux
> >>>
> >>>
> >>> Hi Nikhil,
> >>>
> >>> I don’t find the commit id 767db4b286c3e101ac220b813c873f492d9e4ee8 fro neither Linus tree nor stable tree.
> >>>
> >>> The tree you mentioned at https://kernel.ubuntu.com/~kernel-ppa/mainline/v5.15.50/cod/mainline/v5.15.50 (767db4b286c3e101ac220b813c873f492d9e4ee8), I am not sure whether it is a clone of Linus tree or stable tree. Maybe you may try v5.15.50 from the stable tree (git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git), then we can focus on identical code base.
> >>>
> >>> Thanks.
> >>>
> >>> Coly Li
> >>>
> >>>
> >>>>>>>>>
> >>>
> >>> [snipped]
> >>>>>>>>>
> >>>>>>>>> Is this a bug? It's in writeback mode. I'd setup the cache and run stuff like,
> >>>>>>>>>
> >>>>>>>>> echo 0 > /sys/block/bcache0/bcache/sequential_cutoff
> >>>>>>>>>
> >>>>>>>>> I had also echoed 0 into congested_read_threshold_us,
> >>>>>>>>> congested_write_threshold_us.
> >>>>>>>>>
> >>>>>>>>> echo writeback > /sys/block/bcache0/bcache/cache_mode
> >>>>>>>>
> >>>>>>>> Where do you get the kernel? If this is stable kernel, could you give me the HEAD commit id?
> >>>>>>>>
> >>>>>>>> Coly Li
> >>>>>>>>
> >>>>>>
> >>>
>

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2022-07-07  9:22 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-06-28  6:31 seeing this stace trace on kernel 5.15 Nikhil Kshirsagar
2022-06-28  7:23 ` Nikhil Kshirsagar
2022-06-28 14:37 ` Coly Li
2022-06-29  7:02   ` Nikhil Kshirsagar
2022-06-29  7:50     ` Coly Li
2022-06-29  7:56       ` Nikhil Kshirsagar
2022-06-29  7:59         ` Coly Li
2022-06-29  8:09         ` Nikhil Kshirsagar
2022-06-29  8:17           ` Coly Li
2022-06-29 10:33             ` Nikhil Kshirsagar
2022-06-29 12:30               ` Nikhil Kshirsagar
2022-06-29 13:03                 ` Coly Li
2022-07-07  9:22                   ` Nikhil Kshirsagar
2022-06-29 13:00               ` Coly Li

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.