* seeing this stace trace on kernel 5.15
@ 2022-06-28 6:31 Nikhil Kshirsagar
2022-06-28 7:23 ` Nikhil Kshirsagar
2022-06-28 14:37 ` Coly Li
0 siblings, 2 replies; 14+ messages in thread
From: Nikhil Kshirsagar @ 2022-06-28 6:31 UTC (permalink / raw)
To: Coly Li, linux-bcache
Hi Coly,
I just kicked off a 20gb random readwrite test using fio to a bcache device,
fio --filename=/home/ubuntu/bcache_mount/cacahedfile --size=20GB
--direct=1 --rw=randrw --bs=4k --ioengine=libaio --iodepth=128
--name=iops-test-job --eta-newline=1
# lsblk /dev/sdc /dev/nvme0n1
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINTS
sdc 8:32 0 279.4G 0 disk
└─sdc1 8:33 0 60G 0 part
└─bcache0 252:0 0 60G 0 disk /home/ubuntu/bcache_mount
nvme0n1 259:0 0 372.6G 0 disk
└─nvme0n1p1 259:2 0 15G 0 part
└─bcache0 252:0 0 60G 0 disk /home/ubuntu/bcache_mount
and am seeing this trace in the dmesg,
[ 2475.034909] XFS (bcache0): Ending clean mount
[ 2475.036238] xfs filesystem being mounted at
/home/ubuntu/bcache_mount supports timestamps until 2038 (0x7fffffff)
[ 2782.176705] INFO: task kworker/2:1:255 blocked for more than 120 seconds.
[ 2782.507171] Not tainted 5.15.0-40-generic #43-Ubuntu
[ 2782.774078] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
disables this message.
[ 2783.155206] task:kworker/2:1 state:D stack: 0 pid: 255
ppid: 2 flags:0x00004000
[ 2783.155210] Workqueue: bcache bch_data_insert_start [bcache]
[ 2783.155259] Call Trace:
[ 2783.155261] <TASK>
[ 2783.155263] __schedule+0x23d/0x590
[ 2783.155269] schedule+0x4e/0xb0
[ 2783.155271] bch_bucket_alloc+0x21a/0x5b0 [bcache]
[ 2783.155281] ? bch_btree_insert_check_key+0x1e0/0x1e0 [bcache]
[ 2783.155294] ? wait_woken+0x70/0x70
[ 2783.155298] __bch_bucket_alloc_set+0x52/0xf0 [bcache]
[ 2783.155309] bch_alloc_sectors+0x1c9/0x4c0 [bcache]
[ 2783.155319] ? bch_btree_insert+0xea/0x130 [bcache]
[ 2783.155331] ? __queue_work+0x211/0x480
[ 2783.155336] bch_data_insert_start+0x15e/0x3a0 [bcache]
[ 2783.155349] ? closure_sub+0x94/0xb0 [bcache]
[ 2783.155362] process_one_work+0x22b/0x3d0
[ 2783.155366] worker_thread+0x53/0x410
[ 2783.155369] ? process_one_work+0x3d0/0x3d0
[ 2783.155372] kthread+0x12a/0x150
[ 2783.155376] ? set_kthread_struct+0x50/0x50
[ 2783.155379] ret_from_fork+0x22/0x30
[ 2783.155385] </TASK>
[ 2783.155423] INFO: task kworker/3:1:267 blocked for more than 121 seconds.
[ 2783.485797] Not tainted 5.15.0-40-generic #43-Ubuntu
[ 2783.752485] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
disables this message.
[ 2784.133541] task:kworker/3:1 state:D stack: 0 pid: 267
ppid: 2 flags:0x00004000
[ 2784.133544] Workqueue: events update_writeback_rate [bcache]
[ 2784.133588] Call Trace:
[ 2784.133589] <TASK>
[ 2784.133591] __schedule+0x23d/0x590
[ 2784.133594] schedule+0x4e/0xb0
[ 2784.133596] rwsem_down_read_slowpath+0x32e/0x380
[ 2784.133600] down_read+0x43/0x90
[ 2784.133602] update_writeback_rate+0x134/0x190 [bcache]
[ 2784.133619] process_one_work+0x22b/0x3d0
[ 2784.133623] worker_thread+0x53/0x410
[ 2784.133626] ? process_one_work+0x3d0/0x3d0
[ 2784.133630] kthread+0x12a/0x150
[ 2784.133633] ? set_kthread_struct+0x50/0x50
[ 2784.133636] ret_from_fork+0x22/0x30
[ 2784.133640] </TASK>
[ 2784.133650] INFO: task kworker/25:2:326 blocked for more than 122 seconds.
[ 2784.467880] Not tainted 5.15.0-40-generic #43-Ubuntu
[ 2784.734405] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
disables this message.
[ 2785.114677] task:kworker/25:2 state:D stack: 0 pid: 326
ppid: 2 flags:0x00004000
[ 2785.114692] Workqueue: bcache bch_data_insert_start [bcache]
[ 2785.114720] Call Trace:
[ 2785.114721] <TASK>
[ 2785.114723] __schedule+0x23d/0x590
[ 2785.114726] schedule+0x4e/0xb0
[ 2785.114729] bch_bucket_alloc+0x21a/0x5b0 [bcache]
[ 2785.114740] ? bch_btree_insert_check_key+0x1e0/0x1e0 [bcache]
[ 2785.114753] ? wait_woken+0x70/0x70
[ 2785.114756] __bch_bucket_alloc_set+0x52/0xf0 [bcache]
[ 2785.114766] bch_alloc_sectors+0x1c9/0x4c0 [bcache]
[ 2785.114777] ? bch_btree_insert+0xea/0x130 [bcache]
[ 2785.114789] ? __queue_work+0x211/0x480
[ 2785.114793] bch_data_insert_start+0x15e/0x3a0 [bcache]
[ 2785.114806] ? closure_sub+0x94/0xb0 [bcache]
[ 2785.114818] process_one_work+0x22b/0x3d0
[ 2785.114822] worker_thread+0x53/0x410
[ 2785.114826] ? process_one_work+0x3d0/0x3d0
[ 2785.114829] kthread+0x12a/0x150
[ 2785.114832] ? set_kthread_struct+0x50/0x50
[ 2785.114835] ret_from_fork+0x22/0x30
[ 2785.114839] </TASK>
[ 2785.114864] INFO: task kworker/6:0:3038 blocked for more than 123 seconds.
[ 2785.444697] Not tainted 5.15.0-40-generic #43-Ubuntu
[ 2785.711071] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
disables this message.
[ 2786.091440] task:kworker/6:0 state:D stack: 0 pid: 3038
ppid: 2 flags:0x00004000
[ 2786.091446] Workqueue: xfs-sync/bcache0 xfs_log_worker [xfs]
[ 2786.091582] Call Trace:
[ 2786.091584] <TASK>
[ 2786.091585] __schedule+0x23d/0x590
[ 2786.091589] schedule+0x4e/0xb0
[ 2786.091591] rwsem_down_read_slowpath+0x32e/0x380
[ 2786.091594] down_read+0x43/0x90
[ 2786.091597] cached_dev_write+0x7e/0x480 [bcache]
[ 2786.091613] cached_dev_submit_bio+0x4ef/0x540 [bcache]
[ 2786.091626] __submit_bio+0x1a1/0x220
[ 2786.091631] __submit_bio_noacct+0x85/0x1f0
[ 2786.091634] ? mutex_lock+0x13/0x40
[ 2786.091637] submit_bio_noacct+0x4e/0x120
[ 2786.091640] submit_bio+0x4a/0x130
[ 2786.091642] xlog_write_iclog+0x254/0x300 [xfs]
[ 2786.091725] xlog_sync+0x1ab/0x2c0 [xfs]
[ 2786.091807] xlog_state_release_iclog+0x123/0x1d0 [xfs]
[ 2786.091925] xfs_log_force+0x186/0x250 [xfs]
[ 2786.091994] xfs_log_worker+0x39/0x90 [xfs]
[ 2786.092063] process_one_work+0x22b/0x3d0
[ 2786.092067] worker_thread+0x53/0x410
[ 2786.092069] ? process_one_work+0x3d0/0x3d0
[ 2786.092072] kthread+0x12a/0x150
[ 2786.092074] ? set_kthread_struct+0x50/0x50
[ 2786.092077] ret_from_fork+0x22/0x30
[ 2786.092081] </TASK>
[ 2786.092088] INFO: task kworker/25:0:3047 blocked for more than 124 seconds.
[ 2786.430313] Not tainted 5.15.0-40-generic #43-Ubuntu
[ 2786.696626] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
disables this message.
[ 2787.077056] task:kworker/25:0 state:D stack: 0 pid: 3047
ppid: 2 flags:0x00004000
[ 2787.077071] Workqueue: bcache bch_data_insert_start [bcache]
[ 2787.077098] Call Trace:
[ 2787.077099] <TASK>
[ 2787.077101] __schedule+0x23d/0x590
[ 2787.077104] schedule+0x4e/0xb0
[ 2787.077106] bch_bucket_alloc+0x21a/0x5b0 [bcache]
[ 2787.077116] ? wait_woken+0x70/0x70
[ 2787.077119] __bch_bucket_alloc_set+0x52/0xf0 [bcache]
[ 2787.077142] bch_alloc_sectors+0x1c9/0x4c0 [bcache]
[ 2787.077153] ? __bch_submit_bbio+0x97/0xb0 [bcache]
[ 2787.077166] bch_data_insert_start+0x15e/0x3a0 [bcache]
[ 2787.077179] ? closure_sub+0x94/0xb0 [bcache]
[ 2787.077191] process_one_work+0x22b/0x3d0
[ 2787.077195] worker_thread+0x53/0x410
[ 2787.077198] ? process_one_work+0x3d0/0x3d0
[ 2787.077202] kthread+0x12a/0x150
[ 2787.077205] ? set_kthread_struct+0x50/0x50
[ 2787.077207] ret_from_fork+0x22/0x30
[ 2787.077212] </TASK>
[ 2787.077247] INFO: task bcache_writebac:3080 blocked for more than
125 seconds.
[ 2787.423890] Not tainted 5.15.0-40-generic #43-Ubuntu
[ 2787.690476] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
disables this message.
[ 2788.071063] task:bcache_writebac state:D stack: 0 pid: 3080
ppid: 2 flags:0x00004000
[ 2788.071080] Call Trace:
[ 2788.071081] <TASK>
[ 2788.071082] __schedule+0x23d/0x590
[ 2788.071086] schedule+0x4e/0xb0
[ 2788.071100] rwsem_down_write_slowpath+0x220/0x3d0
[ 2788.071105] ? del_timer_sync+0x6c/0xb0
[ 2788.071109] down_write+0x43/0x50
[ 2788.071112] bch_writeback_thread+0x78/0x320 [bcache]
[ 2788.071142] ? read_dirty+0x5a0/0x5a0 [bcache]
[ 2788.071158] kthread+0x12a/0x150
[ 2788.071161] ? set_kthread_struct+0x50/0x50
[ 2788.071164] ret_from_fork+0x22/0x30
[ 2788.071168] </TASK>
[ 2788.071199] INFO: task fio:3123 blocked for more than 126 seconds.
[ 2788.367175] Not tainted 5.15.0-40-generic #43-Ubuntu
[ 2788.633386] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
disables this message.
[ 2789.014084] task:fio state:D stack: 0 pid: 3123
ppid: 2378 flags:0x00004002
[ 2789.014087] Call Trace:
[ 2789.014089] <TASK>
[ 2789.014090] __schedule+0x23d/0x590
[ 2789.014106] schedule+0x4e/0xb0
[ 2789.014108] io_schedule+0x46/0x70
[ 2789.014110] wait_on_page_bit_common+0x10c/0x3d0
[ 2789.014115] ? filemap_invalidate_unlock_two+0x40/0x40
[ 2789.014118] wait_on_page_bit+0x3f/0x50
[ 2789.014120] wait_on_page_writeback+0x26/0x80
[ 2789.014124] __filemap_fdatawait_range+0x97/0x110
[ 2789.014126] file_write_and_wait_range+0xcc/0xf0
[ 2789.014130] xfs_file_fsync+0x5b/0x250 [xfs]
[ 2789.014207] vfs_fsync_range+0x49/0x80
[ 2789.014212] ? __fget_light+0x32/0x80
[ 2789.014217] __x64_sys_fsync+0x38/0x60
[ 2789.014220] do_syscall_64+0x5c/0xc0
[ 2789.014223] ? do_syscall_64+0x69/0xc0
[ 2789.014225] ? do_syscall_64+0x69/0xc0
[ 2789.014226] ? syscall_exit_to_user_mode+0x27/0x50
[ 2789.014230] ? __x64_sys_write+0x19/0x20
[ 2789.014232] ? do_syscall_64+0x69/0xc0
[ 2789.014234] ? do_syscall_64+0x69/0xc0
[ 2789.014235] ? asm_sysvec_apic_timer_interrupt+0xa/0x20
[ 2789.014239] entry_SYSCALL_64_after_hwframe+0x44/0xae
[ 2789.014243] RIP: 0033:0x7f2cfdcfea5b
[ 2789.014245] RSP: 002b:00007ffcd87c6e60 EFLAGS: 00000293 ORIG_RAX:
000000000000004a
[ 2789.014248] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f2cfdcfea5b
[ 2789.014249] RDX: 0000000000000000 RSI: 000055c8d26e6f40 RDI: 0000000000000006
[ 2789.014250] RBP: 0000000000001000 R08: 0000000000000000 R09: 000055c8d26e6f40
[ 2789.014252] R10: 00000000a318c620 R11: 0000000000000293 R12: 00007f2cf38a7000
[ 2789.014253] R13: 0000000000000000 R14: 00007f2cfb0522f0 R15: 0000000500000000
[ 2789.014255] </TASK>
[ 2909.151501] INFO: task kworker/2:1:255 blocked for more than 247 seconds.
[ 2909.481856] Not tainted 5.15.0-40-generic #43-Ubuntu
[ 2909.748707] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
disables this message.
[ 2910.130132] task:kworker/2:1 state:D stack: 0 pid: 255
ppid: 2 flags:0x00004000
[ 2910.130147] Workqueue: bcache bch_data_insert_start [bcache]
[ 2910.130176] Call Trace:
[ 2910.130177] <TASK>
[ 2910.130179] __schedule+0x23d/0x590
[ 2910.130182] schedule+0x4e/0xb0
[ 2910.130184] bch_bucket_alloc+0x21a/0x5b0 [bcache]
[ 2910.130194] ? bch_btree_insert_check_key+0x1e0/0x1e0 [bcache]
[ 2910.130219] ? wait_woken+0x70/0x70
[ 2910.130222] __bch_bucket_alloc_set+0x52/0xf0 [bcache]
[ 2910.130233] bch_alloc_sectors+0x1c9/0x4c0 [bcache]
[ 2910.130243] ? bch_btree_insert+0xea/0x130 [bcache]
[ 2910.130255] ? __queue_work+0x211/0x480
[ 2910.130259] bch_data_insert_start+0x15e/0x3a0 [bcache]
[ 2910.130272] ? closure_sub+0x94/0xb0 [bcache]
[ 2910.130285] process_one_work+0x22b/0x3d0
[ 2910.130288] worker_thread+0x53/0x410
[ 2910.130292] ? process_one_work+0x3d0/0x3d0
[ 2910.130295] kthread+0x12a/0x150
[ 2910.130298] ? set_kthread_struct+0x50/0x50
[ 2910.130301] ret_from_fork+0x22/0x30
[ 2910.130305] </TASK>
[ 2910.130307] INFO: task kworker/3:1:267 blocked for more than 248 seconds.
[ 2910.456520] Not tainted 5.15.0-40-generic #43-Ubuntu
[ 2910.723476] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
disables this message.
[ 2911.104601] task:kworker/3:1 state:D stack: 0 pid: 267
ppid: 2 flags:0x00004000
[ 2911.104617] Workqueue: events update_writeback_rate [bcache]
[ 2911.104647] Call Trace:
[ 2911.104648] <TASK>
[ 2911.104649] __schedule+0x23d/0x590
[ 2911.104652] schedule+0x4e/0xb0
[ 2911.104654] rwsem_down_read_slowpath+0x32e/0x380
[ 2911.104657] down_read+0x43/0x90
[ 2911.104660] update_writeback_rate+0x134/0x190 [bcache]
[ 2911.104676] process_one_work+0x22b/0x3d0
[ 2911.104680] worker_thread+0x53/0x410
[ 2911.104683] ? process_one_work+0x3d0/0x3d0
[ 2911.104687] kthread+0x12a/0x150
[ 2911.104690] ? set_kthread_struct+0x50/0x50
[ 2911.104693] ret_from_fork+0x22/0x30
[ 2911.104697] </TASK>
[ 2911.104733] INFO: task kworker/25:2:326 blocked for more than 249 seconds.
[ 2911.439086] Not tainted 5.15.0-40-generic #43-Ubuntu
[ 2911.706294] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
disables this message.
[ 2912.087471] task:kworker/25:2 state:D stack: 0 pid: 326
ppid: 2 flags:0x00004000
[ 2912.087475] Workqueue: bcache bch_data_insert_start [bcache]
[ 2912.087495] Call Trace:
[ 2912.087497] <TASK>
[ 2912.087498] __schedule+0x23d/0x590
[ 2912.087503] schedule+0x4e/0xb0
[ 2912.087505] bch_bucket_alloc+0x21a/0x5b0 [bcache]
[ 2912.087516] ? bch_btree_insert_check_key+0x1e0/0x1e0 [bcache]
[ 2912.087528] ? wait_woken+0x70/0x70
[ 2912.087532] __bch_bucket_alloc_set+0x52/0xf0 [bcache]
[ 2912.087543] bch_alloc_sectors+0x1c9/0x4c0 [bcache]
[ 2912.087553] ? bch_btree_insert+0xea/0x130 [bcache]
[ 2912.087565] ? __queue_work+0x211/0x480
[ 2912.087570] bch_data_insert_start+0x15e/0x3a0 [bcache]
[ 2912.087583] ? closure_sub+0x94/0xb0 [bcache]
[ 2912.087595] process_one_work+0x22b/0x3d0
[ 2912.087599] worker_thread+0x53/0x410
[ 2912.087602] ? process_one_work+0x3d0/0x3d0
[ 2912.087606] kthread+0x12a/0x150
[ 2912.087609] ? set_kthread_struct+0x50/0x50
[ 2912.087612] ret_from_fork+0x22/0x30
[ 2912.087617] </TASK>
Is this a bug? It's in writeback mode. I'd setup the cache and run stuff like,
echo 0 > /sys/block/bcache0/bcache/sequential_cutoff
I had also echoed 0 into congested_read_threshold_us,
congested_write_threshold_us.
echo writeback > /sys/block/bcache0/bcache/cache_mode
Regards,
Nikhil.
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: seeing this stace trace on kernel 5.15
2022-06-28 6:31 seeing this stace trace on kernel 5.15 Nikhil Kshirsagar
@ 2022-06-28 7:23 ` Nikhil Kshirsagar
2022-06-28 14:37 ` Coly Li
1 sibling, 0 replies; 14+ messages in thread
From: Nikhil Kshirsagar @ 2022-06-28 7:23 UTC (permalink / raw)
To: Coly Li, linux-bcache
Able to reproduce this very easily each time I run the same fio test -
[ 969.525374] INFO: task kworker/0:6:261 blocked for more than 120 seconds.
[ 969.850975] Not tainted 5.15.0-40-generic #43-Ubuntu
[ 970.117328] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
disables this message.
[ 970.497751] task:kworker/0:6 state:D stack: 0 pid: 261
ppid: 2 flags:0x00004000
[ 970.497768] Workqueue: events update_writeback_rate [bcache]
[ 970.497806] Call Trace:
[ 970.497808] <TASK>
[ 970.497810] __schedule+0x23d/0x590
[ 970.497826] schedule+0x4e/0xb0
[ 970.497828] rwsem_down_read_slowpath+0x32e/0x380
[ 970.497832] down_read+0x43/0x90
[ 970.497834] update_writeback_rate+0x134/0x190 [bcache]
[ 970.497850] process_one_work+0x22b/0x3d0
[ 970.497856] worker_thread+0x53/0x410
[ 970.497859] ? process_one_work+0x3d0/0x3d0
[ 970.497862] kthread+0x12a/0x150
[ 970.497865] ? set_kthread_struct+0x50/0x50
[ 970.497868] ret_from_fork+0x22/0x30
[ 970.497874] </TASK>
[ 970.497885] INFO: task kworker/22:2:326 blocked for more than 121 seconds.
[ 970.831922] Not tainted 5.15.0-40-generic #43-Ubuntu
[ 971.098073] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
disables this message.
[ 971.478693] task:kworker/22:2 state:D stack: 0 pid: 326
ppid: 2 flags:0x00004000
[ 971.478696] Workqueue: bcache bch_data_insert_start [bcache]
[ 971.478734] Call Trace:
[ 971.478736] <TASK>
[ 971.478737] __schedule+0x23d/0x590
[ 971.478740] schedule+0x4e/0xb0
[ 971.478742] bch_bucket_alloc+0x21a/0x5b0 [bcache]
[ 971.478752] ? wait_woken+0x70/0x70
[ 971.478756] __bch_bucket_alloc_set+0x52/0xf0 [bcache]
[ 971.478779] bch_alloc_sectors+0x1c9/0x4c0 [bcache]
[ 971.478790] ? __bch_submit_bbio+0x97/0xb0 [bcache]
[ 971.478803] bch_data_insert_start+0x15e/0x3a0 [bcache]
[ 971.478816] ? closure_sub+0x94/0xb0 [bcache]
[ 971.478829] process_one_work+0x22b/0x3d0
[ 971.478832] worker_thread+0x53/0x410
[ 971.478836] ? process_one_work+0x3d0/0x3d0
[ 971.478839] kthread+0x12a/0x150
[ 971.478842] ? set_kthread_struct+0x50/0x50
[ 971.478845] ret_from_fork+0x22/0x30
[ 971.478849] </TASK>
[ 971.478895] INFO: task bcache_writebac:2911 blocked for more than
122 seconds.
[ 971.829737] Not tainted 5.15.0-40-generic #43-Ubuntu
[ 972.096191] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
disables this message.
[ 972.476825] task:bcache_writebac state:D stack: 0 pid: 2911
ppid: 2 flags:0x00004000
[ 972.476841] Call Trace:
[ 972.476842] <TASK>
[ 972.476843] __schedule+0x23d/0x590
[ 972.476847] schedule+0x4e/0xb0
[ 972.476861] rwsem_down_write_slowpath+0x220/0x3d0
[ 972.476865] ? del_timer_sync+0x6c/0xb0
[ 972.476869] down_write+0x43/0x50
[ 972.476872] bch_writeback_thread+0x78/0x320 [bcache]
[ 972.476891] ? read_dirty+0x5a0/0x5a0 [bcache]
[ 972.476919] kthread+0x12a/0x150
[ 972.476922] ? set_kthread_struct+0x50/0x50
[ 972.476926] ret_from_fork+0x22/0x30
[ 972.476930] </TASK>
[ 972.476933] INFO: task xfsaild/bcache0:2955 blocked for more than
123 seconds.
[ 972.825513] Not tainted 5.15.0-40-generic #43-Ubuntu
[ 973.092507] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
disables this message.
[ 973.473651] task:xfsaild/bcache0 state:D stack: 0 pid: 2955
ppid: 2 flags:0x00004000
[ 973.473653] Call Trace:
[ 973.473654] <TASK>
[ 973.473667] __schedule+0x23d/0x590
[ 973.473669] ? cpumask_next_and+0x24/0x30
[ 973.473674] schedule+0x4e/0xb0
[ 973.473687] rwsem_down_read_slowpath+0x32e/0x380
[ 973.473690] down_read+0x43/0x90
[ 973.473693] cached_dev_write+0x7e/0x480 [bcache]
[ 973.473707] ? recalibrate_cpu_khz+0x10/0x10
[ 973.473712] cached_dev_submit_bio+0x4ef/0x540 [bcache]
[ 973.473725] __submit_bio+0x1a1/0x220
[ 973.473740] ? kmem_cache_alloc+0x1ab/0x2e0
[ 973.473744] ? mempool_alloc_slab+0x17/0x20
[ 973.473747] __submit_bio_noacct+0x85/0x1f0
[ 973.473750] submit_bio_noacct+0x4e/0x120
[ 973.473753] ? bio_add_page+0x68/0x90
[ 973.473757] submit_bio+0x4a/0x130
[ 973.473761] xfs_buf_ioapply_map+0x205/0x290 [xfs]
[ 973.473860] _xfs_buf_ioapply+0xe2/0x1b0 [xfs]
[ 973.473936] ? wake_up_q+0x90/0x90
[ 973.473940] __xfs_buf_submit+0x6d/0x1d0 [xfs]
[ 973.474016] xfs_buf_delwri_submit_buffers+0xd9/0x200 [xfs]
[ 973.474093] xfs_buf_delwri_submit_nowait+0x10/0x20 [xfs]
[ 973.474179] xfsaild_push+0x185/0x890 [xfs]
[ 973.474260] ? del_timer_sync+0x6c/0xb0
[ 973.474262] xfsaild+0xf2/0x210 [xfs]
[ 973.474343] ? xfsaild_push+0x890/0x890 [xfs]
[ 973.474437] kthread+0x12a/0x150
[ 973.474440] ? set_kthread_struct+0x50/0x50
[ 973.474443] ret_from_fork+0x22/0x30
[ 973.474447] </TASK>
[ 973.474483] INFO: task fio:2965 blocked for more than 124 seconds.
[ 973.771249] Not tainted 5.15.0-40-generic #43-Ubuntu
[ 974.038183] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
disables this message.
[ 974.419303] task:fio state:D stack: 0 pid: 2965
ppid: 2613 flags:0x00004002
[ 974.419307] Call Trace:
[ 974.419308] <TASK>
[ 974.419310] __schedule+0x23d/0x590
[ 974.419325] schedule+0x4e/0xb0
[ 974.419327] io_schedule+0x46/0x70
[ 974.419329] wait_on_page_bit_common+0x10c/0x3d0
[ 974.419332] ? filemap_invalidate_unlock_two+0x40/0x40
[ 974.419335] wait_on_page_bit+0x3f/0x50
[ 974.419337] wait_on_page_writeback+0x26/0x80
[ 974.419341] __filemap_fdatawait_range+0x97/0x110
[ 974.419344] file_write_and_wait_range+0xcc/0xf0
[ 974.419347] xfs_file_fsync+0x5b/0x250 [xfs]
[ 974.419439] vfs_fsync_range+0x49/0x80
[ 974.419443] ? __fget_light+0x32/0x80
[ 974.419448] __x64_sys_fsync+0x38/0x60
[ 974.419451] do_syscall_64+0x5c/0xc0
[ 974.419454] ? fput+0x13/0x20
[ 974.419457] ? ksys_write+0xce/0xe0
[ 974.419459] ? exit_to_user_mode_prepare+0x37/0xb0
[ 974.419463] ? syscall_exit_to_user_mode+0x27/0x50
[ 974.419466] ? __x64_sys_write+0x19/0x20
[ 974.419468] ? do_syscall_64+0x69/0xc0
[ 974.419470] ? do_syscall_64+0x69/0xc0
[ 974.419471] ? do_syscall_64+0x69/0xc0
[ 974.419473] ? asm_sysvec_apic_timer_interrupt+0xa/0x20
[ 974.419477] entry_SYSCALL_64_after_hwframe+0x44/0xae
[ 974.419480] RIP: 0033:0x7f91f1c76a5b
[ 974.419483] RSP: 002b:00007fff7fc22340 EFLAGS: 00000293 ORIG_RAX:
000000000000004a
[ 974.419485] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f91f1c76a5b
[ 974.419487] RDX: 0000000000000000 RSI: 0000564d2447bf40 RDI: 0000000000000006
[ 974.419488] RBP: 0000000000001000 R08: 0000000000000000 R09: 0000564d2447bf40
[ 974.419489] R10: 00000000a318c620 R11: 0000000000000293 R12: 00007f91e781f000
[ 974.419491] R13: 0000000000000000 R14: 00007f91eefca2f0 R15: 0000000500000000
[ 974.419493] </TASK>
[ 974.419534] INFO: task kworker/16:0:3008 blocked for more than 125 seconds.
[ 974.758685] Not tainted 5.15.0-40-generic #43-Ubuntu
[ 975.025650] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
disables this message.
[ 975.407087] task:kworker/16:0 state:D stack: 0 pid: 3008
ppid: 2 flags:0x00004000
[ 975.407103] Workqueue: bcache bch_data_insert_start [bcache]
[ 975.407133] Call Trace:
[ 975.407135] <TASK>
[ 975.407136] __schedule+0x23d/0x590
[ 975.407140] schedule+0x4e/0xb0
[ 975.407143] bch_bucket_alloc+0x21a/0x5b0 [bcache]
[ 975.407153] ? bch_btree_insert_check_key+0x1e0/0x1e0 [bcache]
[ 975.407166] ? wait_woken+0x70/0x70
[ 975.407169] __bch_bucket_alloc_set+0x52/0xf0 [bcache]
[ 975.407180] bch_alloc_sectors+0x1c9/0x4c0 [bcache]
[ 975.407191] ? __wake_up_common_lock+0x8a/0xc0
[ 975.407193] bch_data_insert_start+0x15e/0x3a0 [bcache]
[ 975.407207] ? closure_sub+0x94/0xb0 [bcache]
[ 975.407220] process_one_work+0x22b/0x3d0
[ 975.407224] worker_thread+0x53/0x410
[ 975.407228] ? process_one_work+0x3d0/0x3d0
[ 975.407231] kthread+0x12a/0x150
[ 975.407234] ? set_kthread_struct+0x50/0x50
[ 975.407237] ret_from_fork+0x22/0x30
[ 975.407242] </TASK>
On Tue, 28 Jun 2022 at 12:01, Nikhil Kshirsagar <nkshirsagar@gmail.com> wrote:
>
> Hi Coly,
>
> I just kicked off a 20gb random readwrite test using fio to a bcache device,
>
> fio --filename=/home/ubuntu/bcache_mount/cacahedfile --size=20GB
> --direct=1 --rw=randrw --bs=4k --ioengine=libaio --iodepth=128
> --name=iops-test-job --eta-newline=1
>
> # lsblk /dev/sdc /dev/nvme0n1
> NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINTS
> sdc 8:32 0 279.4G 0 disk
> └─sdc1 8:33 0 60G 0 part
> └─bcache0 252:0 0 60G 0 disk /home/ubuntu/bcache_mount
> nvme0n1 259:0 0 372.6G 0 disk
> └─nvme0n1p1 259:2 0 15G 0 part
> └─bcache0 252:0 0 60G 0 disk /home/ubuntu/bcache_mount
>
> and am seeing this trace in the dmesg,
>
> [ 2475.034909] XFS (bcache0): Ending clean mount
> [ 2475.036238] xfs filesystem being mounted at
> /home/ubuntu/bcache_mount supports timestamps until 2038 (0x7fffffff)
>
> [ 2782.176705] INFO: task kworker/2:1:255 blocked for more than 120 seconds.
> [ 2782.507171] Not tainted 5.15.0-40-generic #43-Ubuntu
> [ 2782.774078] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
> disables this message.
> [ 2783.155206] task:kworker/2:1 state:D stack: 0 pid: 255
> ppid: 2 flags:0x00004000
> [ 2783.155210] Workqueue: bcache bch_data_insert_start [bcache]
> [ 2783.155259] Call Trace:
> [ 2783.155261] <TASK>
> [ 2783.155263] __schedule+0x23d/0x590
> [ 2783.155269] schedule+0x4e/0xb0
> [ 2783.155271] bch_bucket_alloc+0x21a/0x5b0 [bcache]
> [ 2783.155281] ? bch_btree_insert_check_key+0x1e0/0x1e0 [bcache]
> [ 2783.155294] ? wait_woken+0x70/0x70
> [ 2783.155298] __bch_bucket_alloc_set+0x52/0xf0 [bcache]
> [ 2783.155309] bch_alloc_sectors+0x1c9/0x4c0 [bcache]
> [ 2783.155319] ? bch_btree_insert+0xea/0x130 [bcache]
> [ 2783.155331] ? __queue_work+0x211/0x480
> [ 2783.155336] bch_data_insert_start+0x15e/0x3a0 [bcache]
> [ 2783.155349] ? closure_sub+0x94/0xb0 [bcache]
> [ 2783.155362] process_one_work+0x22b/0x3d0
> [ 2783.155366] worker_thread+0x53/0x410
> [ 2783.155369] ? process_one_work+0x3d0/0x3d0
> [ 2783.155372] kthread+0x12a/0x150
> [ 2783.155376] ? set_kthread_struct+0x50/0x50
> [ 2783.155379] ret_from_fork+0x22/0x30
> [ 2783.155385] </TASK>
> [ 2783.155423] INFO: task kworker/3:1:267 blocked for more than 121 seconds.
> [ 2783.485797] Not tainted 5.15.0-40-generic #43-Ubuntu
> [ 2783.752485] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
> disables this message.
> [ 2784.133541] task:kworker/3:1 state:D stack: 0 pid: 267
> ppid: 2 flags:0x00004000
> [ 2784.133544] Workqueue: events update_writeback_rate [bcache]
> [ 2784.133588] Call Trace:
> [ 2784.133589] <TASK>
> [ 2784.133591] __schedule+0x23d/0x590
> [ 2784.133594] schedule+0x4e/0xb0
> [ 2784.133596] rwsem_down_read_slowpath+0x32e/0x380
> [ 2784.133600] down_read+0x43/0x90
> [ 2784.133602] update_writeback_rate+0x134/0x190 [bcache]
> [ 2784.133619] process_one_work+0x22b/0x3d0
> [ 2784.133623] worker_thread+0x53/0x410
> [ 2784.133626] ? process_one_work+0x3d0/0x3d0
> [ 2784.133630] kthread+0x12a/0x150
> [ 2784.133633] ? set_kthread_struct+0x50/0x50
> [ 2784.133636] ret_from_fork+0x22/0x30
> [ 2784.133640] </TASK>
> [ 2784.133650] INFO: task kworker/25:2:326 blocked for more than 122 seconds.
> [ 2784.467880] Not tainted 5.15.0-40-generic #43-Ubuntu
> [ 2784.734405] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
> disables this message.
> [ 2785.114677] task:kworker/25:2 state:D stack: 0 pid: 326
> ppid: 2 flags:0x00004000
> [ 2785.114692] Workqueue: bcache bch_data_insert_start [bcache]
> [ 2785.114720] Call Trace:
> [ 2785.114721] <TASK>
> [ 2785.114723] __schedule+0x23d/0x590
> [ 2785.114726] schedule+0x4e/0xb0
> [ 2785.114729] bch_bucket_alloc+0x21a/0x5b0 [bcache]
> [ 2785.114740] ? bch_btree_insert_check_key+0x1e0/0x1e0 [bcache]
> [ 2785.114753] ? wait_woken+0x70/0x70
> [ 2785.114756] __bch_bucket_alloc_set+0x52/0xf0 [bcache]
> [ 2785.114766] bch_alloc_sectors+0x1c9/0x4c0 [bcache]
> [ 2785.114777] ? bch_btree_insert+0xea/0x130 [bcache]
> [ 2785.114789] ? __queue_work+0x211/0x480
> [ 2785.114793] bch_data_insert_start+0x15e/0x3a0 [bcache]
> [ 2785.114806] ? closure_sub+0x94/0xb0 [bcache]
> [ 2785.114818] process_one_work+0x22b/0x3d0
> [ 2785.114822] worker_thread+0x53/0x410
> [ 2785.114826] ? process_one_work+0x3d0/0x3d0
> [ 2785.114829] kthread+0x12a/0x150
> [ 2785.114832] ? set_kthread_struct+0x50/0x50
> [ 2785.114835] ret_from_fork+0x22/0x30
> [ 2785.114839] </TASK>
> [ 2785.114864] INFO: task kworker/6:0:3038 blocked for more than 123 seconds.
> [ 2785.444697] Not tainted 5.15.0-40-generic #43-Ubuntu
> [ 2785.711071] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
> disables this message.
> [ 2786.091440] task:kworker/6:0 state:D stack: 0 pid: 3038
> ppid: 2 flags:0x00004000
> [ 2786.091446] Workqueue: xfs-sync/bcache0 xfs_log_worker [xfs]
> [ 2786.091582] Call Trace:
> [ 2786.091584] <TASK>
> [ 2786.091585] __schedule+0x23d/0x590
> [ 2786.091589] schedule+0x4e/0xb0
> [ 2786.091591] rwsem_down_read_slowpath+0x32e/0x380
> [ 2786.091594] down_read+0x43/0x90
> [ 2786.091597] cached_dev_write+0x7e/0x480 [bcache]
> [ 2786.091613] cached_dev_submit_bio+0x4ef/0x540 [bcache]
> [ 2786.091626] __submit_bio+0x1a1/0x220
> [ 2786.091631] __submit_bio_noacct+0x85/0x1f0
> [ 2786.091634] ? mutex_lock+0x13/0x40
> [ 2786.091637] submit_bio_noacct+0x4e/0x120
> [ 2786.091640] submit_bio+0x4a/0x130
> [ 2786.091642] xlog_write_iclog+0x254/0x300 [xfs]
> [ 2786.091725] xlog_sync+0x1ab/0x2c0 [xfs]
> [ 2786.091807] xlog_state_release_iclog+0x123/0x1d0 [xfs]
> [ 2786.091925] xfs_log_force+0x186/0x250 [xfs]
> [ 2786.091994] xfs_log_worker+0x39/0x90 [xfs]
> [ 2786.092063] process_one_work+0x22b/0x3d0
> [ 2786.092067] worker_thread+0x53/0x410
> [ 2786.092069] ? process_one_work+0x3d0/0x3d0
> [ 2786.092072] kthread+0x12a/0x150
> [ 2786.092074] ? set_kthread_struct+0x50/0x50
> [ 2786.092077] ret_from_fork+0x22/0x30
> [ 2786.092081] </TASK>
> [ 2786.092088] INFO: task kworker/25:0:3047 blocked for more than 124 seconds.
> [ 2786.430313] Not tainted 5.15.0-40-generic #43-Ubuntu
> [ 2786.696626] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
> disables this message.
> [ 2787.077056] task:kworker/25:0 state:D stack: 0 pid: 3047
> ppid: 2 flags:0x00004000
> [ 2787.077071] Workqueue: bcache bch_data_insert_start [bcache]
> [ 2787.077098] Call Trace:
> [ 2787.077099] <TASK>
> [ 2787.077101] __schedule+0x23d/0x590
> [ 2787.077104] schedule+0x4e/0xb0
> [ 2787.077106] bch_bucket_alloc+0x21a/0x5b0 [bcache]
> [ 2787.077116] ? wait_woken+0x70/0x70
> [ 2787.077119] __bch_bucket_alloc_set+0x52/0xf0 [bcache]
> [ 2787.077142] bch_alloc_sectors+0x1c9/0x4c0 [bcache]
> [ 2787.077153] ? __bch_submit_bbio+0x97/0xb0 [bcache]
> [ 2787.077166] bch_data_insert_start+0x15e/0x3a0 [bcache]
> [ 2787.077179] ? closure_sub+0x94/0xb0 [bcache]
> [ 2787.077191] process_one_work+0x22b/0x3d0
> [ 2787.077195] worker_thread+0x53/0x410
> [ 2787.077198] ? process_one_work+0x3d0/0x3d0
> [ 2787.077202] kthread+0x12a/0x150
> [ 2787.077205] ? set_kthread_struct+0x50/0x50
> [ 2787.077207] ret_from_fork+0x22/0x30
> [ 2787.077212] </TASK>
> [ 2787.077247] INFO: task bcache_writebac:3080 blocked for more than
> 125 seconds.
> [ 2787.423890] Not tainted 5.15.0-40-generic #43-Ubuntu
> [ 2787.690476] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
> disables this message.
> [ 2788.071063] task:bcache_writebac state:D stack: 0 pid: 3080
> ppid: 2 flags:0x00004000
> [ 2788.071080] Call Trace:
> [ 2788.071081] <TASK>
> [ 2788.071082] __schedule+0x23d/0x590
> [ 2788.071086] schedule+0x4e/0xb0
> [ 2788.071100] rwsem_down_write_slowpath+0x220/0x3d0
> [ 2788.071105] ? del_timer_sync+0x6c/0xb0
> [ 2788.071109] down_write+0x43/0x50
> [ 2788.071112] bch_writeback_thread+0x78/0x320 [bcache]
> [ 2788.071142] ? read_dirty+0x5a0/0x5a0 [bcache]
> [ 2788.071158] kthread+0x12a/0x150
> [ 2788.071161] ? set_kthread_struct+0x50/0x50
> [ 2788.071164] ret_from_fork+0x22/0x30
> [ 2788.071168] </TASK>
> [ 2788.071199] INFO: task fio:3123 blocked for more than 126 seconds.
> [ 2788.367175] Not tainted 5.15.0-40-generic #43-Ubuntu
> [ 2788.633386] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
> disables this message.
> [ 2789.014084] task:fio state:D stack: 0 pid: 3123
> ppid: 2378 flags:0x00004002
> [ 2789.014087] Call Trace:
> [ 2789.014089] <TASK>
> [ 2789.014090] __schedule+0x23d/0x590
> [ 2789.014106] schedule+0x4e/0xb0
> [ 2789.014108] io_schedule+0x46/0x70
> [ 2789.014110] wait_on_page_bit_common+0x10c/0x3d0
> [ 2789.014115] ? filemap_invalidate_unlock_two+0x40/0x40
> [ 2789.014118] wait_on_page_bit+0x3f/0x50
> [ 2789.014120] wait_on_page_writeback+0x26/0x80
> [ 2789.014124] __filemap_fdatawait_range+0x97/0x110
> [ 2789.014126] file_write_and_wait_range+0xcc/0xf0
> [ 2789.014130] xfs_file_fsync+0x5b/0x250 [xfs]
> [ 2789.014207] vfs_fsync_range+0x49/0x80
> [ 2789.014212] ? __fget_light+0x32/0x80
> [ 2789.014217] __x64_sys_fsync+0x38/0x60
> [ 2789.014220] do_syscall_64+0x5c/0xc0
> [ 2789.014223] ? do_syscall_64+0x69/0xc0
> [ 2789.014225] ? do_syscall_64+0x69/0xc0
> [ 2789.014226] ? syscall_exit_to_user_mode+0x27/0x50
> [ 2789.014230] ? __x64_sys_write+0x19/0x20
> [ 2789.014232] ? do_syscall_64+0x69/0xc0
> [ 2789.014234] ? do_syscall_64+0x69/0xc0
> [ 2789.014235] ? asm_sysvec_apic_timer_interrupt+0xa/0x20
> [ 2789.014239] entry_SYSCALL_64_after_hwframe+0x44/0xae
> [ 2789.014243] RIP: 0033:0x7f2cfdcfea5b
> [ 2789.014245] RSP: 002b:00007ffcd87c6e60 EFLAGS: 00000293 ORIG_RAX:
> 000000000000004a
> [ 2789.014248] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f2cfdcfea5b
> [ 2789.014249] RDX: 0000000000000000 RSI: 000055c8d26e6f40 RDI: 0000000000000006
> [ 2789.014250] RBP: 0000000000001000 R08: 0000000000000000 R09: 000055c8d26e6f40
> [ 2789.014252] R10: 00000000a318c620 R11: 0000000000000293 R12: 00007f2cf38a7000
> [ 2789.014253] R13: 0000000000000000 R14: 00007f2cfb0522f0 R15: 0000000500000000
> [ 2789.014255] </TASK>
> [ 2909.151501] INFO: task kworker/2:1:255 blocked for more than 247 seconds.
> [ 2909.481856] Not tainted 5.15.0-40-generic #43-Ubuntu
> [ 2909.748707] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
> disables this message.
> [ 2910.130132] task:kworker/2:1 state:D stack: 0 pid: 255
> ppid: 2 flags:0x00004000
> [ 2910.130147] Workqueue: bcache bch_data_insert_start [bcache]
> [ 2910.130176] Call Trace:
> [ 2910.130177] <TASK>
> [ 2910.130179] __schedule+0x23d/0x590
> [ 2910.130182] schedule+0x4e/0xb0
> [ 2910.130184] bch_bucket_alloc+0x21a/0x5b0 [bcache]
> [ 2910.130194] ? bch_btree_insert_check_key+0x1e0/0x1e0 [bcache]
> [ 2910.130219] ? wait_woken+0x70/0x70
> [ 2910.130222] __bch_bucket_alloc_set+0x52/0xf0 [bcache]
> [ 2910.130233] bch_alloc_sectors+0x1c9/0x4c0 [bcache]
> [ 2910.130243] ? bch_btree_insert+0xea/0x130 [bcache]
> [ 2910.130255] ? __queue_work+0x211/0x480
> [ 2910.130259] bch_data_insert_start+0x15e/0x3a0 [bcache]
> [ 2910.130272] ? closure_sub+0x94/0xb0 [bcache]
> [ 2910.130285] process_one_work+0x22b/0x3d0
> [ 2910.130288] worker_thread+0x53/0x410
> [ 2910.130292] ? process_one_work+0x3d0/0x3d0
> [ 2910.130295] kthread+0x12a/0x150
> [ 2910.130298] ? set_kthread_struct+0x50/0x50
> [ 2910.130301] ret_from_fork+0x22/0x30
> [ 2910.130305] </TASK>
> [ 2910.130307] INFO: task kworker/3:1:267 blocked for more than 248 seconds.
> [ 2910.456520] Not tainted 5.15.0-40-generic #43-Ubuntu
> [ 2910.723476] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
> disables this message.
> [ 2911.104601] task:kworker/3:1 state:D stack: 0 pid: 267
> ppid: 2 flags:0x00004000
> [ 2911.104617] Workqueue: events update_writeback_rate [bcache]
> [ 2911.104647] Call Trace:
> [ 2911.104648] <TASK>
> [ 2911.104649] __schedule+0x23d/0x590
> [ 2911.104652] schedule+0x4e/0xb0
> [ 2911.104654] rwsem_down_read_slowpath+0x32e/0x380
> [ 2911.104657] down_read+0x43/0x90
> [ 2911.104660] update_writeback_rate+0x134/0x190 [bcache]
> [ 2911.104676] process_one_work+0x22b/0x3d0
> [ 2911.104680] worker_thread+0x53/0x410
> [ 2911.104683] ? process_one_work+0x3d0/0x3d0
> [ 2911.104687] kthread+0x12a/0x150
> [ 2911.104690] ? set_kthread_struct+0x50/0x50
> [ 2911.104693] ret_from_fork+0x22/0x30
> [ 2911.104697] </TASK>
> [ 2911.104733] INFO: task kworker/25:2:326 blocked for more than 249 seconds.
> [ 2911.439086] Not tainted 5.15.0-40-generic #43-Ubuntu
> [ 2911.706294] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
> disables this message.
> [ 2912.087471] task:kworker/25:2 state:D stack: 0 pid: 326
> ppid: 2 flags:0x00004000
> [ 2912.087475] Workqueue: bcache bch_data_insert_start [bcache]
> [ 2912.087495] Call Trace:
> [ 2912.087497] <TASK>
> [ 2912.087498] __schedule+0x23d/0x590
> [ 2912.087503] schedule+0x4e/0xb0
> [ 2912.087505] bch_bucket_alloc+0x21a/0x5b0 [bcache]
> [ 2912.087516] ? bch_btree_insert_check_key+0x1e0/0x1e0 [bcache]
> [ 2912.087528] ? wait_woken+0x70/0x70
> [ 2912.087532] __bch_bucket_alloc_set+0x52/0xf0 [bcache]
> [ 2912.087543] bch_alloc_sectors+0x1c9/0x4c0 [bcache]
> [ 2912.087553] ? bch_btree_insert+0xea/0x130 [bcache]
> [ 2912.087565] ? __queue_work+0x211/0x480
> [ 2912.087570] bch_data_insert_start+0x15e/0x3a0 [bcache]
> [ 2912.087583] ? closure_sub+0x94/0xb0 [bcache]
> [ 2912.087595] process_one_work+0x22b/0x3d0
> [ 2912.087599] worker_thread+0x53/0x410
> [ 2912.087602] ? process_one_work+0x3d0/0x3d0
> [ 2912.087606] kthread+0x12a/0x150
> [ 2912.087609] ? set_kthread_struct+0x50/0x50
> [ 2912.087612] ret_from_fork+0x22/0x30
> [ 2912.087617] </TASK>
>
> Is this a bug? It's in writeback mode. I'd setup the cache and run stuff like,
>
> echo 0 > /sys/block/bcache0/bcache/sequential_cutoff
>
> I had also echoed 0 into congested_read_threshold_us,
> congested_write_threshold_us.
>
> echo writeback > /sys/block/bcache0/bcache/cache_mode
>
> Regards,
> Nikhil.
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: seeing this stace trace on kernel 5.15
2022-06-28 6:31 seeing this stace trace on kernel 5.15 Nikhil Kshirsagar
2022-06-28 7:23 ` Nikhil Kshirsagar
@ 2022-06-28 14:37 ` Coly Li
2022-06-29 7:02 ` Nikhil Kshirsagar
1 sibling, 1 reply; 14+ messages in thread
From: Coly Li @ 2022-06-28 14:37 UTC (permalink / raw)
To: Nikhil Kshirsagar; +Cc: linux-bcache
> 2022年6月28日 14:31,Nikhil Kshirsagar <nkshirsagar@gmail.com> 写道:
>
> Hi Coly,
>
> I just kicked off a 20gb random readwrite test using fio to a bcache device,
>
> fio --filename=/home/ubuntu/bcache_mount/cacahedfile --size=20GB
> --direct=1 --rw=randrw --bs=4k --ioengine=libaio --iodepth=128
> --name=iops-test-job --eta-newline=1
>
> # lsblk /dev/sdc /dev/nvme0n1
> NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINTS
> sdc 8:32 0 279.4G 0 disk
> └─sdc1 8:33 0 60G 0 part
> └─bcache0 252:0 0 60G 0 disk /home/ubuntu/bcache_mount
> nvme0n1 259:0 0 372.6G 0 disk
> └─nvme0n1p1 259:2 0 15G 0 part
> └─bcache0 252:0 0 60G 0 disk /home/ubuntu/bcache_mount
>
> and am seeing this trace in the dmesg,
>
> [ 2475.034909] XFS (bcache0): Ending clean mount
> [ 2475.036238] xfs filesystem being mounted at
> /home/ubuntu/bcache_mount supports timestamps until 2038 (0x7fffffff)
>
> [ 2782.176705] INFO: task kworker/2:1:255 blocked for more than 120 seconds.
> [ 2782.507171] Not tainted 5.15.0-40-generic #43-Ubuntu
> [ 2782.774078] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
> disables this message.
> [ 2783.155206] task:kworker/2:1 state:D stack: 0 pid: 255
> ppid: 2 flags:0x00004000
> [ 2783.155210] Workqueue: bcache bch_data_insert_start [bcache]
> [ 2783.155259] Call Trace:
> [ 2783.155261] <TASK>
> [ 2783.155263] __schedule+0x23d/0x590
> [ 2783.155269] schedule+0x4e/0xb0
> [ 2783.155271] bch_bucket_alloc+0x21a/0x5b0 [bcache]
> [ 2783.155281] ? bch_btree_insert_check_key+0x1e0/0x1e0 [bcache]
> [ 2783.155294] ? wait_woken+0x70/0x70
> [ 2783.155298] __bch_bucket_alloc_set+0x52/0xf0 [bcache]
> [ 2783.155309] bch_alloc_sectors+0x1c9/0x4c0 [bcache]
> [ 2783.155319] ? bch_btree_insert+0xea/0x130 [bcache]
> [ 2783.155331] ? __queue_work+0x211/0x480
> [ 2783.155336] bch_data_insert_start+0x15e/0x3a0 [bcache]
> [ 2783.155349] ? closure_sub+0x94/0xb0 [bcache]
> [ 2783.155362] process_one_work+0x22b/0x3d0
> [ 2783.155366] worker_thread+0x53/0x410
> [ 2783.155369] ? process_one_work+0x3d0/0x3d0
> [ 2783.155372] kthread+0x12a/0x150
> [ 2783.155376] ? set_kthread_struct+0x50/0x50
> [ 2783.155379] ret_from_fork+0x22/0x30
> [ 2783.155385] </TASK>
> [ 2783.155423] INFO: task kworker/3:1:267 blocked for more than 121 seconds.
> [ 2783.485797] Not tainted 5.15.0-40-generic #43-Ubuntu
> [ 2783.752485] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
> disables this message.
> [ 2784.133541] task:kworker/3:1 state:D stack: 0 pid: 267
> ppid: 2 flags:0x00004000
> [ 2784.133544] Workqueue: events update_writeback_rate [bcache]
> [ 2784.133588] Call Trace:
> [ 2784.133589] <TASK>
> [ 2784.133591] __schedule+0x23d/0x590
> [ 2784.133594] schedule+0x4e/0xb0
> [ 2784.133596] rwsem_down_read_slowpath+0x32e/0x380
> [ 2784.133600] down_read+0x43/0x90
> [ 2784.133602] update_writeback_rate+0x134/0x190 [bcache]
> [ 2784.133619] process_one_work+0x22b/0x3d0
> [ 2784.133623] worker_thread+0x53/0x410
> [ 2784.133626] ? process_one_work+0x3d0/0x3d0
> [ 2784.133630] kthread+0x12a/0x150
> [ 2784.133633] ? set_kthread_struct+0x50/0x50
> [ 2784.133636] ret_from_fork+0x22/0x30
> [ 2784.133640] </TASK>
> [ 2784.133650] INFO: task kworker/25:2:326 blocked for more than 122 seconds.
> [ 2784.467880] Not tainted 5.15.0-40-generic #43-Ubuntu
> [ 2784.734405] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
> disables this message.
> [ 2785.114677] task:kworker/25:2 state:D stack: 0 pid: 326
> ppid: 2 flags:0x00004000
> [ 2785.114692] Workqueue: bcache bch_data_insert_start [bcache]
> [ 2785.114720] Call Trace:
> [ 2785.114721] <TASK>
> [ 2785.114723] __schedule+0x23d/0x590
> [ 2785.114726] schedule+0x4e/0xb0
> [ 2785.114729] bch_bucket_alloc+0x21a/0x5b0 [bcache]
> [ 2785.114740] ? bch_btree_insert_check_key+0x1e0/0x1e0 [bcache]
> [ 2785.114753] ? wait_woken+0x70/0x70
> [ 2785.114756] __bch_bucket_alloc_set+0x52/0xf0 [bcache]
> [ 2785.114766] bch_alloc_sectors+0x1c9/0x4c0 [bcache]
> [ 2785.114777] ? bch_btree_insert+0xea/0x130 [bcache]
> [ 2785.114789] ? __queue_work+0x211/0x480
> [ 2785.114793] bch_data_insert_start+0x15e/0x3a0 [bcache]
> [ 2785.114806] ? closure_sub+0x94/0xb0 [bcache]
> [ 2785.114818] process_one_work+0x22b/0x3d0
> [ 2785.114822] worker_thread+0x53/0x410
> [ 2785.114826] ? process_one_work+0x3d0/0x3d0
> [ 2785.114829] kthread+0x12a/0x150
> [ 2785.114832] ? set_kthread_struct+0x50/0x50
> [ 2785.114835] ret_from_fork+0x22/0x30
> [ 2785.114839] </TASK>
> [ 2785.114864] INFO: task kworker/6:0:3038 blocked for more than 123 seconds.
> [ 2785.444697] Not tainted 5.15.0-40-generic #43-Ubuntu
> [ 2785.711071] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
> disables this message.
> [ 2786.091440] task:kworker/6:0 state:D stack: 0 pid: 3038
> ppid: 2 flags:0x00004000
> [ 2786.091446] Workqueue: xfs-sync/bcache0 xfs_log_worker [xfs]
> [ 2786.091582] Call Trace:
> [ 2786.091584] <TASK>
> [ 2786.091585] __schedule+0x23d/0x590
> [ 2786.091589] schedule+0x4e/0xb0
> [ 2786.091591] rwsem_down_read_slowpath+0x32e/0x380
> [ 2786.091594] down_read+0x43/0x90
> [ 2786.091597] cached_dev_write+0x7e/0x480 [bcache]
> [ 2786.091613] cached_dev_submit_bio+0x4ef/0x540 [bcache]
> [ 2786.091626] __submit_bio+0x1a1/0x220
> [ 2786.091631] __submit_bio_noacct+0x85/0x1f0
> [ 2786.091634] ? mutex_lock+0x13/0x40
> [ 2786.091637] submit_bio_noacct+0x4e/0x120
> [ 2786.091640] submit_bio+0x4a/0x130
> [ 2786.091642] xlog_write_iclog+0x254/0x300 [xfs]
> [ 2786.091725] xlog_sync+0x1ab/0x2c0 [xfs]
> [ 2786.091807] xlog_state_release_iclog+0x123/0x1d0 [xfs]
> [ 2786.091925] xfs_log_force+0x186/0x250 [xfs]
> [ 2786.091994] xfs_log_worker+0x39/0x90 [xfs]
> [ 2786.092063] process_one_work+0x22b/0x3d0
> [ 2786.092067] worker_thread+0x53/0x410
> [ 2786.092069] ? process_one_work+0x3d0/0x3d0
> [ 2786.092072] kthread+0x12a/0x150
> [ 2786.092074] ? set_kthread_struct+0x50/0x50
> [ 2786.092077] ret_from_fork+0x22/0x30
> [ 2786.092081] </TASK>
> [ 2786.092088] INFO: task kworker/25:0:3047 blocked for more than 124 seconds.
> [ 2786.430313] Not tainted 5.15.0-40-generic #43-Ubuntu
> [ 2786.696626] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
> disables this message.
> [ 2787.077056] task:kworker/25:0 state:D stack: 0 pid: 3047
> ppid: 2 flags:0x00004000
> [ 2787.077071] Workqueue: bcache bch_data_insert_start [bcache]
> [ 2787.077098] Call Trace:
> [ 2787.077099] <TASK>
> [ 2787.077101] __schedule+0x23d/0x590
> [ 2787.077104] schedule+0x4e/0xb0
> [ 2787.077106] bch_bucket_alloc+0x21a/0x5b0 [bcache]
> [ 2787.077116] ? wait_woken+0x70/0x70
> [ 2787.077119] __bch_bucket_alloc_set+0x52/0xf0 [bcache]
> [ 2787.077142] bch_alloc_sectors+0x1c9/0x4c0 [bcache]
> [ 2787.077153] ? __bch_submit_bbio+0x97/0xb0 [bcache]
> [ 2787.077166] bch_data_insert_start+0x15e/0x3a0 [bcache]
> [ 2787.077179] ? closure_sub+0x94/0xb0 [bcache]
> [ 2787.077191] process_one_work+0x22b/0x3d0
> [ 2787.077195] worker_thread+0x53/0x410
> [ 2787.077198] ? process_one_work+0x3d0/0x3d0
> [ 2787.077202] kthread+0x12a/0x150
> [ 2787.077205] ? set_kthread_struct+0x50/0x50
> [ 2787.077207] ret_from_fork+0x22/0x30
> [ 2787.077212] </TASK>
> [ 2787.077247] INFO: task bcache_writebac:3080 blocked for more than
> 125 seconds.
> [ 2787.423890] Not tainted 5.15.0-40-generic #43-Ubuntu
> [ 2787.690476] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
> disables this message.
> [ 2788.071063] task:bcache_writebac state:D stack: 0 pid: 3080
> ppid: 2 flags:0x00004000
> [ 2788.071080] Call Trace:
> [ 2788.071081] <TASK>
> [ 2788.071082] __schedule+0x23d/0x590
> [ 2788.071086] schedule+0x4e/0xb0
> [ 2788.071100] rwsem_down_write_slowpath+0x220/0x3d0
> [ 2788.071105] ? del_timer_sync+0x6c/0xb0
> [ 2788.071109] down_write+0x43/0x50
> [ 2788.071112] bch_writeback_thread+0x78/0x320 [bcache]
> [ 2788.071142] ? read_dirty+0x5a0/0x5a0 [bcache]
> [ 2788.071158] kthread+0x12a/0x150
> [ 2788.071161] ? set_kthread_struct+0x50/0x50
> [ 2788.071164] ret_from_fork+0x22/0x30
> [ 2788.071168] </TASK>
> [ 2788.071199] INFO: task fio:3123 blocked for more than 126 seconds.
> [ 2788.367175] Not tainted 5.15.0-40-generic #43-Ubuntu
> [ 2788.633386] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
> disables this message.
> [ 2789.014084] task:fio state:D stack: 0 pid: 3123
> ppid: 2378 flags:0x00004002
> [ 2789.014087] Call Trace:
> [ 2789.014089] <TASK>
> [ 2789.014090] __schedule+0x23d/0x590
> [ 2789.014106] schedule+0x4e/0xb0
> [ 2789.014108] io_schedule+0x46/0x70
> [ 2789.014110] wait_on_page_bit_common+0x10c/0x3d0
> [ 2789.014115] ? filemap_invalidate_unlock_two+0x40/0x40
> [ 2789.014118] wait_on_page_bit+0x3f/0x50
> [ 2789.014120] wait_on_page_writeback+0x26/0x80
> [ 2789.014124] __filemap_fdatawait_range+0x97/0x110
> [ 2789.014126] file_write_and_wait_range+0xcc/0xf0
> [ 2789.014130] xfs_file_fsync+0x5b/0x250 [xfs]
> [ 2789.014207] vfs_fsync_range+0x49/0x80
> [ 2789.014212] ? __fget_light+0x32/0x80
> [ 2789.014217] __x64_sys_fsync+0x38/0x60
> [ 2789.014220] do_syscall_64+0x5c/0xc0
> [ 2789.014223] ? do_syscall_64+0x69/0xc0
> [ 2789.014225] ? do_syscall_64+0x69/0xc0
> [ 2789.014226] ? syscall_exit_to_user_mode+0x27/0x50
> [ 2789.014230] ? __x64_sys_write+0x19/0x20
> [ 2789.014232] ? do_syscall_64+0x69/0xc0
> [ 2789.014234] ? do_syscall_64+0x69/0xc0
> [ 2789.014235] ? asm_sysvec_apic_timer_interrupt+0xa/0x20
> [ 2789.014239] entry_SYSCALL_64_after_hwframe+0x44/0xae
> [ 2789.014243] RIP: 0033:0x7f2cfdcfea5b
> [ 2789.014245] RSP: 002b:00007ffcd87c6e60 EFLAGS: 00000293 ORIG_RAX:
> 000000000000004a
> [ 2789.014248] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f2cfdcfea5b
> [ 2789.014249] RDX: 0000000000000000 RSI: 000055c8d26e6f40 RDI: 0000000000000006
> [ 2789.014250] RBP: 0000000000001000 R08: 0000000000000000 R09: 000055c8d26e6f40
> [ 2789.014252] R10: 00000000a318c620 R11: 0000000000000293 R12: 00007f2cf38a7000
> [ 2789.014253] R13: 0000000000000000 R14: 00007f2cfb0522f0 R15: 0000000500000000
> [ 2789.014255] </TASK>
> [ 2909.151501] INFO: task kworker/2:1:255 blocked for more than 247 seconds.
> [ 2909.481856] Not tainted 5.15.0-40-generic #43-Ubuntu
> [ 2909.748707] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
> disables this message.
> [ 2910.130132] task:kworker/2:1 state:D stack: 0 pid: 255
> ppid: 2 flags:0x00004000
> [ 2910.130147] Workqueue: bcache bch_data_insert_start [bcache]
> [ 2910.130176] Call Trace:
> [ 2910.130177] <TASK>
> [ 2910.130179] __schedule+0x23d/0x590
> [ 2910.130182] schedule+0x4e/0xb0
> [ 2910.130184] bch_bucket_alloc+0x21a/0x5b0 [bcache]
> [ 2910.130194] ? bch_btree_insert_check_key+0x1e0/0x1e0 [bcache]
> [ 2910.130219] ? wait_woken+0x70/0x70
> [ 2910.130222] __bch_bucket_alloc_set+0x52/0xf0 [bcache]
> [ 2910.130233] bch_alloc_sectors+0x1c9/0x4c0 [bcache]
> [ 2910.130243] ? bch_btree_insert+0xea/0x130 [bcache]
> [ 2910.130255] ? __queue_work+0x211/0x480
> [ 2910.130259] bch_data_insert_start+0x15e/0x3a0 [bcache]
> [ 2910.130272] ? closure_sub+0x94/0xb0 [bcache]
> [ 2910.130285] process_one_work+0x22b/0x3d0
> [ 2910.130288] worker_thread+0x53/0x410
> [ 2910.130292] ? process_one_work+0x3d0/0x3d0
> [ 2910.130295] kthread+0x12a/0x150
> [ 2910.130298] ? set_kthread_struct+0x50/0x50
> [ 2910.130301] ret_from_fork+0x22/0x30
> [ 2910.130305] </TASK>
> [ 2910.130307] INFO: task kworker/3:1:267 blocked for more than 248 seconds.
> [ 2910.456520] Not tainted 5.15.0-40-generic #43-Ubuntu
> [ 2910.723476] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
> disables this message.
> [ 2911.104601] task:kworker/3:1 state:D stack: 0 pid: 267
> ppid: 2 flags:0x00004000
> [ 2911.104617] Workqueue: events update_writeback_rate [bcache]
> [ 2911.104647] Call Trace:
> [ 2911.104648] <TASK>
> [ 2911.104649] __schedule+0x23d/0x590
> [ 2911.104652] schedule+0x4e/0xb0
> [ 2911.104654] rwsem_down_read_slowpath+0x32e/0x380
> [ 2911.104657] down_read+0x43/0x90
> [ 2911.104660] update_writeback_rate+0x134/0x190 [bcache]
> [ 2911.104676] process_one_work+0x22b/0x3d0
> [ 2911.104680] worker_thread+0x53/0x410
> [ 2911.104683] ? process_one_work+0x3d0/0x3d0
> [ 2911.104687] kthread+0x12a/0x150
> [ 2911.104690] ? set_kthread_struct+0x50/0x50
> [ 2911.104693] ret_from_fork+0x22/0x30
> [ 2911.104697] </TASK>
> [ 2911.104733] INFO: task kworker/25:2:326 blocked for more than 249 seconds.
> [ 2911.439086] Not tainted 5.15.0-40-generic #43-Ubuntu
> [ 2911.706294] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
> disables this message.
> [ 2912.087471] task:kworker/25:2 state:D stack: 0 pid: 326
> ppid: 2 flags:0x00004000
> [ 2912.087475] Workqueue: bcache bch_data_insert_start [bcache]
> [ 2912.087495] Call Trace:
> [ 2912.087497] <TASK>
> [ 2912.087498] __schedule+0x23d/0x590
> [ 2912.087503] schedule+0x4e/0xb0
> [ 2912.087505] bch_bucket_alloc+0x21a/0x5b0 [bcache]
> [ 2912.087516] ? bch_btree_insert_check_key+0x1e0/0x1e0 [bcache]
> [ 2912.087528] ? wait_woken+0x70/0x70
> [ 2912.087532] __bch_bucket_alloc_set+0x52/0xf0 [bcache]
> [ 2912.087543] bch_alloc_sectors+0x1c9/0x4c0 [bcache]
> [ 2912.087553] ? bch_btree_insert+0xea/0x130 [bcache]
> [ 2912.087565] ? __queue_work+0x211/0x480
> [ 2912.087570] bch_data_insert_start+0x15e/0x3a0 [bcache]
> [ 2912.087583] ? closure_sub+0x94/0xb0 [bcache]
> [ 2912.087595] process_one_work+0x22b/0x3d0
> [ 2912.087599] worker_thread+0x53/0x410
> [ 2912.087602] ? process_one_work+0x3d0/0x3d0
> [ 2912.087606] kthread+0x12a/0x150
> [ 2912.087609] ? set_kthread_struct+0x50/0x50
> [ 2912.087612] ret_from_fork+0x22/0x30
> [ 2912.087617] </TASK>
>
> Is this a bug? It's in writeback mode. I'd setup the cache and run stuff like,
>
> echo 0 > /sys/block/bcache0/bcache/sequential_cutoff
>
> I had also echoed 0 into congested_read_threshold_us,
> congested_write_threshold_us.
>
> echo writeback > /sys/block/bcache0/bcache/cache_mode
Where do you get the kernel? If this is stable kernel, could you give me the HEAD commit id?
Coly Li
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: seeing this stace trace on kernel 5.15
2022-06-28 14:37 ` Coly Li
@ 2022-06-29 7:02 ` Nikhil Kshirsagar
2022-06-29 7:50 ` Coly Li
0 siblings, 1 reply; 14+ messages in thread
From: Nikhil Kshirsagar @ 2022-06-29 7:02 UTC (permalink / raw)
To: Coly Li; +Cc: linux-bcache
Hi Coly,
I see the same bug on upstream kernel when I tried with
https://kernel.ubuntu.com/~kernel-ppa/mainline/v5.15.50/ (
cod/mainline/v5.15.50 (767db4b286c3e101ac220b813c873f492d9e4ee8)
Reads seem to trigger it, not writes. So this test triggered it -
fio --name=read_iops --directory=/home/ubuntu/bcache_mount --size=12G
--ioengine=libaio --direct=1 --verify=0 --bs=4K --iodepth=128
--rw=randread --group_reporting=1
https://pastebin.com/KyVSfnik has all the details.
Regards,
Nikhil.
On Tue, 28 Jun 2022 at 20:08, Coly Li <colyli@suse.de> wrote:
>
>
>
> > 2022年6月28日 14:31,Nikhil Kshirsagar <nkshirsagar@gmail.com> 写道:
> >
> > Hi Coly,
> >
> > I just kicked off a 20gb random readwrite test using fio to a bcache device,
> >
> > fio --filename=/home/ubuntu/bcache_mount/cacahedfile --size=20GB
> > --direct=1 --rw=randrw --bs=4k --ioengine=libaio --iodepth=128
> > --name=iops-test-job --eta-newline=1
> >
> > # lsblk /dev/sdc /dev/nvme0n1
> > NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINTS
> > sdc 8:32 0 279.4G 0 disk
> > └─sdc1 8:33 0 60G 0 part
> > └─bcache0 252:0 0 60G 0 disk /home/ubuntu/bcache_mount
> > nvme0n1 259:0 0 372.6G 0 disk
> > └─nvme0n1p1 259:2 0 15G 0 part
> > └─bcache0 252:0 0 60G 0 disk /home/ubuntu/bcache_mount
> >
> > and am seeing this trace in the dmesg,
> >
> > [ 2475.034909] XFS (bcache0): Ending clean mount
> > [ 2475.036238] xfs filesystem being mounted at
> > /home/ubuntu/bcache_mount supports timestamps until 2038 (0x7fffffff)
> >
> > [ 2782.176705] INFO: task kworker/2:1:255 blocked for more than 120 seconds.
> > [ 2782.507171] Not tainted 5.15.0-40-generic #43-Ubuntu
> > [ 2782.774078] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
> > disables this message.
> > [ 2783.155206] task:kworker/2:1 state:D stack: 0 pid: 255
> > ppid: 2 flags:0x00004000
> > [ 2783.155210] Workqueue: bcache bch_data_insert_start [bcache]
> > [ 2783.155259] Call Trace:
> > [ 2783.155261] <TASK>
> > [ 2783.155263] __schedule+0x23d/0x590
> > [ 2783.155269] schedule+0x4e/0xb0
> > [ 2783.155271] bch_bucket_alloc+0x21a/0x5b0 [bcache]
> > [ 2783.155281] ? bch_btree_insert_check_key+0x1e0/0x1e0 [bcache]
> > [ 2783.155294] ? wait_woken+0x70/0x70
> > [ 2783.155298] __bch_bucket_alloc_set+0x52/0xf0 [bcache]
> > [ 2783.155309] bch_alloc_sectors+0x1c9/0x4c0 [bcache]
> > [ 2783.155319] ? bch_btree_insert+0xea/0x130 [bcache]
> > [ 2783.155331] ? __queue_work+0x211/0x480
> > [ 2783.155336] bch_data_insert_start+0x15e/0x3a0 [bcache]
> > [ 2783.155349] ? closure_sub+0x94/0xb0 [bcache]
> > [ 2783.155362] process_one_work+0x22b/0x3d0
> > [ 2783.155366] worker_thread+0x53/0x410
> > [ 2783.155369] ? process_one_work+0x3d0/0x3d0
> > [ 2783.155372] kthread+0x12a/0x150
> > [ 2783.155376] ? set_kthread_struct+0x50/0x50
> > [ 2783.155379] ret_from_fork+0x22/0x30
> > [ 2783.155385] </TASK>
> > [ 2783.155423] INFO: task kworker/3:1:267 blocked for more than 121 seconds.
> > [ 2783.485797] Not tainted 5.15.0-40-generic #43-Ubuntu
> > [ 2783.752485] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
> > disables this message.
> > [ 2784.133541] task:kworker/3:1 state:D stack: 0 pid: 267
> > ppid: 2 flags:0x00004000
> > [ 2784.133544] Workqueue: events update_writeback_rate [bcache]
> > [ 2784.133588] Call Trace:
> > [ 2784.133589] <TASK>
> > [ 2784.133591] __schedule+0x23d/0x590
> > [ 2784.133594] schedule+0x4e/0xb0
> > [ 2784.133596] rwsem_down_read_slowpath+0x32e/0x380
> > [ 2784.133600] down_read+0x43/0x90
> > [ 2784.133602] update_writeback_rate+0x134/0x190 [bcache]
> > [ 2784.133619] process_one_work+0x22b/0x3d0
> > [ 2784.133623] worker_thread+0x53/0x410
> > [ 2784.133626] ? process_one_work+0x3d0/0x3d0
> > [ 2784.133630] kthread+0x12a/0x150
> > [ 2784.133633] ? set_kthread_struct+0x50/0x50
> > [ 2784.133636] ret_from_fork+0x22/0x30
> > [ 2784.133640] </TASK>
> > [ 2784.133650] INFO: task kworker/25:2:326 blocked for more than 122 seconds.
> > [ 2784.467880] Not tainted 5.15.0-40-generic #43-Ubuntu
> > [ 2784.734405] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
> > disables this message.
> > [ 2785.114677] task:kworker/25:2 state:D stack: 0 pid: 326
> > ppid: 2 flags:0x00004000
> > [ 2785.114692] Workqueue: bcache bch_data_insert_start [bcache]
> > [ 2785.114720] Call Trace:
> > [ 2785.114721] <TASK>
> > [ 2785.114723] __schedule+0x23d/0x590
> > [ 2785.114726] schedule+0x4e/0xb0
> > [ 2785.114729] bch_bucket_alloc+0x21a/0x5b0 [bcache]
> > [ 2785.114740] ? bch_btree_insert_check_key+0x1e0/0x1e0 [bcache]
> > [ 2785.114753] ? wait_woken+0x70/0x70
> > [ 2785.114756] __bch_bucket_alloc_set+0x52/0xf0 [bcache]
> > [ 2785.114766] bch_alloc_sectors+0x1c9/0x4c0 [bcache]
> > [ 2785.114777] ? bch_btree_insert+0xea/0x130 [bcache]
> > [ 2785.114789] ? __queue_work+0x211/0x480
> > [ 2785.114793] bch_data_insert_start+0x15e/0x3a0 [bcache]
> > [ 2785.114806] ? closure_sub+0x94/0xb0 [bcache]
> > [ 2785.114818] process_one_work+0x22b/0x3d0
> > [ 2785.114822] worker_thread+0x53/0x410
> > [ 2785.114826] ? process_one_work+0x3d0/0x3d0
> > [ 2785.114829] kthread+0x12a/0x150
> > [ 2785.114832] ? set_kthread_struct+0x50/0x50
> > [ 2785.114835] ret_from_fork+0x22/0x30
> > [ 2785.114839] </TASK>
> > [ 2785.114864] INFO: task kworker/6:0:3038 blocked for more than 123 seconds.
> > [ 2785.444697] Not tainted 5.15.0-40-generic #43-Ubuntu
> > [ 2785.711071] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
> > disables this message.
> > [ 2786.091440] task:kworker/6:0 state:D stack: 0 pid: 3038
> > ppid: 2 flags:0x00004000
> > [ 2786.091446] Workqueue: xfs-sync/bcache0 xfs_log_worker [xfs]
> > [ 2786.091582] Call Trace:
> > [ 2786.091584] <TASK>
> > [ 2786.091585] __schedule+0x23d/0x590
> > [ 2786.091589] schedule+0x4e/0xb0
> > [ 2786.091591] rwsem_down_read_slowpath+0x32e/0x380
> > [ 2786.091594] down_read+0x43/0x90
> > [ 2786.091597] cached_dev_write+0x7e/0x480 [bcache]
> > [ 2786.091613] cached_dev_submit_bio+0x4ef/0x540 [bcache]
> > [ 2786.091626] __submit_bio+0x1a1/0x220
> > [ 2786.091631] __submit_bio_noacct+0x85/0x1f0
> > [ 2786.091634] ? mutex_lock+0x13/0x40
> > [ 2786.091637] submit_bio_noacct+0x4e/0x120
> > [ 2786.091640] submit_bio+0x4a/0x130
> > [ 2786.091642] xlog_write_iclog+0x254/0x300 [xfs]
> > [ 2786.091725] xlog_sync+0x1ab/0x2c0 [xfs]
> > [ 2786.091807] xlog_state_release_iclog+0x123/0x1d0 [xfs]
> > [ 2786.091925] xfs_log_force+0x186/0x250 [xfs]
> > [ 2786.091994] xfs_log_worker+0x39/0x90 [xfs]
> > [ 2786.092063] process_one_work+0x22b/0x3d0
> > [ 2786.092067] worker_thread+0x53/0x410
> > [ 2786.092069] ? process_one_work+0x3d0/0x3d0
> > [ 2786.092072] kthread+0x12a/0x150
> > [ 2786.092074] ? set_kthread_struct+0x50/0x50
> > [ 2786.092077] ret_from_fork+0x22/0x30
> > [ 2786.092081] </TASK>
> > [ 2786.092088] INFO: task kworker/25:0:3047 blocked for more than 124 seconds.
> > [ 2786.430313] Not tainted 5.15.0-40-generic #43-Ubuntu
> > [ 2786.696626] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
> > disables this message.
> > [ 2787.077056] task:kworker/25:0 state:D stack: 0 pid: 3047
> > ppid: 2 flags:0x00004000
> > [ 2787.077071] Workqueue: bcache bch_data_insert_start [bcache]
> > [ 2787.077098] Call Trace:
> > [ 2787.077099] <TASK>
> > [ 2787.077101] __schedule+0x23d/0x590
> > [ 2787.077104] schedule+0x4e/0xb0
> > [ 2787.077106] bch_bucket_alloc+0x21a/0x5b0 [bcache]
> > [ 2787.077116] ? wait_woken+0x70/0x70
> > [ 2787.077119] __bch_bucket_alloc_set+0x52/0xf0 [bcache]
> > [ 2787.077142] bch_alloc_sectors+0x1c9/0x4c0 [bcache]
> > [ 2787.077153] ? __bch_submit_bbio+0x97/0xb0 [bcache]
> > [ 2787.077166] bch_data_insert_start+0x15e/0x3a0 [bcache]
> > [ 2787.077179] ? closure_sub+0x94/0xb0 [bcache]
> > [ 2787.077191] process_one_work+0x22b/0x3d0
> > [ 2787.077195] worker_thread+0x53/0x410
> > [ 2787.077198] ? process_one_work+0x3d0/0x3d0
> > [ 2787.077202] kthread+0x12a/0x150
> > [ 2787.077205] ? set_kthread_struct+0x50/0x50
> > [ 2787.077207] ret_from_fork+0x22/0x30
> > [ 2787.077212] </TASK>
> > [ 2787.077247] INFO: task bcache_writebac:3080 blocked for more than
> > 125 seconds.
> > [ 2787.423890] Not tainted 5.15.0-40-generic #43-Ubuntu
> > [ 2787.690476] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
> > disables this message.
> > [ 2788.071063] task:bcache_writebac state:D stack: 0 pid: 3080
> > ppid: 2 flags:0x00004000
> > [ 2788.071080] Call Trace:
> > [ 2788.071081] <TASK>
> > [ 2788.071082] __schedule+0x23d/0x590
> > [ 2788.071086] schedule+0x4e/0xb0
> > [ 2788.071100] rwsem_down_write_slowpath+0x220/0x3d0
> > [ 2788.071105] ? del_timer_sync+0x6c/0xb0
> > [ 2788.071109] down_write+0x43/0x50
> > [ 2788.071112] bch_writeback_thread+0x78/0x320 [bcache]
> > [ 2788.071142] ? read_dirty+0x5a0/0x5a0 [bcache]
> > [ 2788.071158] kthread+0x12a/0x150
> > [ 2788.071161] ? set_kthread_struct+0x50/0x50
> > [ 2788.071164] ret_from_fork+0x22/0x30
> > [ 2788.071168] </TASK>
> > [ 2788.071199] INFO: task fio:3123 blocked for more than 126 seconds.
> > [ 2788.367175] Not tainted 5.15.0-40-generic #43-Ubuntu
> > [ 2788.633386] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
> > disables this message.
> > [ 2789.014084] task:fio state:D stack: 0 pid: 3123
> > ppid: 2378 flags:0x00004002
> > [ 2789.014087] Call Trace:
> > [ 2789.014089] <TASK>
> > [ 2789.014090] __schedule+0x23d/0x590
> > [ 2789.014106] schedule+0x4e/0xb0
> > [ 2789.014108] io_schedule+0x46/0x70
> > [ 2789.014110] wait_on_page_bit_common+0x10c/0x3d0
> > [ 2789.014115] ? filemap_invalidate_unlock_two+0x40/0x40
> > [ 2789.014118] wait_on_page_bit+0x3f/0x50
> > [ 2789.014120] wait_on_page_writeback+0x26/0x80
> > [ 2789.014124] __filemap_fdatawait_range+0x97/0x110
> > [ 2789.014126] file_write_and_wait_range+0xcc/0xf0
> > [ 2789.014130] xfs_file_fsync+0x5b/0x250 [xfs]
> > [ 2789.014207] vfs_fsync_range+0x49/0x80
> > [ 2789.014212] ? __fget_light+0x32/0x80
> > [ 2789.014217] __x64_sys_fsync+0x38/0x60
> > [ 2789.014220] do_syscall_64+0x5c/0xc0
> > [ 2789.014223] ? do_syscall_64+0x69/0xc0
> > [ 2789.014225] ? do_syscall_64+0x69/0xc0
> > [ 2789.014226] ? syscall_exit_to_user_mode+0x27/0x50
> > [ 2789.014230] ? __x64_sys_write+0x19/0x20
> > [ 2789.014232] ? do_syscall_64+0x69/0xc0
> > [ 2789.014234] ? do_syscall_64+0x69/0xc0
> > [ 2789.014235] ? asm_sysvec_apic_timer_interrupt+0xa/0x20
> > [ 2789.014239] entry_SYSCALL_64_after_hwframe+0x44/0xae
> > [ 2789.014243] RIP: 0033:0x7f2cfdcfea5b
> > [ 2789.014245] RSP: 002b:00007ffcd87c6e60 EFLAGS: 00000293 ORIG_RAX:
> > 000000000000004a
> > [ 2789.014248] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f2cfdcfea5b
> > [ 2789.014249] RDX: 0000000000000000 RSI: 000055c8d26e6f40 RDI: 0000000000000006
> > [ 2789.014250] RBP: 0000000000001000 R08: 0000000000000000 R09: 000055c8d26e6f40
> > [ 2789.014252] R10: 00000000a318c620 R11: 0000000000000293 R12: 00007f2cf38a7000
> > [ 2789.014253] R13: 0000000000000000 R14: 00007f2cfb0522f0 R15: 0000000500000000
> > [ 2789.014255] </TASK>
> > [ 2909.151501] INFO: task kworker/2:1:255 blocked for more than 247 seconds.
> > [ 2909.481856] Not tainted 5.15.0-40-generic #43-Ubuntu
> > [ 2909.748707] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
> > disables this message.
> > [ 2910.130132] task:kworker/2:1 state:D stack: 0 pid: 255
> > ppid: 2 flags:0x00004000
> > [ 2910.130147] Workqueue: bcache bch_data_insert_start [bcache]
> > [ 2910.130176] Call Trace:
> > [ 2910.130177] <TASK>
> > [ 2910.130179] __schedule+0x23d/0x590
> > [ 2910.130182] schedule+0x4e/0xb0
> > [ 2910.130184] bch_bucket_alloc+0x21a/0x5b0 [bcache]
> > [ 2910.130194] ? bch_btree_insert_check_key+0x1e0/0x1e0 [bcache]
> > [ 2910.130219] ? wait_woken+0x70/0x70
> > [ 2910.130222] __bch_bucket_alloc_set+0x52/0xf0 [bcache]
> > [ 2910.130233] bch_alloc_sectors+0x1c9/0x4c0 [bcache]
> > [ 2910.130243] ? bch_btree_insert+0xea/0x130 [bcache]
> > [ 2910.130255] ? __queue_work+0x211/0x480
> > [ 2910.130259] bch_data_insert_start+0x15e/0x3a0 [bcache]
> > [ 2910.130272] ? closure_sub+0x94/0xb0 [bcache]
> > [ 2910.130285] process_one_work+0x22b/0x3d0
> > [ 2910.130288] worker_thread+0x53/0x410
> > [ 2910.130292] ? process_one_work+0x3d0/0x3d0
> > [ 2910.130295] kthread+0x12a/0x150
> > [ 2910.130298] ? set_kthread_struct+0x50/0x50
> > [ 2910.130301] ret_from_fork+0x22/0x30
> > [ 2910.130305] </TASK>
> > [ 2910.130307] INFO: task kworker/3:1:267 blocked for more than 248 seconds.
> > [ 2910.456520] Not tainted 5.15.0-40-generic #43-Ubuntu
> > [ 2910.723476] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
> > disables this message.
> > [ 2911.104601] task:kworker/3:1 state:D stack: 0 pid: 267
> > ppid: 2 flags:0x00004000
> > [ 2911.104617] Workqueue: events update_writeback_rate [bcache]
> > [ 2911.104647] Call Trace:
> > [ 2911.104648] <TASK>
> > [ 2911.104649] __schedule+0x23d/0x590
> > [ 2911.104652] schedule+0x4e/0xb0
> > [ 2911.104654] rwsem_down_read_slowpath+0x32e/0x380
> > [ 2911.104657] down_read+0x43/0x90
> > [ 2911.104660] update_writeback_rate+0x134/0x190 [bcache]
> > [ 2911.104676] process_one_work+0x22b/0x3d0
> > [ 2911.104680] worker_thread+0x53/0x410
> > [ 2911.104683] ? process_one_work+0x3d0/0x3d0
> > [ 2911.104687] kthread+0x12a/0x150
> > [ 2911.104690] ? set_kthread_struct+0x50/0x50
> > [ 2911.104693] ret_from_fork+0x22/0x30
> > [ 2911.104697] </TASK>
> > [ 2911.104733] INFO: task kworker/25:2:326 blocked for more than 249 seconds.
> > [ 2911.439086] Not tainted 5.15.0-40-generic #43-Ubuntu
> > [ 2911.706294] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
> > disables this message.
> > [ 2912.087471] task:kworker/25:2 state:D stack: 0 pid: 326
> > ppid: 2 flags:0x00004000
> > [ 2912.087475] Workqueue: bcache bch_data_insert_start [bcache]
> > [ 2912.087495] Call Trace:
> > [ 2912.087497] <TASK>
> > [ 2912.087498] __schedule+0x23d/0x590
> > [ 2912.087503] schedule+0x4e/0xb0
> > [ 2912.087505] bch_bucket_alloc+0x21a/0x5b0 [bcache]
> > [ 2912.087516] ? bch_btree_insert_check_key+0x1e0/0x1e0 [bcache]
> > [ 2912.087528] ? wait_woken+0x70/0x70
> > [ 2912.087532] __bch_bucket_alloc_set+0x52/0xf0 [bcache]
> > [ 2912.087543] bch_alloc_sectors+0x1c9/0x4c0 [bcache]
> > [ 2912.087553] ? bch_btree_insert+0xea/0x130 [bcache]
> > [ 2912.087565] ? __queue_work+0x211/0x480
> > [ 2912.087570] bch_data_insert_start+0x15e/0x3a0 [bcache]
> > [ 2912.087583] ? closure_sub+0x94/0xb0 [bcache]
> > [ 2912.087595] process_one_work+0x22b/0x3d0
> > [ 2912.087599] worker_thread+0x53/0x410
> > [ 2912.087602] ? process_one_work+0x3d0/0x3d0
> > [ 2912.087606] kthread+0x12a/0x150
> > [ 2912.087609] ? set_kthread_struct+0x50/0x50
> > [ 2912.087612] ret_from_fork+0x22/0x30
> > [ 2912.087617] </TASK>
> >
> > Is this a bug? It's in writeback mode. I'd setup the cache and run stuff like,
> >
> > echo 0 > /sys/block/bcache0/bcache/sequential_cutoff
> >
> > I had also echoed 0 into congested_read_threshold_us,
> > congested_write_threshold_us.
> >
> > echo writeback > /sys/block/bcache0/bcache/cache_mode
>
> Where do you get the kernel? If this is stable kernel, could you give me the HEAD commit id?
>
> Coly Li
>
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: seeing this stace trace on kernel 5.15
2022-06-29 7:02 ` Nikhil Kshirsagar
@ 2022-06-29 7:50 ` Coly Li
2022-06-29 7:56 ` Nikhil Kshirsagar
0 siblings, 1 reply; 14+ messages in thread
From: Coly Li @ 2022-06-29 7:50 UTC (permalink / raw)
To: Nikhil Kshirsagar; +Cc: linux-bcache
> 2022年6月29日 15:02,Nikhil Kshirsagar <nkshirsagar@gmail.com> 写道:
>
> Hi Coly,
>
> I see the same bug on upstream kernel when I tried with
> https://kernel.ubuntu.com/~kernel-ppa/mainline/v5.15.50/ (
> cod/mainline/v5.15.50 (767db4b286c3e101ac220b813c873f492d9e4ee8)
Can you help to try the linux-stable tree, or the latest upstream Linus tree? Then I can have a clean code base.
>
> Reads seem to trigger it, not writes. So this test triggered it -
>
> fio --name=read_iops --directory=/home/ubuntu/bcache_mount --size=12G
> --ioengine=libaio --direct=1 --verify=0 --bs=4K --iodepth=128
> --rw=randread --group_reporting=1
>
> https://pastebin.com/KyVSfnik has all the details.
OK, I will try to reproduce with above operation. What is the preferred cache and backing device sizes to reproduce the soft lockup?
Thanks.
Coly Li
>
> Regards,
> Nikhil.
>
> On Tue, 28 Jun 2022 at 20:08, Coly Li <colyli@suse.de> wrote:
>>
>>
>>
>>> 2022年6月28日 14:31,Nikhil Kshirsagar <nkshirsagar@gmail.com> 写道:
>>>
>>> Hi Coly,
>>>
>>> I just kicked off a 20gb random readwrite test using fio to a bcache device,
>>>
>>> fio --filename=/home/ubuntu/bcache_mount/cacahedfile --size=20GB
>>> --direct=1 --rw=randrw --bs=4k --ioengine=libaio --iodepth=128
>>> --name=iops-test-job --eta-newline=1
>>>
>>> # lsblk /dev/sdc /dev/nvme0n1
>>> NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINTS
>>> sdc 8:32 0 279.4G 0 disk
>>> └─sdc1 8:33 0 60G 0 part
>>> └─bcache0 252:0 0 60G 0 disk /home/ubuntu/bcache_mount
>>> nvme0n1 259:0 0 372.6G 0 disk
>>> └─nvme0n1p1 259:2 0 15G 0 part
>>> └─bcache0 252:0 0 60G 0 disk /home/ubuntu/bcache_mount
>>>
>>> and am seeing this trace in the dmesg,
>>>
>>> [ 2475.034909] XFS (bcache0): Ending clean mount
>>> [ 2475.036238] xfs filesystem being mounted at
>>> /home/ubuntu/bcache_mount supports timestamps until 2038 (0x7fffffff)
>>>
>>> [ 2782.176705] INFO: task kworker/2:1:255 blocked for more than 120 seconds.
>>> [ 2782.507171] Not tainted 5.15.0-40-generic #43-Ubuntu
>>> [ 2782.774078] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
>>> disables this message.
>>> [ 2783.155206] task:kworker/2:1 state:D stack: 0 pid: 255
>>> ppid: 2 flags:0x00004000
>>> [ 2783.155210] Workqueue: bcache bch_data_insert_start [bcache]
>>> [ 2783.155259] Call Trace:
>>> [ 2783.155261] <TASK>
>>> [ 2783.155263] __schedule+0x23d/0x590
>>> [ 2783.155269] schedule+0x4e/0xb0
>>> [ 2783.155271] bch_bucket_alloc+0x21a/0x5b0 [bcache]
>>> [ 2783.155281] ? bch_btree_insert_check_key+0x1e0/0x1e0 [bcache]
>>> [ 2783.155294] ? wait_woken+0x70/0x70
>>> [ 2783.155298] __bch_bucket_alloc_set+0x52/0xf0 [bcache]
>>> [ 2783.155309] bch_alloc_sectors+0x1c9/0x4c0 [bcache]
>>> [ 2783.155319] ? bch_btree_insert+0xea/0x130 [bcache]
>>> [ 2783.155331] ? __queue_work+0x211/0x480
>>> [ 2783.155336] bch_data_insert_start+0x15e/0x3a0 [bcache]
>>> [ 2783.155349] ? closure_sub+0x94/0xb0 [bcache]
>>> [ 2783.155362] process_one_work+0x22b/0x3d0
>>> [ 2783.155366] worker_thread+0x53/0x410
>>> [ 2783.155369] ? process_one_work+0x3d0/0x3d0
>>> [ 2783.155372] kthread+0x12a/0x150
>>> [ 2783.155376] ? set_kthread_struct+0x50/0x50
>>> [ 2783.155379] ret_from_fork+0x22/0x30
>>> [ 2783.155385] </TASK>
>>> [ 2783.155423] INFO: task kworker/3:1:267 blocked for more than 121 seconds.
>>> [ 2783.485797] Not tainted 5.15.0-40-generic #43-Ubuntu
>>> [ 2783.752485] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
>>> disables this message.
>>> [ 2784.133541] task:kworker/3:1 state:D stack: 0 pid: 267
>>> ppid: 2 flags:0x00004000
>>> [ 2784.133544] Workqueue: events update_writeback_rate [bcache]
>>> [ 2784.133588] Call Trace:
>>> [ 2784.133589] <TASK>
>>> [ 2784.133591] __schedule+0x23d/0x590
>>> [ 2784.133594] schedule+0x4e/0xb0
>>> [ 2784.133596] rwsem_down_read_slowpath+0x32e/0x380
>>> [ 2784.133600] down_read+0x43/0x90
>>> [ 2784.133602] update_writeback_rate+0x134/0x190 [bcache]
>>> [ 2784.133619] process_one_work+0x22b/0x3d0
>>> [ 2784.133623] worker_thread+0x53/0x410
>>> [ 2784.133626] ? process_one_work+0x3d0/0x3d0
>>> [ 2784.133630] kthread+0x12a/0x150
>>> [ 2784.133633] ? set_kthread_struct+0x50/0x50
>>> [ 2784.133636] ret_from_fork+0x22/0x30
>>> [ 2784.133640] </TASK>
>>> [ 2784.133650] INFO: task kworker/25:2:326 blocked for more than 122 seconds.
>>> [ 2784.467880] Not tainted 5.15.0-40-generic #43-Ubuntu
>>> [ 2784.734405] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
>>> disables this message.
>>> [ 2785.114677] task:kworker/25:2 state:D stack: 0 pid: 326
>>> ppid: 2 flags:0x00004000
>>> [ 2785.114692] Workqueue: bcache bch_data_insert_start [bcache]
>>> [ 2785.114720] Call Trace:
>>> [ 2785.114721] <TASK>
>>> [ 2785.114723] __schedule+0x23d/0x590
>>> [ 2785.114726] schedule+0x4e/0xb0
>>> [ 2785.114729] bch_bucket_alloc+0x21a/0x5b0 [bcache]
>>> [ 2785.114740] ? bch_btree_insert_check_key+0x1e0/0x1e0 [bcache]
>>> [ 2785.114753] ? wait_woken+0x70/0x70
>>> [ 2785.114756] __bch_bucket_alloc_set+0x52/0xf0 [bcache]
>>> [ 2785.114766] bch_alloc_sectors+0x1c9/0x4c0 [bcache]
>>> [ 2785.114777] ? bch_btree_insert+0xea/0x130 [bcache]
>>> [ 2785.114789] ? __queue_work+0x211/0x480
>>> [ 2785.114793] bch_data_insert_start+0x15e/0x3a0 [bcache]
>>> [ 2785.114806] ? closure_sub+0x94/0xb0 [bcache]
>>> [ 2785.114818] process_one_work+0x22b/0x3d0
>>> [ 2785.114822] worker_thread+0x53/0x410
>>> [ 2785.114826] ? process_one_work+0x3d0/0x3d0
>>> [ 2785.114829] kthread+0x12a/0x150
>>> [ 2785.114832] ? set_kthread_struct+0x50/0x50
>>> [ 2785.114835] ret_from_fork+0x22/0x30
>>> [ 2785.114839] </TASK>
>>> [ 2785.114864] INFO: task kworker/6:0:3038 blocked for more than 123 seconds.
>>> [ 2785.444697] Not tainted 5.15.0-40-generic #43-Ubuntu
>>> [ 2785.711071] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
>>> disables this message.
>>> [ 2786.091440] task:kworker/6:0 state:D stack: 0 pid: 3038
>>> ppid: 2 flags:0x00004000
>>> [ 2786.091446] Workqueue: xfs-sync/bcache0 xfs_log_worker [xfs]
>>> [ 2786.091582] Call Trace:
>>> [ 2786.091584] <TASK>
>>> [ 2786.091585] __schedule+0x23d/0x590
>>> [ 2786.091589] schedule+0x4e/0xb0
>>> [ 2786.091591] rwsem_down_read_slowpath+0x32e/0x380
>>> [ 2786.091594] down_read+0x43/0x90
>>> [ 2786.091597] cached_dev_write+0x7e/0x480 [bcache]
>>> [ 2786.091613] cached_dev_submit_bio+0x4ef/0x540 [bcache]
>>> [ 2786.091626] __submit_bio+0x1a1/0x220
>>> [ 2786.091631] __submit_bio_noacct+0x85/0x1f0
>>> [ 2786.091634] ? mutex_lock+0x13/0x40
>>> [ 2786.091637] submit_bio_noacct+0x4e/0x120
>>> [ 2786.091640] submit_bio+0x4a/0x130
>>> [ 2786.091642] xlog_write_iclog+0x254/0x300 [xfs]
>>> [ 2786.091725] xlog_sync+0x1ab/0x2c0 [xfs]
>>> [ 2786.091807] xlog_state_release_iclog+0x123/0x1d0 [xfs]
>>> [ 2786.091925] xfs_log_force+0x186/0x250 [xfs]
>>> [ 2786.091994] xfs_log_worker+0x39/0x90 [xfs]
>>> [ 2786.092063] process_one_work+0x22b/0x3d0
>>> [ 2786.092067] worker_thread+0x53/0x410
>>> [ 2786.092069] ? process_one_work+0x3d0/0x3d0
>>> [ 2786.092072] kthread+0x12a/0x150
>>> [ 2786.092074] ? set_kthread_struct+0x50/0x50
>>> [ 2786.092077] ret_from_fork+0x22/0x30
>>> [ 2786.092081] </TASK>
>>> [ 2786.092088] INFO: task kworker/25:0:3047 blocked for more than 124 seconds.
>>> [ 2786.430313] Not tainted 5.15.0-40-generic #43-Ubuntu
>>> [ 2786.696626] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
>>> disables this message.
>>> [ 2787.077056] task:kworker/25:0 state:D stack: 0 pid: 3047
>>> ppid: 2 flags:0x00004000
>>> [ 2787.077071] Workqueue: bcache bch_data_insert_start [bcache]
>>> [ 2787.077098] Call Trace:
>>> [ 2787.077099] <TASK>
>>> [ 2787.077101] __schedule+0x23d/0x590
>>> [ 2787.077104] schedule+0x4e/0xb0
>>> [ 2787.077106] bch_bucket_alloc+0x21a/0x5b0 [bcache]
>>> [ 2787.077116] ? wait_woken+0x70/0x70
>>> [ 2787.077119] __bch_bucket_alloc_set+0x52/0xf0 [bcache]
>>> [ 2787.077142] bch_alloc_sectors+0x1c9/0x4c0 [bcache]
>>> [ 2787.077153] ? __bch_submit_bbio+0x97/0xb0 [bcache]
>>> [ 2787.077166] bch_data_insert_start+0x15e/0x3a0 [bcache]
>>> [ 2787.077179] ? closure_sub+0x94/0xb0 [bcache]
>>> [ 2787.077191] process_one_work+0x22b/0x3d0
>>> [ 2787.077195] worker_thread+0x53/0x410
>>> [ 2787.077198] ? process_one_work+0x3d0/0x3d0
>>> [ 2787.077202] kthread+0x12a/0x150
>>> [ 2787.077205] ? set_kthread_struct+0x50/0x50
>>> [ 2787.077207] ret_from_fork+0x22/0x30
>>> [ 2787.077212] </TASK>
>>> [ 2787.077247] INFO: task bcache_writebac:3080 blocked for more than
>>> 125 seconds.
>>> [ 2787.423890] Not tainted 5.15.0-40-generic #43-Ubuntu
>>> [ 2787.690476] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
>>> disables this message.
>>> [ 2788.071063] task:bcache_writebac state:D stack: 0 pid: 3080
>>> ppid: 2 flags:0x00004000
>>> [ 2788.071080] Call Trace:
>>> [ 2788.071081] <TASK>
>>> [ 2788.071082] __schedule+0x23d/0x590
>>> [ 2788.071086] schedule+0x4e/0xb0
>>> [ 2788.071100] rwsem_down_write_slowpath+0x220/0x3d0
>>> [ 2788.071105] ? del_timer_sync+0x6c/0xb0
>>> [ 2788.071109] down_write+0x43/0x50
>>> [ 2788.071112] bch_writeback_thread+0x78/0x320 [bcache]
>>> [ 2788.071142] ? read_dirty+0x5a0/0x5a0 [bcache]
>>> [ 2788.071158] kthread+0x12a/0x150
>>> [ 2788.071161] ? set_kthread_struct+0x50/0x50
>>> [ 2788.071164] ret_from_fork+0x22/0x30
>>> [ 2788.071168] </TASK>
>>> [ 2788.071199] INFO: task fio:3123 blocked for more than 126 seconds.
>>> [ 2788.367175] Not tainted 5.15.0-40-generic #43-Ubuntu
>>> [ 2788.633386] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
>>> disables this message.
>>> [ 2789.014084] task:fio state:D stack: 0 pid: 3123
>>> ppid: 2378 flags:0x00004002
>>> [ 2789.014087] Call Trace:
>>> [ 2789.014089] <TASK>
>>> [ 2789.014090] __schedule+0x23d/0x590
>>> [ 2789.014106] schedule+0x4e/0xb0
>>> [ 2789.014108] io_schedule+0x46/0x70
>>> [ 2789.014110] wait_on_page_bit_common+0x10c/0x3d0
>>> [ 2789.014115] ? filemap_invalidate_unlock_two+0x40/0x40
>>> [ 2789.014118] wait_on_page_bit+0x3f/0x50
>>> [ 2789.014120] wait_on_page_writeback+0x26/0x80
>>> [ 2789.014124] __filemap_fdatawait_range+0x97/0x110
>>> [ 2789.014126] file_write_and_wait_range+0xcc/0xf0
>>> [ 2789.014130] xfs_file_fsync+0x5b/0x250 [xfs]
>>> [ 2789.014207] vfs_fsync_range+0x49/0x80
>>> [ 2789.014212] ? __fget_light+0x32/0x80
>>> [ 2789.014217] __x64_sys_fsync+0x38/0x60
>>> [ 2789.014220] do_syscall_64+0x5c/0xc0
>>> [ 2789.014223] ? do_syscall_64+0x69/0xc0
>>> [ 2789.014225] ? do_syscall_64+0x69/0xc0
>>> [ 2789.014226] ? syscall_exit_to_user_mode+0x27/0x50
>>> [ 2789.014230] ? __x64_sys_write+0x19/0x20
>>> [ 2789.014232] ? do_syscall_64+0x69/0xc0
>>> [ 2789.014234] ? do_syscall_64+0x69/0xc0
>>> [ 2789.014235] ? asm_sysvec_apic_timer_interrupt+0xa/0x20
>>> [ 2789.014239] entry_SYSCALL_64_after_hwframe+0x44/0xae
>>> [ 2789.014243] RIP: 0033:0x7f2cfdcfea5b
>>> [ 2789.014245] RSP: 002b:00007ffcd87c6e60 EFLAGS: 00000293 ORIG_RAX:
>>> 000000000000004a
>>> [ 2789.014248] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f2cfdcfea5b
>>> [ 2789.014249] RDX: 0000000000000000 RSI: 000055c8d26e6f40 RDI: 0000000000000006
>>> [ 2789.014250] RBP: 0000000000001000 R08: 0000000000000000 R09: 000055c8d26e6f40
>>> [ 2789.014252] R10: 00000000a318c620 R11: 0000000000000293 R12: 00007f2cf38a7000
>>> [ 2789.014253] R13: 0000000000000000 R14: 00007f2cfb0522f0 R15: 0000000500000000
>>> [ 2789.014255] </TASK>
>>> [ 2909.151501] INFO: task kworker/2:1:255 blocked for more than 247 seconds.
>>> [ 2909.481856] Not tainted 5.15.0-40-generic #43-Ubuntu
>>> [ 2909.748707] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
>>> disables this message.
>>> [ 2910.130132] task:kworker/2:1 state:D stack: 0 pid: 255
>>> ppid: 2 flags:0x00004000
>>> [ 2910.130147] Workqueue: bcache bch_data_insert_start [bcache]
>>> [ 2910.130176] Call Trace:
>>> [ 2910.130177] <TASK>
>>> [ 2910.130179] __schedule+0x23d/0x590
>>> [ 2910.130182] schedule+0x4e/0xb0
>>> [ 2910.130184] bch_bucket_alloc+0x21a/0x5b0 [bcache]
>>> [ 2910.130194] ? bch_btree_insert_check_key+0x1e0/0x1e0 [bcache]
>>> [ 2910.130219] ? wait_woken+0x70/0x70
>>> [ 2910.130222] __bch_bucket_alloc_set+0x52/0xf0 [bcache]
>>> [ 2910.130233] bch_alloc_sectors+0x1c9/0x4c0 [bcache]
>>> [ 2910.130243] ? bch_btree_insert+0xea/0x130 [bcache]
>>> [ 2910.130255] ? __queue_work+0x211/0x480
>>> [ 2910.130259] bch_data_insert_start+0x15e/0x3a0 [bcache]
>>> [ 2910.130272] ? closure_sub+0x94/0xb0 [bcache]
>>> [ 2910.130285] process_one_work+0x22b/0x3d0
>>> [ 2910.130288] worker_thread+0x53/0x410
>>> [ 2910.130292] ? process_one_work+0x3d0/0x3d0
>>> [ 2910.130295] kthread+0x12a/0x150
>>> [ 2910.130298] ? set_kthread_struct+0x50/0x50
>>> [ 2910.130301] ret_from_fork+0x22/0x30
>>> [ 2910.130305] </TASK>
>>> [ 2910.130307] INFO: task kworker/3:1:267 blocked for more than 248 seconds.
>>> [ 2910.456520] Not tainted 5.15.0-40-generic #43-Ubuntu
>>> [ 2910.723476] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
>>> disables this message.
>>> [ 2911.104601] task:kworker/3:1 state:D stack: 0 pid: 267
>>> ppid: 2 flags:0x00004000
>>> [ 2911.104617] Workqueue: events update_writeback_rate [bcache]
>>> [ 2911.104647] Call Trace:
>>> [ 2911.104648] <TASK>
>>> [ 2911.104649] __schedule+0x23d/0x590
>>> [ 2911.104652] schedule+0x4e/0xb0
>>> [ 2911.104654] rwsem_down_read_slowpath+0x32e/0x380
>>> [ 2911.104657] down_read+0x43/0x90
>>> [ 2911.104660] update_writeback_rate+0x134/0x190 [bcache]
>>> [ 2911.104676] process_one_work+0x22b/0x3d0
>>> [ 2911.104680] worker_thread+0x53/0x410
>>> [ 2911.104683] ? process_one_work+0x3d0/0x3d0
>>> [ 2911.104687] kthread+0x12a/0x150
>>> [ 2911.104690] ? set_kthread_struct+0x50/0x50
>>> [ 2911.104693] ret_from_fork+0x22/0x30
>>> [ 2911.104697] </TASK>
>>> [ 2911.104733] INFO: task kworker/25:2:326 blocked for more than 249 seconds.
>>> [ 2911.439086] Not tainted 5.15.0-40-generic #43-Ubuntu
>>> [ 2911.706294] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
>>> disables this message.
>>> [ 2912.087471] task:kworker/25:2 state:D stack: 0 pid: 326
>>> ppid: 2 flags:0x00004000
>>> [ 2912.087475] Workqueue: bcache bch_data_insert_start [bcache]
>>> [ 2912.087495] Call Trace:
>>> [ 2912.087497] <TASK>
>>> [ 2912.087498] __schedule+0x23d/0x590
>>> [ 2912.087503] schedule+0x4e/0xb0
>>> [ 2912.087505] bch_bucket_alloc+0x21a/0x5b0 [bcache]
>>> [ 2912.087516] ? bch_btree_insert_check_key+0x1e0/0x1e0 [bcache]
>>> [ 2912.087528] ? wait_woken+0x70/0x70
>>> [ 2912.087532] __bch_bucket_alloc_set+0x52/0xf0 [bcache]
>>> [ 2912.087543] bch_alloc_sectors+0x1c9/0x4c0 [bcache]
>>> [ 2912.087553] ? bch_btree_insert+0xea/0x130 [bcache]
>>> [ 2912.087565] ? __queue_work+0x211/0x480
>>> [ 2912.087570] bch_data_insert_start+0x15e/0x3a0 [bcache]
>>> [ 2912.087583] ? closure_sub+0x94/0xb0 [bcache]
>>> [ 2912.087595] process_one_work+0x22b/0x3d0
>>> [ 2912.087599] worker_thread+0x53/0x410
>>> [ 2912.087602] ? process_one_work+0x3d0/0x3d0
>>> [ 2912.087606] kthread+0x12a/0x150
>>> [ 2912.087609] ? set_kthread_struct+0x50/0x50
>>> [ 2912.087612] ret_from_fork+0x22/0x30
>>> [ 2912.087617] </TASK>
>>>
>>> Is this a bug? It's in writeback mode. I'd setup the cache and run stuff like,
>>>
>>> echo 0 > /sys/block/bcache0/bcache/sequential_cutoff
>>>
>>> I had also echoed 0 into congested_read_threshold_us,
>>> congested_write_threshold_us.
>>>
>>> echo writeback > /sys/block/bcache0/bcache/cache_mode
>>
>> Where do you get the kernel? If this is stable kernel, could you give me the HEAD commit id?
>>
>> Coly Li
>>
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: seeing this stace trace on kernel 5.15
2022-06-29 7:50 ` Coly Li
@ 2022-06-29 7:56 ` Nikhil Kshirsagar
2022-06-29 7:59 ` Coly Li
2022-06-29 8:09 ` Nikhil Kshirsagar
0 siblings, 2 replies; 14+ messages in thread
From: Nikhil Kshirsagar @ 2022-06-29 7:56 UTC (permalink / raw)
To: Coly Li; +Cc: linux-bcache
60gb slow hdd, 15gb fast nvme cache. shall i file upstream bug for this issue?
On Wed, 29 Jun 2022 at 13:20, Coly Li <colyli@suse.de> wrote:
>
>
>
> > 2022年6月29日 15:02,Nikhil Kshirsagar <nkshirsagar@gmail.com> 写道:
> >
> > Hi Coly,
> >
> > I see the same bug on upstream kernel when I tried with
> > https://kernel.ubuntu.com/~kernel-ppa/mainline/v5.15.50/ (
> > cod/mainline/v5.15.50 (767db4b286c3e101ac220b813c873f492d9e4ee8)
>
>
> Can you help to try the linux-stable tree, or the latest upstream Linus tree? Then I can have a clean code base.
>
> >
> > Reads seem to trigger it, not writes. So this test triggered it -
> >
> > fio --name=read_iops --directory=/home/ubuntu/bcache_mount --size=12G
> > --ioengine=libaio --direct=1 --verify=0 --bs=4K --iodepth=128
> > --rw=randread --group_reporting=1
> >
> > https://pastebin.com/KyVSfnik has all the details.
>
>
> OK, I will try to reproduce with above operation. What is the preferred cache and backing device sizes to reproduce the soft lockup?
>
> Thanks.
>
> Coly Li
>
>
> >
> > Regards,
> > Nikhil.
> >
> > On Tue, 28 Jun 2022 at 20:08, Coly Li <colyli@suse.de> wrote:
> >>
> >>
> >>
> >>> 2022年6月28日 14:31,Nikhil Kshirsagar <nkshirsagar@gmail.com> 写道:
> >>>
> >>> Hi Coly,
> >>>
> >>> I just kicked off a 20gb random readwrite test using fio to a bcache device,
> >>>
> >>> fio --filename=/home/ubuntu/bcache_mount/cacahedfile --size=20GB
> >>> --direct=1 --rw=randrw --bs=4k --ioengine=libaio --iodepth=128
> >>> --name=iops-test-job --eta-newline=1
> >>>
> >>> # lsblk /dev/sdc /dev/nvme0n1
> >>> NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINTS
> >>> sdc 8:32 0 279.4G 0 disk
> >>> └─sdc1 8:33 0 60G 0 part
> >>> └─bcache0 252:0 0 60G 0 disk /home/ubuntu/bcache_mount
> >>> nvme0n1 259:0 0 372.6G 0 disk
> >>> └─nvme0n1p1 259:2 0 15G 0 part
> >>> └─bcache0 252:0 0 60G 0 disk /home/ubuntu/bcache_mount
> >>>
> >>> and am seeing this trace in the dmesg,
> >>>
> >>> [ 2475.034909] XFS (bcache0): Ending clean mount
> >>> [ 2475.036238] xfs filesystem being mounted at
> >>> /home/ubuntu/bcache_mount supports timestamps until 2038 (0x7fffffff)
> >>>
> >>> [ 2782.176705] INFO: task kworker/2:1:255 blocked for more than 120 seconds.
> >>> [ 2782.507171] Not tainted 5.15.0-40-generic #43-Ubuntu
> >>> [ 2782.774078] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
> >>> disables this message.
> >>> [ 2783.155206] task:kworker/2:1 state:D stack: 0 pid: 255
> >>> ppid: 2 flags:0x00004000
> >>> [ 2783.155210] Workqueue: bcache bch_data_insert_start [bcache]
> >>> [ 2783.155259] Call Trace:
> >>> [ 2783.155261] <TASK>
> >>> [ 2783.155263] __schedule+0x23d/0x590
> >>> [ 2783.155269] schedule+0x4e/0xb0
> >>> [ 2783.155271] bch_bucket_alloc+0x21a/0x5b0 [bcache]
> >>> [ 2783.155281] ? bch_btree_insert_check_key+0x1e0/0x1e0 [bcache]
> >>> [ 2783.155294] ? wait_woken+0x70/0x70
> >>> [ 2783.155298] __bch_bucket_alloc_set+0x52/0xf0 [bcache]
> >>> [ 2783.155309] bch_alloc_sectors+0x1c9/0x4c0 [bcache]
> >>> [ 2783.155319] ? bch_btree_insert+0xea/0x130 [bcache]
> >>> [ 2783.155331] ? __queue_work+0x211/0x480
> >>> [ 2783.155336] bch_data_insert_start+0x15e/0x3a0 [bcache]
> >>> [ 2783.155349] ? closure_sub+0x94/0xb0 [bcache]
> >>> [ 2783.155362] process_one_work+0x22b/0x3d0
> >>> [ 2783.155366] worker_thread+0x53/0x410
> >>> [ 2783.155369] ? process_one_work+0x3d0/0x3d0
> >>> [ 2783.155372] kthread+0x12a/0x150
> >>> [ 2783.155376] ? set_kthread_struct+0x50/0x50
> >>> [ 2783.155379] ret_from_fork+0x22/0x30
> >>> [ 2783.155385] </TASK>
> >>> [ 2783.155423] INFO: task kworker/3:1:267 blocked for more than 121 seconds.
> >>> [ 2783.485797] Not tainted 5.15.0-40-generic #43-Ubuntu
> >>> [ 2783.752485] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
> >>> disables this message.
> >>> [ 2784.133541] task:kworker/3:1 state:D stack: 0 pid: 267
> >>> ppid: 2 flags:0x00004000
> >>> [ 2784.133544] Workqueue: events update_writeback_rate [bcache]
> >>> [ 2784.133588] Call Trace:
> >>> [ 2784.133589] <TASK>
> >>> [ 2784.133591] __schedule+0x23d/0x590
> >>> [ 2784.133594] schedule+0x4e/0xb0
> >>> [ 2784.133596] rwsem_down_read_slowpath+0x32e/0x380
> >>> [ 2784.133600] down_read+0x43/0x90
> >>> [ 2784.133602] update_writeback_rate+0x134/0x190 [bcache]
> >>> [ 2784.133619] process_one_work+0x22b/0x3d0
> >>> [ 2784.133623] worker_thread+0x53/0x410
> >>> [ 2784.133626] ? process_one_work+0x3d0/0x3d0
> >>> [ 2784.133630] kthread+0x12a/0x150
> >>> [ 2784.133633] ? set_kthread_struct+0x50/0x50
> >>> [ 2784.133636] ret_from_fork+0x22/0x30
> >>> [ 2784.133640] </TASK>
> >>> [ 2784.133650] INFO: task kworker/25:2:326 blocked for more than 122 seconds.
> >>> [ 2784.467880] Not tainted 5.15.0-40-generic #43-Ubuntu
> >>> [ 2784.734405] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
> >>> disables this message.
> >>> [ 2785.114677] task:kworker/25:2 state:D stack: 0 pid: 326
> >>> ppid: 2 flags:0x00004000
> >>> [ 2785.114692] Workqueue: bcache bch_data_insert_start [bcache]
> >>> [ 2785.114720] Call Trace:
> >>> [ 2785.114721] <TASK>
> >>> [ 2785.114723] __schedule+0x23d/0x590
> >>> [ 2785.114726] schedule+0x4e/0xb0
> >>> [ 2785.114729] bch_bucket_alloc+0x21a/0x5b0 [bcache]
> >>> [ 2785.114740] ? bch_btree_insert_check_key+0x1e0/0x1e0 [bcache]
> >>> [ 2785.114753] ? wait_woken+0x70/0x70
> >>> [ 2785.114756] __bch_bucket_alloc_set+0x52/0xf0 [bcache]
> >>> [ 2785.114766] bch_alloc_sectors+0x1c9/0x4c0 [bcache]
> >>> [ 2785.114777] ? bch_btree_insert+0xea/0x130 [bcache]
> >>> [ 2785.114789] ? __queue_work+0x211/0x480
> >>> [ 2785.114793] bch_data_insert_start+0x15e/0x3a0 [bcache]
> >>> [ 2785.114806] ? closure_sub+0x94/0xb0 [bcache]
> >>> [ 2785.114818] process_one_work+0x22b/0x3d0
> >>> [ 2785.114822] worker_thread+0x53/0x410
> >>> [ 2785.114826] ? process_one_work+0x3d0/0x3d0
> >>> [ 2785.114829] kthread+0x12a/0x150
> >>> [ 2785.114832] ? set_kthread_struct+0x50/0x50
> >>> [ 2785.114835] ret_from_fork+0x22/0x30
> >>> [ 2785.114839] </TASK>
> >>> [ 2785.114864] INFO: task kworker/6:0:3038 blocked for more than 123 seconds.
> >>> [ 2785.444697] Not tainted 5.15.0-40-generic #43-Ubuntu
> >>> [ 2785.711071] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
> >>> disables this message.
> >>> [ 2786.091440] task:kworker/6:0 state:D stack: 0 pid: 3038
> >>> ppid: 2 flags:0x00004000
> >>> [ 2786.091446] Workqueue: xfs-sync/bcache0 xfs_log_worker [xfs]
> >>> [ 2786.091582] Call Trace:
> >>> [ 2786.091584] <TASK>
> >>> [ 2786.091585] __schedule+0x23d/0x590
> >>> [ 2786.091589] schedule+0x4e/0xb0
> >>> [ 2786.091591] rwsem_down_read_slowpath+0x32e/0x380
> >>> [ 2786.091594] down_read+0x43/0x90
> >>> [ 2786.091597] cached_dev_write+0x7e/0x480 [bcache]
> >>> [ 2786.091613] cached_dev_submit_bio+0x4ef/0x540 [bcache]
> >>> [ 2786.091626] __submit_bio+0x1a1/0x220
> >>> [ 2786.091631] __submit_bio_noacct+0x85/0x1f0
> >>> [ 2786.091634] ? mutex_lock+0x13/0x40
> >>> [ 2786.091637] submit_bio_noacct+0x4e/0x120
> >>> [ 2786.091640] submit_bio+0x4a/0x130
> >>> [ 2786.091642] xlog_write_iclog+0x254/0x300 [xfs]
> >>> [ 2786.091725] xlog_sync+0x1ab/0x2c0 [xfs]
> >>> [ 2786.091807] xlog_state_release_iclog+0x123/0x1d0 [xfs]
> >>> [ 2786.091925] xfs_log_force+0x186/0x250 [xfs]
> >>> [ 2786.091994] xfs_log_worker+0x39/0x90 [xfs]
> >>> [ 2786.092063] process_one_work+0x22b/0x3d0
> >>> [ 2786.092067] worker_thread+0x53/0x410
> >>> [ 2786.092069] ? process_one_work+0x3d0/0x3d0
> >>> [ 2786.092072] kthread+0x12a/0x150
> >>> [ 2786.092074] ? set_kthread_struct+0x50/0x50
> >>> [ 2786.092077] ret_from_fork+0x22/0x30
> >>> [ 2786.092081] </TASK>
> >>> [ 2786.092088] INFO: task kworker/25:0:3047 blocked for more than 124 seconds.
> >>> [ 2786.430313] Not tainted 5.15.0-40-generic #43-Ubuntu
> >>> [ 2786.696626] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
> >>> disables this message.
> >>> [ 2787.077056] task:kworker/25:0 state:D stack: 0 pid: 3047
> >>> ppid: 2 flags:0x00004000
> >>> [ 2787.077071] Workqueue: bcache bch_data_insert_start [bcache]
> >>> [ 2787.077098] Call Trace:
> >>> [ 2787.077099] <TASK>
> >>> [ 2787.077101] __schedule+0x23d/0x590
> >>> [ 2787.077104] schedule+0x4e/0xb0
> >>> [ 2787.077106] bch_bucket_alloc+0x21a/0x5b0 [bcache]
> >>> [ 2787.077116] ? wait_woken+0x70/0x70
> >>> [ 2787.077119] __bch_bucket_alloc_set+0x52/0xf0 [bcache]
> >>> [ 2787.077142] bch_alloc_sectors+0x1c9/0x4c0 [bcache]
> >>> [ 2787.077153] ? __bch_submit_bbio+0x97/0xb0 [bcache]
> >>> [ 2787.077166] bch_data_insert_start+0x15e/0x3a0 [bcache]
> >>> [ 2787.077179] ? closure_sub+0x94/0xb0 [bcache]
> >>> [ 2787.077191] process_one_work+0x22b/0x3d0
> >>> [ 2787.077195] worker_thread+0x53/0x410
> >>> [ 2787.077198] ? process_one_work+0x3d0/0x3d0
> >>> [ 2787.077202] kthread+0x12a/0x150
> >>> [ 2787.077205] ? set_kthread_struct+0x50/0x50
> >>> [ 2787.077207] ret_from_fork+0x22/0x30
> >>> [ 2787.077212] </TASK>
> >>> [ 2787.077247] INFO: task bcache_writebac:3080 blocked for more than
> >>> 125 seconds.
> >>> [ 2787.423890] Not tainted 5.15.0-40-generic #43-Ubuntu
> >>> [ 2787.690476] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
> >>> disables this message.
> >>> [ 2788.071063] task:bcache_writebac state:D stack: 0 pid: 3080
> >>> ppid: 2 flags:0x00004000
> >>> [ 2788.071080] Call Trace:
> >>> [ 2788.071081] <TASK>
> >>> [ 2788.071082] __schedule+0x23d/0x590
> >>> [ 2788.071086] schedule+0x4e/0xb0
> >>> [ 2788.071100] rwsem_down_write_slowpath+0x220/0x3d0
> >>> [ 2788.071105] ? del_timer_sync+0x6c/0xb0
> >>> [ 2788.071109] down_write+0x43/0x50
> >>> [ 2788.071112] bch_writeback_thread+0x78/0x320 [bcache]
> >>> [ 2788.071142] ? read_dirty+0x5a0/0x5a0 [bcache]
> >>> [ 2788.071158] kthread+0x12a/0x150
> >>> [ 2788.071161] ? set_kthread_struct+0x50/0x50
> >>> [ 2788.071164] ret_from_fork+0x22/0x30
> >>> [ 2788.071168] </TASK>
> >>> [ 2788.071199] INFO: task fio:3123 blocked for more than 126 seconds.
> >>> [ 2788.367175] Not tainted 5.15.0-40-generic #43-Ubuntu
> >>> [ 2788.633386] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
> >>> disables this message.
> >>> [ 2789.014084] task:fio state:D stack: 0 pid: 3123
> >>> ppid: 2378 flags:0x00004002
> >>> [ 2789.014087] Call Trace:
> >>> [ 2789.014089] <TASK>
> >>> [ 2789.014090] __schedule+0x23d/0x590
> >>> [ 2789.014106] schedule+0x4e/0xb0
> >>> [ 2789.014108] io_schedule+0x46/0x70
> >>> [ 2789.014110] wait_on_page_bit_common+0x10c/0x3d0
> >>> [ 2789.014115] ? filemap_invalidate_unlock_two+0x40/0x40
> >>> [ 2789.014118] wait_on_page_bit+0x3f/0x50
> >>> [ 2789.014120] wait_on_page_writeback+0x26/0x80
> >>> [ 2789.014124] __filemap_fdatawait_range+0x97/0x110
> >>> [ 2789.014126] file_write_and_wait_range+0xcc/0xf0
> >>> [ 2789.014130] xfs_file_fsync+0x5b/0x250 [xfs]
> >>> [ 2789.014207] vfs_fsync_range+0x49/0x80
> >>> [ 2789.014212] ? __fget_light+0x32/0x80
> >>> [ 2789.014217] __x64_sys_fsync+0x38/0x60
> >>> [ 2789.014220] do_syscall_64+0x5c/0xc0
> >>> [ 2789.014223] ? do_syscall_64+0x69/0xc0
> >>> [ 2789.014225] ? do_syscall_64+0x69/0xc0
> >>> [ 2789.014226] ? syscall_exit_to_user_mode+0x27/0x50
> >>> [ 2789.014230] ? __x64_sys_write+0x19/0x20
> >>> [ 2789.014232] ? do_syscall_64+0x69/0xc0
> >>> [ 2789.014234] ? do_syscall_64+0x69/0xc0
> >>> [ 2789.014235] ? asm_sysvec_apic_timer_interrupt+0xa/0x20
> >>> [ 2789.014239] entry_SYSCALL_64_after_hwframe+0x44/0xae
> >>> [ 2789.014243] RIP: 0033:0x7f2cfdcfea5b
> >>> [ 2789.014245] RSP: 002b:00007ffcd87c6e60 EFLAGS: 00000293 ORIG_RAX:
> >>> 000000000000004a
> >>> [ 2789.014248] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f2cfdcfea5b
> >>> [ 2789.014249] RDX: 0000000000000000 RSI: 000055c8d26e6f40 RDI: 0000000000000006
> >>> [ 2789.014250] RBP: 0000000000001000 R08: 0000000000000000 R09: 000055c8d26e6f40
> >>> [ 2789.014252] R10: 00000000a318c620 R11: 0000000000000293 R12: 00007f2cf38a7000
> >>> [ 2789.014253] R13: 0000000000000000 R14: 00007f2cfb0522f0 R15: 0000000500000000
> >>> [ 2789.014255] </TASK>
> >>> [ 2909.151501] INFO: task kworker/2:1:255 blocked for more than 247 seconds.
> >>> [ 2909.481856] Not tainted 5.15.0-40-generic #43-Ubuntu
> >>> [ 2909.748707] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
> >>> disables this message.
> >>> [ 2910.130132] task:kworker/2:1 state:D stack: 0 pid: 255
> >>> ppid: 2 flags:0x00004000
> >>> [ 2910.130147] Workqueue: bcache bch_data_insert_start [bcache]
> >>> [ 2910.130176] Call Trace:
> >>> [ 2910.130177] <TASK>
> >>> [ 2910.130179] __schedule+0x23d/0x590
> >>> [ 2910.130182] schedule+0x4e/0xb0
> >>> [ 2910.130184] bch_bucket_alloc+0x21a/0x5b0 [bcache]
> >>> [ 2910.130194] ? bch_btree_insert_check_key+0x1e0/0x1e0 [bcache]
> >>> [ 2910.130219] ? wait_woken+0x70/0x70
> >>> [ 2910.130222] __bch_bucket_alloc_set+0x52/0xf0 [bcache]
> >>> [ 2910.130233] bch_alloc_sectors+0x1c9/0x4c0 [bcache]
> >>> [ 2910.130243] ? bch_btree_insert+0xea/0x130 [bcache]
> >>> [ 2910.130255] ? __queue_work+0x211/0x480
> >>> [ 2910.130259] bch_data_insert_start+0x15e/0x3a0 [bcache]
> >>> [ 2910.130272] ? closure_sub+0x94/0xb0 [bcache]
> >>> [ 2910.130285] process_one_work+0x22b/0x3d0
> >>> [ 2910.130288] worker_thread+0x53/0x410
> >>> [ 2910.130292] ? process_one_work+0x3d0/0x3d0
> >>> [ 2910.130295] kthread+0x12a/0x150
> >>> [ 2910.130298] ? set_kthread_struct+0x50/0x50
> >>> [ 2910.130301] ret_from_fork+0x22/0x30
> >>> [ 2910.130305] </TASK>
> >>> [ 2910.130307] INFO: task kworker/3:1:267 blocked for more than 248 seconds.
> >>> [ 2910.456520] Not tainted 5.15.0-40-generic #43-Ubuntu
> >>> [ 2910.723476] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
> >>> disables this message.
> >>> [ 2911.104601] task:kworker/3:1 state:D stack: 0 pid: 267
> >>> ppid: 2 flags:0x00004000
> >>> [ 2911.104617] Workqueue: events update_writeback_rate [bcache]
> >>> [ 2911.104647] Call Trace:
> >>> [ 2911.104648] <TASK>
> >>> [ 2911.104649] __schedule+0x23d/0x590
> >>> [ 2911.104652] schedule+0x4e/0xb0
> >>> [ 2911.104654] rwsem_down_read_slowpath+0x32e/0x380
> >>> [ 2911.104657] down_read+0x43/0x90
> >>> [ 2911.104660] update_writeback_rate+0x134/0x190 [bcache]
> >>> [ 2911.104676] process_one_work+0x22b/0x3d0
> >>> [ 2911.104680] worker_thread+0x53/0x410
> >>> [ 2911.104683] ? process_one_work+0x3d0/0x3d0
> >>> [ 2911.104687] kthread+0x12a/0x150
> >>> [ 2911.104690] ? set_kthread_struct+0x50/0x50
> >>> [ 2911.104693] ret_from_fork+0x22/0x30
> >>> [ 2911.104697] </TASK>
> >>> [ 2911.104733] INFO: task kworker/25:2:326 blocked for more than 249 seconds.
> >>> [ 2911.439086] Not tainted 5.15.0-40-generic #43-Ubuntu
> >>> [ 2911.706294] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
> >>> disables this message.
> >>> [ 2912.087471] task:kworker/25:2 state:D stack: 0 pid: 326
> >>> ppid: 2 flags:0x00004000
> >>> [ 2912.087475] Workqueue: bcache bch_data_insert_start [bcache]
> >>> [ 2912.087495] Call Trace:
> >>> [ 2912.087497] <TASK>
> >>> [ 2912.087498] __schedule+0x23d/0x590
> >>> [ 2912.087503] schedule+0x4e/0xb0
> >>> [ 2912.087505] bch_bucket_alloc+0x21a/0x5b0 [bcache]
> >>> [ 2912.087516] ? bch_btree_insert_check_key+0x1e0/0x1e0 [bcache]
> >>> [ 2912.087528] ? wait_woken+0x70/0x70
> >>> [ 2912.087532] __bch_bucket_alloc_set+0x52/0xf0 [bcache]
> >>> [ 2912.087543] bch_alloc_sectors+0x1c9/0x4c0 [bcache]
> >>> [ 2912.087553] ? bch_btree_insert+0xea/0x130 [bcache]
> >>> [ 2912.087565] ? __queue_work+0x211/0x480
> >>> [ 2912.087570] bch_data_insert_start+0x15e/0x3a0 [bcache]
> >>> [ 2912.087583] ? closure_sub+0x94/0xb0 [bcache]
> >>> [ 2912.087595] process_one_work+0x22b/0x3d0
> >>> [ 2912.087599] worker_thread+0x53/0x410
> >>> [ 2912.087602] ? process_one_work+0x3d0/0x3d0
> >>> [ 2912.087606] kthread+0x12a/0x150
> >>> [ 2912.087609] ? set_kthread_struct+0x50/0x50
> >>> [ 2912.087612] ret_from_fork+0x22/0x30
> >>> [ 2912.087617] </TASK>
> >>>
> >>> Is this a bug? It's in writeback mode. I'd setup the cache and run stuff like,
> >>>
> >>> echo 0 > /sys/block/bcache0/bcache/sequential_cutoff
> >>>
> >>> I had also echoed 0 into congested_read_threshold_us,
> >>> congested_write_threshold_us.
> >>>
> >>> echo writeback > /sys/block/bcache0/bcache/cache_mode
> >>
> >> Where do you get the kernel? If this is stable kernel, could you give me the HEAD commit id?
> >>
> >> Coly Li
> >>
>
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: seeing this stace trace on kernel 5.15
2022-06-29 7:56 ` Nikhil Kshirsagar
@ 2022-06-29 7:59 ` Coly Li
2022-06-29 8:09 ` Nikhil Kshirsagar
1 sibling, 0 replies; 14+ messages in thread
From: Coly Li @ 2022-06-29 7:59 UTC (permalink / raw)
To: Nikhil Kshirsagar; +Cc: linux-bcache
> 2022年6月29日 15:56,Nikhil Kshirsagar <nkshirsagar@gmail.com> 写道:
>
> 60gb slow hdd, 15gb fast nvme cache. shall i file upstream bug for this issue?
If you may trigger this on upstream kernel or stable kernel, please do it.
Coly Li
>
> On Wed, 29 Jun 2022 at 13:20, Coly Li <colyli@suse.de> wrote:
>>
>>
>>
>>> 2022年6月29日 15:02,Nikhil Kshirsagar <nkshirsagar@gmail.com> 写道:
>>>
>>> Hi Coly,
>>>
>>> I see the same bug on upstream kernel when I tried with
>>> https://kernel.ubuntu.com/~kernel-ppa/mainline/v5.15.50/ (
>>> cod/mainline/v5.15.50 (767db4b286c3e101ac220b813c873f492d9e4ee8)
>>
>>
>> Can you help to try the linux-stable tree, or the latest upstream Linus tree? Then I can have a clean code base.
>>
>>>
>>> Reads seem to trigger it, not writes. So this test triggered it -
>>>
>>> fio --name=read_iops --directory=/home/ubuntu/bcache_mount --size=12G
>>> --ioengine=libaio --direct=1 --verify=0 --bs=4K --iodepth=128
>>> --rw=randread --group_reporting=1
>>>
>>> https://pastebin.com/KyVSfnik has all the details.
>>
>>
>> OK, I will try to reproduce with above operation. What is the preferred cache and backing device sizes to reproduce the soft lockup?
>>
>> Thanks.
>>
>> Coly Li
>>
>>
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: seeing this stace trace on kernel 5.15
2022-06-29 7:56 ` Nikhil Kshirsagar
2022-06-29 7:59 ` Coly Li
@ 2022-06-29 8:09 ` Nikhil Kshirsagar
2022-06-29 8:17 ` Coly Li
1 sibling, 1 reply; 14+ messages in thread
From: Nikhil Kshirsagar @ 2022-06-29 8:09 UTC (permalink / raw)
To: Coly Li; +Cc: linux-bcache
Hi Coly,
Note I used partitions for the bcache as well as the hdd, not sure if
that's a factor.
the kernel is upstream kernel -
# uname -a
Linux bronzor 5.15.50-051550-generic #202206251445 SMP Sat Jun 25
14:51:22 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux
lsblk shows the setup,
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINTS
sdd 8:48 0 279.4G 0 disk
└─sdd1 8:49 0 60G 0 part
└─bcache0 252:0 0 60G 0 disk /home/ubuntu/bcache_mount
nvme0n1 259:0 0 372.6G 0 disk
└─nvme0n1p1 259:2 0 15G 0 part
└─bcache0 252:0 0 60G 0 disk /home/ubuntu/bcache_mount
echo 0 > /sys/block/bcache0/bcache/sequential_cutoff
echo 0 > /sys/fs/bcache/dbadc4cf-feed-42b6-9534-1eeff569a450/congested_read_threshold_us
echo 0 > /sys/fs/bcache/dbadc4cf-feed-42b6-9534-1eeff569a450/congested_write_threshold_us
echo writeback > /sys/block/bcache0/bcache/cache_mode
test that triggers the deadlock -
fio --name=read_iops --directory=/home/ubuntu/bcache_mount --size=12G
--ioengine=libaio --direct=1 --verify=0 --bs=4K --iodepth=128
--rw=randread --group_reporting=1
[ 4473.699902] INFO: task bcache_writebac:1835 blocked for more than
120 seconds.
[ 4474.050921] Not tainted 5.15.50-051550-generic #202206251445
[ 4474.350883] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
disables this message.
[ 4474.731391] task:bcache_writebac state:D stack: 0 pid: 1835
ppid: 2 flags:0x00004000
[ 4474.731408] Call Trace:
[ 4474.731411] <TASK>
[ 4474.731413] __schedule+0x23d/0x5a0
[ 4474.731433] schedule+0x4e/0xb0
[ 4474.731436] rwsem_down_write_slowpath+0x220/0x3d0
[ 4474.731441] down_write+0x43/0x50
[ 4474.731446] bch_writeback_thread+0x78/0x320 [bcache]
[ 4474.731471] ? read_dirty_submit+0x70/0x70 [bcache]
[ 4474.731487] kthread+0x12a/0x150
[ 4474.731491] ? set_kthread_struct+0x50/0x50
[ 4474.731494] ret_from_fork+0x22/0x30
[ 4474.731499] </TASK>
[ 4474.731735] INFO: task fio:16626 blocked for more than 121 seconds.
[ 4475.035858] Not tainted 5.15.50-051550-generic #202206251445
[ 4475.335859] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
disables this message.
[ 4475.716526] task:fio state:D stack: 0 pid:16626
ppid: 1715 flags:0x00004002
[ 4475.716531] Call Trace:
[ 4475.716546] <TASK>
[ 4475.716549] __schedule+0x23d/0x5a0
[ 4475.716555] ? sysvec_apic_timer_interrupt+0x4e/0x90
[ 4475.716560] schedule+0x4e/0xb0
[ 4475.716563] rwsem_down_read_slowpath+0x32e/0x380
[ 4475.716567] down_read+0x43/0x90
[ 4475.716571] cached_dev_write+0x7e/0x480 [bcache]
[ 4475.716604] cached_dev_submit_bio+0x502/0x550 [bcache]
[ 4475.716618] __submit_bio+0x1a1/0x220
[ 4475.716623] __submit_bio_noacct+0x85/0x1f0
[ 4475.716626] submit_bio_noacct+0x4e/0x120
[ 4475.716628] submit_bio+0x4a/0x130
[ 4475.716631] iomap_submit_ioend+0x53/0x80
[ 4475.716634] iomap_writepages+0x35/0x40
[ 4475.716636] xfs_vm_writepages+0x84/0xb0 [xfs]
[ 4475.716730] do_writepages+0xda/0x200
[ 4475.716736] filemap_fdatawrite_wbc+0x81/0xd0
[ 4475.716739] file_write_and_wait_range+0xac/0xf0
[ 4475.716742] xfs_file_fsync+0x5b/0x250 [xfs]
[ 4475.716821] vfs_fsync_range+0x49/0x80
[ 4475.716826] ? __fget_light+0x32/0x80
[ 4475.716829] __x64_sys_fsync+0x38/0x60
[ 4475.716832] do_syscall_64+0x5c/0xc0
[ 4475.716836] ? ksys_write+0xce/0xe0
[ 4475.716838] ? exit_to_user_mode_prepare+0x37/0xb0
[ 4475.716843] ? syscall_exit_to_user_mode+0x27/0x50
[ 4475.716847] ? __x64_sys_write+0x19/0x20
[ 4475.716849] ? do_syscall_64+0x69/0xc0
[ 4475.716851] ? syscall_exit_to_user_mode+0x27/0x50
[ 4475.716855] ? __x64_sys_write+0x19/0x20
[ 4475.716857] ? do_syscall_64+0x69/0xc0
[ 4475.716859] ? exit_to_user_mode_prepare+0x37/0xb0
[ 4475.716862] ? syscall_exit_to_user_mode+0x27/0x50
[ 4475.716865] ? __x64_sys_write+0x19/0x20
[ 4475.716867] ? do_syscall_64+0x69/0xc0
[ 4475.716869] ? syscall_exit_to_user_mode+0x27/0x50
[ 4475.716873] ? __x64_sys_write+0x19/0x20
[ 4475.716875] ? do_syscall_64+0x69/0xc0
[ 4475.716877] ? do_syscall_64+0x69/0xc0
[ 4475.716879] ? asm_common_interrupt+0x8/0x40
[ 4475.716883] entry_SYSCALL_64_after_hwframe+0x44/0xae
[ 4475.716885] RIP: 0033:0x7fd3c498aa5b
[ 4475.716888] RSP: 002b:00007ffc81602ee0 EFLAGS: 00000293 ORIG_RAX:
000000000000004a
[ 4475.716890] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007fd3c498aa5b
[ 4475.716892] RDX: 0000000000000000 RSI: 000055ededf41f40 RDI: 0000000000000006
[ 4475.716893] RBP: 0000000000001000 R08: 0000000000000000 R09: 000055ededf41f40
[ 4475.716895] R10: 00000000e8746551 R11: 0000000000000293 R12: 00007fd3ba533000
[ 4475.716896] R13: 0000000000000000 R14: 00007fd3c1cde110 R15: 0000000300000000
[ 4475.716898] </TASK>
[ 4475.716912] INFO: task kworker/26:87:16682 blocked for more than 122 seconds.
[ 4476.063391] Not tainted 5.15.50-051550-generic #202206251445
[ 4476.363476] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
disables this message.
[ 4476.743988] task:kworker/26:87 state:D stack: 0 pid:16682
ppid: 2 flags:0x00004000
[ 4476.743994] Workqueue: events update_writeback_rate [bcache]
[ 4476.744022] Call Trace:
[ 4476.744024] <TASK>
[ 4476.744026] __schedule+0x23d/0x5a0
[ 4476.744032] schedule+0x4e/0xb0
[ 4476.744035] rwsem_down_read_slowpath+0x32e/0x380
[ 4476.744039] down_read+0x43/0x90
[ 4476.744042] update_writeback_rate+0x134/0x190 [bcache]
[ 4476.744057] process_one_work+0x22b/0x3d0
[ 4476.744063] worker_thread+0x53/0x410
[ 4476.744066] ? process_one_work+0x3d0/0x3d0
[ 4476.744070] kthread+0x12a/0x150
[ 4476.744074] ? set_kthread_struct+0x50/0x50
[ 4476.744077] ret_from_fork+0x22/0x30
[ 4476.744082] </TASK>
[ 4598.626220] INFO: task bcache_writebac:1835 blocked for more than
245 seconds.
[ 4598.978432] Not tainted 5.15.50-051550-generic #202206251445
[ 4599.279900] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
disables this message.
[ 4599.657293] task:bcache_writebac state:D stack: 0 pid: 1835
ppid: 2 flags:0x00004000
[ 4599.657298] Call Trace:
[ 4599.657300] <TASK>
[ 4599.657302] __schedule+0x23d/0x5a0
[ 4599.657325] schedule+0x4e/0xb0
[ 4599.657328] rwsem_down_write_slowpath+0x220/0x3d0
[ 4599.657333] down_write+0x43/0x50
[ 4599.657338] bch_writeback_thread+0x78/0x320 [bcache]
[ 4599.657374] ? read_dirty_submit+0x70/0x70 [bcache]
[ 4599.657390] kthread+0x12a/0x150
[ 4599.657393] ? set_kthread_struct+0x50/0x50
[ 4599.657396] ret_from_fork+0x22/0x30
[ 4599.657401] </TASK>
[ 4599.657434] INFO: task kworker/2:63:14895 blocked for more than 124 seconds.
[ 4600.000498] Not tainted 5.15.50-051550-generic #202206251445
[ 4600.301188] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
disables this message.
[ 4600.681984] task:kworker/2:63 state:D stack: 0 pid:14895
ppid: 2 flags:0x00004000
[ 4600.682005] Workqueue: bcache bch_data_insert_start [bcache]
[ 4600.682031] Call Trace:
[ 4600.682033] <TASK>
[ 4600.682035] __schedule+0x23d/0x5a0
[ 4600.682040] schedule+0x4e/0xb0
[ 4600.682043] bch_bucket_alloc+0x21a/0x5b0 [bcache]
[ 4600.682066] ? wait_woken+0x70/0x70
[ 4600.682071] __bch_bucket_alloc_set+0x52/0xf0 [bcache]
[ 4600.682082] bch_alloc_sectors+0x1c9/0x4c0 [bcache]
[ 4600.682098] ? krealloc+0x9d/0xd0
[ 4600.682102] ? __bch_keylist_realloc+0xb7/0x100 [bcache]
[ 4600.682113] ? __bch_submit_bbio+0x97/0xb0 [bcache]
[ 4600.682126] bch_data_insert_start+0x15e/0x3a0 [bcache]
[ 4600.682139] ? closure_sub+0x94/0xb0 [bcache]
[ 4600.682152] process_one_work+0x22b/0x3d0
[ 4600.682157] worker_thread+0x53/0x410
[ 4600.682161] ? process_one_work+0x3d0/0x3d0
[ 4600.682164] kthread+0x12a/0x150
[ 4600.682168] ? set_kthread_struct+0x50/0x50
[ 4600.682171] ret_from_fork+0x22/0x30
[ 4600.682176] </TASK>
[ 4600.682240] INFO: task kworker/25:123:16464 blocked for more than
125 seconds.
[ 4601.033116] Not tainted 5.15.50-051550-generic #202206251445
[ 4601.333405] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
disables this message.
[ 4601.713829] task:kworker/25:123 state:D stack: 0 pid:16464
ppid: 2 flags:0x00004000
[ 4601.713835] Workqueue: xfs-sync/bcache0 xfs_log_worker [xfs]
[ 4601.713975] Call Trace:
[ 4601.713977] <TASK>
[ 4601.713979] __schedule+0x23d/0x5a0
[ 4601.713986] schedule+0x4e/0xb0
[ 4601.713988] rwsem_down_read_slowpath+0x32e/0x380
[ 4601.713993] down_read+0x43/0x90
[ 4601.713996] cached_dev_write+0x7e/0x480 [bcache]
[ 4601.714018] cached_dev_submit_bio+0x502/0x550 [bcache]
[ 4601.714031] __submit_bio+0x1a1/0x220
[ 4601.714035] ? ttwu_do_activate+0x72/0xf0
[ 4601.714039] __submit_bio_noacct+0x85/0x1f0
[ 4601.714042] ? mutex_lock+0x13/0x40
[ 4601.714045] submit_bio_noacct+0x4e/0x120
[ 4601.714047] submit_bio+0x4a/0x130
[ 4601.714050] xlog_write_iclog+0x254/0x300 [xfs]
[ 4601.714142] xlog_sync+0x1ab/0x2c0 [xfs]
[ 4601.714225] xlog_state_release_iclog+0x123/0x1d0 [xfs]
[ 4601.714306] xfs_log_force+0x186/0x250 [xfs]
[ 4601.714386] xfs_log_worker+0x39/0x90 [xfs]
[ 4601.714466] process_one_work+0x22b/0x3d0
[ 4601.714471] worker_thread+0x53/0x410
[ 4601.714474] ? process_one_work+0x3d0/0x3d0
[ 4601.714477] kthread+0x12a/0x150
[ 4601.714481] ? set_kthread_struct+0x50/0x50
[ 4601.714484] ret_from_fork+0x22/0x30
[ 4601.714489] </TASK>
[ 4601.714526] INFO: task kworker/22:89:16557 blocked for more than 126 seconds.
[ 4602.056844] Not tainted 5.15.50-051550-generic #202206251445
[ 4602.357148] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
disables this message.
[ 4602.733923] task:kworker/22:89 state:D stack: 0 pid:16557
ppid: 2 flags:0x00004000
[ 4602.733940] Workqueue: bcache bch_data_insert_start [bcache]
[ 4602.733966] Call Trace:
[ 4602.733968] <TASK>
[ 4602.733970] __schedule+0x23d/0x5a0
[ 4602.733988] schedule+0x4e/0xb0
[ 4602.733991] bch_bucket_alloc+0x21a/0x5b0 [bcache]
[ 4602.734002] ? bch_btree_insert_check_key+0x1e0/0x1e0 [bcache]
[ 4602.734014] ? wait_woken+0x70/0x70
[ 4602.734019] __bch_bucket_alloc_set+0x52/0xf0 [bcache]
[ 4602.734031] bch_alloc_sectors+0x1c9/0x4c0 [bcache]
[ 4602.734043] ? __wake_up_common_lock+0x8a/0xc0
[ 4602.734046] bch_data_insert_start+0x15e/0x3a0 [bcache]
[ 4602.734062] ? closure_sub+0x94/0xb0 [bcache]
[ 4602.734083] process_one_work+0x22b/0x3d0
[ 4602.734088] worker_thread+0x53/0x410
[ 4602.734092] ? process_one_work+0x3d0/0x3d0
[ 4602.734095] kthread+0x12a/0x150
[ 4602.734099] ? set_kthread_struct+0x50/0x50
[ 4602.734102] ret_from_fork+0x22/0x30
[ 4602.734108] </TASK>
[ 4602.734134] INFO: task fio:16626 blocked for more than 249 seconds.
[ 4603.034541] Not tainted 5.15.50-051550-generic #202206251445
[ 4603.330400] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
disables this message.
[ 4603.711013] task:fio state:D stack: 0 pid:16626
ppid: 1715 flags:0x00004002
[ 4603.711030] Call Trace:
[ 4603.711032] <TASK>
[ 4603.711046] __schedule+0x23d/0x5a0
[ 4603.711054] ? sysvec_apic_timer_interrupt+0x4e/0x90
[ 4603.711060] schedule+0x4e/0xb0
[ 4603.711063] rwsem_down_read_slowpath+0x32e/0x380
[ 4603.711067] down_read+0x43/0x90
[ 4603.711072] cached_dev_write+0x7e/0x480 [bcache]
[ 4603.711095] cached_dev_submit_bio+0x502/0x550 [bcache]
[ 4603.711108] __submit_bio+0x1a1/0x220
[ 4603.711113] __submit_bio_noacct+0x85/0x1f0
[ 4603.711116] submit_bio_noacct+0x4e/0x120
[ 4603.711118] submit_bio+0x4a/0x130
[ 4603.711121] iomap_submit_ioend+0x53/0x80
[ 4603.711124] iomap_writepages+0x35/0x40
[ 4603.711126] xfs_vm_writepages+0x84/0xb0 [xfs]
[ 4603.711223] do_writepages+0xda/0x200
[ 4603.711229] filemap_fdatawrite_wbc+0x81/0xd0
[ 4603.711232] file_write_and_wait_range+0xac/0xf0
[ 4603.711235] xfs_file_fsync+0x5b/0x250 [xfs]
[ 4603.711312] vfs_fsync_range+0x49/0x80
[ 4603.711317] ? __fget_light+0x32/0x80
[ 4603.711321] __x64_sys_fsync+0x38/0x60
[ 4603.711324] do_syscall_64+0x5c/0xc0
[ 4603.711327] ? ksys_write+0xce/0xe0
[ 4603.711330] ? exit_to_user_mode_prepare+0x37/0xb0
[ 4603.711335] ? syscall_exit_to_user_mode+0x27/0x50
[ 4603.711338] ? __x64_sys_write+0x19/0x20
[ 4603.711340] ? do_syscall_64+0x69/0xc0
[ 4603.711342] ? syscall_exit_to_user_mode+0x27/0x50
[ 4603.711345] ? __x64_sys_write+0x19/0x20
[ 4603.711347] ? do_syscall_64+0x69/0xc0
[ 4603.711349] ? exit_to_user_mode_prepare+0x37/0xb0
[ 4603.711352] ? syscall_exit_to_user_mode+0x27/0x50
[ 4603.711355] ? __x64_sys_write+0x19/0x20
[ 4603.711357] ? do_syscall_64+0x69/0xc0
[ 4603.711359] ? syscall_exit_to_user_mode+0x27/0x50
[ 4603.711362] ? __x64_sys_write+0x19/0x20
[ 4603.711364] ? do_syscall_64+0x69/0xc0
[ 4603.711366] ? do_syscall_64+0x69/0xc0
[ 4603.711368] ? asm_common_interrupt+0x8/0x40
[ 4603.711371] entry_SYSCALL_64_after_hwframe+0x44/0xae
[ 4603.711374] RIP: 0033:0x7fd3c498aa5b
[ 4603.711376] RSP: 002b:00007ffc81602ee0 EFLAGS: 00000293 ORIG_RAX:
000000000000004a
[ 4603.711379] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007fd3c498aa5b
[ 4603.711381] RDX: 0000000000000000 RSI: 000055ededf41f40 RDI: 0000000000000006
[ 4603.711382] RBP: 0000000000001000 R08: 0000000000000000 R09: 000055ededf41f40
[ 4603.711383] R10: 00000000e8746551 R11: 0000000000000293 R12: 00007fd3ba533000
[ 4603.711384] R13: 0000000000000000 R14: 00007fd3c1cde110 R15: 0000000300000000
[ 4603.711387] </TASK>
[ 4603.711421] INFO: task kworker/26:87:16682 blocked for more than 250 seconds.
[ 4604.058441] Not tainted 5.15.50-051550-generic #202206251445
[ 4604.359128] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
disables this message.
[ 4604.739950] task:kworker/26:87 state:D stack: 0 pid:16682
ppid: 2 flags:0x00004000
[ 4604.739968] Workqueue: events update_writeback_rate [bcache]
[ 4604.739995] Call Trace:
[ 4604.739997] <TASK>
[ 4604.740012] __schedule+0x23d/0x5a0
[ 4604.740017] schedule+0x4e/0xb0
[ 4604.740020] rwsem_down_read_slowpath+0x32e/0x380
[ 4604.740024] down_read+0x43/0x90
[ 4604.740027] update_writeback_rate+0x134/0x190 [bcache]
[ 4604.740043] process_one_work+0x22b/0x3d0
[ 4604.740049] worker_thread+0x53/0x410
[ 4604.740052] ? process_one_work+0x3d0/0x3d0
[ 4604.740055] kthread+0x12a/0x150
[ 4604.740059] ? set_kthread_struct+0x50/0x50
[ 4604.740062] ret_from_fork+0x22/0x30
[ 4604.740067] </TASK>
[ 4725.600602] INFO: task bcache_writebac:1835 blocked for more than
372 seconds.
[ 4725.952827] Not tainted 5.15.50-051550-generic #202206251445
[ 4726.253763] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
disables this message.
[ 4726.630937] task:bcache_writebac state:D stack: 0 pid: 1835
ppid: 2 flags:0x00004000
[ 4726.630954] Call Trace:
[ 4726.630956] <TASK>
[ 4726.630959] __schedule+0x23d/0x5a0
[ 4726.630966] schedule+0x4e/0xb0
[ 4726.630970] rwsem_down_write_slowpath+0x220/0x3d0
[ 4726.630976] down_write+0x43/0x50
[ 4726.630980] bch_writeback_thread+0x78/0x320 [bcache]
[ 4726.631005] ? read_dirty_submit+0x70/0x70 [bcache]
[ 4726.631021] kthread+0x12a/0x150
[ 4726.631025] ? set_kthread_struct+0x50/0x50
[ 4726.631028] ret_from_fork+0x22/0x30
[ 4726.631033] </TASK>
Regards,
Nikhil.
On Wed, 29 Jun 2022 at 13:26, Nikhil Kshirsagar <nkshirsagar@gmail.com> wrote:
>
> 60gb slow hdd, 15gb fast nvme cache. shall i file upstream bug for this issue?
>
> On Wed, 29 Jun 2022 at 13:20, Coly Li <colyli@suse.de> wrote:
> >
> >
> >
> > > 2022年6月29日 15:02,Nikhil Kshirsagar <nkshirsagar@gmail.com> 写道:
> > >
> > > Hi Coly,
> > >
> > > I see the same bug on upstream kernel when I tried with
> > > https://kernel.ubuntu.com/~kernel-ppa/mainline/v5.15.50/ (
> > > cod/mainline/v5.15.50 (767db4b286c3e101ac220b813c873f492d9e4ee8)
> >
> >
> > Can you help to try the linux-stable tree, or the latest upstream Linus tree? Then I can have a clean code base.
> >
> > >
> > > Reads seem to trigger it, not writes. So this test triggered it -
> > >
> > > fio --name=read_iops --directory=/home/ubuntu/bcache_mount --size=12G
> > > --ioengine=libaio --direct=1 --verify=0 --bs=4K --iodepth=128
> > > --rw=randread --group_reporting=1
> > >
> > > https://pastebin.com/KyVSfnik has all the details.
> >
> >
> > OK, I will try to reproduce with above operation. What is the preferred cache and backing device sizes to reproduce the soft lockup?
> >
> > Thanks.
> >
> > Coly Li
> >
> >
> > >
> > > Regards,
> > > Nikhil.
> > >
> > > On Tue, 28 Jun 2022 at 20:08, Coly Li <colyli@suse.de> wrote:
> > >>
> > >>
> > >>
> > >>> 2022年6月28日 14:31,Nikhil Kshirsagar <nkshirsagar@gmail.com> 写道:
> > >>>
> > >>> Hi Coly,
> > >>>
> > >>> I just kicked off a 20gb random readwrite test using fio to a bcache device,
> > >>>
> > >>> fio --filename=/home/ubuntu/bcache_mount/cacahedfile --size=20GB
> > >>> --direct=1 --rw=randrw --bs=4k --ioengine=libaio --iodepth=128
> > >>> --name=iops-test-job --eta-newline=1
> > >>>
> > >>> # lsblk /dev/sdc /dev/nvme0n1
> > >>> NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINTS
> > >>> sdc 8:32 0 279.4G 0 disk
> > >>> └─sdc1 8:33 0 60G 0 part
> > >>> └─bcache0 252:0 0 60G 0 disk /home/ubuntu/bcache_mount
> > >>> nvme0n1 259:0 0 372.6G 0 disk
> > >>> └─nvme0n1p1 259:2 0 15G 0 part
> > >>> └─bcache0 252:0 0 60G 0 disk /home/ubuntu/bcache_mount
> > >>>
> > >>> and am seeing this trace in the dmesg,
> > >>>
> > >>> [ 2475.034909] XFS (bcache0): Ending clean mount
> > >>> [ 2475.036238] xfs filesystem being mounted at
> > >>> /home/ubuntu/bcache_mount supports timestamps until 2038 (0x7fffffff)
> > >>>
> > >>> [ 2782.176705] INFO: task kworker/2:1:255 blocked for more than 120 seconds.
> > >>> [ 2782.507171] Not tainted 5.15.0-40-generic #43-Ubuntu
> > >>> [ 2782.774078] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
> > >>> disables this message.
> > >>> [ 2783.155206] task:kworker/2:1 state:D stack: 0 pid: 255
> > >>> ppid: 2 flags:0x00004000
> > >>> [ 2783.155210] Workqueue: bcache bch_data_insert_start [bcache]
> > >>> [ 2783.155259] Call Trace:
> > >>> [ 2783.155261] <TASK>
> > >>> [ 2783.155263] __schedule+0x23d/0x590
> > >>> [ 2783.155269] schedule+0x4e/0xb0
> > >>> [ 2783.155271] bch_bucket_alloc+0x21a/0x5b0 [bcache]
> > >>> [ 2783.155281] ? bch_btree_insert_check_key+0x1e0/0x1e0 [bcache]
> > >>> [ 2783.155294] ? wait_woken+0x70/0x70
> > >>> [ 2783.155298] __bch_bucket_alloc_set+0x52/0xf0 [bcache]
> > >>> [ 2783.155309] bch_alloc_sectors+0x1c9/0x4c0 [bcache]
> > >>> [ 2783.155319] ? bch_btree_insert+0xea/0x130 [bcache]
> > >>> [ 2783.155331] ? __queue_work+0x211/0x480
> > >>> [ 2783.155336] bch_data_insert_start+0x15e/0x3a0 [bcache]
> > >>> [ 2783.155349] ? closure_sub+0x94/0xb0 [bcache]
> > >>> [ 2783.155362] process_one_work+0x22b/0x3d0
> > >>> [ 2783.155366] worker_thread+0x53/0x410
> > >>> [ 2783.155369] ? process_one_work+0x3d0/0x3d0
> > >>> [ 2783.155372] kthread+0x12a/0x150
> > >>> [ 2783.155376] ? set_kthread_struct+0x50/0x50
> > >>> [ 2783.155379] ret_from_fork+0x22/0x30
> > >>> [ 2783.155385] </TASK>
> > >>> [ 2783.155423] INFO: task kworker/3:1:267 blocked for more than 121 seconds.
> > >>> [ 2783.485797] Not tainted 5.15.0-40-generic #43-Ubuntu
> > >>> [ 2783.752485] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
> > >>> disables this message.
> > >>> [ 2784.133541] task:kworker/3:1 state:D stack: 0 pid: 267
> > >>> ppid: 2 flags:0x00004000
> > >>> [ 2784.133544] Workqueue: events update_writeback_rate [bcache]
> > >>> [ 2784.133588] Call Trace:
> > >>> [ 2784.133589] <TASK>
> > >>> [ 2784.133591] __schedule+0x23d/0x590
> > >>> [ 2784.133594] schedule+0x4e/0xb0
> > >>> [ 2784.133596] rwsem_down_read_slowpath+0x32e/0x380
> > >>> [ 2784.133600] down_read+0x43/0x90
> > >>> [ 2784.133602] update_writeback_rate+0x134/0x190 [bcache]
> > >>> [ 2784.133619] process_one_work+0x22b/0x3d0
> > >>> [ 2784.133623] worker_thread+0x53/0x410
> > >>> [ 2784.133626] ? process_one_work+0x3d0/0x3d0
> > >>> [ 2784.133630] kthread+0x12a/0x150
> > >>> [ 2784.133633] ? set_kthread_struct+0x50/0x50
> > >>> [ 2784.133636] ret_from_fork+0x22/0x30
> > >>> [ 2784.133640] </TASK>
> > >>> [ 2784.133650] INFO: task kworker/25:2:326 blocked for more than 122 seconds.
> > >>> [ 2784.467880] Not tainted 5.15.0-40-generic #43-Ubuntu
> > >>> [ 2784.734405] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
> > >>> disables this message.
> > >>> [ 2785.114677] task:kworker/25:2 state:D stack: 0 pid: 326
> > >>> ppid: 2 flags:0x00004000
> > >>> [ 2785.114692] Workqueue: bcache bch_data_insert_start [bcache]
> > >>> [ 2785.114720] Call Trace:
> > >>> [ 2785.114721] <TASK>
> > >>> [ 2785.114723] __schedule+0x23d/0x590
> > >>> [ 2785.114726] schedule+0x4e/0xb0
> > >>> [ 2785.114729] bch_bucket_alloc+0x21a/0x5b0 [bcache]
> > >>> [ 2785.114740] ? bch_btree_insert_check_key+0x1e0/0x1e0 [bcache]
> > >>> [ 2785.114753] ? wait_woken+0x70/0x70
> > >>> [ 2785.114756] __bch_bucket_alloc_set+0x52/0xf0 [bcache]
> > >>> [ 2785.114766] bch_alloc_sectors+0x1c9/0x4c0 [bcache]
> > >>> [ 2785.114777] ? bch_btree_insert+0xea/0x130 [bcache]
> > >>> [ 2785.114789] ? __queue_work+0x211/0x480
> > >>> [ 2785.114793] bch_data_insert_start+0x15e/0x3a0 [bcache]
> > >>> [ 2785.114806] ? closure_sub+0x94/0xb0 [bcache]
> > >>> [ 2785.114818] process_one_work+0x22b/0x3d0
> > >>> [ 2785.114822] worker_thread+0x53/0x410
> > >>> [ 2785.114826] ? process_one_work+0x3d0/0x3d0
> > >>> [ 2785.114829] kthread+0x12a/0x150
> > >>> [ 2785.114832] ? set_kthread_struct+0x50/0x50
> > >>> [ 2785.114835] ret_from_fork+0x22/0x30
> > >>> [ 2785.114839] </TASK>
> > >>> [ 2785.114864] INFO: task kworker/6:0:3038 blocked for more than 123 seconds.
> > >>> [ 2785.444697] Not tainted 5.15.0-40-generic #43-Ubuntu
> > >>> [ 2785.711071] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
> > >>> disables this message.
> > >>> [ 2786.091440] task:kworker/6:0 state:D stack: 0 pid: 3038
> > >>> ppid: 2 flags:0x00004000
> > >>> [ 2786.091446] Workqueue: xfs-sync/bcache0 xfs_log_worker [xfs]
> > >>> [ 2786.091582] Call Trace:
> > >>> [ 2786.091584] <TASK>
> > >>> [ 2786.091585] __schedule+0x23d/0x590
> > >>> [ 2786.091589] schedule+0x4e/0xb0
> > >>> [ 2786.091591] rwsem_down_read_slowpath+0x32e/0x380
> > >>> [ 2786.091594] down_read+0x43/0x90
> > >>> [ 2786.091597] cached_dev_write+0x7e/0x480 [bcache]
> > >>> [ 2786.091613] cached_dev_submit_bio+0x4ef/0x540 [bcache]
> > >>> [ 2786.091626] __submit_bio+0x1a1/0x220
> > >>> [ 2786.091631] __submit_bio_noacct+0x85/0x1f0
> > >>> [ 2786.091634] ? mutex_lock+0x13/0x40
> > >>> [ 2786.091637] submit_bio_noacct+0x4e/0x120
> > >>> [ 2786.091640] submit_bio+0x4a/0x130
> > >>> [ 2786.091642] xlog_write_iclog+0x254/0x300 [xfs]
> > >>> [ 2786.091725] xlog_sync+0x1ab/0x2c0 [xfs]
> > >>> [ 2786.091807] xlog_state_release_iclog+0x123/0x1d0 [xfs]
> > >>> [ 2786.091925] xfs_log_force+0x186/0x250 [xfs]
> > >>> [ 2786.091994] xfs_log_worker+0x39/0x90 [xfs]
> > >>> [ 2786.092063] process_one_work+0x22b/0x3d0
> > >>> [ 2786.092067] worker_thread+0x53/0x410
> > >>> [ 2786.092069] ? process_one_work+0x3d0/0x3d0
> > >>> [ 2786.092072] kthread+0x12a/0x150
> > >>> [ 2786.092074] ? set_kthread_struct+0x50/0x50
> > >>> [ 2786.092077] ret_from_fork+0x22/0x30
> > >>> [ 2786.092081] </TASK>
> > >>> [ 2786.092088] INFO: task kworker/25:0:3047 blocked for more than 124 seconds.
> > >>> [ 2786.430313] Not tainted 5.15.0-40-generic #43-Ubuntu
> > >>> [ 2786.696626] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
> > >>> disables this message.
> > >>> [ 2787.077056] task:kworker/25:0 state:D stack: 0 pid: 3047
> > >>> ppid: 2 flags:0x00004000
> > >>> [ 2787.077071] Workqueue: bcache bch_data_insert_start [bcache]
> > >>> [ 2787.077098] Call Trace:
> > >>> [ 2787.077099] <TASK>
> > >>> [ 2787.077101] __schedule+0x23d/0x590
> > >>> [ 2787.077104] schedule+0x4e/0xb0
> > >>> [ 2787.077106] bch_bucket_alloc+0x21a/0x5b0 [bcache]
> > >>> [ 2787.077116] ? wait_woken+0x70/0x70
> > >>> [ 2787.077119] __bch_bucket_alloc_set+0x52/0xf0 [bcache]
> > >>> [ 2787.077142] bch_alloc_sectors+0x1c9/0x4c0 [bcache]
> > >>> [ 2787.077153] ? __bch_submit_bbio+0x97/0xb0 [bcache]
> > >>> [ 2787.077166] bch_data_insert_start+0x15e/0x3a0 [bcache]
> > >>> [ 2787.077179] ? closure_sub+0x94/0xb0 [bcache]
> > >>> [ 2787.077191] process_one_work+0x22b/0x3d0
> > >>> [ 2787.077195] worker_thread+0x53/0x410
> > >>> [ 2787.077198] ? process_one_work+0x3d0/0x3d0
> > >>> [ 2787.077202] kthread+0x12a/0x150
> > >>> [ 2787.077205] ? set_kthread_struct+0x50/0x50
> > >>> [ 2787.077207] ret_from_fork+0x22/0x30
> > >>> [ 2787.077212] </TASK>
> > >>> [ 2787.077247] INFO: task bcache_writebac:3080 blocked for more than
> > >>> 125 seconds.
> > >>> [ 2787.423890] Not tainted 5.15.0-40-generic #43-Ubuntu
> > >>> [ 2787.690476] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
> > >>> disables this message.
> > >>> [ 2788.071063] task:bcache_writebac state:D stack: 0 pid: 3080
> > >>> ppid: 2 flags:0x00004000
> > >>> [ 2788.071080] Call Trace:
> > >>> [ 2788.071081] <TASK>
> > >>> [ 2788.071082] __schedule+0x23d/0x590
> > >>> [ 2788.071086] schedule+0x4e/0xb0
> > >>> [ 2788.071100] rwsem_down_write_slowpath+0x220/0x3d0
> > >>> [ 2788.071105] ? del_timer_sync+0x6c/0xb0
> > >>> [ 2788.071109] down_write+0x43/0x50
> > >>> [ 2788.071112] bch_writeback_thread+0x78/0x320 [bcache]
> > >>> [ 2788.071142] ? read_dirty+0x5a0/0x5a0 [bcache]
> > >>> [ 2788.071158] kthread+0x12a/0x150
> > >>> [ 2788.071161] ? set_kthread_struct+0x50/0x50
> > >>> [ 2788.071164] ret_from_fork+0x22/0x30
> > >>> [ 2788.071168] </TASK>
> > >>> [ 2788.071199] INFO: task fio:3123 blocked for more than 126 seconds.
> > >>> [ 2788.367175] Not tainted 5.15.0-40-generic #43-Ubuntu
> > >>> [ 2788.633386] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
> > >>> disables this message.
> > >>> [ 2789.014084] task:fio state:D stack: 0 pid: 3123
> > >>> ppid: 2378 flags:0x00004002
> > >>> [ 2789.014087] Call Trace:
> > >>> [ 2789.014089] <TASK>
> > >>> [ 2789.014090] __schedule+0x23d/0x590
> > >>> [ 2789.014106] schedule+0x4e/0xb0
> > >>> [ 2789.014108] io_schedule+0x46/0x70
> > >>> [ 2789.014110] wait_on_page_bit_common+0x10c/0x3d0
> > >>> [ 2789.014115] ? filemap_invalidate_unlock_two+0x40/0x40
> > >>> [ 2789.014118] wait_on_page_bit+0x3f/0x50
> > >>> [ 2789.014120] wait_on_page_writeback+0x26/0x80
> > >>> [ 2789.014124] __filemap_fdatawait_range+0x97/0x110
> > >>> [ 2789.014126] file_write_and_wait_range+0xcc/0xf0
> > >>> [ 2789.014130] xfs_file_fsync+0x5b/0x250 [xfs]
> > >>> [ 2789.014207] vfs_fsync_range+0x49/0x80
> > >>> [ 2789.014212] ? __fget_light+0x32/0x80
> > >>> [ 2789.014217] __x64_sys_fsync+0x38/0x60
> > >>> [ 2789.014220] do_syscall_64+0x5c/0xc0
> > >>> [ 2789.014223] ? do_syscall_64+0x69/0xc0
> > >>> [ 2789.014225] ? do_syscall_64+0x69/0xc0
> > >>> [ 2789.014226] ? syscall_exit_to_user_mode+0x27/0x50
> > >>> [ 2789.014230] ? __x64_sys_write+0x19/0x20
> > >>> [ 2789.014232] ? do_syscall_64+0x69/0xc0
> > >>> [ 2789.014234] ? do_syscall_64+0x69/0xc0
> > >>> [ 2789.014235] ? asm_sysvec_apic_timer_interrupt+0xa/0x20
> > >>> [ 2789.014239] entry_SYSCALL_64_after_hwframe+0x44/0xae
> > >>> [ 2789.014243] RIP: 0033:0x7f2cfdcfea5b
> > >>> [ 2789.014245] RSP: 002b:00007ffcd87c6e60 EFLAGS: 00000293 ORIG_RAX:
> > >>> 000000000000004a
> > >>> [ 2789.014248] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f2cfdcfea5b
> > >>> [ 2789.014249] RDX: 0000000000000000 RSI: 000055c8d26e6f40 RDI: 0000000000000006
> > >>> [ 2789.014250] RBP: 0000000000001000 R08: 0000000000000000 R09: 000055c8d26e6f40
> > >>> [ 2789.014252] R10: 00000000a318c620 R11: 0000000000000293 R12: 00007f2cf38a7000
> > >>> [ 2789.014253] R13: 0000000000000000 R14: 00007f2cfb0522f0 R15: 0000000500000000
> > >>> [ 2789.014255] </TASK>
> > >>> [ 2909.151501] INFO: task kworker/2:1:255 blocked for more than 247 seconds.
> > >>> [ 2909.481856] Not tainted 5.15.0-40-generic #43-Ubuntu
> > >>> [ 2909.748707] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
> > >>> disables this message.
> > >>> [ 2910.130132] task:kworker/2:1 state:D stack: 0 pid: 255
> > >>> ppid: 2 flags:0x00004000
> > >>> [ 2910.130147] Workqueue: bcache bch_data_insert_start [bcache]
> > >>> [ 2910.130176] Call Trace:
> > >>> [ 2910.130177] <TASK>
> > >>> [ 2910.130179] __schedule+0x23d/0x590
> > >>> [ 2910.130182] schedule+0x4e/0xb0
> > >>> [ 2910.130184] bch_bucket_alloc+0x21a/0x5b0 [bcache]
> > >>> [ 2910.130194] ? bch_btree_insert_check_key+0x1e0/0x1e0 [bcache]
> > >>> [ 2910.130219] ? wait_woken+0x70/0x70
> > >>> [ 2910.130222] __bch_bucket_alloc_set+0x52/0xf0 [bcache]
> > >>> [ 2910.130233] bch_alloc_sectors+0x1c9/0x4c0 [bcache]
> > >>> [ 2910.130243] ? bch_btree_insert+0xea/0x130 [bcache]
> > >>> [ 2910.130255] ? __queue_work+0x211/0x480
> > >>> [ 2910.130259] bch_data_insert_start+0x15e/0x3a0 [bcache]
> > >>> [ 2910.130272] ? closure_sub+0x94/0xb0 [bcache]
> > >>> [ 2910.130285] process_one_work+0x22b/0x3d0
> > >>> [ 2910.130288] worker_thread+0x53/0x410
> > >>> [ 2910.130292] ? process_one_work+0x3d0/0x3d0
> > >>> [ 2910.130295] kthread+0x12a/0x150
> > >>> [ 2910.130298] ? set_kthread_struct+0x50/0x50
> > >>> [ 2910.130301] ret_from_fork+0x22/0x30
> > >>> [ 2910.130305] </TASK>
> > >>> [ 2910.130307] INFO: task kworker/3:1:267 blocked for more than 248 seconds.
> > >>> [ 2910.456520] Not tainted 5.15.0-40-generic #43-Ubuntu
> > >>> [ 2910.723476] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
> > >>> disables this message.
> > >>> [ 2911.104601] task:kworker/3:1 state:D stack: 0 pid: 267
> > >>> ppid: 2 flags:0x00004000
> > >>> [ 2911.104617] Workqueue: events update_writeback_rate [bcache]
> > >>> [ 2911.104647] Call Trace:
> > >>> [ 2911.104648] <TASK>
> > >>> [ 2911.104649] __schedule+0x23d/0x590
> > >>> [ 2911.104652] schedule+0x4e/0xb0
> > >>> [ 2911.104654] rwsem_down_read_slowpath+0x32e/0x380
> > >>> [ 2911.104657] down_read+0x43/0x90
> > >>> [ 2911.104660] update_writeback_rate+0x134/0x190 [bcache]
> > >>> [ 2911.104676] process_one_work+0x22b/0x3d0
> > >>> [ 2911.104680] worker_thread+0x53/0x410
> > >>> [ 2911.104683] ? process_one_work+0x3d0/0x3d0
> > >>> [ 2911.104687] kthread+0x12a/0x150
> > >>> [ 2911.104690] ? set_kthread_struct+0x50/0x50
> > >>> [ 2911.104693] ret_from_fork+0x22/0x30
> > >>> [ 2911.104697] </TASK>
> > >>> [ 2911.104733] INFO: task kworker/25:2:326 blocked for more than 249 seconds.
> > >>> [ 2911.439086] Not tainted 5.15.0-40-generic #43-Ubuntu
> > >>> [ 2911.706294] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
> > >>> disables this message.
> > >>> [ 2912.087471] task:kworker/25:2 state:D stack: 0 pid: 326
> > >>> ppid: 2 flags:0x00004000
> > >>> [ 2912.087475] Workqueue: bcache bch_data_insert_start [bcache]
> > >>> [ 2912.087495] Call Trace:
> > >>> [ 2912.087497] <TASK>
> > >>> [ 2912.087498] __schedule+0x23d/0x590
> > >>> [ 2912.087503] schedule+0x4e/0xb0
> > >>> [ 2912.087505] bch_bucket_alloc+0x21a/0x5b0 [bcache]
> > >>> [ 2912.087516] ? bch_btree_insert_check_key+0x1e0/0x1e0 [bcache]
> > >>> [ 2912.087528] ? wait_woken+0x70/0x70
> > >>> [ 2912.087532] __bch_bucket_alloc_set+0x52/0xf0 [bcache]
> > >>> [ 2912.087543] bch_alloc_sectors+0x1c9/0x4c0 [bcache]
> > >>> [ 2912.087553] ? bch_btree_insert+0xea/0x130 [bcache]
> > >>> [ 2912.087565] ? __queue_work+0x211/0x480
> > >>> [ 2912.087570] bch_data_insert_start+0x15e/0x3a0 [bcache]
> > >>> [ 2912.087583] ? closure_sub+0x94/0xb0 [bcache]
> > >>> [ 2912.087595] process_one_work+0x22b/0x3d0
> > >>> [ 2912.087599] worker_thread+0x53/0x410
> > >>> [ 2912.087602] ? process_one_work+0x3d0/0x3d0
> > >>> [ 2912.087606] kthread+0x12a/0x150
> > >>> [ 2912.087609] ? set_kthread_struct+0x50/0x50
> > >>> [ 2912.087612] ret_from_fork+0x22/0x30
> > >>> [ 2912.087617] </TASK>
> > >>>
> > >>> Is this a bug? It's in writeback mode. I'd setup the cache and run stuff like,
> > >>>
> > >>> echo 0 > /sys/block/bcache0/bcache/sequential_cutoff
> > >>>
> > >>> I had also echoed 0 into congested_read_threshold_us,
> > >>> congested_write_threshold_us.
> > >>>
> > >>> echo writeback > /sys/block/bcache0/bcache/cache_mode
> > >>
> > >> Where do you get the kernel? If this is stable kernel, could you give me the HEAD commit id?
> > >>
> > >> Coly Li
> > >>
> >
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: seeing this stace trace on kernel 5.15
2022-06-29 8:09 ` Nikhil Kshirsagar
@ 2022-06-29 8:17 ` Coly Li
2022-06-29 10:33 ` Nikhil Kshirsagar
0 siblings, 1 reply; 14+ messages in thread
From: Coly Li @ 2022-06-29 8:17 UTC (permalink / raw)
To: Nikhil Kshirsagar; +Cc: linux-bcache
> 2022年6月29日 16:09,Nikhil Kshirsagar <nkshirsagar@gmail.com> 写道:
>
> Hi Coly,
>
> Note I used partitions for the bcache as well as the hdd, not sure if
> that's a factor.
>
> the kernel is upstream kernel -
>
> # uname -a
> Linux bronzor 5.15.50-051550-generic #202206251445 SMP Sat Jun 25
> 14:51:22 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux
Hi Nikhil,
I don’t find the commit id 767db4b286c3e101ac220b813c873f492d9e4ee8 fro neither Linus tree nor stable tree.
The tree you mentioned at https://kernel.ubuntu.com/~kernel-ppa/mainline/v5.15.50/cod/mainline/v5.15.50 (767db4b286c3e101ac220b813c873f492d9e4ee8), I am not sure whether it is a clone of Linus tree or stable tree. Maybe you may try v5.15.50 from the stable tree (git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git), then we can focus on identical code base.
Thanks.
Coly Li
>>>>>>
[snipped]
>>>>>>
>>>>>> Is this a bug? It's in writeback mode. I'd setup the cache and run stuff like,
>>>>>>
>>>>>> echo 0 > /sys/block/bcache0/bcache/sequential_cutoff
>>>>>>
>>>>>> I had also echoed 0 into congested_read_threshold_us,
>>>>>> congested_write_threshold_us.
>>>>>>
>>>>>> echo writeback > /sys/block/bcache0/bcache/cache_mode
>>>>>
>>>>> Where do you get the kernel? If this is stable kernel, could you give me the HEAD commit id?
>>>>>
>>>>> Coly Li
>>>>>
>>>
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: seeing this stace trace on kernel 5.15
2022-06-29 8:17 ` Coly Li
@ 2022-06-29 10:33 ` Nikhil Kshirsagar
2022-06-29 12:30 ` Nikhil Kshirsagar
2022-06-29 13:00 ` Coly Li
0 siblings, 2 replies; 14+ messages in thread
From: Nikhil Kshirsagar @ 2022-06-29 10:33 UTC (permalink / raw)
To: Coly Li; +Cc: linux-bcache
Hi Coly,
please use this as a reference, this is the source code to refer to
for the kernel i was able to reproduce this on
(https://kernel.ubuntu.com/~kernel-ppa/mainline/v5.15.50/)
commit 18a33c8dabb88b50b860e0177a73933f2c0ddf68 (tag: v5.15.50)
Author: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Date: Sat Jun 25 15:18:40 2022 +0200
Linux 5.15.50
Link: https://lore.kernel.org/r/20220623164322.288837280@linuxfoundation.org
Tested-by: Florian Fainelli <f.fainelli@gmail.com>
Tested-by: Shuah Khan <skhan@linuxfoundation.org>
Tested-by: Bagas Sanjaya <bagasdotme@gmail.com>
Tested-by: Jon Hunter <jonathanh@nvidia.com>
Tested-by: Sudip Mukherjee <sudip.mukherjee@codethink.co.uk>
Tested-by: Ron Economos <re@w6rz.net>
Tested-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
I am able to reproduce this bug on older kernels too, like 5.4.0-121.
I will also test the latest upstream kernel soon.
Regards,
Nikhil.
On Wed, 29 Jun 2022 at 13:47, Coly Li <colyli@suse.de> wrote:
>
>
>
> > 2022年6月29日 16:09,Nikhil Kshirsagar <nkshirsagar@gmail.com> 写道:
> >
> > Hi Coly,
> >
> > Note I used partitions for the bcache as well as the hdd, not sure if
> > that's a factor.
> >
> > the kernel is upstream kernel -
> >
> > # uname -a
> > Linux bronzor 5.15.50-051550-generic #202206251445 SMP Sat Jun 25
> > 14:51:22 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux
>
>
> Hi Nikhil,
>
> I don’t find the commit id 767db4b286c3e101ac220b813c873f492d9e4ee8 fro neither Linus tree nor stable tree.
>
> The tree you mentioned at https://kernel.ubuntu.com/~kernel-ppa/mainline/v5.15.50/cod/mainline/v5.15.50 (767db4b286c3e101ac220b813c873f492d9e4ee8), I am not sure whether it is a clone of Linus tree or stable tree. Maybe you may try v5.15.50 from the stable tree (git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git), then we can focus on identical code base.
>
> Thanks.
>
> Coly Li
>
>
> >>>>>>
>
> [snipped]
> >>>>>>
> >>>>>> Is this a bug? It's in writeback mode. I'd setup the cache and run stuff like,
> >>>>>>
> >>>>>> echo 0 > /sys/block/bcache0/bcache/sequential_cutoff
> >>>>>>
> >>>>>> I had also echoed 0 into congested_read_threshold_us,
> >>>>>> congested_write_threshold_us.
> >>>>>>
> >>>>>> echo writeback > /sys/block/bcache0/bcache/cache_mode
> >>>>>
> >>>>> Where do you get the kernel? If this is stable kernel, could you give me the HEAD commit id?
> >>>>>
> >>>>> Coly Li
> >>>>>
> >>>
>
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: seeing this stace trace on kernel 5.15
2022-06-29 10:33 ` Nikhil Kshirsagar
@ 2022-06-29 12:30 ` Nikhil Kshirsagar
2022-06-29 13:03 ` Coly Li
2022-06-29 13:00 ` Coly Li
1 sibling, 1 reply; 14+ messages in thread
From: Nikhil Kshirsagar @ 2022-06-29 12:30 UTC (permalink / raw)
To: Coly Li; +Cc: linux-bcache
Hi Coly,
I am not able to see this issue on 5.18.0-051800-generic
(https://kernel.ubuntu.com/~kernel-ppa/mainline/v5.18/amd64/), so I am
guessing its been fixed since 5.15.50 sometime. If you are aware of
which commit may be the fix please let me know. I will keep testing on
5.18.0-051800-generic to see if I see it ever.
Regards,
Nikhil.
On Wed, 29 Jun 2022 at 16:03, Nikhil Kshirsagar <nkshirsagar@gmail.com> wrote:
>
> Hi Coly,
>
> please use this as a reference, this is the source code to refer to
> for the kernel i was able to reproduce this on
> (https://kernel.ubuntu.com/~kernel-ppa/mainline/v5.15.50/)
>
>
> commit 18a33c8dabb88b50b860e0177a73933f2c0ddf68 (tag: v5.15.50)
> Author: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
> Date: Sat Jun 25 15:18:40 2022 +0200
>
> Linux 5.15.50
>
> Link: https://lore.kernel.org/r/20220623164322.288837280@linuxfoundation.org
> Tested-by: Florian Fainelli <f.fainelli@gmail.com>
> Tested-by: Shuah Khan <skhan@linuxfoundation.org>
> Tested-by: Bagas Sanjaya <bagasdotme@gmail.com>
> Tested-by: Jon Hunter <jonathanh@nvidia.com>
> Tested-by: Sudip Mukherjee <sudip.mukherjee@codethink.co.uk>
> Tested-by: Ron Economos <re@w6rz.net>
> Tested-by: Guenter Roeck <linux@roeck-us.net>
> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
>
> I am able to reproduce this bug on older kernels too, like 5.4.0-121.
> I will also test the latest upstream kernel soon.
>
> Regards,
> Nikhil.
>
> On Wed, 29 Jun 2022 at 13:47, Coly Li <colyli@suse.de> wrote:
> >
> >
> >
> > > 2022年6月29日 16:09,Nikhil Kshirsagar <nkshirsagar@gmail.com> 写道:
> > >
> > > Hi Coly,
> > >
> > > Note I used partitions for the bcache as well as the hdd, not sure if
> > > that's a factor.
> > >
> > > the kernel is upstream kernel -
> > >
> > > # uname -a
> > > Linux bronzor 5.15.50-051550-generic #202206251445 SMP Sat Jun 25
> > > 14:51:22 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux
> >
> >
> > Hi Nikhil,
> >
> > I don’t find the commit id 767db4b286c3e101ac220b813c873f492d9e4ee8 fro neither Linus tree nor stable tree.
> >
> > The tree you mentioned at https://kernel.ubuntu.com/~kernel-ppa/mainline/v5.15.50/cod/mainline/v5.15.50 (767db4b286c3e101ac220b813c873f492d9e4ee8), I am not sure whether it is a clone of Linus tree or stable tree. Maybe you may try v5.15.50 from the stable tree (git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git), then we can focus on identical code base.
> >
> > Thanks.
> >
> > Coly Li
> >
> >
> > >>>>>>
> >
> > [snipped]
> > >>>>>>
> > >>>>>> Is this a bug? It's in writeback mode. I'd setup the cache and run stuff like,
> > >>>>>>
> > >>>>>> echo 0 > /sys/block/bcache0/bcache/sequential_cutoff
> > >>>>>>
> > >>>>>> I had also echoed 0 into congested_read_threshold_us,
> > >>>>>> congested_write_threshold_us.
> > >>>>>>
> > >>>>>> echo writeback > /sys/block/bcache0/bcache/cache_mode
> > >>>>>
> > >>>>> Where do you get the kernel? If this is stable kernel, could you give me the HEAD commit id?
> > >>>>>
> > >>>>> Coly Li
> > >>>>>
> > >>>
> >
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: seeing this stace trace on kernel 5.15
2022-06-29 10:33 ` Nikhil Kshirsagar
2022-06-29 12:30 ` Nikhil Kshirsagar
@ 2022-06-29 13:00 ` Coly Li
1 sibling, 0 replies; 14+ messages in thread
From: Coly Li @ 2022-06-29 13:00 UTC (permalink / raw)
To: Nikhil Kshirsagar; +Cc: linux-bcache
> 2022年6月29日 18:33,Nikhil Kshirsagar <nkshirsagar@gmail.com> 写道:
>
> Hi Coly,
>
> please use this as a reference, this is the source code to refer to
> for the kernel i was able to reproduce this on
> (https://kernel.ubuntu.com/~kernel-ppa/mainline/v5.15.50/)
>
>
Oh, I didn’t know this. Thanks for the hint.
> commit 18a33c8dabb88b50b860e0177a73933f2c0ddf68 (tag: v5.15.50)
> Author: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
> Date: Sat Jun 25 15:18:40 2022 +0200
>
> Linux 5.15.50
>
> Link: https://lore.kernel.org/r/20220623164322.288837280@linuxfoundation.org
> Tested-by: Florian Fainelli <f.fainelli@gmail.com>
> Tested-by: Shuah Khan <skhan@linuxfoundation.org>
> Tested-by: Bagas Sanjaya <bagasdotme@gmail.com>
> Tested-by: Jon Hunter <jonathanh@nvidia.com>
> Tested-by: Sudip Mukherjee <sudip.mukherjee@codethink.co.uk>
> Tested-by: Ron Economos <re@w6rz.net>
> Tested-by: Guenter Roeck <linux@roeck-us.net>
> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
>
> I am able to reproduce this bug on older kernels too, like 5.4.0-121.
> I will also test the latest upstream kernel soon.
>
Thanks.
Coly Li
> Regards,
> Nikhil.
>
> On Wed, 29 Jun 2022 at 13:47, Coly Li <colyli@suse.de> wrote:
>>
>>
>>
>>> 2022年6月29日 16:09,Nikhil Kshirsagar <nkshirsagar@gmail.com> 写道:
>>>
>>> Hi Coly,
>>>
>>> Note I used partitions for the bcache as well as the hdd, not sure if
>>> that's a factor.
>>>
>>> the kernel is upstream kernel -
>>>
>>> # uname -a
>>> Linux bronzor 5.15.50-051550-generic #202206251445 SMP Sat Jun 25
>>> 14:51:22 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux
>>
>>
>> Hi Nikhil,
>>
>> I don’t find the commit id 767db4b286c3e101ac220b813c873f492d9e4ee8 fro neither Linus tree nor stable tree.
>>
>> The tree you mentioned at https://kernel.ubuntu.com/~kernel-ppa/mainline/v5.15.50/cod/mainline/v5.15.50 (767db4b286c3e101ac220b813c873f492d9e4ee8), I am not sure whether it is a clone of Linus tree or stable tree. Maybe you may try v5.15.50 from the stable tree (git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git), then we can focus on identical code base.
>>
>> Thanks.
>>
>> Coly Li
>>
>>
>>>>>>>>
>>
>> [snipped]
>>>>>>>>
>>>>>>>> Is this a bug? It's in writeback mode. I'd setup the cache and run stuff like,
>>>>>>>>
>>>>>>>> echo 0 > /sys/block/bcache0/bcache/sequential_cutoff
>>>>>>>>
>>>>>>>> I had also echoed 0 into congested_read_threshold_us,
>>>>>>>> congested_write_threshold_us.
>>>>>>>>
>>>>>>>> echo writeback > /sys/block/bcache0/bcache/cache_mode
>>>>>>>
>>>>>>> Where do you get the kernel? If this is stable kernel, could you give me the HEAD commit id?
>>>>>>>
>>>>>>> Coly Li
>>>>>>>
>>>>>
>>
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: seeing this stace trace on kernel 5.15
2022-06-29 12:30 ` Nikhil Kshirsagar
@ 2022-06-29 13:03 ` Coly Li
2022-07-07 9:22 ` Nikhil Kshirsagar
0 siblings, 1 reply; 14+ messages in thread
From: Coly Li @ 2022-06-29 13:03 UTC (permalink / raw)
To: Nikhil Kshirsagar; +Cc: linux-bcache
> 2022年6月29日 20:30,Nikhil Kshirsagar <nkshirsagar@gmail.com> 写道:
>
> Hi Coly,
>
> I am not able to see this issue on 5.18.0-051800-generic
> (https://kernel.ubuntu.com/~kernel-ppa/mainline/v5.18/amd64/), so I am
> guessing its been fixed since 5.15.50 sometime. If you are aware of
> which commit may be the fix please let me know. I will keep testing on
> 5.18.0-051800-generic to see if I see it ever.
Hi Nikhil,
Thanks for the update. Recently there are several fixes go into stable 5.15.51 (but still one patch I just submitted to Greg yesterday, maybe it will show up in future 5.15.52).
Maybe you can try stable tree 5.15.51 with appying one more patch,
commit 7d6b902ea0e0 (“bcache: memset on stack variables in bch_btree_check() and bch_sectors_dirty_init()”)
BTW, the journal no-space deadlock fix is in 5.15.51 now.
Coly Li
>
> On Wed, 29 Jun 2022 at 16:03, Nikhil Kshirsagar <nkshirsagar@gmail.com> wrote:
>>
>> Hi Coly,
>>
>> please use this as a reference, this is the source code to refer to
>> for the kernel i was able to reproduce this on
>> (https://kernel.ubuntu.com/~kernel-ppa/mainline/v5.15.50/)
>>
>>
>> commit 18a33c8dabb88b50b860e0177a73933f2c0ddf68 (tag: v5.15.50)
>> Author: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
>> Date: Sat Jun 25 15:18:40 2022 +0200
>>
>> Linux 5.15.50
>>
>> Link: https://lore.kernel.org/r/20220623164322.288837280@linuxfoundation.org
>> Tested-by: Florian Fainelli <f.fainelli@gmail.com>
>> Tested-by: Shuah Khan <skhan@linuxfoundation.org>
>> Tested-by: Bagas Sanjaya <bagasdotme@gmail.com>
>> Tested-by: Jon Hunter <jonathanh@nvidia.com>
>> Tested-by: Sudip Mukherjee <sudip.mukherjee@codethink.co.uk>
>> Tested-by: Ron Economos <re@w6rz.net>
>> Tested-by: Guenter Roeck <linux@roeck-us.net>
>> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
>>
>> I am able to reproduce this bug on older kernels too, like 5.4.0-121.
>> I will also test the latest upstream kernel soon.
>>
>> Regards,
>> Nikhil.
>>
>> On Wed, 29 Jun 2022 at 13:47, Coly Li <colyli@suse.de> wrote:
>>>
>>>
>>>
>>>> 2022年6月29日 16:09,Nikhil Kshirsagar <nkshirsagar@gmail.com> 写道:
>>>>
>>>> Hi Coly,
>>>>
>>>> Note I used partitions for the bcache as well as the hdd, not sure if
>>>> that's a factor.
>>>>
>>>> the kernel is upstream kernel -
>>>>
>>>> # uname -a
>>>> Linux bronzor 5.15.50-051550-generic #202206251445 SMP Sat Jun 25
>>>> 14:51:22 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux
>>>
>>>
>>> Hi Nikhil,
>>>
>>> I don’t find the commit id 767db4b286c3e101ac220b813c873f492d9e4ee8 fro neither Linus tree nor stable tree.
>>>
>>> The tree you mentioned at https://kernel.ubuntu.com/~kernel-ppa/mainline/v5.15.50/cod/mainline/v5.15.50 (767db4b286c3e101ac220b813c873f492d9e4ee8), I am not sure whether it is a clone of Linus tree or stable tree. Maybe you may try v5.15.50 from the stable tree (git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git), then we can focus on identical code base.
>>>
>>> Thanks.
>>>
>>> Coly Li
>>>
>>>
>>>>>>>>>
>>>
>>> [snipped]
>>>>>>>>>
>>>>>>>>> Is this a bug? It's in writeback mode. I'd setup the cache and run stuff like,
>>>>>>>>>
>>>>>>>>> echo 0 > /sys/block/bcache0/bcache/sequential_cutoff
>>>>>>>>>
>>>>>>>>> I had also echoed 0 into congested_read_threshold_us,
>>>>>>>>> congested_write_threshold_us.
>>>>>>>>>
>>>>>>>>> echo writeback > /sys/block/bcache0/bcache/cache_mode
>>>>>>>>
>>>>>>>> Where do you get the kernel? If this is stable kernel, could you give me the HEAD commit id?
>>>>>>>>
>>>>>>>> Coly Li
>>>>>>>>
>>>>>>
>>>
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: seeing this stace trace on kernel 5.15
2022-06-29 13:03 ` Coly Li
@ 2022-07-07 9:22 ` Nikhil Kshirsagar
0 siblings, 0 replies; 14+ messages in thread
From: Nikhil Kshirsagar @ 2022-07-07 9:22 UTC (permalink / raw)
To: Coly Li; +Cc: linux-bcache
Hi Coly,
Yes, 7d6b902ea0e0 seems to be the fix for it. I cannot reproduce this
deadlock when I build this patch into 5.15.50.
Thanks for pointing me to this patch!
Regards,
Nikhil.
On Wed, 29 Jun 2022 at 18:33, Coly Li <colyli@suse.de> wrote:
>
>
>
> > 2022年6月29日 20:30,Nikhil Kshirsagar <nkshirsagar@gmail.com> 写道:
> >
> > Hi Coly,
> >
> > I am not able to see this issue on 5.18.0-051800-generic
> > (https://kernel.ubuntu.com/~kernel-ppa/mainline/v5.18/amd64/), so I am
> > guessing its been fixed since 5.15.50 sometime. If you are aware of
> > which commit may be the fix please let me know. I will keep testing on
> > 5.18.0-051800-generic to see if I see it ever.
>
>
> Hi Nikhil,
>
> Thanks for the update. Recently there are several fixes go into stable 5.15.51 (but still one patch I just submitted to Greg yesterday, maybe it will show up in future 5.15.52).
>
> Maybe you can try stable tree 5.15.51 with appying one more patch,
> commit 7d6b902ea0e0 (“bcache: memset on stack variables in bch_btree_check() and bch_sectors_dirty_init()”)
>
> BTW, the journal no-space deadlock fix is in 5.15.51 now.
>
> Coly Li
>
> >
> > On Wed, 29 Jun 2022 at 16:03, Nikhil Kshirsagar <nkshirsagar@gmail.com> wrote:
> >>
> >> Hi Coly,
> >>
> >> please use this as a reference, this is the source code to refer to
> >> for the kernel i was able to reproduce this on
> >> (https://kernel.ubuntu.com/~kernel-ppa/mainline/v5.15.50/)
> >>
> >>
> >> commit 18a33c8dabb88b50b860e0177a73933f2c0ddf68 (tag: v5.15.50)
> >> Author: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
> >> Date: Sat Jun 25 15:18:40 2022 +0200
> >>
> >> Linux 5.15.50
> >>
> >> Link: https://lore.kernel.org/r/20220623164322.288837280@linuxfoundation.org
> >> Tested-by: Florian Fainelli <f.fainelli@gmail.com>
> >> Tested-by: Shuah Khan <skhan@linuxfoundation.org>
> >> Tested-by: Bagas Sanjaya <bagasdotme@gmail.com>
> >> Tested-by: Jon Hunter <jonathanh@nvidia.com>
> >> Tested-by: Sudip Mukherjee <sudip.mukherjee@codethink.co.uk>
> >> Tested-by: Ron Economos <re@w6rz.net>
> >> Tested-by: Guenter Roeck <linux@roeck-us.net>
> >> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
> >>
> >> I am able to reproduce this bug on older kernels too, like 5.4.0-121.
> >> I will also test the latest upstream kernel soon.
> >>
> >> Regards,
> >> Nikhil.
> >>
> >> On Wed, 29 Jun 2022 at 13:47, Coly Li <colyli@suse.de> wrote:
> >>>
> >>>
> >>>
> >>>> 2022年6月29日 16:09,Nikhil Kshirsagar <nkshirsagar@gmail.com> 写道:
> >>>>
> >>>> Hi Coly,
> >>>>
> >>>> Note I used partitions for the bcache as well as the hdd, not sure if
> >>>> that's a factor.
> >>>>
> >>>> the kernel is upstream kernel -
> >>>>
> >>>> # uname -a
> >>>> Linux bronzor 5.15.50-051550-generic #202206251445 SMP Sat Jun 25
> >>>> 14:51:22 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux
> >>>
> >>>
> >>> Hi Nikhil,
> >>>
> >>> I don’t find the commit id 767db4b286c3e101ac220b813c873f492d9e4ee8 fro neither Linus tree nor stable tree.
> >>>
> >>> The tree you mentioned at https://kernel.ubuntu.com/~kernel-ppa/mainline/v5.15.50/cod/mainline/v5.15.50 (767db4b286c3e101ac220b813c873f492d9e4ee8), I am not sure whether it is a clone of Linus tree or stable tree. Maybe you may try v5.15.50 from the stable tree (git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git), then we can focus on identical code base.
> >>>
> >>> Thanks.
> >>>
> >>> Coly Li
> >>>
> >>>
> >>>>>>>>>
> >>>
> >>> [snipped]
> >>>>>>>>>
> >>>>>>>>> Is this a bug? It's in writeback mode. I'd setup the cache and run stuff like,
> >>>>>>>>>
> >>>>>>>>> echo 0 > /sys/block/bcache0/bcache/sequential_cutoff
> >>>>>>>>>
> >>>>>>>>> I had also echoed 0 into congested_read_threshold_us,
> >>>>>>>>> congested_write_threshold_us.
> >>>>>>>>>
> >>>>>>>>> echo writeback > /sys/block/bcache0/bcache/cache_mode
> >>>>>>>>
> >>>>>>>> Where do you get the kernel? If this is stable kernel, could you give me the HEAD commit id?
> >>>>>>>>
> >>>>>>>> Coly Li
> >>>>>>>>
> >>>>>>
> >>>
>
^ permalink raw reply [flat|nested] 14+ messages in thread
end of thread, other threads:[~2022-07-07 9:22 UTC | newest]
Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-06-28 6:31 seeing this stace trace on kernel 5.15 Nikhil Kshirsagar
2022-06-28 7:23 ` Nikhil Kshirsagar
2022-06-28 14:37 ` Coly Li
2022-06-29 7:02 ` Nikhil Kshirsagar
2022-06-29 7:50 ` Coly Li
2022-06-29 7:56 ` Nikhil Kshirsagar
2022-06-29 7:59 ` Coly Li
2022-06-29 8:09 ` Nikhil Kshirsagar
2022-06-29 8:17 ` Coly Li
2022-06-29 10:33 ` Nikhil Kshirsagar
2022-06-29 12:30 ` Nikhil Kshirsagar
2022-06-29 13:03 ` Coly Li
2022-07-07 9:22 ` Nikhil Kshirsagar
2022-06-29 13:00 ` Coly Li
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.