* [v4.13-rc BUG] system lockup when running big buffered write(4M) to IB SRP via mpath
@ 2017-08-08 14:17 ` Ming Lei
0 siblings, 0 replies; 16+ messages in thread
From: Ming Lei @ 2017-08-08 14:17 UTC (permalink / raw)
To: dm-devel, linux-scsi, linux-block; +Cc: loberman, bart.vanassche
Hi Guys,
Laurence and I see a system lockup issue when running concurrent
big buffered write(4M bytes) to IB SRP on v4.13-rc3.
1 how to reproduce
1) setup IB_SRR & multi path
#./start_opensm.sh
#./start_srp.sh
# cat start_opensm.sh
#!/bin/bash
rmmod ib_srpt
opensm -F opensm.1.conf &
opensm -F opensm.2.conf &
# cat start_srp.sh
run_srp_daemon -f /etc/ddn/srp_daemon.conf -R 30 -T 10 -t 7000 -ance -i mlx5_0 -p 1 1>/root/srp1.log 2>&1 &
run_srp_daemon -f /etc/ddn/srp_daemon.conf -R 30 -T 10 -t 7000 -ance -i mlx5_1 -p 1 1>/root/srp2.log 2>&1 &
2) run the following test script
#./test-mp.sh
#cat test-mp.sh
#!/bin/sh
DEVS="360014051186d24cc27b4d049ffff94e3 36001405b2b5c6c24c084b6fa4d55da2f 36001405b26ebe76dcb94a489f6f245f8"
for i in $DEVS; do
for j in `seq 0 15`; do
./hammer_write.sh $i 1>/dev/null 2>&1 &
done
done
#cat hammer_write.sh
#!/bin/bash
while true; do
dd if=/dev/zero of=/dev/mapper/$1 bs=4096k count=800
done
2 lockup log
[root@ibclient ~]# ./test-mp.sh
[root@ibclient ~]# [ 169.095494] perf: interrupt took too long (2575 > 2500), lowering kernel.perf_event_max_sample_rate to 77000
[ 178.569124] perf: interrupt took too long (3249 > 3218), lowering kernel.perf_event_max_sample_rate to 61000
[ 190.586624] perf: interrupt took too long (4066 > 4061), lowering kernel.perf_event_max_sample_rate to 49000
[ 210.381836] perf: interrupt took too long (5083 > 5082), lowering kernel.perf_event_max_sample_rate to 39000
[ 240.498659] perf: interrupt took too long (6373 > 6353), lowering kernel.perf_event_max_sample_rate to 31000
[root@ibclient ~]#
[root@ibclient ~]# [ 292.252795] perf: interrupt took too long (8006 > 7966), lowering kernel.perf_event_max_sample_rate to 24000
[ 410.354443] device-mapper: multipath: blk_get_request() returned -11 - requeuing
[ 410.432725] device-mapper: multipath: blk_get_request() returned -11 - requeuing
[ 410.499530] device-mapper: multipath: blk_get_request() returned -11 - requeuing
[ 410.564952] device-mapper: multipath: blk_get_request() returned -11 - requeuing
[ 410.644493] device-mapper: multipath: blk_get_request() returned -11 - requeuing
[ 410.698091] device-mapper: multipath: blk_get_request() returned -11 - requeuing
[ 410.755175] device-mapper: multipath: blk_get_request() returned -11 - requeuing
[ 410.796784] device-mapper: multipath: blk_get_request() returned -11 - requeuing
[ 410.837690] device-mapper: multipath: blk_get_request() returned -11 - requeuing
[ 410.878743] device-mapper: multipath: blk_get_request() returned -11 - requeuing
[ 463.476823] multipath_clone_and_map: 32 callbacks suppressed
[ 463.506773] device-mapper: multipath: blk_get_request() returned -11 - requeuing
[ 463.586752] device-mapper: multipath: blk_get_request() returned -11 - requeuing
[ 463.656880] device-mapper: multipath: blk_get_request() returned -11 - requeuing
[ 463.698919] device-mapper: multipath: blk_get_request() returned -11 - requeuing
[ 463.723572] device-mapper: multipath: blk_get_request() returned -11 - requeuing
[ 463.751820] device-mapper: multipath: blk_get_request() returned -11 - requeuing
[ 463.781110] device-mapper: multipath: blk_get_request() returned -11 - requeuing
[ 463.828267] device-mapper: multipath: blk_get_request() returned -11 - requeuing
[ 463.856997] device-mapper: multipath: blk_get_request() returned -11 - requeuing
[ 463.885998] device-mapper: multipath: blk_get_request() returned -11 - requeuing
[ 468.572060] multipath_clone_and_map: 5201 callbacks suppressed
[ 468.602191] device-mapper: multipath: blk_get_request() returned -11 - requeuing
[ 468.655761] device-mapper: multipath: blk_get_request() returned -11 - requeuing
[ 468.697540] device-mapper: multipath: blk_get_request() returned -11 - requeuing
[ 468.738610] device-mapper: multipath: blk_get_request() returned -11 - requeuing
[ 468.779597] device-mapper: multipath: blk_get_request() returned -11 - requeuing
[ 468.820705] device-mapper: multipath: blk_get_request() returned -11 - requeuing
[ 468.862712] device-mapper: multipath: blk_get_request() returned -11 - requeuing
[ 468.904662] device-mapper: multipath: blk_get_request() returned -11 - requeuing
[ 468.945132] device-mapper: multipath: blk_get_request() returned -11 - requeuing
[ 468.985128] device-mapper: multipath: blk_get_request() returned -11 - requeuing
[ 473.643013] multipath_clone_and_map: 6569 callbacks suppressed
[ 473.673466] device-mapper: multipath: blk_get_request() returned -11 - requeuing
[ 473.713929] device-mapper: multipath: blk_get_request() returned -11 - requeuing
[ 473.754989] device-mapper: multipath: blk_get_request() returned -11 - requeuing
[ 473.796069] device-mapper: multipath: blk_get_request() returned -11 - requeuing
[ 473.849768] device-mapper: multipath: blk_get_request() returned -11 - requeuing
[ 473.891640] device-mapper: multipath: blk_get_request() returned -11 - requeuing
[ 473.933321] device-mapper: multipath: blk_get_request() returned -11 - requeuing
[ 473.973791] device-mapper: multipath: blk_get_request() returned -11 - requeuing
[ 474.016739] device-mapper: multipath: blk_get_request() returned -11 - requeuing
[ 474.059858] device-mapper: multipath: blk_get_request() returned -11 - requeuing
[ 478.732265] multipath_clone_and_map: 6310 callbacks suppressed
[ 478.762164] device-mapper: multipath: blk_get_request() returned -11 - requeuing
[ 478.802327] device-mapper: multipath: blk_get_request() returned -11 - requeuing
[ 478.844235] device-mapper: multipath: blk_get_request() returned -11 - requeuing
[ 478.886113] device-mapper: multipath: blk_get_request() returned -11 - requeuing
[ 478.927164] device-mapper: multipath: blk_get_request() returned -11 - requeuing
[ 478.968141] device-mapper: multipath: blk_get_request() returned -11 - requeuing
[ 479.008772] device-mapper: multipath: blk_get_request() returned -11 - requeuing
[ 479.060225] device-mapper: multipath: blk_get_request() returned -11 - requeuing
[ 479.102873] device-mapper: multipath: blk_get_request() returned -11 - requeuing
[ 479.145341] device-mapper: multipath: blk_get_request() returned -11 - requeuing
[ 483.834884] multipath_clone_and_map: 5703 callbacks suppressed
[ 483.865056] device-mapper: multipath: blk_get_request() returned -11 - requeuing
[ 483.906792] device-mapper: multipath: blk_get_request() returned -11 - requeuing
[ 483.959492] device-mapper: multipath: blk_get_request() returned -11 - requeuing
[ 484.001768] device-mapper: multipath: blk_get_request() returned -11 - requeuing
[ 484.044785] device-mapper: multipath: blk_get_request() returned -11 - requeuing
[ 484.087289] device-mapper: multipath: blk_get_request() returned -11 - requeuing
[ 484.129854] device-mapper: multipath: blk_get_request() returned -11 - requeuing
[ 484.171317] device-mapper: multipath: blk_get_request() returned -11 - requeuing
[ 484.212314] device-mapper: multipath: blk_get_request() returned -11 - requeuing
[ 484.263931] device-mapper: multipath: blk_get_request() returned -11 - requeuing
[ 484.982608] perf: interrupt took too long (10021 > 10007), lowering kernel.perf_event_max_sample_rate to 19000
[ 485.362787] watchdog: BUG: soft lockup - CPU#11 stuck for 21s! [ksoftirqd/11:78]
[ 485.400762] Modules linked in: dm_round_robin xt_CHECKSUM ipt_MASQUERADE nf_nat_masquerade_ipv4 tun ip6t_rpfilter ipt_REJECT nf_reject_ipv4 ip6t_REJECT nf_reject_ipv6 xt_conntrack ip_set nfnetlink ebtable_nat ebtable_broute bridge stp llc ip6table_nat nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv6 ip6table_mangle ip6table_security ip6table_raw iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat rpcrdma nf_conntrack ib_isert iscsi_target_mod iptable_mangle iptable_security iptable_raw ebtable_filter ebtables ib_iser ip6table_filter ip6_tables libiscsi iptable_filter scsi_transport_iscsi target_core_mod ib_srp scsi_transport_srp ib_ipoib rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm mlx5_ib ib_core intel_powerclamp coretemp kvm_intel kvm irqbypass crct10dif_pclmul crc32_pclmul ghash_clmulni_intel
[ 485.809222] pcbc joydev ipmi_si aesni_intel sg hpwdt hpilo crypto_simd iTCO_wdt cryptd iTCO_vendor_support ipmi_devintf gpio_ich glue_helper shpchp lpc_ich ipmi_msghandler acpi_power_meter pcspkr i7core_edac pcc_cpufreq nfsd auth_rpcgss nfs_acl lockd grace sunrpc dm_multipath ip_tables xfs libcrc32c sd_mod amdkfd amd_iommu_v2 radeon mlx5_core i2c_algo_bit drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops ttm mlxfw drm ptp hpsa pps_core scsi_transport_sas crc32c_intel bnx2 i2c_core serio_raw devlink dm_mirror dm_region_hash dm_log dm_mod [last unloaded: ib_srpt]
[ 486.096352] CPU: 11 PID: 78 Comm: ksoftirqd/11 Tainted: G I 4.13.0-rc3+ #23
[ 486.142775] Hardware name: HP ProLiant DL380 G7, BIOS P67 08/16/2015
[ 486.178816] task: ffff8a3fe453df40 task.stack: ffff9a6cc65a8000
[ 486.212674] RIP: 0010:__blk_recalc_rq_segments+0xb8/0x3d0
[ 486.242444] RSP: 0018:ffff9a6cc65aba60 EFLAGS: 00000282 ORIG_RAX: ffffffffffffff10
[ 486.284086] RAX: ffff8a4a8e26e440 RBX: 0000000000001000 RCX: ffff8a4a8e26e440
[ 486.323445] RDX: 0000000000000000 RSI: 0000000000001000 RDI: ffff8a4a8e2699c8
[ 486.363614] RBP: ffff9a6cc65abae0 R08: 0000000000007000 R09: 0000000000000001
[ 486.404621] R10: 0000000000001000 R11: 0000000000000000 R12: 0000000000000000
[ 486.445127] R13: 0000000000000000 R14: 0000000000000001 R15: 0000000000000007
[ 486.486120] FS: 0000000000000000(0000) GS:ffff8a4b66940000(0000) knlGS:0000000000000000
[ 486.532771] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 486.565772] CR2: 00007f32bfcee000 CR3: 0000000d67609000 CR4: 00000000000006e0
[ 486.606325] Call Trace:
[ 486.620269] ? mempool_free+0x2b/0x80
[ 486.641248] blk_recalc_rq_segments+0x28/0x40
[ 486.666754] blk_update_request+0x249/0x310
[ 486.690838] end_clone_bio+0x46/0x70 [dm_mod]
[ 486.715949] bio_endio+0xa1/0x120
[ 486.734408] blk_update_request+0xb7/0x310
[ 486.757313] scsi_end_request+0x38/0x1d0
[ 486.779375] scsi_io_completion+0x13c/0x630
[ 486.802928] scsi_finish_command+0xd9/0x120
[ 486.825875] scsi_softirq_done+0x12a/0x150
[ 486.849029] __blk_mq_complete_request+0x90/0x140
[ 486.875484] blk_mq_complete_request+0x16/0x20
[ 486.900631] scsi_mq_done+0x21/0x70
[ 486.920462] srp_recv_done+0x48f/0x610 [ib_srp]
[ 486.946212] __ib_process_cq+0x5c/0xb0 [ib_core]
[ 486.972576] ib_poll_handler+0x22/0x70 [ib_core]
[ 486.998638] irq_poll_softirq+0xaf/0x110
[ 487.021727] __do_softirq+0xc9/0x269
[ 487.042027] run_ksoftirqd+0x29/0x50
[ 487.062675] smpboot_thread_fn+0x110/0x160
[ 487.085597] kthread+0x109/0x140
[ 487.104113] ? sort_range+0x30/0x30
[ 487.124120] ? kthread_park+0x60/0x60
[ 487.144624] ret_from_fork+0x25/0x30
[ 487.164640] Code: c3 83 fa 09 0f 84 8e 01 00 00 48 8b 44 24 20 48 8b 4c 24 28 45 31 ff 45 31 c0 45 31 c9 0f b6 80 36 06 00 00 88 44 24 17 8b 59 28 <44> 8b 71 2c 44 8b 61 34 85 db 0f 84 92 01 00 00 45 89 f5 45 89
Message from syslogd@ibclient at Aug 8 04:12:23 ...
kernel:watchdog: BUG: soft lockup - CPU#11 stuck for 21s! [ksoftirqd/11:78]
Thanks,
Ming
^ permalink raw reply [flat|nested] 16+ messages in thread
* [v4.13-rc BUG] system lockup when running big buffered write(4M) to IB SRP via mpath
@ 2017-08-08 14:17 ` Ming Lei
0 siblings, 0 replies; 16+ messages in thread
From: Ming Lei @ 2017-08-08 14:17 UTC (permalink / raw)
To: dm-devel, linux-scsi, linux-block; +Cc: loberman, bart.vanassche
Hi Guys,
Laurence and I see a system lockup issue when running concurrent
big buffered write(4M bytes) to IB SRP on v4.13-rc3.
1 how to reproduce
1) setup IB_SRR & multi path
#./start_opensm.sh
#./start_srp.sh
# cat start_opensm.sh
#!/bin/bash
rmmod ib_srpt
opensm -F opensm.1.conf &
opensm -F opensm.2.conf &
# cat start_srp.sh
run_srp_daemon -f /etc/ddn/srp_daemon.conf -R 30 -T 10 -t 7000 -ance -i mlx5_0 -p 1 1>/root/srp1.log 2>&1 &
run_srp_daemon -f /etc/ddn/srp_daemon.conf -R 30 -T 10 -t 7000 -ance -i mlx5_1 -p 1 1>/root/srp2.log 2>&1 &
2) run the following test script
#./test-mp.sh
#cat test-mp.sh
#!/bin/sh
DEVS="360014051186d24cc27b4d049ffff94e3 36001405b2b5c6c24c084b6fa4d55da2f 36001405b26ebe76dcb94a489f6f245f8"
for i in $DEVS; do
for j in `seq 0 15`; do
./hammer_write.sh $i 1>/dev/null 2>&1 &
done
done
#cat hammer_write.sh
#!/bin/bash
while true; do
dd if=/dev/zero of=/dev/mapper/$1 bs=4096k count=800
done
2 lockup log
[root@ibclient ~]# ./test-mp.sh
[root@ibclient ~]# [ 169.095494] perf: interrupt took too long (2575 > 2500), lowering kernel.perf_event_max_sample_rate to 77000
[ 178.569124] perf: interrupt took too long (3249 > 3218), lowering kernel.perf_event_max_sample_rate to 61000
[ 190.586624] perf: interrupt took too long (4066 > 4061), lowering kernel.perf_event_max_sample_rate to 49000
[ 210.381836] perf: interrupt took too long (5083 > 5082), lowering kernel.perf_event_max_sample_rate to 39000
[ 240.498659] perf: interrupt took too long (6373 > 6353), lowering kernel.perf_event_max_sample_rate to 31000
[root@ibclient ~]#
[root@ibclient ~]# [ 292.252795] perf: interrupt took too long (8006 > 7966), lowering kernel.perf_event_max_sample_rate to 24000
[ 410.354443] device-mapper: multipath: blk_get_request() returned -11 - requeuing
[ 410.432725] device-mapper: multipath: blk_get_request() returned -11 - requeuing
[ 410.499530] device-mapper: multipath: blk_get_request() returned -11 - requeuing
[ 410.564952] device-mapper: multipath: blk_get_request() returned -11 - requeuing
[ 410.644493] device-mapper: multipath: blk_get_request() returned -11 - requeuing
[ 410.698091] device-mapper: multipath: blk_get_request() returned -11 - requeuing
[ 410.755175] device-mapper: multipath: blk_get_request() returned -11 - requeuing
[ 410.796784] device-mapper: multipath: blk_get_request() returned -11 - requeuing
[ 410.837690] device-mapper: multipath: blk_get_request() returned -11 - requeuing
[ 410.878743] device-mapper: multipath: blk_get_request() returned -11 - requeuing
[ 463.476823] multipath_clone_and_map: 32 callbacks suppressed
[ 463.506773] device-mapper: multipath: blk_get_request() returned -11 - requeuing
[ 463.586752] device-mapper: multipath: blk_get_request() returned -11 - requeuing
[ 463.656880] device-mapper: multipath: blk_get_request() returned -11 - requeuing
[ 463.698919] device-mapper: multipath: blk_get_request() returned -11 - requeuing
[ 463.723572] device-mapper: multipath: blk_get_request() returned -11 - requeuing
[ 463.751820] device-mapper: multipath: blk_get_request() returned -11 - requeuing
[ 463.781110] device-mapper: multipath: blk_get_request() returned -11 - requeuing
[ 463.828267] device-mapper: multipath: blk_get_request() returned -11 - requeuing
[ 463.856997] device-mapper: multipath: blk_get_request() returned -11 - requeuing
[ 463.885998] device-mapper: multipath: blk_get_request() returned -11 - requeuing
[ 468.572060] multipath_clone_and_map: 5201 callbacks suppressed
[ 468.602191] device-mapper: multipath: blk_get_request() returned -11 - requeuing
[ 468.655761] device-mapper: multipath: blk_get_request() returned -11 - requeuing
[ 468.697540] device-mapper: multipath: blk_get_request() returned -11 - requeuing
[ 468.738610] device-mapper: multipath: blk_get_request() returned -11 - requeuing
[ 468.779597] device-mapper: multipath: blk_get_request() returned -11 - requeuing
[ 468.820705] device-mapper: multipath: blk_get_request() returned -11 - requeuing
[ 468.862712] device-mapper: multipath: blk_get_request() returned -11 - requeuing
[ 468.904662] device-mapper: multipath: blk_get_request() returned -11 - requeuing
[ 468.945132] device-mapper: multipath: blk_get_request() returned -11 - requeuing
[ 468.985128] device-mapper: multipath: blk_get_request() returned -11 - requeuing
[ 473.643013] multipath_clone_and_map: 6569 callbacks suppressed
[ 473.673466] device-mapper: multipath: blk_get_request() returned -11 - requeuing
[ 473.713929] device-mapper: multipath: blk_get_request() returned -11 - requeuing
[ 473.754989] device-mapper: multipath: blk_get_request() returned -11 - requeuing
[ 473.796069] device-mapper: multipath: blk_get_request() returned -11 - requeuing
[ 473.849768] device-mapper: multipath: blk_get_request() returned -11 - requeuing
[ 473.891640] device-mapper: multipath: blk_get_request() returned -11 - requeuing
[ 473.933321] device-mapper: multipath: blk_get_request() returned -11 - requeuing
[ 473.973791] device-mapper: multipath: blk_get_request() returned -11 - requeuing
[ 474.016739] device-mapper: multipath: blk_get_request() returned -11 - requeuing
[ 474.059858] device-mapper: multipath: blk_get_request() returned -11 - requeuing
[ 478.732265] multipath_clone_and_map: 6310 callbacks suppressed
[ 478.762164] device-mapper: multipath: blk_get_request() returned -11 - requeuing
[ 478.802327] device-mapper: multipath: blk_get_request() returned -11 - requeuing
[ 478.844235] device-mapper: multipath: blk_get_request() returned -11 - requeuing
[ 478.886113] device-mapper: multipath: blk_get_request() returned -11 - requeuing
[ 478.927164] device-mapper: multipath: blk_get_request() returned -11 - requeuing
[ 478.968141] device-mapper: multipath: blk_get_request() returned -11 - requeuing
[ 479.008772] device-mapper: multipath: blk_get_request() returned -11 - requeuing
[ 479.060225] device-mapper: multipath: blk_get_request() returned -11 - requeuing
[ 479.102873] device-mapper: multipath: blk_get_request() returned -11 - requeuing
[ 479.145341] device-mapper: multipath: blk_get_request() returned -11 - requeuing
[ 483.834884] multipath_clone_and_map: 5703 callbacks suppressed
[ 483.865056] device-mapper: multipath: blk_get_request() returned -11 - requeuing
[ 483.906792] device-mapper: multipath: blk_get_request() returned -11 - requeuing
[ 483.959492] device-mapper: multipath: blk_get_request() returned -11 - requeuing
[ 484.001768] device-mapper: multipath: blk_get_request() returned -11 - requeuing
[ 484.044785] device-mapper: multipath: blk_get_request() returned -11 - requeuing
[ 484.087289] device-mapper: multipath: blk_get_request() returned -11 - requeuing
[ 484.129854] device-mapper: multipath: blk_get_request() returned -11 - requeuing
[ 484.171317] device-mapper: multipath: blk_get_request() returned -11 - requeuing
[ 484.212314] device-mapper: multipath: blk_get_request() returned -11 - requeuing
[ 484.263931] device-mapper: multipath: blk_get_request() returned -11 - requeuing
[ 484.982608] perf: interrupt took too long (10021 > 10007), lowering kernel.perf_event_max_sample_rate to 19000
[ 485.362787] watchdog: BUG: soft lockup - CPU#11 stuck for 21s! [ksoftirqd/11:78]
[ 485.400762] Modules linked in: dm_round_robin xt_CHECKSUM ipt_MASQUERADE nf_nat_masquerade_ipv4 tun ip6t_rpfilter ipt_REJECT nf_reject_ipv4 ip6t_REJECT nf_reject_ipv6 xt_conntrack ip_set nfnetlink ebtable_nat ebtable_broute bridge stp llc ip6table_nat nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv6 ip6table_mangle ip6table_security ip6table_raw iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat rpcrdma nf_conntrack ib_isert iscsi_target_mod iptable_mangle iptable_security iptable_raw ebtable_filter ebtables ib_iser ip6table_filter ip6_tables libiscsi iptable_filter scsi_transport_iscsi target_core_mod ib_srp scsi_transport_srp ib_ipoib rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm mlx5_ib ib_core intel_powerclamp coretemp kvm_intel kvm irqbypass crct10dif_pclmul crc32_
pclmul ghash_clmulni_intel
[ 485.809222] pcbc joydev ipmi_si aesni_intel sg hpwdt hpilo crypto_simd iTCO_wdt cryptd iTCO_vendor_support ipmi_devintf gpio_ich glue_helper shpchp lpc_ich ipmi_msghandler acpi_power_meter pcspkr i7core_edac pcc_cpufreq nfsd auth_rpcgss nfs_acl lockd grace sunrpc dm_multipath ip_tables xfs libcrc32c sd_mod amdkfd amd_iommu_v2 radeon mlx5_core i2c_algo_bit drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops ttm mlxfw drm ptp hpsa pps_core scsi_transport_sas crc32c_intel bnx2 i2c_core serio_raw devlink dm_mirror dm_region_hash dm_log dm_mod [last unloaded: ib_srpt]
[ 486.096352] CPU: 11 PID: 78 Comm: ksoftirqd/11 Tainted: G I 4.13.0-rc3+ #23
[ 486.142775] Hardware name: HP ProLiant DL380 G7, BIOS P67 08/16/2015
[ 486.178816] task: ffff8a3fe453df40 task.stack: ffff9a6cc65a8000
[ 486.212674] RIP: 0010:__blk_recalc_rq_segments+0xb8/0x3d0
[ 486.242444] RSP: 0018:ffff9a6cc65aba60 EFLAGS: 00000282 ORIG_RAX: ffffffffffffff10
[ 486.284086] RAX: ffff8a4a8e26e440 RBX: 0000000000001000 RCX: ffff8a4a8e26e440
[ 486.323445] RDX: 0000000000000000 RSI: 0000000000001000 RDI: ffff8a4a8e2699c8
[ 486.363614] RBP: ffff9a6cc65abae0 R08: 0000000000007000 R09: 0000000000000001
[ 486.404621] R10: 0000000000001000 R11: 0000000000000000 R12: 0000000000000000
[ 486.445127] R13: 0000000000000000 R14: 0000000000000001 R15: 0000000000000007
[ 486.486120] FS: 0000000000000000(0000) GS:ffff8a4b66940000(0000) knlGS:0000000000000000
[ 486.532771] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 486.565772] CR2: 00007f32bfcee000 CR3: 0000000d67609000 CR4: 00000000000006e0
[ 486.606325] Call Trace:
[ 486.620269] ? mempool_free+0x2b/0x80
[ 486.641248] blk_recalc_rq_segments+0x28/0x40
[ 486.666754] blk_update_request+0x249/0x310
[ 486.690838] end_clone_bio+0x46/0x70 [dm_mod]
[ 486.715949] bio_endio+0xa1/0x120
[ 486.734408] blk_update_request+0xb7/0x310
[ 486.757313] scsi_end_request+0x38/0x1d0
[ 486.779375] scsi_io_completion+0x13c/0x630
[ 486.802928] scsi_finish_command+0xd9/0x120
[ 486.825875] scsi_softirq_done+0x12a/0x150
[ 486.849029] __blk_mq_complete_request+0x90/0x140
[ 486.875484] blk_mq_complete_request+0x16/0x20
[ 486.900631] scsi_mq_done+0x21/0x70
[ 486.920462] srp_recv_done+0x48f/0x610 [ib_srp]
[ 486.946212] __ib_process_cq+0x5c/0xb0 [ib_core]
[ 486.972576] ib_poll_handler+0x22/0x70 [ib_core]
[ 486.998638] irq_poll_softirq+0xaf/0x110
[ 487.021727] __do_softirq+0xc9/0x269
[ 487.042027] run_ksoftirqd+0x29/0x50
[ 487.062675] smpboot_thread_fn+0x110/0x160
[ 487.085597] kthread+0x109/0x140
[ 487.104113] ? sort_range+0x30/0x30
[ 487.124120] ? kthread_park+0x60/0x60
[ 487.144624] ret_from_fork+0x25/0x30
[ 487.164640] Code: c3 83 fa 09 0f 84 8e 01 00 00 48 8b 44 24 20 48 8b 4c 24 28 45 31 ff 45 31 c0 45 31 c9 0f b6 80 36 06 00 00 88 44 24 17 8b 59 28 <44> 8b 71 2c 44 8b 61 34 85 db 0f 84 92 01 00 00 45 89 f5 45 89
Message from syslogd@ibclient at Aug 8 04:12:23 ...
kernel:watchdog: BUG: soft lockup - CPU#11 stuck for 21s! [ksoftirqd/11:78]
Thanks,
Ming
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [v4.13-rc BUG] system lockup when running big buffered write(4M) to IB SRP via mpath
2017-08-08 14:17 ` Ming Lei
@ 2017-08-08 23:10 ` Bart Van Assche
-1 siblings, 0 replies; 16+ messages in thread
From: Bart Van Assche @ 2017-08-08 23:10 UTC (permalink / raw)
To: dm-devel, linux-scsi, linux-block, ming.lei; +Cc: loberman
T24gVHVlLCAyMDE3LTA4LTA4IGF0IDIyOjE3ICswODAwLCBNaW5nIExlaSB3cm90ZToNCj4gTGF1
cmVuY2UgYW5kIEkgc2VlIGEgc3lzdGVtIGxvY2t1cCBpc3N1ZSB3aGVuIHJ1bm5pbmcgY29uY3Vy
cmVudA0KPiBiaWcgYnVmZmVyZWQgd3JpdGUoNE0gYnl0ZXMpIHRvIElCIFNSUCBvbiB2NC4xMy1y
YzMuDQo+IFsgLi4uIF0gDQo+IAkjY2F0IGhhbW1lcl93cml0ZS5zaA0KPiAJIyEvYmluL2Jhc2gN
Cj4gCXdoaWxlIHRydWU7IGRvDQo+IAkJZGQgaWY9L2Rldi96ZXJvIG9mPS9kZXYvbWFwcGVyLyQx
IGJzPTQwOTZrIGNvdW50PTgwMCAJDQo+ICAgICAgIGRvbmUNCg0KSGVsbG8gTGF1cmVuY2UsDQoN
CklzIHlvdXIgZ29hbCBwZXJoYXBzIHRvIHNpbXVsYXRlIGEgREROIHdvcmtsb2FkPyBJbiB0aGF0
IGNhc2UgSSB0aGluayB5b3UNCm5lZWQgb2ZsYWc9ZGlyZWN0IHRvIHRoZSBkZCBhcmd1bWVudCBs
aXN0IHN1Y2ggdGhhdCB0aGUgcGFnZSBjYWNoZSB3cml0ZWJhY2sNCmNvZGUgZG9lcyBub3QgYWx0
ZXIgdGhlIHNpemUgb2YgdGhlIHdyaXRlIHJlcXVlc3RzLiBBbnl3YXksIHRoaXMgdGVzdCBzaG91
bGQNCm5vdCB0cmlnZ2VyIGEgbG9ja3VwLiBDYW4geW91IGNoZWNrIHdoZXRoZXIgdGhlIHBhdGNo
IGJlbG93IG1ha2VzIHRoZSBzb2Z0DQpsb2NrdXAgY29tcGxhaW50cyBkaXNhcHBlYXIgKHdpdGhv
dXQgY2hhbmdpbmcgdGhlIGhhbW1lcl93cml0ZS5zaCB0ZXN0DQpzY3JpcHQpPw0KDQpUaGFua3Ms
DQoNCkJhcnQuDQotLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0t
LS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tDQpbUEFUQ0hdIGJsb2NrOiBNYWtlIGJsa19t
cV9kZWxheV9raWNrX3JlcXVldWVfbGlzdCgpIHJlcnVuIHRoZSBxdWV1ZSBhdCBhDQpxdWlldCB0
aW1lDQoNCkRyaXZlcnMgbGlrZSBkbS1tcGF0aCByZXF1ZXVlIHJlcXVlc3RzIGlmIG5vIHBhdGhz
IGFyZSBhdmFpbGFibGUgYW5kDQppZiBjb25maWd1cmVkIHRvIGRvIHNvLiBJZiB0aGUgcXVldWUg
ZGVwdGggaXMgc3VmZmljaWVudGx5IGhpZ2ggYW5kDQp0aGUgcXVldWUgcmVydW5uaW5nIGRlbGF5
IHN1ZmZpY2llbnRseSBzaG9ydCB0aGVuIC5yZXF1ZXVlX3dvcmsgY2FuDQpiZSBxdWV1ZWQgc28g
b2Z0ZW4gdGhhdCBvdGhlciB3b3JrIGl0ZW1zIHF1ZXVlZCBvbiB0aGUgc2FtZSB3b3JrIHF1ZXVl
DQpkbyBub3QgZ2V0IGV4ZWN1dGVkLiBBdm9pZCB0aGF0IHRoaXMgaGFwcGVucyBieSBvbmx5IHJl
cnVubmluZyB0aGUNCnF1ZXVlIGFmdGVyIG5vIGJsa19tcV9kZWxheV9raWNrX3JlcXVldWVfbGlz
dCgpIGNhbGxzIGhhdmUgb2NjdXJyZWQNCmR1cmluZyBAbXNlY3MgbWlsbGlzZWNvbmRzLiBTaW5j
ZSB0aGUgZGV2aWNlIG1hcHBlciBjb3JlIGlzIHRoZSBvbmx5DQp1c2VyIG9mIGJsa19tcV9kZWxh
eV9raWNrX3JlcXVldWVfbGlzdCgpLCBtb2RpZnkgdGhlIGltcGxlbWVudGF0aW9uDQpvZiB0aGlz
IGZ1bmN0aW9uIGluc3RlYWQgb2YgY3JlYXRpbmcgYSBuZXcgZnVuY3Rpb24uDQoNClNpZ25lZC1v
ZmYtYnk6IEJhcnQgVmFuIEFzc2NoZSA8YmFydC52YW5hc3NjaGVAd2RjLmNvbT4NCi0tLQ0KIGJs
b2NrL2Jsay1tcS5jIHwgNCArKy0tDQogMSBmaWxlIGNoYW5nZWQsIDIgaW5zZXJ0aW9ucygrKSwg
MiBkZWxldGlvbnMoLSkNCg0KZGlmZiAtLWdpdCBhL2Jsb2NrL2Jsay1tcS5jIGIvYmxvY2svYmxr
LW1xLmMNCmluZGV4IDA0MWY3YjdmYTBkNi4uOGJmZWEzNmU5MmY5IDEwMDY0NA0KLS0tIGEvYmxv
Y2svYmxrLW1xLmMNCisrKyBiL2Jsb2NrL2Jsay1tcS5jDQpAQCAtNjc5LDggKzY3OSw4IEBAIEVY
UE9SVF9TWU1CT0woYmxrX21xX2tpY2tfcmVxdWV1ZV9saXN0KTsNCiB2b2lkIGJsa19tcV9kZWxh
eV9raWNrX3JlcXVldWVfbGlzdChzdHJ1Y3QgcmVxdWVzdF9xdWV1ZSAqcSwNCiAJCQkJICAgIHVu
c2lnbmVkIGxvbmcgbXNlY3MpDQogew0KLQlrYmxvY2tkX3NjaGVkdWxlX2RlbGF5ZWRfd29yaygm
cS0+cmVxdWV1ZV93b3JrLA0KLQkJCQkgICAgICBtc2Vjc190b19qaWZmaWVzKG1zZWNzKSk7DQor
CWtibG9ja2RfbW9kX2RlbGF5ZWRfd29ya19vbihXT1JLX0NQVV9VTkJPVU5ELCAmcS0+cmVxdWV1
ZV93b3JrLA0KKwkJCQkgICAgbXNlY3NfdG9famlmZmllcyhtc2VjcykpOw0KIH0NCiBFWFBPUlRf
U1lNQk9MKGJsa19tcV9kZWxheV9raWNrX3JlcXVldWVfbGlzdCk7DQo=
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [v4.13-rc BUG] system lockup when running big buffered write(4M) to IB SRP via mpath
@ 2017-08-08 23:10 ` Bart Van Assche
0 siblings, 0 replies; 16+ messages in thread
From: Bart Van Assche @ 2017-08-08 23:10 UTC (permalink / raw)
To: dm-devel, linux-scsi, linux-block, ming.lei; +Cc: loberman
On Tue, 2017-08-08 at 22:17 +0800, Ming Lei wrote:
> Laurence and I see a system lockup issue when running concurrent
> big buffered write(4M bytes) to IB SRP on v4.13-rc3.
> [ ... ]
> #cat hammer_write.sh
> #!/bin/bash
> while true; do
> dd if=/dev/zero of=/dev/mapper/$1 bs=4096k count=800
> done
Hello Laurence,
Is your goal perhaps to simulate a DDN workload? In that case I think you
need oflag=direct to the dd argument list such that the page cache writeback
code does not alter the size of the write requests. Anyway, this test should
not trigger a lockup. Can you check whether the patch below makes the soft
lockup complaints disappear (without changing the hammer_write.sh test
script)?
Thanks,
Bart.
----------------------------------------------------------------------------
[PATCH] block: Make blk_mq_delay_kick_requeue_list() rerun the queue at a
quiet time
Drivers like dm-mpath requeue requests if no paths are available and
if configured to do so. If the queue depth is sufficiently high and
the queue rerunning delay sufficiently short then .requeue_work can
be queued so often that other work items queued on the same work queue
do not get executed. Avoid that this happens by only rerunning the
queue after no blk_mq_delay_kick_requeue_list() calls have occurred
during @msecs milliseconds. Since the device mapper core is the only
user of blk_mq_delay_kick_requeue_list(), modify the implementation
of this function instead of creating a new function.
Signed-off-by: Bart Van Assche <bart.vanassche@wdc.com>
---
block/blk-mq.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/block/blk-mq.c b/block/blk-mq.c
index 041f7b7fa0d6..8bfea36e92f9 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -679,8 +679,8 @@ EXPORT_SYMBOL(blk_mq_kick_requeue_list);
void blk_mq_delay_kick_requeue_list(struct request_queue *q,
unsigned long msecs)
{
- kblockd_schedule_delayed_work(&q->requeue_work,
- msecs_to_jiffies(msecs));
+ kblockd_mod_delayed_work_on(WORK_CPU_UNBOUND, &q->requeue_work,
+ msecs_to_jiffies(msecs));
}
EXPORT_SYMBOL(blk_mq_delay_kick_requeue_list);
^ permalink raw reply related [flat|nested] 16+ messages in thread
* Re: [v4.13-rc BUG] system lockup when running big buffered write(4M) to IB SRP via mpath
2017-08-08 14:17 ` Ming Lei
(?)
(?)
@ 2017-08-09 0:11 ` Laurence Oberman
2017-08-09 2:28 ` Laurence Oberman
-1 siblings, 1 reply; 16+ messages in thread
From: Laurence Oberman @ 2017-08-09 0:11 UTC (permalink / raw)
To: Ming Lei, dm-devel, linux-scsi, linux-block; +Cc: bart.vanassche
On Tue, 2017-08-08 at 22:17 +0800, Ming Lei wrote:
> Hi Guys,
>
> Laurence and I see a system lockup issue when running concurrent
> big buffered write(4M bytes) to IB SRP on v4.13-rc3.
>
> 1 how to reproduce
>
> 1) setup IB_SRR & multi path
>
> #./start_opensm.sh
> #./start_srp.sh
>
> # cat start_opensm.sh
> #!/bin/bash
> rmmod ib_srpt
> opensm -F opensm.1.conf &
> opensm -F opensm.2.conf &
>
> # cat start_srp.sh
> run_srp_daemon -f /etc/ddn/srp_daemon.conf -R 30 -T 10 -t 7000
> -ance -i mlx5_0 -p 1 1>/root/srp1.log 2>&1 &
> run_srp_daemon -f /etc/ddn/srp_daemon.conf -R 30 -T 10 -t 7000
> -ance -i mlx5_1 -p 1 1>/root/srp2.log 2>&1 &
>
>
> 2) run the following test script
>
> #./test-mp.sh
>
> #cat test-mp.sh
> #!/bin/sh
> DEVS="360014051186d24cc27b4d049ffff94e3
> 36001405b2b5c6c24c084b6fa4d55da2f 36001405b26ebe76dcb94a489f6f245f8"
>
> for i in $DEVS; do
> for j in `seq 0 15`; do
> ./hammer_write.sh $i 1>/dev/null 2>&1 &
> done
> done
>
> #cat hammer_write.sh
> #!/bin/bash
> while true; do
> dd if=/dev/zero of=/dev/mapper/$1 bs=4096k count=800
> done
>
>
> 2 lockup log
> [root@ibclient ~]# ./test-mp.sh
> [root@ibclient ~]# [ 169.095494] perf: interrupt took too long (2575
> > 2500), lowering kernel.perf_event_max_sample_rate to 77000
> [ 178.569124] perf: interrupt took too long (3249 > 3218), lowering
> kernel.perf_event_max_sample_rate to 61000
> [ 190.586624] perf: interrupt took too long (4066 > 4061), lowering
> kernel.perf_event_max_sample_rate to 49000
> [ 210.381836] perf: interrupt took too long (5083 > 5082), lowering
> kernel.perf_event_max_sample_rate to 39000
> [ 240.498659] perf: interrupt took too long (6373 > 6353), lowering
> kernel.perf_event_max_sample_rate to 31000
>
> [root@ibclient ~]#
> [root@ibclient ~]# [ 292.252795] perf: interrupt took too long (8006
> > 7966), lowering kernel.perf_event_max_sample_rate to 24000
> [ 410.354443] device-mapper: multipath: blk_get_request() returned
> -11 - requeuing
> [ 410.432725] device-mapper: multipath: blk_get_request() returned
> -11 - requeuing
> [ 410.499530] device-mapper: multipath: blk_get_request() returned
> -11 - requeuing
> [ 410.564952] device-mapper: multipath: blk_get_request() returned
> -11 - requeuing
> [ 410.644493] device-mapper: multipath: blk_get_request() returned
> -11 - requeuing
> [ 410.698091] device-mapper: multipath: blk_get_request() returned
> -11 - requeuing
> [ 410.755175] device-mapper: multipath: blk_get_request() returned
> -11 - requeuing
> [ 410.796784] device-mapper: multipath: blk_get_request() returned
> -11 - requeuing
> [ 410.837690] device-mapper: multipath: blk_get_request() returned
> -11 - requeuing
> [ 410.878743] device-mapper: multipath: blk_get_request() returned
> -11 - requeuing
> [ 463.476823] multipath_clone_and_map: 32 callbacks suppressed
> [ 463.506773] device-mapper: multipath: blk_get_request() returned
> -11 - requeuing
> [ 463.586752] device-mapper: multipath: blk_get_request() returned
> -11 - requeuing
> [ 463.656880] device-mapper: multipath: blk_get_request() returned
> -11 - requeuing
> [ 463.698919] device-mapper: multipath: blk_get_request() returned
> -11 - requeuing
> [ 463.723572] device-mapper: multipath: blk_get_request() returned
> -11 - requeuing
> [ 463.751820] device-mapper: multipath: blk_get_request() returned
> -11 - requeuing
> [ 463.781110] device-mapper: multipath: blk_get_request() returned
> -11 - requeuing
> [ 463.828267] device-mapper: multipath: blk_get_request() returned
> -11 - requeuing
> [ 463.856997] device-mapper: multipath: blk_get_request() returned
> -11 - requeuing
> [ 463.885998] device-mapper: multipath: blk_get_request() returned
> -11 - requeuing
> [ 468.572060] multipath_clone_and_map: 5201 callbacks suppressed
> [ 468.602191] device-mapper: multipath: blk_get_request() returned
> -11 - requeuing
> [ 468.655761] device-mapper: multipath: blk_get_request() returned
> -11 - requeuing
> [ 468.697540] device-mapper: multipath: blk_get_request() returned
> -11 - requeuing
> [ 468.738610] device-mapper: multipath: blk_get_request() returned
> -11 - requeuing
> [ 468.779597] device-mapper: multipath: blk_get_request() returned
> -11 - requeuing
> [ 468.820705] device-mapper: multipath: blk_get_request() returned
> -11 - requeuing
> [ 468.862712] device-mapper: multipath: blk_get_request() returned
> -11 - requeuing
> [ 468.904662] device-mapper: multipath: blk_get_request() returned
> -11 - requeuing
> [ 468.945132] device-mapper: multipath: blk_get_request() returned
> -11 - requeuing
> [ 468.985128] device-mapper: multipath: blk_get_request() returned
> -11 - requeuing
> [ 473.643013] multipath_clone_and_map: 6569 callbacks suppressed
> [ 473.673466] device-mapper: multipath: blk_get_request() returned
> -11 - requeuing
> [ 473.713929] device-mapper: multipath: blk_get_request() returned
> -11 - requeuing
> [ 473.754989] device-mapper: multipath: blk_get_request() returned
> -11 - requeuing
> [ 473.796069] device-mapper: multipath: blk_get_request() returned
> -11 - requeuing
> [ 473.849768] device-mapper: multipath: blk_get_request() returned
> -11 - requeuing
> [ 473.891640] device-mapper: multipath: blk_get_request() returned
> -11 - requeuing
> [ 473.933321] device-mapper: multipath: blk_get_request() returned
> -11 - requeuing
> [ 473.973791] device-mapper: multipath: blk_get_request() returned
> -11 - requeuing
> [ 474.016739] device-mapper: multipath: blk_get_request() returned
> -11 - requeuing
> [ 474.059858] device-mapper: multipath: blk_get_request() returned
> -11 - requeuing
> [ 478.732265] multipath_clone_and_map: 6310 callbacks suppressed
> [ 478.762164] device-mapper: multipath: blk_get_request() returned
> -11 - requeuing
> [ 478.802327] device-mapper: multipath: blk_get_request() returned
> -11 - requeuing
> [ 478.844235] device-mapper: multipath: blk_get_request() returned
> -11 - requeuing
> [ 478.886113] device-mapper: multipath: blk_get_request() returned
> -11 - requeuing
> [ 478.927164] device-mapper: multipath: blk_get_request() returned
> -11 - requeuing
> [ 478.968141] device-mapper: multipath: blk_get_request() returned
> -11 - requeuing
> [ 479.008772] device-mapper: multipath: blk_get_request() returned
> -11 - requeuing
> [ 479.060225] device-mapper: multipath: blk_get_request() returned
> -11 - requeuing
> [ 479.102873] device-mapper: multipath: blk_get_request() returned
> -11 - requeuing
> [ 479.145341] device-mapper: multipath: blk_get_request() returned
> -11 - requeuing
> [ 483.834884] multipath_clone_and_map: 5703 callbacks suppressed
> [ 483.865056] device-mapper: multipath: blk_get_request() returned
> -11 - requeuing
> [ 483.906792] device-mapper: multipath: blk_get_request() returned
> -11 - requeuing
> [ 483.959492] device-mapper: multipath: blk_get_request() returned
> -11 - requeuing
> [ 484.001768] device-mapper: multipath: blk_get_request() returned
> -11 - requeuing
> [ 484.044785] device-mapper: multipath: blk_get_request() returned
> -11 - requeuing
> [ 484.087289] device-mapper: multipath: blk_get_request() returned
> -11 - requeuing
> [ 484.129854] device-mapper: multipath: blk_get_request() returned
> -11 - requeuing
> [ 484.171317] device-mapper: multipath: blk_get_request() returned
> -11 - requeuing
> [ 484.212314] device-mapper: multipath: blk_get_request() returned
> -11 - requeuing
> [ 484.263931] device-mapper: multipath: blk_get_request() returned
> -11 - requeuing
> [ 484.982608] perf: interrupt took too long (10021 > 10007),
> lowering kernel.perf_event_max_sample_rate to 19000
> [ 485.362787] watchdog: BUG: soft lockup - CPU#11 stuck for 21s!
> [ksoftirqd/11:78]
> [ 485.400762] Modules linked in: dm_round_robin xt_CHECKSUM
> ipt_MASQUERADE nf_nat_masquerade_ipv4 tun ip6t_rpfilter ipt_REJECT
> nf_reject_ipv4 ip6t_REJECT nf_reject_ipv6 xt_conntrack ip_set
> nfnetlink ebtable_nat ebtable_broute bridge stp llc ip6table_nat
> nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv6 ip6table_mangle
> ip6table_security ip6table_raw iptable_nat nf_conntrack_ipv4
> nf_defrag_ipv4 nf_nat_ipv4 nf_nat rpcrdma nf_conntrack ib_isert
> iscsi_target_mod iptable_mangle iptable_security iptable_raw
> ebtable_filter ebtables ib_iser ip6table_filter ip6_tables libiscsi
> iptable_filter scsi_transport_iscsi target_core_mod ib_srp
> scsi_transport_srp ib_ipoib rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm
> ib_cm iw_cm mlx5_ib ib_core intel_powerclamp coretemp kvm_intel kvm
> irqbypass crct10dif_pclmul crc32_pclmul ghash_clmulni_intel
> [ 485.809222] pcbc joydev ipmi_si aesni_intel sg hpwdt hpilo
> crypto_simd iTCO_wdt cryptd iTCO_vendor_support ipmi_devintf gpio_ich
> glue_helper shpchp lpc_ich ipmi_msghandler acpi_power_meter pcspkr
> i7core_edac pcc_cpufreq nfsd auth_rpcgss nfs_acl lockd grace sunrpc
> dm_multipath ip_tables xfs libcrc32c sd_mod amdkfd amd_iommu_v2
> radeon mlx5_core i2c_algo_bit drm_kms_helper syscopyarea sysfillrect
> sysimgblt fb_sys_fops ttm mlxfw drm ptp hpsa pps_core
> scsi_transport_sas crc32c_intel bnx2 i2c_core serio_raw devlink
> dm_mirror dm_region_hash dm_log dm_mod [last unloaded: ib_srpt]
> [ 486.096352] CPU: 11 PID: 78 Comm: ksoftirqd/11 Tainted:
> G I 4.13.0-rc3+ #23
> [ 486.142775] Hardware name: HP ProLiant DL380 G7, BIOS P67
> 08/16/2015
> [ 486.178816] task: ffff8a3fe453df40 task.stack: ffff9a6cc65a8000
> [ 486.212674] RIP: 0010:__blk_recalc_rq_segments+0xb8/0x3d0
> [ 486.242444] RSP: 0018:ffff9a6cc65aba60 EFLAGS: 00000282 ORIG_RAX:
> ffffffffffffff10
> [ 486.284086] RAX: ffff8a4a8e26e440 RBX: 0000000000001000 RCX:
> ffff8a4a8e26e440
> [ 486.323445] RDX: 0000000000000000 RSI: 0000000000001000 RDI:
> ffff8a4a8e2699c8
> [ 486.363614] RBP: ffff9a6cc65abae0 R08: 0000000000007000 R09:
> 0000000000000001
> [ 486.404621] R10: 0000000000001000 R11: 0000000000000000 R12:
> 0000000000000000
> [ 486.445127] R13: 0000000000000000 R14: 0000000000000001 R15:
> 0000000000000007
> [ 486.486120] FS: 0000000000000000(0000) GS:ffff8a4b66940000(0000)
> knlGS:0000000000000000
> [ 486.532771] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [ 486.565772] CR2: 00007f32bfcee000 CR3: 0000000d67609000 CR4:
> 00000000000006e0
> [ 486.606325] Call Trace:
> [ 486.620269] ? mempool_free+0x2b/0x80
> [ 486.641248] blk_recalc_rq_segments+0x28/0x40
> [ 486.666754] blk_update_request+0x249/0x310
> [ 486.690838] end_clone_bio+0x46/0x70 [dm_mod]
> [ 486.715949] bio_endio+0xa1/0x120
> [ 486.734408] blk_update_request+0xb7/0x310
> [ 486.757313] scsi_end_request+0x38/0x1d0
> [ 486.779375] scsi_io_completion+0x13c/0x630
> [ 486.802928] scsi_finish_command+0xd9/0x120
> [ 486.825875] scsi_softirq_done+0x12a/0x150
> [ 486.849029] __blk_mq_complete_request+0x90/0x140
> [ 486.875484] blk_mq_complete_request+0x16/0x20
> [ 486.900631] scsi_mq_done+0x21/0x70
> [ 486.920462] srp_recv_done+0x48f/0x610 [ib_srp]
> [ 486.946212] __ib_process_cq+0x5c/0xb0 [ib_core]
> [ 486.972576] ib_poll_handler+0x22/0x70 [ib_core]
> [ 486.998638] irq_poll_softirq+0xaf/0x110
> [ 487.021727] __do_softirq+0xc9/0x269
> [ 487.042027] run_ksoftirqd+0x29/0x50
> [ 487.062675] smpboot_thread_fn+0x110/0x160
> [ 487.085597] kthread+0x109/0x140
> [ 487.104113] ? sort_range+0x30/0x30
> [ 487.124120] ? kthread_park+0x60/0x60
> [ 487.144624] ret_from_fork+0x25/0x30
> [ 487.164640] Code: c3 83 fa 09 0f 84 8e 01 00 00 48 8b 44 24 20 48
> 8b 4c 24 28 45 31 ff 45 31 c0 45 31 c9 0f b6 80 36 06 00 00 88 44 24
> 17 8b 59 28 <44> 8b 71 2c 44 8b 61 34 85 db 0f 84 92 01 00 00 45 89
> f5 45 89
>
> Message from syslogd@ibclient at Aug 8 04:12:23 ...
> kernel:watchdog: BUG: soft lockup - CPU#11 stuck for 21s!
> [ksoftirqd/11:78]
>
>
>
> Thanks,
> Ming
Hello Bart
I received your message but my work email is not seeing it.
So I am replying to the original email by Ming so I can properly
respond to your most recent message pasted below and keep the thread in
the right flow.
I am purposefully testing buffered I/O because as you know that was a
source of pain for us last year.
So yes my intention always is to run large buffered and un-buffered
(direct) sequential I/O as its part of my regular tests that have been
very useful in discovering many issues.
I also have smaller random and sequential I/O tests driven by fio that
were not seeing the issue.
I will go ahead and test your patch below. I am not surprised Ming was
able to make this happen on the stock 4.13-RC3 but he had to run 16
parallel jobs.
It was happening on his kernel with only 3 parallel jobs and had not
been happening for me with 3 parallel runs on stock upstream but
perhaps just takes longer on 4.13-RC3 so 16 parallel runs brings it
out.
Just one other small point, this is a "hard", not soft lockup.
Unfortunately on the 7.3 base I am unable to capture a full vmcore when
running the latest upstream kernel so if I need to get a vmcore I will
have to use Fedora and the latest upstream.
I will do that if your patch is not successful but reading it and
looking at the stack dumps I have, it makes sense.
back in a bit
Thanks
Laurence
----
Hello Laurence,
Is your goal perhaps to simulate a DDN workload? In that case I think
you
need oflag=direct to the dd argument list such that the page cache
writeback
code does not alter the size of the write requests. Anyway, this test
should
not trigger a lockup. Can you check whether the patch below makes the
soft
lockup complaints disappear (without changing the hammer_write.sh test
script)?
Thanks,
Bart.
---------------------------------------------------------------------
-------
[PATCH] block: Make blk_mq_delay_kick_requeue_list() rerun the queue at
a
quiet time
Drivers like dm-mpath requeue requests if no paths are available and
if configured to do so. If the queue depth is sufficiently high and
the queue rerunning delay sufficiently short then .requeue_work can
be queued so often that other work items queued on the same work queue
do not get executed. Avoid that this happens by only rerunning the
queue after no blk_mq_delay_kick_requeue_list() calls have occurred
during @msecs milliseconds. Since the device mapper core is the only
user of blk_mq_delay_kick_requeue_list(), modify the implementation
of this function instead of creating a new function.
Signed-off-by: Bart Van Assche <bart.vanassche@wdc.com>
---
block/blk-mq.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/block/blk-mq.c b/block/blk-mq.c
index 041f7b7fa0d6..8bfea36e92f9 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -679,8 +679,8 @@ EXPORT_SYMBOL(blk_mq_kick_requeue_list);
void blk_mq_delay_kick_requeue_list(struct request_queue *q,
unsigned long msecs)
{
- kblockd_schedule_delayed_work(&q->requeue_work,
- msecs_to_jiffies(msecs));
+ kblockd_mod_delayed_work_on(WORK_CPU_UNBOUND, &q->requeue_work,
+ msecs_to_jiffies(msecs));
}
EXPORT_SYMBOL(blk_mq_delay_kick_requeue_list);
-----
^ permalink raw reply related [flat|nested] 16+ messages in thread
* Re: [v4.13-rc BUG] system lockup when running big buffered write(4M) to IB SRP via mpath
2017-08-09 0:11 ` Laurence Oberman
@ 2017-08-09 2:28 ` Laurence Oberman
2017-08-09 16:43 ` Laurence Oberman
0 siblings, 1 reply; 16+ messages in thread
From: Laurence Oberman @ 2017-08-09 2:28 UTC (permalink / raw)
To: Ming Lei, dm-devel, linux-scsi, linux-block; +Cc: Van Assche, Bart
On Tue, 2017-08-08 at 20:11 -0400, Laurence Oberman wrote:
> > On Tue, 2017-08-08 at 22:17 +0800, Ming Lei wrote:
> > > > Hi Guys,
> > > >
> > > > Laurence and I see a system lockup issue when running
> > concurrent
> > > > big buffered write(4M bytes) to IB SRP on v4.13-rc3.
> > > >
> > > > 1 how to reproduce
> > > >
> > > > 1) setup IB_SRR & multi path
> > > >
> > > > #./start_opensm.sh
> > > > #./start_srp.sh
> > > >
> > > > # cat start_opensm.sh
> > > > #!/bin/bash
> > > > rmmod ib_srpt
> > > > opensm -F opensm.1.conf &
> > > > opensm -F opensm.2.conf &
> > > >
> > > > # cat start_srp.sh
> > > > run_srp_daemon -f /etc/ddn/srp_daemon.conf -R 30 -T 10
> > -t 7000
> > > > -ance -i mlx5_0 -p 1 1>/root/srp1.log 2>&1 &
> > > > run_srp_daemon -f /etc/ddn/srp_daemon.conf -R 30 -T 10
> > -t 7000
> > > > -ance -i mlx5_1 -p 1 1>/root/srp2.log 2>&1 &
> > > >
> > > >
> > > > 2) run the following test script
> > > >
> > > > #./test-mp.sh
> > > >
> > > > #cat test-mp.sh
> > > > #!/bin/sh
> > > > DEVS="360014051186d24cc27b4d049ffff94e3
> > > > 36001405b2b5c6c24c084b6fa4d55da2f
> > > > 36001405b26ebe76dcb94a489f6f245f8"
> > > >
> > > > for i in $DEVS; do
> > > > for j in `seq 0 15`; do
> > > > ./hammer_write.sh $i 1>/dev/null 2>&1 &
> > > > done
> > > > done
> > > >
> > > > #cat hammer_write.sh
> > > > #!/bin/bash
> > > > while true; do
> > > > dd if=/dev/zero of=/dev/mapper/$1 bs=4096k
> > count=800
> > > > done
> > > >
> > > >
> > > > 2 lockup log
> > > > [root@ibclient ~]# ./test-mp.sh
> > > > [root@ibclient ~]# [ 169.095494] perf: interrupt took too long
> > > > (2575
> > > > > > 2500), lowering kernel.perf_event_max_sample_rate to 77000
> >
> > > >
> > > > [ 178.569124] perf: interrupt took too long (3249 > 3218),
> > > > lowering
> > > > kernel.perf_event_max_sample_rate to 61000
> > > > [ 190.586624] perf: interrupt took too long (4066 > 4061),
> > > > lowering
> > > > kernel.perf_event_max_sample_rate to 49000
> > > > [ 210.381836] perf: interrupt took too long (5083 > 5082),
> > > > lowering
> > > > kernel.perf_event_max_sample_rate to 39000
> > > > [ 240.498659] perf: interrupt took too long (6373 > 6353),
> > > > lowering
> > > > kernel.perf_event_max_sample_rate to 31000
> > > >
> > > > [root@ibclient ~]#
> > > > [root@ibclient ~]# [ 292.252795] perf: interrupt took too long
> > > > (8006
> > > > > > 7966), lowering kernel.perf_event_max_sample_rate to 24000
> >
> > > >
> > > > [ 410.354443] device-mapper: multipath: blk_get_request()
> > returned
> > > > -11 - requeuing
> > > > [ 410.432725] device-mapper: multipath: blk_get_request()
> > returned
> > > > -11 - requeuing
> > > > [ 410.499530] device-mapper: multipath: blk_get_request()
> > returned
> > > > -11 - requeuing
> > > > [ 410.564952] device-mapper: multipath: blk_get_request()
> > returned
> > > > -11 - requeuing
> > > > [ 410.644493] device-mapper: multipath: blk_get_request()
> > returned
> > > > -11 - requeuing
> > > > [ 410.698091] device-mapper: multipath: blk_get_request()
> > returned
> > > > -11 - requeuing
> > > > [ 410.755175] device-mapper: multipath: blk_get_request()
> > returned
> > > > -11 - requeuing
> > > > [ 410.796784] device-mapper: multipath: blk_get_request()
> > returned
> > > > -11 - requeuing
> > > > [ 410.837690] device-mapper: multipath: blk_get_request()
> > returned
> > > > -11 - requeuing
> > > > [ 410.878743] device-mapper: multipath: blk_get_request()
> > returned
> > > > -11 - requeuing
> > > > [ 463.476823] multipath_clone_and_map: 32 callbacks suppressed
> > > > [ 463.506773] device-mapper: multipath: blk_get_request()
> > returned
> > > > -11 - requeuing
> > > > [ 463.586752] device-mapper: multipath: blk_get_request()
> > returned
> > > > -11 - requeuing
> > > > [ 463.656880] device-mapper: multipath: blk_get_request()
> > returned
> > > > -11 - requeuing
> > > > [ 463.698919] device-mapper: multipath: blk_get_request()
> > returned
> > > > -11 - requeuing
> > > > [ 463.723572] device-mapper: multipath: blk_get_request()
> > returned
> > > > -11 - requeuing
> > > > [ 463.751820] device-mapper: multipath: blk_get_request()
> > returned
> > > > -11 - requeuing
> > > > [ 463.781110] device-mapper: multipath: blk_get_request()
> > returned
> > > > -11 - requeuing
> > > > [ 463.828267] device-mapper: multipath: blk_get_request()
> > returned
> > > > -11 - requeuing
> > > > [ 463.856997] device-mapper: multipath: blk_get_request()
> > returned
> > > > -11 - requeuing
> > > > [ 463.885998] device-mapper: multipath: blk_get_request()
> > returned
> > > > -11 - requeuing
> > > > [ 468.572060] multipath_clone_and_map: 5201 callbacks
> > suppressed
> > > > [ 468.602191] device-mapper: multipath: blk_get_request()
> > returned
> > > > -11 - requeuing
> > > > [ 468.655761] device-mapper: multipath: blk_get_request()
> > returned
> > > > -11 - requeuing
> > > > [ 468.697540] device-mapper: multipath: blk_get_request()
> > returned
> > > > -11 - requeuing
> > > > [ 468.738610] device-mapper: multipath: blk_get_request()
> > returned
> > > > -11 - requeuing
> > > > [ 468.779597] device-mapper: multipath: blk_get_request()
> > returned
> > > > -11 - requeuing
> > > > [ 468.820705] device-mapper: multipath: blk_get_request()
> > returned
> > > > -11 - requeuing
> > > > [ 468.862712] device-mapper: multipath: blk_get_request()
> > returned
> > > > -11 - requeuing
> > > > [ 468.904662] device-mapper: multipath: blk_get_request()
> > returned
> > > > -11 - requeuing
> > > > [ 468.945132] device-mapper: multipath: blk_get_request()
> > returned
> > > > -11 - requeuing
> > > > [ 468.985128] device-mapper: multipath: blk_get_request()
> > returned
> > > > -11 - requeuing
> > > > [ 473.643013] multipath_clone_and_map: 6569 callbacks
> > suppressed
> > > > [ 473.673466] device-mapper: multipath: blk_get_request()
> > returned
> > > > -11 - requeuing
> > > > [ 473.713929] device-mapper: multipath: blk_get_request()
> > returned
> > > > -11 - requeuing
> > > > [ 473.754989] device-mapper: multipath: blk_get_request()
> > returned
> > > > -11 - requeuing
> > > > [ 473.796069] device-mapper: multipath: blk_get_request()
> > returned
> > > > -11 - requeuing
> > > > [ 473.849768] device-mapper: multipath: blk_get_request()
> > returned
> > > > -11 - requeuing
> > > > [ 473.891640] device-mapper: multipath: blk_get_request()
> > returned
> > > > -11 - requeuing
> > > > [ 473.933321] device-mapper: multipath: blk_get_request()
> > returned
> > > > -11 - requeuing
> > > > [ 473.973791] device-mapper: multipath: blk_get_request()
> > returned
> > > > -11 - requeuing
> > > > [ 474.016739] device-mapper: multipath: blk_get_request()
> > returned
> > > > -11 - requeuing
> > > > [ 474.059858] device-mapper: multipath: blk_get_request()
> > returned
> > > > -11 - requeuing
> > > > [ 478.732265] multipath_clone_and_map: 6310 callbacks
> > suppressed
> > > > [ 478.762164] device-mapper: multipath: blk_get_request()
> > returned
> > > > -11 - requeuing
> > > > [ 478.802327] device-mapper: multipath: blk_get_request()
> > returned
> > > > -11 - requeuing
> > > > [ 478.844235] device-mapper: multipath: blk_get_request()
> > returned
> > > > -11 - requeuing
> > > > [ 478.886113] device-mapper: multipath: blk_get_request()
> > returned
> > > > -11 - requeuing
> > > > [ 478.927164] device-mapper: multipath: blk_get_request()
> > returned
> > > > -11 - requeuing
> > > > [ 478.968141] device-mapper: multipath: blk_get_request()
> > returned
> > > > -11 - requeuing
> > > > [ 479.008772] device-mapper: multipath: blk_get_request()
> > returned
> > > > -11 - requeuing
> > > > [ 479.060225] device-mapper: multipath: blk_get_request()
> > returned
> > > > -11 - requeuing
> > > > [ 479.102873] device-mapper: multipath: blk_get_request()
> > returned
> > > > -11 - requeuing
> > > > [ 479.145341] device-mapper: multipath: blk_get_request()
> > returned
> > > > -11 - requeuing
> > > > [ 483.834884] multipath_clone_and_map: 5703 callbacks
> > suppressed
> > > > [ 483.865056] device-mapper: multipath: blk_get_request()
> > returned
> > > > -11 - requeuing
> > > > [ 483.906792] device-mapper: multipath: blk_get_request()
> > returned
> > > > -11 - requeuing
> > > > [ 483.959492] device-mapper: multipath: blk_get_request()
> > returned
> > > > -11 - requeuing
> > > > [ 484.001768] device-mapper: multipath: blk_get_request()
> > returned
> > > > -11 - requeuing
> > > > [ 484.044785] device-mapper: multipath: blk_get_request()
> > returned
> > > > -11 - requeuing
> > > > [ 484.087289] device-mapper: multipath: blk_get_request()
> > returned
> > > > -11 - requeuing
> > > > [ 484.129854] device-mapper: multipath: blk_get_request()
> > returned
> > > > -11 - requeuing
> > > > [ 484.171317] device-mapper: multipath: blk_get_request()
> > returned
> > > > -11 - requeuing
> > > > [ 484.212314] device-mapper: multipath: blk_get_request()
> > returned
> > > > -11 - requeuing
> > > > [ 484.263931] device-mapper: multipath: blk_get_request()
> > returned
> > > > -11 - requeuing
> > > > [ 484.982608] perf: interrupt took too long (10021 > 10007),
> > > > lowering kernel.perf_event_max_sample_rate to 19000
> > > > [ 485.362787] watchdog: BUG: soft lockup - CPU#11 stuck for
> > 21s!
> > > > [ksoftirqd/11:78]
> > > > [ 485.400762] Modules linked in: dm_round_robin xt_CHECKSUM
> > > > ipt_MASQUERADE nf_nat_masquerade_ipv4 tun ip6t_rpfilter
> > ipt_REJECT
> > > > nf_reject_ipv4 ip6t_REJECT nf_reject_ipv6 xt_conntrack ip_set
> > > > nfnetlink ebtable_nat ebtable_broute bridge stp llc
> > ip6table_nat
> > > > nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv6 ip6table_mangle
> > > > ip6table_security ip6table_raw iptable_nat nf_conntrack_ipv4
> > > > nf_defrag_ipv4 nf_nat_ipv4 nf_nat rpcrdma nf_conntrack ib_isert
> > > > iscsi_target_mod iptable_mangle iptable_security iptable_raw
> > > > ebtable_filter ebtables ib_iser ip6table_filter ip6_tables
> > libiscsi
> > > > iptable_filter scsi_transport_iscsi target_core_mod ib_srp
> > > > scsi_transport_srp ib_ipoib rdma_ucm ib_ucm ib_uverbs ib_umad
> > > > rdma_cm
> > > > ib_cm iw_cm mlx5_ib ib_core intel_powerclamp coretemp kvm_intel
> > kvm
> > > > irqbypass crct10dif_pclmul crc32_pclmul ghash_clmulni_intel
> > > > [ 485.809222] pcbc joydev ipmi_si aesni_intel sg hpwdt hpilo
> > > > crypto_simd iTCO_wdt cryptd iTCO_vendor_support ipmi_devintf
> > > > gpio_ich
> > > > glue_helper shpchp lpc_ich ipmi_msghandler acpi_power_meter
> > pcspkr
> > > > i7core_edac pcc_cpufreq nfsd auth_rpcgss nfs_acl lockd grace
> > sunrpc
> > > > dm_multipath ip_tables xfs libcrc32c sd_mod amdkfd amd_iommu_v2
> > > > radeon mlx5_core i2c_algo_bit drm_kms_helper syscopyarea
> > > > sysfillrect
> > > > sysimgblt fb_sys_fops ttm mlxfw drm ptp hpsa pps_core
> > > > scsi_transport_sas crc32c_intel bnx2 i2c_core serio_raw devlink
> > > > dm_mirror dm_region_hash dm_log dm_mod [last unloaded: ib_srpt]
> > > > [ 486.096352] CPU: 11 PID: 78 Comm: ksoftirqd/11 Tainted:
> > > > G I 4.13.0-rc3+ #23
> > > > [ 486.142775] Hardware name: HP ProLiant DL380 G7, BIOS P67
> > > > 08/16/2015
> > > > [ 486.178816] task: ffff8a3fe453df40 task.stack:
> > ffff9a6cc65a8000
> > > > [ 486.212674] RIP: 0010:__blk_recalc_rq_segments+0xb8/0x3d0
> > > > [ 486.242444] RSP: 0018:ffff9a6cc65aba60 EFLAGS: 00000282
> > > > ORIG_RAX:
> > > > ffffffffffffff10
> > > > [ 486.284086] RAX: ffff8a4a8e26e440 RBX: 0000000000001000 RCX:
> > > > ffff8a4a8e26e440
> > > > [ 486.323445] RDX: 0000000000000000 RSI: 0000000000001000 RDI:
> > > > ffff8a4a8e2699c8
> > > > [ 486.363614] RBP: ffff9a6cc65abae0 R08: 0000000000007000 R09:
> > > > 0000000000000001
> > > > [ 486.404621] R10: 0000000000001000 R11: 0000000000000000 R12:
> > > > 0000000000000000
> > > > [ 486.445127] R13: 0000000000000000 R14: 0000000000000001 R15:
> > > > 0000000000000007
> > > > [ 486.486120] FS: 0000000000000000(0000)
> > > > GS:ffff8a4b66940000(0000)
> > > > knlGS:0000000000000000
> > > > [ 486.532771] CS: 0010 DS: 0000 ES: 0000 CR0:
> > 0000000080050033
> > > > [ 486.565772] CR2: 00007f32bfcee000 CR3: 0000000d67609000 CR4:
> > > > 00000000000006e0
> > > > [ 486.606325] Call Trace:
> > > > [ 486.620269] ? mempool_free+0x2b/0x80
> > > > [ 486.641248] blk_recalc_rq_segments+0x28/0x40
> > > > [ 486.666754] blk_update_request+0x249/0x310
> > > > [ 486.690838] end_clone_bio+0x46/0x70 [dm_mod]
> > > > [ 486.715949] bio_endio+0xa1/0x120
> > > > [ 486.734408] blk_update_request+0xb7/0x310
> > > > [ 486.757313] scsi_end_request+0x38/0x1d0
> > > > [ 486.779375] scsi_io_completion+0x13c/0x630
> > > > [ 486.802928] scsi_finish_command+0xd9/0x120
> > > > [ 486.825875] scsi_softirq_done+0x12a/0x150
> > > > [ 486.849029] __blk_mq_complete_request+0x90/0x140
> > > > [ 486.875484] blk_mq_complete_request+0x16/0x20
> > > > [ 486.900631] scsi_mq_done+0x21/0x70
> > > > [ 486.920462] srp_recv_done+0x48f/0x610 [ib_srp]
> > > > [ 486.946212] __ib_process_cq+0x5c/0xb0 [ib_core]
> > > > [ 486.972576] ib_poll_handler+0x22/0x70 [ib_core]
> > > > [ 486.998638] irq_poll_softirq+0xaf/0x110
> > > > [ 487.021727] __do_softirq+0xc9/0x269
> > > > [ 487.042027] run_ksoftirqd+0x29/0x50
> > > > [ 487.062675] smpboot_thread_fn+0x110/0x160
> > > > [ 487.085597] kthread+0x109/0x140
> > > > [ 487.104113] ? sort_range+0x30/0x30
> > > > [ 487.124120] ? kthread_park+0x60/0x60
> > > > [ 487.144624] ret_from_fork+0x25/0x30
> > > > [ 487.164640] Code: c3 83 fa 09 0f 84 8e 01 00 00 48 8b 44 24
> > 20
> > > > 48
> > > > 8b 4c 24 28 45 31 ff 45 31 c0 45 31 c9 0f b6 80 36 06 00 00 88
> > 44
> > > > 24
> > > > 17 8b 59 28 <44> 8b 71 2c 44 8b 61 34 85 db 0f 84 92 01 00 00
> > 45 89
> > > > f5 45 89
> > > >
> > > > Message from syslogd@ibclient at Aug 8 04:12:23 ...
> > > > kernel:watchdog: BUG: soft lockup - CPU#11 stuck for 21s!
> > > > [ksoftirqd/11:78]
> > > >
> > > >
> > > >
> > > > Thanks,
> > > > Ming
>
> >
> > Hello Bart
> >
> > I received your message but my work email is not seeing it.
> > So I am replying to the original email by Ming so I can properly
> > respond to your most recent message pasted below and keep the
> thread
> > in
> > the right flow.
> >
> > I am purposefully testing buffered I/O because as you know that was
> a
> > source of pain for us last year.
> > So yes my intention always is to run large buffered and un-buffered
> > (direct) sequential I/O as its part of my regular tests that have
> > been
> > very useful in discovering many issues.
> > I also have smaller random and sequential I/O tests driven by fio
> > that
> > were not seeing the issue.
> >
> > I will go ahead and test your patch below. I am not surprised Ming
> > was
> > able to make this happen on the stock 4.13-RC3 but he had to run 16
> > parallel jobs.
> >
> > It was happening on his kernel with only 3 parallel jobs and had
> not
> > been happening for me with 3 parallel runs on stock upstream but
> > perhaps just takes longer on 4.13-RC3 so 16 parallel runs brings it
> > out.
> >
> > Just one other small point, this is a "hard", not soft lockup.
> >
> > Unfortunately on the 7.3 base I am unable to capture a full vmcore
> > when
> > running the latest upstream kernel so if I need to get a vmcore I
> > will
> > have to use Fedora and the latest upstream.
> >
> > I will do that if your patch is not successful but reading it and
> > looking at the stack dumps I have, it makes sense.
> >
> > back in a bit
> >
> > Thanks
> > Laurence
> >
> > ----
> > Hello Laurence,
> >
> > Is your goal perhaps to simulate a DDN workload? In that case I
> think
> > you
> > need oflag=direct to the dd argument list such that the page cache
> > writeback
> > code does not alter the size of the write requests. Anyway, this
> test
> > should
> > not trigger a lockup. Can you check whether the patch below makes
> the
> > soft
> > lockup complaints disappear (without changing the hammer_write.sh
> > test
> > script)?
> >
> > Thanks,
> >
> > Bart.
> > -----------------------------------------------------------------
> ----
> > -------
> > [PATCH] block: Make blk_mq_delay_kick_requeue_list() rerun the
> queue
> > at
> > a
> > quiet time
> >
> > Drivers like dm-mpath requeue requests if no paths are available
> and
> > if configured to do so. If the queue depth is sufficiently high and
> > the queue rerunning delay sufficiently short then .requeue_work can
> > be queued so often that other work items queued on the same work
> > queue
> > do not get executed. Avoid that this happens by only rerunning the
> > queue after no blk_mq_delay_kick_requeue_list() calls have occurred
> > during @msecs milliseconds. Since the device mapper core is the
> only
> > user of blk_mq_delay_kick_requeue_list(), modify the implementation
> > of this function instead of creating a new function.
> >
> > Signed-off-by: Bart Van Assche <bart.vanassche@wdc.com>
> > ---
> > block/blk-mq.c | 4 ++--
> > 1 file changed, 2 insertions(+), 2 deletions(-)
> >
> > diff --git a/block/blk-mq.c b/block/blk-mq.c
> > index 041f7b7fa0d6..8bfea36e92f9 100644
> > --- a/block/blk-mq.c
> > +++ b/block/blk-mq.c
> > @@ -679,8 +679,8 @@ EXPORT_SYMBOL(blk_mq_kick_requeue_list);
> > void blk_mq_delay_kick_requeue_list(struct request_queue *q,
> > unsigned long msecs)
> > {
> > - kblockd_schedule_delayed_work(&q->requeue_work,
> > - msecs_to_jiffies(msecs));
> > + kblockd_mod_delayed_work_on(WORK_CPU_UNBOUND, &q-
> > >requeue_work,
> > + msecs_to_jiffies(msecs));
> > }
> > EXPORT_SYMBOL(blk_mq_delay_kick_requeue_list);
> > -----
Hello Bart
So the patch is working to avoid the clone failures messages and
requeue -11 messages of my original issue however both the hard lockup
and soft lockup persists. It's is a lot harder to get the hard lockup
now though with the patch.
The soft LOCKUPS due to the spins on the ksoftirqd are almost expected
because of how hard we are driving this and are easy to recreate.
I did boot the original Ming kernel and reproduced first for baseline.
I then patched it with your simple patch.
I was not paying enough attention to Ming's message but he was seeing a
ksoftirq soft LOCKUP on the stock upstream kernel NOT the hard LOCKUP.
I did not check his log message carefully so assumed he meant hard
LOCKUP.
I guess that's why you said "soft LOCKUP"
I did get into a few soft lockups before I hit another hard LOCKUP.
So I am now again not sure about the hard LOCKUP on the stock and 7.4
kernel. I will have to retest and baseline myself as Ming is away now
on vacation.
The soft lockups still happen but they recover unless panic_on_
soft_lockup is set and I almost expect them with how the softirqd is
spinning here.
We see high spin rates here and I am only running 7 parallel writes.
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+
COMMAND
20 root 20 0 0 0 0 R 100.0 0.0 0:27.11
kworker/u6+
24 root 20 0 0 0 0 R 100.0 0.0 0:21.56
ksoftirqd/2
60 root 20 0 0 0 0 R 100.0 0.0 0:34.61
ksoftirqd/8
We see high SYS time because of the spinning so the soft LOCKUP is not
unexpected to be honest and it recovers.
# <----CPU[HYPER]-----><-----------Memory-----------><---------
-Disks----------->
#Time cpu sys inter ctxsw Free Buff Cach Inac Slab Map
KBRead Reads KBWrit Writes
21:32:11 52 52 11526 45981 65G 21G 541M 22G 7G
209M 45 11 709791 211
21:32:12 49 49 13315 26037 65G 21G 541M 22G 6G
209M 0 0 606356 195
21:32:13 47 47 13375 21456 65G 21G 541M 22G 6G
209M 0 0 591468 250
21:32:14 33 33 12635 978 65G 21G 541M 22G 5G
209M 0 0 560860 365
21:32:15 40 40 14304 734 67G 19G 541M 20G 5G
209M 0 0 429472 97
21:32:16 36 36 16060 6220 70G 17G 541M 17G 4G
209M 0 0 312852 54
21:32:17 31 31 16071 4962 71G 16G 540M 16G 4G
209M 24 6 253228 44
21:32:18 31 31 15855 868 70G 17G 540M 18G 6G
209M 104 26 237696 62
21:32:19 31 31 15983 661 70G 17G 541M 17G 7G
209M 0 0 162340 37
21:32:20 32 32 16245 1413 70G 17G 541M 18G 9G
209M 68 17 155648 19
21:32:21 44 44 14919 21711 68G 19G 540M 19G 10G
209M 0 0 303104 32
21:32:22 59 59 11661 69544 68G 19G 541M 20G 10G
209M 0 0 620919 84
21:32:23 59 59 12951 1058 67G 20G 540M 20G 10G
209M 0 0 596582 126
21:32:24 68 67 12991 2077 66G 21G 540M 21G 10G
209M 0 0 689504 168
21:32:25 78 78 13295 2927 66G 21G 541M 21G 10G
209M 0 0 667673 144
21:32:26 66 66 13498 7670 65G 21G 541M 22G 10G
209M 0 0 626874 98
21:32:27 55 55 12203 2663 65G 21G 541M 22G 9G
209M 0 0 652096 112
21:32:28 47 47 12567 27804 65G 21G 541M 22G 8G
209M 0 0 534789 137
We are driving this system pretty hard here and the target server is
seeing queue fulls and I am used to these if I drive it too hard.
[ 568.707572] ib_srpt srpt_write_pending: IB send queue full (needed
97)
[ 568.745714] ib_srpt srpt_write_pending: IB send queue full (needed
97)
[ 568.783735] ib_srpt srpt_write_pending: IB send queue full (needed
97)
[ 568.821583] ib_srpt srpt_write_pending: IB send queue full (needed
97)
[ 568.858238] ib_srpt srpt_write_pending: IB send queue full (needed
97)
[ 568.895975] ib_srpt srpt_queue_response: IB send queue full (needed
0)
[ 568.934253] ib_srpt srpt_write_pending: IB send queue full (needed
97)
[ 568.971040] ib_srpt srpt_write_pending: IB send queue full (needed
49)
So your patch helps to avoid easily getting the hard LOCKUP, does not
help the Soft lockup (as expected).
[ 329.305507] watchdog: BUG: soft lockup - CPU#12 stuck for 21s!
[ksoftirqd/12:84]
[ 329.305508] Modules linked in: dm_round_robin mq_deadline
xt_CHECKSUM ipt_MASQUERADE nf_nat_masquerade_ipv4 tun ip6t_rpfilter
ipt_REJECT nf_reject_ipv4 ip6t_REJECT nf_reject_ipv6 xt_conntrack
ip_set nfnetlink ebtable_nat ebtable_broute bridge stp llc ip6table_nat
nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv6 ip6table_mangle
ip6table_security ip6table_raw iptable_nat nf_conntrack_ipv4
nf_defrag_ipv4 nf_nat_ipv4 nf_nat rpcrdma nf_conntrack ib_isert
iptable_mangle iptable_security iscsi_target_mod iptable_raw
ebtable_filter ebtables ip6table_filter ib_iser ip6_tables libiscsi
iptable_filter scsi_transport_iscsi target_core_mod ib_srp
scsi_transport_srp ib_ipoib rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm
ib_cm iw_cm mlx5_ib ib_core intel_powerclamp coretemp kvm_intel kvm
irqbypass crct10dif_pclmul crc32_pclmul
[ 329.305543] ghash_clmulni_intel pcbc joydev aesni_intel ipmi_si sg
hpilo ipmi_devintf hpwdt crypto_simd iTCO_wdt gpio_ich
iTCO_vendor_support cryptd shpchp acpi_power_meter ipmi_msghandler
i7core_edac glue_helper pcspkr lpc_ich pcc_cpufreq nfsd auth_rpcgss
nfs_acl lockd grace sunrpc dm_multipath ip_tables xfs libcrc32c sd_mod
amdkfd amd_iommu_v2 radeon mlx5_core i2c_algo_bit drm_kms_helper
syscopyarea sysfillrect sysimgblt fb_sys_fops ttm mlxfw ptp drm
pps_core hpsa i2c_core devlink crc32c_intel serio_raw bnx2
scsi_transport_sas dm_mirror dm_region_hash dm_log dm_mod [last
unloaded: ib_srpt]
[ 329.305564] CPU: 12 PID: 84 Comm: ksoftirqd/12 Tainted:
G I 4.13.0-rc3.ming.bart+ #1
[ 329.305566] Hardware name: HP ProLiant DL380 G7, BIOS P67 08/16/2015
[ 329.305567] task: ffff91a94c070040 task.stack: ffffaa8ec65d8000
[ 329.305573] RIP: 0010:__blk_recalc_rq_segments+0x1a7/0x3d0
[ 329.305574] RSP: 0018:ffffaa8ec65dba60 EFLAGS: 00000202 ORIG_RAX:
ffffffffffffff10
[ 329.305575] RAX: 0000000000000001 RBX: 0000000000001000 RCX:
ffff91b1149239c0
[ 329.305576] RDX: 0000000000001000 RSI: 0000000000000001 RDI:
0000000000001000
[ 329.305577] RBP: ffffaa8ec65dbae0 R08: 0000000000001000 R09:
0000000000000001
[ 329.305578] R10: 0000000b2a506000 R11: 0000000000000000 R12:
0000000000000000
[ 329.305579] R13: 0000000000000000 R14: 0000000000000000 R15:
000000000000002e
[ 329.305580] FS: 0000000000000000(0000) GS:ffff91b3b7b80000(0000)
knlGS:0000000000000000
[ 329.305581] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 329.305582] CR2: 00007ff4591f1860 CR3: 0000000985e09000 CR4:
00000000000006e0
[ 329.305583] Call Trace:
[ 329.305588] ? mempool_free+0x2b/0x80
[ 329.305590] blk_recalc_rq_segments+0x28/0x40
[ 329.305592] blk_update_request+0x249/0x310
[ 329.305603] end_clone_bio+0x46/0x70 [dm_mod]
[ 329.305607] bio_endio+0xa1/0x120
[ 329.305608] blk_update_request+0xb7/0x310
[ 329.305611] scsi_end_request+0x38/0x1d0
[ 329.305613] scsi_io_completion+0x13c/0x630
[ 329.305615] scsi_finish_command+0xd9/0x120
[ 329.305616] scsi_softirq_done+0x12a/0x150
[ 329.305618] __blk_mq_complete_request+0x90/0x140
[ 329.305624] blk_mq_complete_request+0x16/0x20
[ 329.305625] scsi_mq_done+0x21/0x70
[ 329.305628] srp_recv_done+0x48f/0x610 [ib_srp]
[ 329.305639] __ib_process_cq+0x5c/0xb0 [ib_core]
[ 329.305646] ib_poll_handler+0x22/0x70 [ib_core]
[ 329.305649] irq_poll_softirq+0xaf/0x110
[ 329.305652] __do_softirq+0xc9/0x269
[ 329.305655] run_ksoftirqd+0x29/0x50
[ 329.305657] smpboot_thread_fn+0x110/0x160
[ 329.305659] kthread+0x109/0x140
[ 329.305660] ? sort_range+0x30/0x30
[ 329.305661] ? kthread_park+0x60/0x60
[ 329.305663] ret_from_fork+0x25/0x30
[ 329.305664] Code: 8b 74 24 30 48 8b 7c 24 38 41 83 c7 01 41 89 d0 41
b9 01 00 00 00 48 89 74 24 40 48 89 7c 24 48 8b 41 14 0f b6 f0 25 fb 00
00 00 <83> f8 03 0f 84 a0 00 00 00 83 fe 05 0f 84 97 00 00 00 83 fe 09
[ 331.465885] CPU: 4 PID: 36 Comm: ksoftirqd/4 Tainted:
G I L 4.13.0-rc3.ming.bart+ #1
[ 331.516413] Hardware name: HP ProLiant DL380 G7, BIOS P67 08/16/2015
[ 331.551403] task: ffff91a94c76df40 task.stack: ffffaa8ec6458000
[ 331.584080] RIP: 0010:__blk_recalc_rq_segments+0xdf/0x3d0
[ 331.614515] RSP: 0018:ffffaa8ec645ba60 EFLAGS: 00000286 ORIG_RAX:
ffffffffffffff10
[ 331.656098] RAX: 0000000000001000 RBX: 0000000000001000 RCX:
ffff91b29faecdc0
[ 331.696577] RDX: ffffe839e5d75300 RSI: ffff91b29faece48 RDI:
ffff91b2831f8348
[ 331.736607] RBP: ffffaa8ec645bae0 R08: 0000000000004000 R09:
0000000000000001
[ 331.776129] R10: 0000000000000000 R11: 0000000000000000 R12:
0000000000000000
[ 331.817085] R13: 0000000000000000 R14: 0000000000000000 R15:
0000000000000021
[ 331.857671] FS: 0000000000000000(0000) GS:ffff91b3b7a80000(0000)
knlGS:0000000000000000
[ 331.904232] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 331.935249] CR2: 00007f404526e000 CR3: 0000000985e09000 CR4:
00000000000006e0
[ 331.976273] Call Trace:
[ 331.990216] ? mempool_free+0x2b/0x80
[ 332.010310] blk_recalc_rq_segments+0x28/0x40
[ 332.034810] blk_update_request+0x249/0x310
[ 332.058259] end_clone_bio+0x46/0x70 [dm_mod]
[ 332.082851] bio_endio+0xa1/0x120
[ 332.101145] blk_update_request+0xb7/0x310
[ 332.123965] scsi_end_request+0x38/0x1d0
[ 332.145943] scsi_io_completion+0x13c/0x630
[ 332.169357] scsi_finish_command+0xd9/0x120
[ 332.193505] scsi_softirq_done+0x12a/0x150
[ 332.217044] __blk_mq_complete_request+0x90/0x140
[ 332.244498] blk_mq_complete_request+0x16/0x20
[ 332.269971] scsi_mq_done+0x21/0x70
[ 332.289591] srp_recv_done+0x48f/0x610 [ib_srp]
[ 332.315609] __ib_process_cq+0x5c/0xb0 [ib_core]
[ 332.342144] ib_poll_handler+0x22/0x70 [ib_core]
[ 332.368724] irq_poll_softirq+0xaf/0x110
[ 332.392595] __do_softirq+0xc9/0x269
[ 332.413220] run_ksoftirqd+0x29/0x50
[ 332.433214] smpboot_thread_fn+0x110/0x160
[ 332.456254] kthread+0x109/0x140
[ 332.475263] ? sort_range+0x30/0x30
[ 332.494624] ? kthread_park+0x60/0x60
[ 332.515413] ret_from_fork+0x25/0x30
[ 332.535180] Code: 17 8b 59 28 44 8b 71 2c 44 8b 61 34 85 db 0f 84 92
01 00 00 45 89 f5 45 89 e2 4c 89 ee 48 c1 e6 04 48 03 71 78 8b 46 08 48
8b 16 <44> 29 e0 48 89 54 24 30 39 d8 0f 47 c3 44 03 56 0c 45 84 db 89
Message from syslogd@ibclient at Aug 8 22:06:38 ...
kernel:watchdog: BUG: soft lockup - CPU#4 stuck for 23s!
[ksoftirqd/4:36]
Message from syslogd@ibclient at Aug 8 22:06:38 ...
kernel:watchdog: BUG: soft lockup - CPU#12 stuck for 21s!
[ksoftirqd/12:84]
I am going to leave this running overnight and see if I wake up to
another hard lockup.
Thanks as always, again the patch has helped in two ways.
1. It avoids the clone failure and requeue -11 failure messages
2. It seems to take a lot longe rnow to get into a hard lockup.
Regards
Laurence
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [v4.13-rc BUG] system lockup when running big buffered write(4M) to IB SRP via mpath
2017-08-09 2:28 ` Laurence Oberman
@ 2017-08-09 16:43 ` Laurence Oberman
2017-08-09 17:10 ` Bart Van Assche
0 siblings, 1 reply; 16+ messages in thread
From: Laurence Oberman @ 2017-08-09 16:43 UTC (permalink / raw)
To: Ming Lei, dm-devel, linux-scsi, linux-block; +Cc: Van Assche, Bart
On 08/08/2017 10:28 PM, Laurence Oberman wrote:
> On Tue, 2017-08-08 at 20:11 -0400, Laurence Oberman wrote:
>>> On Tue, 2017-08-08 at 22:17 +0800, Ming Lei wrote:
>>>>> Hi Guys,
>>>>>
>>>>> Laurence and I see a system lockup issue when running
>>> concurrent
>>>>> big buffered write(4M bytes) to IB SRP on v4.13-rc3.
>>>>>
>>>>> 1 how to reproduce
>>>>>
>>>>> 1) setup IB_SRR & multi path
>>>>>
>>>>> #./start_opensm.sh
>>>>> #./start_srp.sh
>>>>>
>>>>> # cat start_opensm.sh
>>>>> #!/bin/bash
>>>>> rmmod ib_srpt
>>>>> opensm -F opensm.1.conf &
>>>>> opensm -F opensm.2.conf &
>>>>>
>>>>> # cat start_srp.sh
>>>>> run_srp_daemon -f /etc/ddn/srp_daemon.conf -R 30 -T 10
>>> -t 7000
>>>>> -ance -i mlx5_0 -p 1 1>/root/srp1.log 2>&1 &
>>>>> run_srp_daemon -f /etc/ddn/srp_daemon.conf -R 30 -T 10
>>> -t 7000
>>>>> -ance -i mlx5_1 -p 1 1>/root/srp2.log 2>&1 &
>>>>>
>>>>>
>>>>> 2) run the following test script
>>>>>
>>>>> #./test-mp.sh
>>>>>
>>>>> #cat test-mp.sh
>>>>> #!/bin/sh
>>>>> DEVS="360014051186d24cc27b4d049ffff94e3
>>>>> 36001405b2b5c6c24c084b6fa4d55da2f
>>>>> 36001405b26ebe76dcb94a489f6f245f8"
>>>>>
>>>>> for i in $DEVS; do
>>>>> for j in `seq 0 15`; do
>>>>> ./hammer_write.sh $i 1>/dev/null 2>&1 &
>>>>> done
>>>>> done
>>>>>
>>>>> #cat hammer_write.sh
>>>>> #!/bin/bash
>>>>> while true; do
>>>>> dd if=/dev/zero of=/dev/mapper/$1 bs=4096k
>>> count=800
>>>>> done
>>>>>
>>>>>
>>>>> 2 lockup log
>>>>> [root@ibclient ~]# ./test-mp.sh
>>>>> [root@ibclient ~]# [ 169.095494] perf: interrupt took too long
>>>>> (2575
>>>>>>> 2500), lowering kernel.perf_event_max_sample_rate to 77000
>>>
>>>>>
>>>>> [ 178.569124] perf: interrupt took too long (3249 > 3218),
>>>>> lowering
>>>>> kernel.perf_event_max_sample_rate to 61000
>>>>> [ 190.586624] perf: interrupt took too long (4066 > 4061),
>>>>> lowering
>>>>> kernel.perf_event_max_sample_rate to 49000
>>>>> [ 210.381836] perf: interrupt took too long (5083 > 5082),
>>>>> lowering
>>>>> kernel.perf_event_max_sample_rate to 39000
>>>>> [ 240.498659] perf: interrupt took too long (6373 > 6353),
>>>>> lowering
>>>>> kernel.perf_event_max_sample_rate to 31000
>>>>>
>>>>> [root@ibclient ~]#
>>>>> [root@ibclient ~]# [ 292.252795] perf: interrupt took too long
>>>>> (8006
>>>>>>> 7966), lowering kernel.perf_event_max_sample_rate to 24000
>>>
>>>>>
>>>>> [ 410.354443] device-mapper: multipath: blk_get_request()
>>> returned
>>>>> -11 - requeuing
>>>>> [ 410.432725] device-mapper: multipath: blk_get_request()
>>> returned
>>>>> -11 - requeuing
>>>>> [ 410.499530] device-mapper: multipath: blk_get_request()
>>> returned
>>>>> -11 - requeuing
>>>>> [ 410.564952] device-mapper: multipath: blk_get_request()
>>> returned
>>>>> -11 - requeuing
>>>>> [ 410.644493] device-mapper: multipath: blk_get_request()
>>> returned
>>>>> -11 - requeuing
>>>>> [ 410.698091] device-mapper: multipath: blk_get_request()
>>> returned
>>>>> -11 - requeuing
>>>>> [ 410.755175] device-mapper: multipath: blk_get_request()
>>> returned
>>>>> -11 - requeuing
>>>>> [ 410.796784] device-mapper: multipath: blk_get_request()
>>> returned
>>>>> -11 - requeuing
>>>>> [ 410.837690] device-mapper: multipath: blk_get_request()
>>> returned
>>>>> -11 - requeuing
>>>>> [ 410.878743] device-mapper: multipath: blk_get_request()
>>> returned
>>>>> -11 - requeuing
>>>>> [ 463.476823] multipath_clone_and_map: 32 callbacks suppressed
>>>>> [ 463.506773] device-mapper: multipath: blk_get_request()
>>> returned
>>>>> -11 - requeuing
>>>>> [ 463.586752] device-mapper: multipath: blk_get_request()
>>> returned
>>>>> -11 - requeuing
>>>>> [ 463.656880] device-mapper: multipath: blk_get_request()
>>> returned
>>>>> -11 - requeuing
>>>>> [ 463.698919] device-mapper: multipath: blk_get_request()
>>> returned
>>>>> -11 - requeuing
>>>>> [ 463.723572] device-mapper: multipath: blk_get_request()
>>> returned
>>>>> -11 - requeuing
>>>>> [ 463.751820] device-mapper: multipath: blk_get_request()
>>> returned
>>>>> -11 - requeuing
>>>>> [ 463.781110] device-mapper: multipath: blk_get_request()
>>> returned
>>>>> -11 - requeuing
>>>>> [ 463.828267] device-mapper: multipath: blk_get_request()
>>> returned
>>>>> -11 - requeuing
>>>>> [ 463.856997] device-mapper: multipath: blk_get_request()
>>> returned
>>>>> -11 - requeuing
>>>>> [ 463.885998] device-mapper: multipath: blk_get_request()
>>> returned
>>>>> -11 - requeuing
>>>>> [ 468.572060] multipath_clone_and_map: 5201 callbacks
>>> suppressed
>>>>> [ 468.602191] device-mapper: multipath: blk_get_request()
>>> returned
>>>>> -11 - requeuing
>>>>> [ 468.655761] device-mapper: multipath: blk_get_request()
>>> returned
>>>>> -11 - requeuing
>>>>> [ 468.697540] device-mapper: multipath: blk_get_request()
>>> returned
>>>>> -11 - requeuing
>>>>> [ 468.738610] device-mapper: multipath: blk_get_request()
>>> returned
>>>>> -11 - requeuing
>>>>> [ 468.779597] device-mapper: multipath: blk_get_request()
>>> returned
>>>>> -11 - requeuing
>>>>> [ 468.820705] device-mapper: multipath: blk_get_request()
>>> returned
>>>>> -11 - requeuing
>>>>> [ 468.862712] device-mapper: multipath: blk_get_request()
>>> returned
>>>>> -11 - requeuing
>>>>> [ 468.904662] device-mapper: multipath: blk_get_request()
>>> returned
>>>>> -11 - requeuing
>>>>> [ 468.945132] device-mapper: multipath: blk_get_request()
>>> returned
>>>>> -11 - requeuing
>>>>> [ 468.985128] device-mapper: multipath: blk_get_request()
>>> returned
>>>>> -11 - requeuing
>>>>> [ 473.643013] multipath_clone_and_map: 6569 callbacks
>>> suppressed
>>>>> [ 473.673466] device-mapper: multipath: blk_get_request()
>>> returned
>>>>> -11 - requeuing
>>>>> [ 473.713929] device-mapper: multipath: blk_get_request()
>>> returned
>>>>> -11 - requeuing
>>>>> [ 473.754989] device-mapper: multipath: blk_get_request()
>>> returned
>>>>> -11 - requeuing
>>>>> [ 473.796069] device-mapper: multipath: blk_get_request()
>>> returned
>>>>> -11 - requeuing
>>>>> [ 473.849768] device-mapper: multipath: blk_get_request()
>>> returned
>>>>> -11 - requeuing
>>>>> [ 473.891640] device-mapper: multipath: blk_get_request()
>>> returned
>>>>> -11 - requeuing
>>>>> [ 473.933321] device-mapper: multipath: blk_get_request()
>>> returned
>>>>> -11 - requeuing
>>>>> [ 473.973791] device-mapper: multipath: blk_get_request()
>>> returned
>>>>> -11 - requeuing
>>>>> [ 474.016739] device-mapper: multipath: blk_get_request()
>>> returned
>>>>> -11 - requeuing
>>>>> [ 474.059858] device-mapper: multipath: blk_get_request()
>>> returned
>>>>> -11 - requeuing
>>>>> [ 478.732265] multipath_clone_and_map: 6310 callbacks
>>> suppressed
>>>>> [ 478.762164] device-mapper: multipath: blk_get_request()
>>> returned
>>>>> -11 - requeuing
>>>>> [ 478.802327] device-mapper: multipath: blk_get_request()
>>> returned
>>>>> -11 - requeuing
>>>>> [ 478.844235] device-mapper: multipath: blk_get_request()
>>> returned
>>>>> -11 - requeuing
>>>>> [ 478.886113] device-mapper: multipath: blk_get_request()
>>> returned
>>>>> -11 - requeuing
>>>>> [ 478.927164] device-mapper: multipath: blk_get_request()
>>> returned
>>>>> -11 - requeuing
>>>>> [ 478.968141] device-mapper: multipath: blk_get_request()
>>> returned
>>>>> -11 - requeuing
>>>>> [ 479.008772] device-mapper: multipath: blk_get_request()
>>> returned
>>>>> -11 - requeuing
>>>>> [ 479.060225] device-mapper: multipath: blk_get_request()
>>> returned
>>>>> -11 - requeuing
>>>>> [ 479.102873] device-mapper: multipath: blk_get_request()
>>> returned
>>>>> -11 - requeuing
>>>>> [ 479.145341] device-mapper: multipath: blk_get_request()
>>> returned
>>>>> -11 - requeuing
>>>>> [ 483.834884] multipath_clone_and_map: 5703 callbacks
>>> suppressed
>>>>> [ 483.865056] device-mapper: multipath: blk_get_request()
>>> returned
>>>>> -11 - requeuing
>>>>> [ 483.906792] device-mapper: multipath: blk_get_request()
>>> returned
>>>>> -11 - requeuing
>>>>> [ 483.959492] device-mapper: multipath: blk_get_request()
>>> returned
>>>>> -11 - requeuing
>>>>> [ 484.001768] device-mapper: multipath: blk_get_request()
>>> returned
>>>>> -11 - requeuing
>>>>> [ 484.044785] device-mapper: multipath: blk_get_request()
>>> returned
>>>>> -11 - requeuing
>>>>> [ 484.087289] device-mapper: multipath: blk_get_request()
>>> returned
>>>>> -11 - requeuing
>>>>> [ 484.129854] device-mapper: multipath: blk_get_request()
>>> returned
>>>>> -11 - requeuing
>>>>> [ 484.171317] device-mapper: multipath: blk_get_request()
>>> returned
>>>>> -11 - requeuing
>>>>> [ 484.212314] device-mapper: multipath: blk_get_request()
>>> returned
>>>>> -11 - requeuing
>>>>> [ 484.263931] device-mapper: multipath: blk_get_request()
>>> returned
>>>>> -11 - requeuing
>>>>> [ 484.982608] perf: interrupt took too long (10021 > 10007),
>>>>> lowering kernel.perf_event_max_sample_rate to 19000
>>>>> [ 485.362787] watchdog: BUG: soft lockup - CPU#11 stuck for
>>> 21s!
>>>>> [ksoftirqd/11:78]
>>>>> [ 485.400762] Modules linked in: dm_round_robin xt_CHECKSUM
>>>>> ipt_MASQUERADE nf_nat_masquerade_ipv4 tun ip6t_rpfilter
>>> ipt_REJECT
>>>>> nf_reject_ipv4 ip6t_REJECT nf_reject_ipv6 xt_conntrack ip_set
>>>>> nfnetlink ebtable_nat ebtable_broute bridge stp llc
>>> ip6table_nat
>>>>> nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv6 ip6table_mangle
>>>>> ip6table_security ip6table_raw iptable_nat nf_conntrack_ipv4
>>>>> nf_defrag_ipv4 nf_nat_ipv4 nf_nat rpcrdma nf_conntrack ib_isert
>>>>> iscsi_target_mod iptable_mangle iptable_security iptable_raw
>>>>> ebtable_filter ebtables ib_iser ip6table_filter ip6_tables
>>> libiscsi
>>>>> iptable_filter scsi_transport_iscsi target_core_mod ib_srp
>>>>> scsi_transport_srp ib_ipoib rdma_ucm ib_ucm ib_uverbs ib_umad
>>>>> rdma_cm
>>>>> ib_cm iw_cm mlx5_ib ib_core intel_powerclamp coretemp kvm_intel
>>> kvm
>>>>> irqbypass crct10dif_pclmul crc32_pclmul ghash_clmulni_intel
>>>>> [ 485.809222] pcbc joydev ipmi_si aesni_intel sg hpwdt hpilo
>>>>> crypto_simd iTCO_wdt cryptd iTCO_vendor_support ipmi_devintf
>>>>> gpio_ich
>>>>> glue_helper shpchp lpc_ich ipmi_msghandler acpi_power_meter
>>> pcspkr
>>>>> i7core_edac pcc_cpufreq nfsd auth_rpcgss nfs_acl lockd grace
>>> sunrpc
>>>>> dm_multipath ip_tables xfs libcrc32c sd_mod amdkfd amd_iommu_v2
>>>>> radeon mlx5_core i2c_algo_bit drm_kms_helper syscopyarea
>>>>> sysfillrect
>>>>> sysimgblt fb_sys_fops ttm mlxfw drm ptp hpsa pps_core
>>>>> scsi_transport_sas crc32c_intel bnx2 i2c_core serio_raw devlink
>>>>> dm_mirror dm_region_hash dm_log dm_mod [last unloaded: ib_srpt]
>>>>> [ 486.096352] CPU: 11 PID: 78 Comm: ksoftirqd/11 Tainted:
>>>>> G I 4.13.0-rc3+ #23
>>>>> [ 486.142775] Hardware name: HP ProLiant DL380 G7, BIOS P67
>>>>> 08/16/2015
>>>>> [ 486.178816] task: ffff8a3fe453df40 task.stack:
>>> ffff9a6cc65a8000
>>>>> [ 486.212674] RIP: 0010:__blk_recalc_rq_segments+0xb8/0x3d0
>>>>> [ 486.242444] RSP: 0018:ffff9a6cc65aba60 EFLAGS: 00000282
>>>>> ORIG_RAX:
>>>>> ffffffffffffff10
>>>>> [ 486.284086] RAX: ffff8a4a8e26e440 RBX: 0000000000001000 RCX:
>>>>> ffff8a4a8e26e440
>>>>> [ 486.323445] RDX: 0000000000000000 RSI: 0000000000001000 RDI:
>>>>> ffff8a4a8e2699c8
>>>>> [ 486.363614] RBP: ffff9a6cc65abae0 R08: 0000000000007000 R09:
>>>>> 0000000000000001
>>>>> [ 486.404621] R10: 0000000000001000 R11: 0000000000000000 R12:
>>>>> 0000000000000000
>>>>> [ 486.445127] R13: 0000000000000000 R14: 0000000000000001 R15:
>>>>> 0000000000000007
>>>>> [ 486.486120] FS: 0000000000000000(0000)
>>>>> GS:ffff8a4b66940000(0000)
>>>>> knlGS:0000000000000000
>>>>> [ 486.532771] CS: 0010 DS: 0000 ES: 0000 CR0:
>>> 0000000080050033
>>>>> [ 486.565772] CR2: 00007f32bfcee000 CR3: 0000000d67609000 CR4:
>>>>> 00000000000006e0
>>>>> [ 486.606325] Call Trace:
>>>>> [ 486.620269] ? mempool_free+0x2b/0x80
>>>>> [ 486.641248] blk_recalc_rq_segments+0x28/0x40
>>>>> [ 486.666754] blk_update_request+0x249/0x310
>>>>> [ 486.690838] end_clone_bio+0x46/0x70 [dm_mod]
>>>>> [ 486.715949] bio_endio+0xa1/0x120
>>>>> [ 486.734408] blk_update_request+0xb7/0x310
>>>>> [ 486.757313] scsi_end_request+0x38/0x1d0
>>>>> [ 486.779375] scsi_io_completion+0x13c/0x630
>>>>> [ 486.802928] scsi_finish_command+0xd9/0x120
>>>>> [ 486.825875] scsi_softirq_done+0x12a/0x150
>>>>> [ 486.849029] __blk_mq_complete_request+0x90/0x140
>>>>> [ 486.875484] blk_mq_complete_request+0x16/0x20
>>>>> [ 486.900631] scsi_mq_done+0x21/0x70
>>>>> [ 486.920462] srp_recv_done+0x48f/0x610 [ib_srp]
>>>>> [ 486.946212] __ib_process_cq+0x5c/0xb0 [ib_core]
>>>>> [ 486.972576] ib_poll_handler+0x22/0x70 [ib_core]
>>>>> [ 486.998638] irq_poll_softirq+0xaf/0x110
>>>>> [ 487.021727] __do_softirq+0xc9/0x269
>>>>> [ 487.042027] run_ksoftirqd+0x29/0x50
>>>>> [ 487.062675] smpboot_thread_fn+0x110/0x160
>>>>> [ 487.085597] kthread+0x109/0x140
>>>>> [ 487.104113] ? sort_range+0x30/0x30
>>>>> [ 487.124120] ? kthread_park+0x60/0x60
>>>>> [ 487.144624] ret_from_fork+0x25/0x30
>>>>> [ 487.164640] Code: c3 83 fa 09 0f 84 8e 01 00 00 48 8b 44 24
>>> 20
>>>>> 48
>>>>> 8b 4c 24 28 45 31 ff 45 31 c0 45 31 c9 0f b6 80 36 06 00 00 88
>>> 44
>>>>> 24
>>>>> 17 8b 59 28 <44> 8b 71 2c 44 8b 61 34 85 db 0f 84 92 01 00 00
>>> 45 89
>>>>> f5 45 89
>>>>>
>>>>> Message from syslogd@ibclient at Aug 8 04:12:23 ...
>>>>> kernel:watchdog: BUG: soft lockup - CPU#11 stuck for 21s!
>>>>> [ksoftirqd/11:78]
>>>>>
>>>>>
>>>>>
>>>>> Thanks,
>>>>> Ming
>>
>>>
>>> Hello Bart
>>>
>>> I received your message but my work email is not seeing it.
>>> So I am replying to the original email by Ming so I can properly
>>> respond to your most recent message pasted below and keep the
>> thread
>>> in
>>> the right flow.
>>>
>>> I am purposefully testing buffered I/O because as you know that was
>> a
>>> source of pain for us last year.
>>> So yes my intention always is to run large buffered and un-buffered
>>> (direct) sequential I/O as its part of my regular tests that have
>>> been
>>> very useful in discovering many issues.
>>> I also have smaller random and sequential I/O tests driven by fio
>>> that
>>> were not seeing the issue.
>>>
>>> I will go ahead and test your patch below. I am not surprised Ming
>>> was
>>> able to make this happen on the stock 4.13-RC3 but he had to run 16
>>> parallel jobs.
>>>
>>> It was happening on his kernel with only 3 parallel jobs and had
>> not
>>> been happening for me with 3 parallel runs on stock upstream but
>>> perhaps just takes longer on 4.13-RC3 so 16 parallel runs brings it
>>> out.
>>>
>>> Just one other small point, this is a "hard", not soft lockup.
>>>
>>> Unfortunately on the 7.3 base I am unable to capture a full vmcore
>>> when
>>> running the latest upstream kernel so if I need to get a vmcore I
>>> will
>>> have to use Fedora and the latest upstream.
>>>
>>> I will do that if your patch is not successful but reading it and
>>> looking at the stack dumps I have, it makes sense.
>>>
>>> back in a bit
>>>
>>> Thanks
>>> Laurence
>>>
>>> ----
>>> Hello Laurence,
>>>
>>> Is your goal perhaps to simulate a DDN workload? In that case I
>> think
>>> you
>>> need oflag=direct to the dd argument list such that the page cache
>>> writeback
>>> code does not alter the size of the write requests. Anyway, this
>> test
>>> should
>>> not trigger a lockup. Can you check whether the patch below makes
>> the
>>> soft
>>> lockup complaints disappear (without changing the hammer_write.sh
>>> test
>>> script)?
>>>
>>> Thanks,
>>>
>>> Bart.
>>> -----------------------------------------------------------------
>> ----
>>> -------
>>> [PATCH] block: Make blk_mq_delay_kick_requeue_list() rerun the
>> queue
>>> at
>>> a
>>> quiet time
>>>
>>> Drivers like dm-mpath requeue requests if no paths are available
>> and
>>> if configured to do so. If the queue depth is sufficiently high and
>>> the queue rerunning delay sufficiently short then .requeue_work can
>>> be queued so often that other work items queued on the same work
>>> queue
>>> do not get executed. Avoid that this happens by only rerunning the
>>> queue after no blk_mq_delay_kick_requeue_list() calls have occurred
>>> during @msecs milliseconds. Since the device mapper core is the
>> only
>>> user of blk_mq_delay_kick_requeue_list(), modify the implementation
>>> of this function instead of creating a new function.
>>>
>>> Signed-off-by: Bart Van Assche <bart.vanassche@wdc.com>
>>> ---
>>> block/blk-mq.c | 4 ++--
>>> 1 file changed, 2 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/block/blk-mq.c b/block/blk-mq.c
>>> index 041f7b7fa0d6..8bfea36e92f9 100644
>>> --- a/block/blk-mq.c
>>> +++ b/block/blk-mq.c
>>> @@ -679,8 +679,8 @@ EXPORT_SYMBOL(blk_mq_kick_requeue_list);
>>> void blk_mq_delay_kick_requeue_list(struct request_queue *q,
>>> unsigned long msecs)
>>> {
>>> - kblockd_schedule_delayed_work(&q->requeue_work,
>>> - msecs_to_jiffies(msecs));
>>> + kblockd_mod_delayed_work_on(WORK_CPU_UNBOUND, &q-
>>>> requeue_work,
>>> + msecs_to_jiffies(msecs));
>>> }
>>> EXPORT_SYMBOL(blk_mq_delay_kick_requeue_list);
>>> -----
>
> Hello Bart
>
> So the patch is working to avoid the clone failures messages and
> requeue -11 messages of my original issue however both the hard lockup
> and soft lockup persists. It's is a lot harder to get the hard lockup
> now though with the patch.
>
> The soft LOCKUPS due to the spins on the ksoftirqd are almost expected
> because of how hard we are driving this and are easy to recreate.
>
> I did boot the original Ming kernel and reproduced first for baseline.
> I then patched it with your simple patch.
>
> I was not paying enough attention to Ming's message but he was seeing a
> ksoftirq soft LOCKUP on the stock upstream kernel NOT the hard LOCKUP.
> I did not check his log message carefully so assumed he meant hard
> LOCKUP.
>
> I guess that's why you said "soft LOCKUP"
>
> I did get into a few soft lockups before I hit another hard LOCKUP.
>
> So I am now again not sure about the hard LOCKUP on the stock and 7.4
> kernel. I will have to retest and baseline myself as Ming is away now
> on vacation.
>
> The soft lockups still happen but they recover unless panic_on_
> soft_lockup is set and I almost expect them with how the softirqd is
> spinning here.
>
> We see high spin rates here and I am only running 7 parallel writes.
>
> PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+
> COMMAND
> 20 root 20 0 0 0 0 R 100.0 0.0 0:27.11
> kworker/u6+
> 24 root 20 0 0 0 0 R 100.0 0.0 0:21.56
> ksoftirqd/2
> 60 root 20 0 0 0 0 R 100.0 0.0 0:34.61
> ksoftirqd/8
>
> We see high SYS time because of the spinning so the soft LOCKUP is not
> unexpected to be honest and it recovers.
>
> # <----CPU[HYPER]-----><-----------Memory-----------><---------
> -Disks----------->
> #Time cpu sys inter ctxsw Free Buff Cach Inac Slab Map
> KBRead Reads KBWrit Writes
> 21:32:11 52 52 11526 45981 65G 21G 541M 22G 7G
> 209M 45 11 709791 211
> 21:32:12 49 49 13315 26037 65G 21G 541M 22G 6G
> 209M 0 0 606356 195
> 21:32:13 47 47 13375 21456 65G 21G 541M 22G 6G
> 209M 0 0 591468 250
> 21:32:14 33 33 12635 978 65G 21G 541M 22G 5G
> 209M 0 0 560860 365
> 21:32:15 40 40 14304 734 67G 19G 541M 20G 5G
> 209M 0 0 429472 97
> 21:32:16 36 36 16060 6220 70G 17G 541M 17G 4G
> 209M 0 0 312852 54
> 21:32:17 31 31 16071 4962 71G 16G 540M 16G 4G
> 209M 24 6 253228 44
> 21:32:18 31 31 15855 868 70G 17G 540M 18G 6G
> 209M 104 26 237696 62
> 21:32:19 31 31 15983 661 70G 17G 541M 17G 7G
> 209M 0 0 162340 37
> 21:32:20 32 32 16245 1413 70G 17G 541M 18G 9G
> 209M 68 17 155648 19
> 21:32:21 44 44 14919 21711 68G 19G 540M 19G 10G
> 209M 0 0 303104 32
> 21:32:22 59 59 11661 69544 68G 19G 541M 20G 10G
> 209M 0 0 620919 84
> 21:32:23 59 59 12951 1058 67G 20G 540M 20G 10G
> 209M 0 0 596582 126
> 21:32:24 68 67 12991 2077 66G 21G 540M 21G 10G
> 209M 0 0 689504 168
> 21:32:25 78 78 13295 2927 66G 21G 541M 21G 10G
> 209M 0 0 667673 144
> 21:32:26 66 66 13498 7670 65G 21G 541M 22G 10G
> 209M 0 0 626874 98
> 21:32:27 55 55 12203 2663 65G 21G 541M 22G 9G
> 209M 0 0 652096 112
> 21:32:28 47 47 12567 27804 65G 21G 541M 22G 8G
> 209M 0 0 534789 137
>
> We are driving this system pretty hard here and the target server is
> seeing queue fulls and I am used to these if I drive it too hard.
>
> [ 568.707572] ib_srpt srpt_write_pending: IB send queue full (needed
> 97)
> [ 568.745714] ib_srpt srpt_write_pending: IB send queue full (needed
> 97)
> [ 568.783735] ib_srpt srpt_write_pending: IB send queue full (needed
> 97)
> [ 568.821583] ib_srpt srpt_write_pending: IB send queue full (needed
> 97)
> [ 568.858238] ib_srpt srpt_write_pending: IB send queue full (needed
> 97)
> [ 568.895975] ib_srpt srpt_queue_response: IB send queue full (needed
> 0)
> [ 568.934253] ib_srpt srpt_write_pending: IB send queue full (needed
> 97)
> [ 568.971040] ib_srpt srpt_write_pending: IB send queue full (needed
> 49)
>
> So your patch helps to avoid easily getting the hard LOCKUP, does not
> help the Soft lockup (as expected).
>
> [ 329.305507] watchdog: BUG: soft lockup - CPU#12 stuck for 21s!
> [ksoftirqd/12:84]
> [ 329.305508] Modules linked in: dm_round_robin mq_deadline
> xt_CHECKSUM ipt_MASQUERADE nf_nat_masquerade_ipv4 tun ip6t_rpfilter
> ipt_REJECT nf_reject_ipv4 ip6t_REJECT nf_reject_ipv6 xt_conntrack
> ip_set nfnetlink ebtable_nat ebtable_broute bridge stp llc ip6table_nat
> nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv6 ip6table_mangle
> ip6table_security ip6table_raw iptable_nat nf_conntrack_ipv4
> nf_defrag_ipv4 nf_nat_ipv4 nf_nat rpcrdma nf_conntrack ib_isert
> iptable_mangle iptable_security iscsi_target_mod iptable_raw
> ebtable_filter ebtables ip6table_filter ib_iser ip6_tables libiscsi
> iptable_filter scsi_transport_iscsi target_core_mod ib_srp
> scsi_transport_srp ib_ipoib rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm
> ib_cm iw_cm mlx5_ib ib_core intel_powerclamp coretemp kvm_intel kvm
> irqbypass crct10dif_pclmul crc32_pclmul
> [ 329.305543] ghash_clmulni_intel pcbc joydev aesni_intel ipmi_si sg
> hpilo ipmi_devintf hpwdt crypto_simd iTCO_wdt gpio_ich
> iTCO_vendor_support cryptd shpchp acpi_power_meter ipmi_msghandler
> i7core_edac glue_helper pcspkr lpc_ich pcc_cpufreq nfsd auth_rpcgss
> nfs_acl lockd grace sunrpc dm_multipath ip_tables xfs libcrc32c sd_mod
> amdkfd amd_iommu_v2 radeon mlx5_core i2c_algo_bit drm_kms_helper
> syscopyarea sysfillrect sysimgblt fb_sys_fops ttm mlxfw ptp drm
> pps_core hpsa i2c_core devlink crc32c_intel serio_raw bnx2
> scsi_transport_sas dm_mirror dm_region_hash dm_log dm_mod [last
> unloaded: ib_srpt]
> [ 329.305564] CPU: 12 PID: 84 Comm: ksoftirqd/12 Tainted:
> G I 4.13.0-rc3.ming.bart+ #1
> [ 329.305566] Hardware name: HP ProLiant DL380 G7, BIOS P67 08/16/2015
> [ 329.305567] task: ffff91a94c070040 task.stack: ffffaa8ec65d8000
> [ 329.305573] RIP: 0010:__blk_recalc_rq_segments+0x1a7/0x3d0
> [ 329.305574] RSP: 0018:ffffaa8ec65dba60 EFLAGS: 00000202 ORIG_RAX:
> ffffffffffffff10
> [ 329.305575] RAX: 0000000000000001 RBX: 0000000000001000 RCX:
> ffff91b1149239c0
> [ 329.305576] RDX: 0000000000001000 RSI: 0000000000000001 RDI:
> 0000000000001000
> [ 329.305577] RBP: ffffaa8ec65dbae0 R08: 0000000000001000 R09:
> 0000000000000001
> [ 329.305578] R10: 0000000b2a506000 R11: 0000000000000000 R12:
> 0000000000000000
> [ 329.305579] R13: 0000000000000000 R14: 0000000000000000 R15:
> 000000000000002e
> [ 329.305580] FS: 0000000000000000(0000) GS:ffff91b3b7b80000(0000)
> knlGS:0000000000000000
> [ 329.305581] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [ 329.305582] CR2: 00007ff4591f1860 CR3: 0000000985e09000 CR4:
> 00000000000006e0
> [ 329.305583] Call Trace:
> [ 329.305588] ? mempool_free+0x2b/0x80
> [ 329.305590] blk_recalc_rq_segments+0x28/0x40
> [ 329.305592] blk_update_request+0x249/0x310
> [ 329.305603] end_clone_bio+0x46/0x70 [dm_mod]
> [ 329.305607] bio_endio+0xa1/0x120
> [ 329.305608] blk_update_request+0xb7/0x310
> [ 329.305611] scsi_end_request+0x38/0x1d0
> [ 329.305613] scsi_io_completion+0x13c/0x630
> [ 329.305615] scsi_finish_command+0xd9/0x120
> [ 329.305616] scsi_softirq_done+0x12a/0x150
> [ 329.305618] __blk_mq_complete_request+0x90/0x140
> [ 329.305624] blk_mq_complete_request+0x16/0x20
> [ 329.305625] scsi_mq_done+0x21/0x70
> [ 329.305628] srp_recv_done+0x48f/0x610 [ib_srp]
> [ 329.305639] __ib_process_cq+0x5c/0xb0 [ib_core]
> [ 329.305646] ib_poll_handler+0x22/0x70 [ib_core]
> [ 329.305649] irq_poll_softirq+0xaf/0x110
> [ 329.305652] __do_softirq+0xc9/0x269
> [ 329.305655] run_ksoftirqd+0x29/0x50
> [ 329.305657] smpboot_thread_fn+0x110/0x160
> [ 329.305659] kthread+0x109/0x140
> [ 329.305660] ? sort_range+0x30/0x30
> [ 329.305661] ? kthread_park+0x60/0x60
> [ 329.305663] ret_from_fork+0x25/0x30
> [ 329.305664] Code: 8b 74 24 30 48 8b 7c 24 38 41 83 c7 01 41 89 d0 41
> b9 01 00 00 00 48 89 74 24 40 48 89 7c 24 48 8b 41 14 0f b6 f0 25 fb 00
> 00 00 <83> f8 03 0f 84 a0 00 00 00 83 fe 05 0f 84 97 00 00 00 83 fe 09
> [ 331.465885] CPU: 4 PID: 36 Comm: ksoftirqd/4 Tainted:
> G I L 4.13.0-rc3.ming.bart+ #1
> [ 331.516413] Hardware name: HP ProLiant DL380 G7, BIOS P67 08/16/2015
> [ 331.551403] task: ffff91a94c76df40 task.stack: ffffaa8ec6458000
> [ 331.584080] RIP: 0010:__blk_recalc_rq_segments+0xdf/0x3d0
> [ 331.614515] RSP: 0018:ffffaa8ec645ba60 EFLAGS: 00000286 ORIG_RAX:
> ffffffffffffff10
> [ 331.656098] RAX: 0000000000001000 RBX: 0000000000001000 RCX:
> ffff91b29faecdc0
> [ 331.696577] RDX: ffffe839e5d75300 RSI: ffff91b29faece48 RDI:
> ffff91b2831f8348
> [ 331.736607] RBP: ffffaa8ec645bae0 R08: 0000000000004000 R09:
> 0000000000000001
> [ 331.776129] R10: 0000000000000000 R11: 0000000000000000 R12:
> 0000000000000000
> [ 331.817085] R13: 0000000000000000 R14: 0000000000000000 R15:
> 0000000000000021
> [ 331.857671] FS: 0000000000000000(0000) GS:ffff91b3b7a80000(0000)
> knlGS:0000000000000000
> [ 331.904232] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [ 331.935249] CR2: 00007f404526e000 CR3: 0000000985e09000 CR4:
> 00000000000006e0
> [ 331.976273] Call Trace:
> [ 331.990216] ? mempool_free+0x2b/0x80
> [ 332.010310] blk_recalc_rq_segments+0x28/0x40
> [ 332.034810] blk_update_request+0x249/0x310
> [ 332.058259] end_clone_bio+0x46/0x70 [dm_mod]
> [ 332.082851] bio_endio+0xa1/0x120
> [ 332.101145] blk_update_request+0xb7/0x310
> [ 332.123965] scsi_end_request+0x38/0x1d0
> [ 332.145943] scsi_io_completion+0x13c/0x630
> [ 332.169357] scsi_finish_command+0xd9/0x120
> [ 332.193505] scsi_softirq_done+0x12a/0x150
> [ 332.217044] __blk_mq_complete_request+0x90/0x140
> [ 332.244498] blk_mq_complete_request+0x16/0x20
> [ 332.269971] scsi_mq_done+0x21/0x70
> [ 332.289591] srp_recv_done+0x48f/0x610 [ib_srp]
> [ 332.315609] __ib_process_cq+0x5c/0xb0 [ib_core]
> [ 332.342144] ib_poll_handler+0x22/0x70 [ib_core]
> [ 332.368724] irq_poll_softirq+0xaf/0x110
> [ 332.392595] __do_softirq+0xc9/0x269
> [ 332.413220] run_ksoftirqd+0x29/0x50
> [ 332.433214] smpboot_thread_fn+0x110/0x160
> [ 332.456254] kthread+0x109/0x140
> [ 332.475263] ? sort_range+0x30/0x30
> [ 332.494624] ? kthread_park+0x60/0x60
> [ 332.515413] ret_from_fork+0x25/0x30
> [ 332.535180] Code: 17 8b 59 28 44 8b 71 2c 44 8b 61 34 85 db 0f 84 92
> 01 00 00 45 89 f5 45 89 e2 4c 89 ee 48 c1 e6 04 48 03 71 78 8b 46 08 48
> 8b 16 <44> 29 e0 48 89 54 24 30 39 d8 0f 47 c3 44 03 56 0c 45 84 db 89
>
> Message from syslogd@ibclient at Aug 8 22:06:38 ...
> kernel:watchdog: BUG: soft lockup - CPU#4 stuck for 23s!
> [ksoftirqd/4:36]
>
> Message from syslogd@ibclient at Aug 8 22:06:38 ...
> kernel:watchdog: BUG: soft lockup - CPU#12 stuck for 21s!
> [ksoftirqd/12:84]
>
> I am going to leave this running overnight and see if I wake up to
> another hard lockup.
>
> Thanks as always, again the patch has helped in two ways.
> 1. It avoids the clone failure and requeue -11 failure messages
> 2. It seems to take a lot longe rnow to get into a hard lockup.
>
> Regards
> Laurence
>
Hello Bart
Your latest patch on stock upstream without Ming's latest patches is
behaving for me.
As already mentioned, the requeue -11 and clone failure messages are
gone and I am not actually seeing any soft lockups or hard lockups.
When Ming gets back I will work with him on his patch set and the lockups.
Running 10 parallel writes which easily trips into soft lockups on
Ming's kernel (even with your patch) has been stable here on 4.13-RC3
with your patch.
I will leave it running for a while now but the patch is good.
If it survives 4 hours I will add a Tested-by to your latest patch.
Thanks
Laurence
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [v4.13-rc BUG] system lockup when running big buffered write(4M) to IB SRP via mpath
2017-08-09 16:43 ` Laurence Oberman
@ 2017-08-09 17:10 ` Bart Van Assche
0 siblings, 0 replies; 16+ messages in thread
From: Bart Van Assche @ 2017-08-09 17:10 UTC (permalink / raw)
To: dm-devel, linux-scsi, linux-block, loberman, ming.lei; +Cc: Bart Van Assche
T24gV2VkLCAyMDE3LTA4LTA5IGF0IDEyOjQzIC0wNDAwLCBMYXVyZW5jZSBPYmVybWFuIHdyb3Rl
Og0KPiBZb3VyIGxhdGVzdCBwYXRjaCBvbiBzdG9jayB1cHN0cmVhbSB3aXRob3V0IE1pbmcncyBs
YXRlc3QgcGF0Y2hlcyBpcyANCj4gYmVoYXZpbmcgZm9yIG1lLg0KPiANCj4gQXMgYWxyZWFkeSBt
ZW50aW9uZWQsIHRoZSByZXF1ZXVlIC0xMSBhbmQgY2xvbmUgZmFpbHVyZSBtZXNzYWdlcyBhcmUg
DQo+IGdvbmUgYW5kIEkgYW0gbm90IGFjdHVhbGx5IHNlZWluZyBhbnkgc29mdCBsb2NrdXBzIG9y
IGhhcmQgbG9ja3Vwcy4NCj4gDQo+IFdoZW4gTWluZyBnZXRzIGJhY2sgSSB3aWxsIHdvcmsgd2l0
aCBoaW0gb24gaGlzIHBhdGNoIHNldCBhbmQgdGhlIGxvY2t1cHMuDQo+IA0KPiBSdW5uaW5nIDEw
IHBhcmFsbGVsIHdyaXRlcyB3aGljaCBlYXNpbHkgdHJpcHMgaW50byBzb2Z0IGxvY2t1cHMgb24g
DQo+IE1pbmcncyBrZXJuZWwgKGV2ZW4gd2l0aCB5b3VyIHBhdGNoKSBoYXMgYmVlbiBzdGFibGUg
aGVyZSBvbiA0LjEzLVJDMyANCj4gd2l0aCB5b3VyIHBhdGNoLg0KPiANCj4gSSB3aWxsIGxlYXZl
IGl0IHJ1bm5pbmcgZm9yIGEgd2hpbGUgbm93IGJ1dCB0aGUgcGF0Y2ggaXMgZ29vZC4NCj4gDQo+
IElmIGl0IHN1cnZpdmVzIDQgaG91cnMgSSB3aWxsIGFkZCBhIFRlc3RlZC1ieSB0byB5b3VyIGxh
dGVzdCBwYXRjaC4NCg0KSGVsbG8gTGF1cmVuY2UsDQoNCkknbSB3b3JraW5nIG9uIGFuIGFkZGl0
aW9uYWwgcGF0Y2ggdGhhdCBzaG91bGQgcmVkdWNlIHVubmVjZXNzYXJ5IHJlcXVldWluZw0KZXZl
biBmdXJ0aGVyLiBJIHdpbGwgbGV0IHlvdSBrbm93IHdoZW4gaXQncyByZWFkeS4NCg0KQWRkaXRp
b25hbGx5LCBwbGVhc2UgdHJpbSBlLW1haWxzIHdoZW4gcmVwbHlpbmcgc3VjaCB0aGF0IGUtbWFp
bHMgZG8gbm90IGdldA0KdG9vIGxvbmcuDQoNClRoYW5rcywNCg0KQmFydC4=
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [v4.13-rc BUG] system lockup when running big buffered write(4M) to IB SRP via mpath
@ 2017-08-09 17:10 ` Bart Van Assche
0 siblings, 0 replies; 16+ messages in thread
From: Bart Van Assche @ 2017-08-09 17:10 UTC (permalink / raw)
To: dm-devel, linux-scsi, linux-block, loberman, ming.lei; +Cc: Bart Van Assche
On Wed, 2017-08-09 at 12:43 -0400, Laurence Oberman wrote:
> Your latest patch on stock upstream without Ming's latest patches is
> behaving for me.
>
> As already mentioned, the requeue -11 and clone failure messages are
> gone and I am not actually seeing any soft lockups or hard lockups.
>
> When Ming gets back I will work with him on his patch set and the lockups.
>
> Running 10 parallel writes which easily trips into soft lockups on
> Ming's kernel (even with your patch) has been stable here on 4.13-RC3
> with your patch.
>
> I will leave it running for a while now but the patch is good.
>
> If it survives 4 hours I will add a Tested-by to your latest patch.
Hello Laurence,
I'm working on an additional patch that should reduce unnecessary requeuing
even further. I will let you know when it's ready.
Additionally, please trim e-mails when replying such that e-mails do not get
too long.
Thanks,
Bart.
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [v4.13-rc BUG] system lockup when running big buffered write(4M) to IB SRP via mpath
2017-08-09 17:10 ` Bart Van Assche
(?)
@ 2017-08-23 11:35 ` Ming Lei
2017-08-23 12:46 ` Ming Lei
2017-08-23 15:12 ` Bart Van Assche
-1 siblings, 2 replies; 16+ messages in thread
From: Ming Lei @ 2017-08-23 11:35 UTC (permalink / raw)
To: Bart Van Assche; +Cc: dm-devel, linux-scsi, linux-block, loberman
On Wed, Aug 09, 2017 at 05:10:01PM +0000, Bart Van Assche wrote:
> On Wed, 2017-08-09 at 12:43 -0400, Laurence Oberman wrote:
> > Your latest patch on stock upstream without Ming's latest patches is
> > behaving for me.
> >
> > As already mentioned, the requeue -11 and clone failure messages are
> > gone and I am not actually seeing any soft lockups or hard lockups.
> >
> > When Ming gets back I will work with him on his patch set and the lockups.
> >
> > Running 10 parallel writes which easily trips into soft lockups on
> > Ming's kernel (even with your patch) has been stable here on 4.13-RC3
> > with your patch.
> >
> > I will leave it running for a while now but the patch is good.
> >
> > If it survives 4 hours I will add a Tested-by to your latest patch.
>
> Hello Laurence,
>
> I'm working on an additional patch that should reduce unnecessary requeuing
> even further. I will let you know when it's ready.
>
> Additionally, please trim e-mails when replying such that e-mails do not get
> too long.
soft lockup still can be observed easily with patch d4acf3650c7c(
block: Make blk_mq_delay_kick_requeue_list() rerun the queue at a quiet time),
but no hard lockup.
With the patchset of 'blk-mq-sched: improve SCSI-MQ performance', hard
lockup can be observed following some failure log:
[ 269.277653] device-mapper: multipath: blk_get_request() returned -11 - requeuing
[ 269.321244] device-mapper: multipath: blk_get_request() returned -11 - requeuing
...
[ 273.421688] scsi host2: SRP abort called
[ 273.444577] scsi host2: Sending SRP abort for tag 0x6007e
[ 273.673871] scsi host2: Null scmnd for RSP w/tag 0x0000000006007e received on ch 6 / QP 0x30
...
[ 274.372110] device-mapper: multipath: blk_get_request() returned -11 - requeuing
[ 278.658671] scsi host2: SRP abort called
[ 278.690630] scsi host2: SRP abort called
[ 278.717634] scsi host2: SRP abort called
[ 278.745629] scsi host2: SRP abort called
[ 279.083227] multipath_clone_and_map: 1092 callbacks suppressed
....
[ 296.210503] scsi host2: SRP reset_device called
....
[ 303.784287] NMI watchdog: Watchdog detected hard LOCKUP on cpu 10
The trick thing is that both hard lockup and soft lockup share
one same stack trace.
Another question, I don't understand why request is allocated with
GFP_ATOMIC in multipath_clone_and_map(), looks it shouldn't be
necessary.
--
Ming
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [v4.13-rc BUG] system lockup when running big buffered write(4M) to IB SRP via mpath
2017-08-23 11:35 ` Ming Lei
@ 2017-08-23 12:46 ` Ming Lei
2017-08-23 15:12 ` Bart Van Assche
1 sibling, 0 replies; 16+ messages in thread
From: Ming Lei @ 2017-08-23 12:46 UTC (permalink / raw)
To: Bart Van Assche; +Cc: dm-devel, linux-scsi, linux-block, loberman
On Wed, Aug 23, 2017 at 07:35:26PM +0800, Ming Lei wrote:
> On Wed, Aug 09, 2017 at 05:10:01PM +0000, Bart Van Assche wrote:
> > On Wed, 2017-08-09 at 12:43 -0400, Laurence Oberman wrote:
> > > Your latest patch on stock upstream without Ming's latest patches is
> > > behaving for me.
> > >
> > > As already mentioned, the requeue -11 and clone failure messages are
> > > gone and I am not actually seeing any soft lockups or hard lockups.
> > >
> > > When Ming gets back I will work with him on his patch set and the lockups.
> > >
> > > Running 10 parallel writes which easily trips into soft lockups on
> > > Ming's kernel (even with your patch) has been stable here on 4.13-RC3
> > > with your patch.
> > >
> > > I will leave it running for a while now but the patch is good.
> > >
> > > If it survives 4 hours I will add a Tested-by to your latest patch.
> >
> > Hello Laurence,
> >
> > I'm working on an additional patch that should reduce unnecessary requeuing
> > even further. I will let you know when it's ready.
> >
> > Additionally, please trim e-mails when replying such that e-mails do not get
> > too long.
>
> soft lockup still can be observed easily with patch d4acf3650c7c(
> block: Make blk_mq_delay_kick_requeue_list() rerun the queue at a quiet time),
> but no hard lockup.
>
> With the patchset of 'blk-mq-sched: improve SCSI-MQ performance', hard
> lockup can be observed following some failure log:
>
> [ 269.277653] device-mapper: multipath: blk_get_request() returned -11 - requeuing
> [ 269.321244] device-mapper: multipath: blk_get_request() returned -11 - requeuing
> ...
> [ 273.421688] scsi host2: SRP abort called
> [ 273.444577] scsi host2: Sending SRP abort for tag 0x6007e
> [ 273.673871] scsi host2: Null scmnd for RSP w/tag 0x0000000006007e received on ch 6 / QP 0x30
> ...
> [ 274.372110] device-mapper: multipath: blk_get_request() returned -11 - requeuing
> [ 278.658671] scsi host2: SRP abort called
> [ 278.690630] scsi host2: SRP abort called
> [ 278.717634] scsi host2: SRP abort called
> [ 278.745629] scsi host2: SRP abort called
> [ 279.083227] multipath_clone_and_map: 1092 callbacks suppressed
> ....
> [ 296.210503] scsi host2: SRP reset_device called
> ....
> [ 303.784287] NMI watchdog: Watchdog detected hard LOCKUP on cpu 10
>
> The trick thing is that both hard lockup and soft lockup share
> one same stack trace.
Actually I can reproduce hard lockup too on the latest linus tree(v4.13-rc6+)
by just making the test more aggressive:
1) run hammer_write.sh 32-concurrently.
2) write 8M each time
So it isn't the patchset of 'blk-mq-sched: improve SCSI-MQ performance'
which causes the lockup.
There must be issue somewhere else, and one thing I saw is that
dm's req's completion handling isn't very efficient:
- when low level driver's req is completed, blk_update_request()
is called from scsi_end_request()
- inside blk_update_request(), end_clone_bio() is called for
each bio
- in end_clone_bio(), blk_update_request() is called to
do partial update for the dm req. That means if one request
has N bios, the same dm req need to do the expensive partial
update N times. In theory, the partial update might be avoided,
I guess the reason is that the dm rq can't be retrieved
via the low level driver's rq.
[1] hard lockup log
[ 434.068240] NMI watchdog: Watchdog detected hard LOCKUP on cpu 2
[ 434.068243] Modules linked in: dm_round_robin xt_CHECKSUM ipt_MASQUERADE nf_nat_masquerade_ipv4 tun ip6t_rpfilter ipt_REJECT nf_reject_ipv4 ip6t_REJECT nf_reject_ipv6 xt_conntrack ip_set nfnetlink ebtable_nat ebtable_broute bridge stp llc ip6table_nat nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv6 ip6table_mangle ip6table_security ip6table_raw iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack rpcrdma iptable_mangle ib_isert iscsi_target_mod iptable_security iptable_raw ebtable_filter ebtables ip6table_filter ib_iser libiscsi ip6_tables scsi_transport_iscsi iptable_filter target_core_mod ib_srp scsi_transport_srp ib_ipoib rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm mlx5_ib ib_core intel_powerclamp coretemp kvm_intel kvm irqbypass crct10dif_pclmul crc32_pclmul ghash_clmulni_intel
[ 434.068279] pcbc aesni_intel iTCO_wdt crypto_simd gpio_ich ipmi_si iTCO_vendor_support cryptd joydev glue_helper ipmi_devintf hpwdt acpi_power_meter hpilo sg ipmi_msghandler lpc_ich i7core_edac pcspkr shpchp pcc_cpufreq nfsd auth_rpcgss nfs_acl lockd grace sunrpc dm_multipath ip_tables xfs libcrc32c sd_mod amdkfd amd_iommu_v2 radeon mlx5_core i2c_algo_bit drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops ttm mlxfw drm ptp hpsa pps_core crc32c_intel serio_raw bnx2 i2c_core devlink scsi_transport_sas dm_mirror dm_region_hash dm_log dm_mod [last unloaded: ib_srpt]
[ 434.068314] CPU: 2 PID: 1039 Comm: dd Tainted: G I 4.13.0-rc6.linus+ #28
[ 434.068314] Hardware name: HP ProLiant DL380 G7, BIOS P67 08/16/2015
[ 434.068316] task: ffff9458f4498040 task.stack: ffffbdb48fd10000
[ 434.068323] RIP: 0010:__blk_recalc_rq_segments+0xd9/0x3d0
[ 434.068324] RSP: 0018:ffff945b37a43d08 EFLAGS: 00000086
[ 434.068325] RAX: ffff9458978a16c0 RBX: 0000000000001000 RCX: ffff9458978a16c0
[ 434.068326] RDX: 0000000000000000 RSI: ffff9458978a1748 RDI: ffff9458978a69c8
[ 434.068327] RBP: ffff945b37a43d88 R08: 0000000000014000 R09: 0000000000000001
[ 434.068328] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
[ 434.068329] R13: 0000000000000000 R14: 0000000000000000 R15: 000000000000001d
[ 434.068330] FS: 00007f27b13ea740(0000) GS:ffff945b37a40000(0000) knlGS:0000000000000000
[ 434.068331] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 434.068332] CR2: 00007f27aa600000 CR3: 0000000b1e2af000 CR4: 00000000000006e0
[ 434.068333] Call Trace:
[ 434.068335] <IRQ>
[ 434.068340] ? mempool_free+0x2b/0x80
[ 434.068343] blk_recalc_rq_segments+0x28/0x40
[ 434.068345] blk_update_request+0x249/0x310
[ 434.068363] end_clone_bio+0x46/0x70 [dm_mod]
[ 434.068367] bio_endio+0xa1/0x120
[ 434.068370] blk_update_request+0xb7/0x310
[ 434.068374] scsi_end_request+0x38/0x1d0
[ 434.068375] scsi_io_completion+0x13c/0x630
[ 434.068378] scsi_finish_command+0xd9/0x120
[ 434.068381] scsi_softirq_done+0x12a/0x150
[ 434.068383] __blk_mq_complete_request_remote+0x13/0x20
[ 434.068387] flush_smp_call_function_queue+0x52/0x110
[ 434.068388] generic_smp_call_function_single_interrupt+0x13/0x30
[ 434.068391] smp_call_function_single_interrupt+0x27/0x40
[ 434.068393] call_function_single_interrupt+0x93/0xa0
[ 434.068396] RIP: 0010:copy_user_generic_string+0x2c/0x40
[ 434.068397] RSP: 0018:ffffbdb48fd13c78 EFLAGS: 00010246 ORIG_RAX: ffffffffffffff04
[ 434.068398] RAX: 00007f27aa1f3000 RBX: 0000000000001000 RCX: 0000000000000040
[ 434.068399] RDX: 0000000000000000 RSI: 00007f27aa1f2e00 RDI: ffff945a2f7b9e00
[ 434.068400] RBP: ffffbdb48fd13c80 R08: ffff945a2fcf20b0 R09: 0000000000001000
[ 434.068400] R10: 00000000fffff000 R11: 0000000000000000 R12: 0000000000001000
[ 434.068401] R13: ffff945a2f7ba000 R14: ffffbdb48fd13e20 R15: ffff945be4817950
[ 434.068402] </IRQ>
[ 434.068405] ? I_BDEV+0x20/0x20
[ 434.068408] ? copyin+0x26/0x30
[ 434.068410] iov_iter_copy_from_user_atomic+0xbd/0x2d0
[ 434.068412] generic_perform_write+0xdd/0x1b0
[ 434.068415] __generic_file_write_iter+0x19b/0x1e0
[ 434.068416] blkdev_write_iter+0x8a/0x100
[ 434.068420] __vfs_write+0xf3/0x170
[ 434.068423] vfs_write+0xb2/0x1b0
[ 434.068427] ? syscall_trace_enter+0x1d0/0x2b0
[ 434.068429] SyS_write+0x55/0xc0
[ 434.068431] do_syscall_64+0x67/0x150
[ 434.068434] entry_SYSCALL64_slow_path+0x25/0x25
[ 434.068435] RIP: 0033:0x7f27b0f0dc90
[ 434.068436] RSP: 002b:00007ffe24d1d178 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
[ 434.068438] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f27b0f0dc90
[ 434.068438] RDX: 0000000000800000 RSI: 00007f27aa0fa000 RDI: 0000000000000001
[ 434.068439] RBP: 0000000000800000 R08: ffffffffffffffff R09: 0000000000802003
[ 434.068440] R10: 00007ffe24d1cf00 R11: 0000000000000246 R12: 0000000000800000
[ 434.068441] R13: 00007f27aa0fa000 R14: 00007f27aa8fa000 R15: 0000000000000000
[ 434.068442] Code: 06 00 00 88 44 24 17 8b 59 28 44 8b 71 2c 44 8b 61 34 85 db 0f 84 92 01 00 00 45 89 f5 45 89 e2 4c 89 ee 48 c1 e6 04 48 03 71 78 <8b> 46 08 48 8b 16 44 29 e0 48 89 54 24 30 39 d8 0f 47 c3 44 03
[ 434.068461] Kernel panic - not syncing: Hard LOCKUP
[ 434.068463] CPU: 2 PID: 1039 Comm: dd Tainted: G I 4.13.0-rc6.linus+ #28
[ 434.068463] Hardware name: HP ProLiant DL380 G7, BIOS P67 08/16/2015
[ 434.068463] Call Trace:
[ 434.068464] <NMI>
[ 434.068467] dump_stack+0x63/0x87
[ 434.068470] panic+0xeb/0x245
[ 434.068472] nmi_panic+0x3f/0x40
[ 434.068477] watchdog_overflow_callback+0x106/0x130
[ 434.068480] __perf_event_overflow+0x54/0xe0
[ 434.068482] perf_event_overflow+0x14/0x20
[ 434.068486] intel_pmu_handle_irq+0x203/0x4b0
[ 434.068490] perf_event_nmi_handler+0x2d/0x50
[ 434.068492] nmi_handle+0x61/0x110
[ 434.068494] default_do_nmi+0xf6/0x120
[ 434.068495] do_nmi+0x113/0x190
[ 434.068497] end_repeat_nmi+0x1a/0x1e
[ 434.068499] RIP: 0010:__blk_recalc_rq_segments+0xd9/0x3d0
[ 434.068499] RSP: 0018:ffff945b37a43d08 EFLAGS: 00000086
[ 434.068500] RAX: ffff9458978a16c0 RBX: 0000000000001000 RCX: ffff9458978a16c0
[ 434.068501] RDX: 0000000000000000 RSI: ffff9458978a1748 RDI: ffff9458978a69c8
[ 434.068502] RBP: ffff945b37a43d88 R08: 0000000000014000 R09: 0000000000000001
[ 434.068502] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
[ 434.068503] R13: 0000000000000000 R14: 0000000000000000 R15: 000000000000001d
[ 434.068505] ? __blk_recalc_rq_segments+0xd9/0x3d0
[ 434.068508] ? __blk_recalc_rq_segments+0xd9/0x3d0
[ 434.068510] </NMI>
[ 434.068510] <IRQ>
[ 434.068512] ? mempool_free+0x2b/0x80
[ 434.068514] blk_recalc_rq_segments+0x28/0x40
[ 434.068516] blk_update_request+0x249/0x310
[ 434.068521] end_clone_bio+0x46/0x70 [dm_mod]
[ 434.068522] bio_endio+0xa1/0x120
[ 434.068523] blk_update_request+0xb7/0x310
[ 434.068525] scsi_end_request+0x38/0x1d0
[ 434.068527] scsi_io_completion+0x13c/0x630
[ 434.068528] scsi_finish_command+0xd9/0x120
[ 434.068531] scsi_softirq_done+0x12a/0x150
[ 434.068534] __blk_mq_complete_request_remote+0x13/0x20
[ 434.068536] flush_smp_call_function_queue+0x52/0x110
[ 434.068537] generic_smp_call_function_single_interrupt+0x13/0x30
[ 434.068539] smp_call_function_single_interrupt+0x27/0x40
[ 434.068542] call_function_single_interrupt+0x93/0xa0
[ 434.068543] RIP: 0010:copy_user_generic_string+0x2c/0x40
[ 434.068544] RSP: 0018:ffffbdb48fd13c78 EFLAGS: 00010246 ORIG_RAX: ffffffffffffff04
[ 434.068545] RAX: 00007f27aa1f3000 RBX: 0000000000001000 RCX: 0000000000000040
[ 434.068546] RDX: 0000000000000000 RSI: 00007f27aa1f2e00 RDI: ffff945a2f7b9e00
[ 434.068546] RBP: ffffbdb48fd13c80 R08: ffff945a2fcf20b0 R09: 0000000000001000
[ 434.068547] R10: 00000000fffff000 R11: 0000000000000000 R12: 0000000000001000
[ 434.068548] R13: ffff945a2f7ba000 R14: ffffbdb48fd13e20 R15: ffff945be4817950
[ 434.068548] </IRQ>
[ 434.068550] ? I_BDEV+0x20/0x20
[ 434.068551] ? copyin+0x26/0x30
[ 434.068553] iov_iter_copy_from_user_atomic+0xbd/0x2d0
[ 434.068554] generic_perform_write+0xdd/0x1b0
[ 434.068557] __generic_file_write_iter+0x19b/0x1e0
[ 434.068560] blkdev_write_iter+0x8a/0x100
[ 434.068562] __vfs_write+0xf3/0x170
[ 434.068565] vfs_write+0xb2/0x1b0
[ 434.068566] ? syscall_trace_enter+0x1d0/0x2b0
[ 434.068568] SyS_write+0x55/0xc0
[ 434.068569] do_syscall_64+0x67/0x150
[ 434.068572] entry_SYSCALL64_slow_path+0x25/0x25
[ 434.068573] RIP: 0033:0x7f27b0f0dc90
[ 434.068574] RSP: 002b:00007ffe24d1d178 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
[ 434.068575] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f27b0f0dc90
[ 434.068576] RDX: 0000000000800000 RSI: 00007f27aa0fa000 RDI: 0000000000000001
[ 434.068577] RBP: 0000000000800000 R08: ffffffffffffffff R09: 0000000000802003
[ 434.068578] R10: 00007ffe24d1cf00 R11: 0000000000000246 R12: 0000000000800000
[ 434.068579] R13: 00007f27aa0fa000 R14: 00007f27aa8fa000 R15: 0000000000000000
[ 434.100416] Kernel Offset: 0x13800000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff)
[ 438.636598] ---[ end Kernel panic - not syncing: Hard LOCKUP
--
Ming
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [v4.13-rc BUG] system lockup when running big buffered write(4M) to IB SRP via mpath
@ 2017-08-23 12:46 ` Ming Lei
0 siblings, 0 replies; 16+ messages in thread
From: Ming Lei @ 2017-08-23 12:46 UTC (permalink / raw)
To: Bart Van Assche; +Cc: dm-devel, linux-scsi, linux-block, loberman
On Wed, Aug 23, 2017 at 07:35:26PM +0800, Ming Lei wrote:
> On Wed, Aug 09, 2017 at 05:10:01PM +0000, Bart Van Assche wrote:
> > On Wed, 2017-08-09 at 12:43 -0400, Laurence Oberman wrote:
> > > Your latest patch on stock upstream without Ming's latest patches is
> > > behaving for me.
> > >
> > > As already mentioned, the requeue -11 and clone failure messages are
> > > gone and I am not actually seeing any soft lockups or hard lockups.
> > >
> > > When Ming gets back I will work with him on his patch set and the lockups.
> > >
> > > Running 10 parallel writes which easily trips into soft lockups on
> > > Ming's kernel (even with your patch) has been stable here on 4.13-RC3
> > > with your patch.
> > >
> > > I will leave it running for a while now but the patch is good.
> > >
> > > If it survives 4 hours I will add a Tested-by to your latest patch.
> >
> > Hello Laurence,
> >
> > I'm working on an additional patch that should reduce unnecessary requeuing
> > even further. I will let you know when it's ready.
> >
> > Additionally, please trim e-mails when replying such that e-mails do not get
> > too long.
>
> soft lockup still can be observed easily with patch d4acf3650c7c(
> block: Make blk_mq_delay_kick_requeue_list() rerun the queue at a quiet time),
> but no hard lockup.
>
> With the patchset of 'blk-mq-sched: improve SCSI-MQ performance', hard
> lockup can be observed following some failure log:
>
> [ 269.277653] device-mapper: multipath: blk_get_request() returned -11 - requeuing
> [ 269.321244] device-mapper: multipath: blk_get_request() returned -11 - requeuing
> ...
> [ 273.421688] scsi host2: SRP abort called
> [ 273.444577] scsi host2: Sending SRP abort for tag 0x6007e
> [ 273.673871] scsi host2: Null scmnd for RSP w/tag 0x0000000006007e received on ch 6 / QP 0x30
> ...
> [ 274.372110] device-mapper: multipath: blk_get_request() returned -11 - requeuing
> [ 278.658671] scsi host2: SRP abort called
> [ 278.690630] scsi host2: SRP abort called
> [ 278.717634] scsi host2: SRP abort called
> [ 278.745629] scsi host2: SRP abort called
> [ 279.083227] multipath_clone_and_map: 1092 callbacks suppressed
> ....
> [ 296.210503] scsi host2: SRP reset_device called
> ....
> [ 303.784287] NMI watchdog: Watchdog detected hard LOCKUP on cpu 10
>
> The trick thing is that both hard lockup and soft lockup share
> one same stack trace.
Actually I can reproduce hard lockup too on the latest linus tree(v4.13-rc6+)
by just making the test more aggressive:
1) run hammer_write.sh 32-concurrently.
2) write 8M each time
So it isn't the patchset of 'blk-mq-sched: improve SCSI-MQ performance'
which causes the lockup.
There must be issue somewhere else, and one thing I saw is that
dm's req's completion handling isn't very efficient:
- when low level driver's req is completed, blk_update_request()
is called from scsi_end_request()
- inside blk_update_request(), end_clone_bio() is called for
each bio
- in end_clone_bio(), blk_update_request() is called to
do partial update for the dm req. That means if one request
has N bios, the same dm req need to do the expensive partial
update N times. In theory, the partial update might be avoided,
I guess the reason is that the dm rq can't be retrieved
via the low level driver's rq.
[1] hard lockup log
[ 434.068240] NMI watchdog: Watchdog detected hard LOCKUP on cpu 2
[ 434.068243] Modules linked in: dm_round_robin xt_CHECKSUM ipt_MASQUERADE nf_nat_masquerade_ipv4 tun ip6t_rpfilter ipt_REJECT nf_reject_ipv4 ip6t_REJECT nf_reject_ipv6 xt_conntrack ip_set nfnetlink ebtable_nat ebtable_broute bridge stp llc ip6table_nat nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv6 ip6table_mangle ip6table_security ip6table_raw iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack rpcrdma iptable_mangle ib_isert iscsi_target_mod iptable_security iptable_raw ebtable_filter ebtables ip6table_filter ib_iser libiscsi ip6_tables scsi_transport_iscsi iptable_filter target_core_mod ib_srp scsi_transport_srp ib_ipoib rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm mlx5_ib ib_core intel_powerclamp coretemp kvm_intel kvm irqbypass crct10dif_pclmul crc32_
pclmul ghash_clmulni_intel
[ 434.068279] pcbc aesni_intel iTCO_wdt crypto_simd gpio_ich ipmi_si iTCO_vendor_support cryptd joydev glue_helper ipmi_devintf hpwdt acpi_power_meter hpilo sg ipmi_msghandler lpc_ich i7core_edac pcspkr shpchp pcc_cpufreq nfsd auth_rpcgss nfs_acl lockd grace sunrpc dm_multipath ip_tables xfs libcrc32c sd_mod amdkfd amd_iommu_v2 radeon mlx5_core i2c_algo_bit drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops ttm mlxfw drm ptp hpsa pps_core crc32c_intel serio_raw bnx2 i2c_core devlink scsi_transport_sas dm_mirror dm_region_hash dm_log dm_mod [last unloaded: ib_srpt]
[ 434.068314] CPU: 2 PID: 1039 Comm: dd Tainted: G I 4.13.0-rc6.linus+ #28
[ 434.068314] Hardware name: HP ProLiant DL380 G7, BIOS P67 08/16/2015
[ 434.068316] task: ffff9458f4498040 task.stack: ffffbdb48fd10000
[ 434.068323] RIP: 0010:__blk_recalc_rq_segments+0xd9/0x3d0
[ 434.068324] RSP: 0018:ffff945b37a43d08 EFLAGS: 00000086
[ 434.068325] RAX: ffff9458978a16c0 RBX: 0000000000001000 RCX: ffff9458978a16c0
[ 434.068326] RDX: 0000000000000000 RSI: ffff9458978a1748 RDI: ffff9458978a69c8
[ 434.068327] RBP: ffff945b37a43d88 R08: 0000000000014000 R09: 0000000000000001
[ 434.068328] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
[ 434.068329] R13: 0000000000000000 R14: 0000000000000000 R15: 000000000000001d
[ 434.068330] FS: 00007f27b13ea740(0000) GS:ffff945b37a40000(0000) knlGS:0000000000000000
[ 434.068331] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 434.068332] CR2: 00007f27aa600000 CR3: 0000000b1e2af000 CR4: 00000000000006e0
[ 434.068333] Call Trace:
[ 434.068335] <IRQ>
[ 434.068340] ? mempool_free+0x2b/0x80
[ 434.068343] blk_recalc_rq_segments+0x28/0x40
[ 434.068345] blk_update_request+0x249/0x310
[ 434.068363] end_clone_bio+0x46/0x70 [dm_mod]
[ 434.068367] bio_endio+0xa1/0x120
[ 434.068370] blk_update_request+0xb7/0x310
[ 434.068374] scsi_end_request+0x38/0x1d0
[ 434.068375] scsi_io_completion+0x13c/0x630
[ 434.068378] scsi_finish_command+0xd9/0x120
[ 434.068381] scsi_softirq_done+0x12a/0x150
[ 434.068383] __blk_mq_complete_request_remote+0x13/0x20
[ 434.068387] flush_smp_call_function_queue+0x52/0x110
[ 434.068388] generic_smp_call_function_single_interrupt+0x13/0x30
[ 434.068391] smp_call_function_single_interrupt+0x27/0x40
[ 434.068393] call_function_single_interrupt+0x93/0xa0
[ 434.068396] RIP: 0010:copy_user_generic_string+0x2c/0x40
[ 434.068397] RSP: 0018:ffffbdb48fd13c78 EFLAGS: 00010246 ORIG_RAX: ffffffffffffff04
[ 434.068398] RAX: 00007f27aa1f3000 RBX: 0000000000001000 RCX: 0000000000000040
[ 434.068399] RDX: 0000000000000000 RSI: 00007f27aa1f2e00 RDI: ffff945a2f7b9e00
[ 434.068400] RBP: ffffbdb48fd13c80 R08: ffff945a2fcf20b0 R09: 0000000000001000
[ 434.068400] R10: 00000000fffff000 R11: 0000000000000000 R12: 0000000000001000
[ 434.068401] R13: ffff945a2f7ba000 R14: ffffbdb48fd13e20 R15: ffff945be4817950
[ 434.068402] </IRQ>
[ 434.068405] ? I_BDEV+0x20/0x20
[ 434.068408] ? copyin+0x26/0x30
[ 434.068410] iov_iter_copy_from_user_atomic+0xbd/0x2d0
[ 434.068412] generic_perform_write+0xdd/0x1b0
[ 434.068415] __generic_file_write_iter+0x19b/0x1e0
[ 434.068416] blkdev_write_iter+0x8a/0x100
[ 434.068420] __vfs_write+0xf3/0x170
[ 434.068423] vfs_write+0xb2/0x1b0
[ 434.068427] ? syscall_trace_enter+0x1d0/0x2b0
[ 434.068429] SyS_write+0x55/0xc0
[ 434.068431] do_syscall_64+0x67/0x150
[ 434.068434] entry_SYSCALL64_slow_path+0x25/0x25
[ 434.068435] RIP: 0033:0x7f27b0f0dc90
[ 434.068436] RSP: 002b:00007ffe24d1d178 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
[ 434.068438] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f27b0f0dc90
[ 434.068438] RDX: 0000000000800000 RSI: 00007f27aa0fa000 RDI: 0000000000000001
[ 434.068439] RBP: 0000000000800000 R08: ffffffffffffffff R09: 0000000000802003
[ 434.068440] R10: 00007ffe24d1cf00 R11: 0000000000000246 R12: 0000000000800000
[ 434.068441] R13: 00007f27aa0fa000 R14: 00007f27aa8fa000 R15: 0000000000000000
[ 434.068442] Code: 06 00 00 88 44 24 17 8b 59 28 44 8b 71 2c 44 8b 61 34 85 db 0f 84 92 01 00 00 45 89 f5 45 89 e2 4c 89 ee 48 c1 e6 04 48 03 71 78 <8b> 46 08 48 8b 16 44 29 e0 48 89 54 24 30 39 d8 0f 47 c3 44 03
[ 434.068461] Kernel panic - not syncing: Hard LOCKUP
[ 434.068463] CPU: 2 PID: 1039 Comm: dd Tainted: G I 4.13.0-rc6.linus+ #28
[ 434.068463] Hardware name: HP ProLiant DL380 G7, BIOS P67 08/16/2015
[ 434.068463] Call Trace:
[ 434.068464] <NMI>
[ 434.068467] dump_stack+0x63/0x87
[ 434.068470] panic+0xeb/0x245
[ 434.068472] nmi_panic+0x3f/0x40
[ 434.068477] watchdog_overflow_callback+0x106/0x130
[ 434.068480] __perf_event_overflow+0x54/0xe0
[ 434.068482] perf_event_overflow+0x14/0x20
[ 434.068486] intel_pmu_handle_irq+0x203/0x4b0
[ 434.068490] perf_event_nmi_handler+0x2d/0x50
[ 434.068492] nmi_handle+0x61/0x110
[ 434.068494] default_do_nmi+0xf6/0x120
[ 434.068495] do_nmi+0x113/0x190
[ 434.068497] end_repeat_nmi+0x1a/0x1e
[ 434.068499] RIP: 0010:__blk_recalc_rq_segments+0xd9/0x3d0
[ 434.068499] RSP: 0018:ffff945b37a43d08 EFLAGS: 00000086
[ 434.068500] RAX: ffff9458978a16c0 RBX: 0000000000001000 RCX: ffff9458978a16c0
[ 434.068501] RDX: 0000000000000000 RSI: ffff9458978a1748 RDI: ffff9458978a69c8
[ 434.068502] RBP: ffff945b37a43d88 R08: 0000000000014000 R09: 0000000000000001
[ 434.068502] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
[ 434.068503] R13: 0000000000000000 R14: 0000000000000000 R15: 000000000000001d
[ 434.068505] ? __blk_recalc_rq_segments+0xd9/0x3d0
[ 434.068508] ? __blk_recalc_rq_segments+0xd9/0x3d0
[ 434.068510] </NMI>
[ 434.068510] <IRQ>
[ 434.068512] ? mempool_free+0x2b/0x80
[ 434.068514] blk_recalc_rq_segments+0x28/0x40
[ 434.068516] blk_update_request+0x249/0x310
[ 434.068521] end_clone_bio+0x46/0x70 [dm_mod]
[ 434.068522] bio_endio+0xa1/0x120
[ 434.068523] blk_update_request+0xb7/0x310
[ 434.068525] scsi_end_request+0x38/0x1d0
[ 434.068527] scsi_io_completion+0x13c/0x630
[ 434.068528] scsi_finish_command+0xd9/0x120
[ 434.068531] scsi_softirq_done+0x12a/0x150
[ 434.068534] __blk_mq_complete_request_remote+0x13/0x20
[ 434.068536] flush_smp_call_function_queue+0x52/0x110
[ 434.068537] generic_smp_call_function_single_interrupt+0x13/0x30
[ 434.068539] smp_call_function_single_interrupt+0x27/0x40
[ 434.068542] call_function_single_interrupt+0x93/0xa0
[ 434.068543] RIP: 0010:copy_user_generic_string+0x2c/0x40
[ 434.068544] RSP: 0018:ffffbdb48fd13c78 EFLAGS: 00010246 ORIG_RAX: ffffffffffffff04
[ 434.068545] RAX: 00007f27aa1f3000 RBX: 0000000000001000 RCX: 0000000000000040
[ 434.068546] RDX: 0000000000000000 RSI: 00007f27aa1f2e00 RDI: ffff945a2f7b9e00
[ 434.068546] RBP: ffffbdb48fd13c80 R08: ffff945a2fcf20b0 R09: 0000000000001000
[ 434.068547] R10: 00000000fffff000 R11: 0000000000000000 R12: 0000000000001000
[ 434.068548] R13: ffff945a2f7ba000 R14: ffffbdb48fd13e20 R15: ffff945be4817950
[ 434.068548] </IRQ>
[ 434.068550] ? I_BDEV+0x20/0x20
[ 434.068551] ? copyin+0x26/0x30
[ 434.068553] iov_iter_copy_from_user_atomic+0xbd/0x2d0
[ 434.068554] generic_perform_write+0xdd/0x1b0
[ 434.068557] __generic_file_write_iter+0x19b/0x1e0
[ 434.068560] blkdev_write_iter+0x8a/0x100
[ 434.068562] __vfs_write+0xf3/0x170
[ 434.068565] vfs_write+0xb2/0x1b0
[ 434.068566] ? syscall_trace_enter+0x1d0/0x2b0
[ 434.068568] SyS_write+0x55/0xc0
[ 434.068569] do_syscall_64+0x67/0x150
[ 434.068572] entry_SYSCALL64_slow_path+0x25/0x25
[ 434.068573] RIP: 0033:0x7f27b0f0dc90
[ 434.068574] RSP: 002b:00007ffe24d1d178 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
[ 434.068575] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f27b0f0dc90
[ 434.068576] RDX: 0000000000800000 RSI: 00007f27aa0fa000 RDI: 0000000000000001
[ 434.068577] RBP: 0000000000800000 R08: ffffffffffffffff R09: 0000000000802003
[ 434.068578] R10: 00007ffe24d1cf00 R11: 0000000000000246 R12: 0000000000800000
[ 434.068579] R13: 00007f27aa0fa000 R14: 00007f27aa8fa000 R15: 0000000000000000
[ 434.100416] Kernel Offset: 0x13800000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff)
[ 438.636598] ---[ end Kernel panic - not syncing: Hard LOCKUP
--
Ming
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [v4.13-rc BUG] system lockup when running big buffered write(4M) to IB SRP via mpath
2017-08-23 11:35 ` Ming Lei
@ 2017-08-23 15:12 ` Bart Van Assche
2017-08-23 15:12 ` Bart Van Assche
1 sibling, 0 replies; 16+ messages in thread
From: Bart Van Assche @ 2017-08-23 15:12 UTC (permalink / raw)
To: ming.lei; +Cc: dm-devel, linux-scsi, linux-block, loberman
T24gV2VkLCAyMDE3LTA4LTIzIGF0IDE5OjM1ICswODAwLCBNaW5nIExlaSB3cm90ZToNCj4gc29m
dCBsb2NrdXAgc3RpbGwgY2FuIGJlIG9ic2VydmVkIGVhc2lseSB3aXRoIHBhdGNoIGQ0YWNmMzY1
MGM3YygNCj4gYmxvY2s6IE1ha2UgYmxrX21xX2RlbGF5X2tpY2tfcmVxdWV1ZV9saXN0KCkgcmVy
dW4gdGhlIHF1ZXVlIGF0IGEgcXVpZXQgdGltZSksDQo+IGJ1dCBubyBoYXJkIGxvY2t1cC4NCg0K
SGVsbG8gTWluZywNCg0KSSBuZXZlciBjbGFpbWVkIHRoYXQgY29tbWl0IGQ0YWNmMzY1MGM3YyB3
YXMgYSBmdWxsIHNvbHV0aW9uIHRvIGF2b2lkIHRoZXNlDQpzb2Z0IGxvY2t1cHMuIEkgYXNzdW1l
IHRoZXNlIHJlc3VsdHMgbWVhbiB0aGF0IHBhdGNoIHNlcmllcyAiW1BBVENIIDAvN10NCkRldmlj
ZSBtYXBwZXIgYW5kIGRtLW1wYXRoIGZpeGVzIiB3YXMgbm90IGFwcGxpZWQgdG8gdGhlIGtlcm5l
bCB1c2VkIGluIHlvdXINCnRlc3RzPyBBIGNvbGxlYWd1ZSBvZiB5b3UgaGFzIHJlcG9ydGVkIHRo
YXQgdGhhdCBwYXRjaCBzZXJpZXMgaXMgc3VmZmljaWVudA0KdG8gYXZvaWQgdGhlc2Ugc29mdCBs
b2NrdXBzLiBTZWUgYWxzbw0KaHR0cHM6Ly93d3cucmVkaGF0LmNvbS9hcmNoaXZlcy9kbS1kZXZl
bC8yMDE3LUF1Z3VzdC9tc2cwMDEyMi5odG1sLg0KDQpCYXJ0Lg==
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [v4.13-rc BUG] system lockup when running big buffered write(4M) to IB SRP via mpath
@ 2017-08-23 15:12 ` Bart Van Assche
0 siblings, 0 replies; 16+ messages in thread
From: Bart Van Assche @ 2017-08-23 15:12 UTC (permalink / raw)
To: ming.lei; +Cc: dm-devel, linux-scsi, linux-block, loberman
On Wed, 2017-08-23 at 19:35 +0800, Ming Lei wrote:
> soft lockup still can be observed easily with patch d4acf3650c7c(
> block: Make blk_mq_delay_kick_requeue_list() rerun the queue at a quiet time),
> but no hard lockup.
Hello Ming,
I never claimed that commit d4acf3650c7c was a full solution to avoid these
soft lockups. I assume these results mean that patch series "[PATCH 0/7]
Device mapper and dm-mpath fixes" was not applied to the kernel used in your
tests? A colleague of you has reported that that patch series is sufficient
to avoid these soft lockups. See also
https://www.redhat.com/archives/dm-devel/2017-August/msg00122.html.
Bart.
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [v4.13-rc BUG] system lockup when running big buffered write(4M) to IB SRP via mpath
2017-08-23 15:12 ` Bart Van Assche
(?)
@ 2017-08-23 16:03 ` Ming Lei
-1 siblings, 0 replies; 16+ messages in thread
From: Ming Lei @ 2017-08-23 16:03 UTC (permalink / raw)
To: Bart Van Assche; +Cc: ming.lei, dm-devel, linux-scsi, linux-block, loberman
On Wed, Aug 23, 2017 at 11:12 PM, Bart Van Assche
<Bart.VanAssche@wdc.com> wrote:
> On Wed, 2017-08-23 at 19:35 +0800, Ming Lei wrote:
>> soft lockup still can be observed easily with patch d4acf3650c7c(
>> block: Make blk_mq_delay_kick_requeue_list() rerun the queue at a quiet time),
>> but no hard lockup.
>
> Hello Ming,
>
> I never claimed that commit d4acf3650c7c was a full solution to avoid these
> soft lockups. I assume these results mean that patch series "[PATCH 0/7]
> Device mapper and dm-mpath fixes" was not applied to the kernel used in your
> tests? A colleague of you has reported that that patch series is sufficient
OK, thanks for your clarification.
I run v4.13-rc6 with block for-next, so these patches are not applied in my
kernel.
--
Ming Lei
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [v4.13-rc BUG] system lockup when running big buffered write(4M) to IB SRP via mpath
2017-08-23 15:12 ` Bart Van Assche
(?)
(?)
@ 2017-08-23 23:51 ` Mike Snitzer
-1 siblings, 0 replies; 16+ messages in thread
From: Mike Snitzer @ 2017-08-23 23:51 UTC (permalink / raw)
To: Bart Van Assche; +Cc: ming.lei, linux-block, dm-devel, loberman, linux-scsi
On Wed, Aug 23 2017 at 11:12am -0400,
Bart Van Assche <Bart.VanAssche@wdc.com> wrote:
> On Wed, 2017-08-23 at 19:35 +0800, Ming Lei wrote:
> > soft lockup still can be observed easily with patch d4acf3650c7c(
> > block: Make blk_mq_delay_kick_requeue_list() rerun the queue at a quiet time),
> > but no hard lockup.
>
> Hello Ming,
>
> I never claimed that commit d4acf3650c7c was a full solution to avoid these
> soft lockups. I assume these results mean that patch series "[PATCH 0/7]
> Device mapper and dm-mpath fixes" was not applied to the kernel used in your
> tests? A colleague of you has reported that that patch series is sufficient
> to avoid these soft lockups. See also
> https://www.redhat.com/archives/dm-devel/2017-August/msg00122.html.
FYI, I'll try to review/stage that patchset for 4.14 over the next few
days.
(I'm not yet back to working full-time but will try to handle my
upstream responsibilities)
Mike
^ permalink raw reply [flat|nested] 16+ messages in thread
end of thread, other threads:[~2017-08-23 23:51 UTC | newest]
Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-08-08 14:17 [v4.13-rc BUG] system lockup when running big buffered write(4M) to IB SRP via mpath Ming Lei
2017-08-08 14:17 ` Ming Lei
2017-08-08 23:10 ` Bart Van Assche
2017-08-08 23:10 ` Bart Van Assche
2017-08-09 0:11 ` Laurence Oberman
2017-08-09 2:28 ` Laurence Oberman
2017-08-09 16:43 ` Laurence Oberman
2017-08-09 17:10 ` Bart Van Assche
2017-08-09 17:10 ` Bart Van Assche
2017-08-23 11:35 ` Ming Lei
2017-08-23 12:46 ` Ming Lei
2017-08-23 12:46 ` Ming Lei
2017-08-23 15:12 ` Bart Van Assche
2017-08-23 15:12 ` Bart Van Assche
2017-08-23 16:03 ` Ming Lei
2017-08-23 23:51 ` Mike Snitzer
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.