linux-rdma.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [bug report] blktests srp/011 hang at "ib_srpt srpt_disconnect_ch_sync:still waiting ..."
@ 2021-12-01  8:55 Yi Zhang
  2021-12-02  9:08 ` Bernard Metzler
                   ` (2 more replies)
  0 siblings, 3 replies; 6+ messages in thread
From: Yi Zhang @ 2021-12-01  8:55 UTC (permalink / raw)
  To: RDMA mailing list

Hello
I found blktest srp/011 hang on latest 5.16.0-rc3+, and from dmesg I
can see kernel repeat printing "ib_srpt srpt_disconnect_ch_sync:still
waiting ...".
Pls help check it and let me know if you need any info/testing for it, thanks.

[root@gigabyte-r120-11 blktests]# use_siw=1 ./check srp/011 -------------> hang
srp/011 (Block I/O on top of multipath concurrently with logout and
login) [passed]
    runtime  52.731s  ...  61.351s

dmesg:
[  101.614632] run blktests srp/011 at 2021-12-01 03:43:24
[  102.493106] alua: device handler registered
[  102.519148] emc: device handler registered
[  102.540806] rdac: device handler registered
[  102.608792] null_blk: module loaded
[  103.031132] SoftiWARP attached
[  103.067829] enP2p1s0v1 speed is unknown, defaulting to 1000
[  103.073399] enP2p1s0v1 speed is unknown, defaulting to 1000
[  103.079038] enP2p1s0v1 speed is unknown, defaulting to 1000
[  103.093348] enP2p1s0v1 speed is unknown, defaulting to 1000
[  103.111956] enP2p1s0v1 speed is unknown, defaulting to 1000
[  103.130870] enP2p1s0v1 speed is unknown, defaulting to 1000
[  103.141017] enP2p1s0v4 speed is unknown, defaulting to 1000
[  103.146585] enP2p1s0v4 speed is unknown, defaulting to 1000
[  103.152374] enP2p1s0v4 speed is unknown, defaulting to 1000
[  103.166691] enP2p1s0v1 speed is unknown, defaulting to 1000
[  103.172623] enP2p1s0v4 speed is unknown, defaulting to 1000
[  103.191728] enP2p1s0v1 speed is unknown, defaulting to 1000
[  103.197641] enP2p1s0v4 speed is unknown, defaulting to 1000
[  103.380984] scsi_debug:sdebug_add_store: dif_storep 524288 bytes @
0000000068763489
[  103.389445] scsi_debug:sdebug_driver_probe: scsi_debug: trim
poll_queues to 0. poll_q/nr_hw = (0/1)
[  103.398577] scsi_debug:sdebug_driver_probe: host protection DIF3 DIX3
[  103.405018] scsi host4: scsi_debug: version 0190 [20200710]
[  103.405018]   dev_size_mb=32, opts=0x0, submit_queues=1, statistics=0
[  103.417664] scsi 4:0:0:0: Direct-Access     Linux    scsi_debug
  0190 PQ: 0 ANSI: 7
[  103.426302] sd 4:0:0:0: Power-on or device reset occurred
[  103.426368] sd 4:0:0:0: Attached scsi generic sg1 type 0
[  103.431800] sd 4:0:0:0: [sdb] Enabling DIF Type 3 protection
[  103.442794] sd 4:0:0:0: [sdb] 65536 512-byte logical blocks: (33.6
MB/32.0 MiB)
[  103.450168] sd 4:0:0:0: [sdb] Write Protect is off
[  103.454958] sd 4:0:0:0: [sdb] Mode Sense: 73 00 10 08
[  103.455020] sd 4:0:0:0: [sdb] Write cache: enabled, read cache:
enabled, supports DPO and FUA
[  103.463665] sd 4:0:0:0: [sdb] Optimal transfer size 524288 bytes
[  103.567989] sd 4:0:0:0: [sdb] Enabling DIX T10-DIF-TYPE3-CRC protection
[  103.574602] sd 4:0:0:0: [sdb] DIF application tag size 6
[  103.757781] sd 4:0:0:0: [sdb] Attached SCSI disk
[  104.620435] enP2p1s0v1 speed is unknown, defaulting to 1000
[  104.805722] enP2p1s0v4 speed is unknown, defaulting to 1000
[  105.168234] Rounding down aligned max_sectors from 4294967295 to 4294967288
[  105.313416] ib_srpt:srpt_add_one: ib_srpt device = 0000000043289393
[  105.313438] ib_srpt:srpt_use_srq: ib_srpt
srpt_use_srq(enP2p1s0v0_siw): use_srq = 0; ret[  101.614632] run
blktests srp/011 at 2021-12-01 03:43:24
[  102.493106] alua: device handler registered
[  102.519148] emc: device handler registered
[  102.540806] rdac: device handler registered
[  102.608792] null_blk: module loaded
[  103.031132] SoftiWARP attached
[  103.067829] enP2p1s0v1 speed is unknown, defaulting to 1000
[  103.073399] enP2p1s0v1 speed is unknown, defaulting to 1000
[  103.079038] enP2p1s0v1 speed is unknown, defaulting to 1000
[  103.093348] enP2p1s0v1 speed is unknown, defaulting to 1000
[  103.111956] enP2p1s0v1 speed is unknown, defaulting to 1000
[  103.130870] enP2p1s0v1 speed is unknown, defaulting to 1000
[  103.141017] enP2p1s0v4 speed is unknown, defaulting to 1000
[  103.146585] enP2p1s0v4 speed is unknown, defaulting to 1000
[  103.152374] enP2p1s0v4 speed is unknown, defaulting to 1000
[  103.166691] enP2p1s0v1 speed is unknown, defaulting to 1000
[  103.172623] enP2p1s0v4 speed is unknown, defaulting to 1000
[  103.191728] enP2p1s0v1 speed is unknown, defaulting to 1000
[  103.197641] enP2p1s0v4 speed is unknown, defaulting to 1000
[  103.380984] scsi_debug:sdebug_add_store: dif_storep 524288 bytes @
0000000068763489
[  103.389445] scsi_debug:sdebug_driver_probe: scsi_debug: trim
poll_queues to 0. poll_q/nr_hw = (0/1)
[  103.398577] scsi_debug:sdebug_driver_probe: host protection DIF3 DIX3
[  103.405018] scsi host4: scsi_debug: version 0190 [20200710]
[  103.405018]   dev_size_mb=32, opts=0x0, submit_queues=1, statistics=0
[  103.417664] scsi 4:0:0:0: Direct-Access     Linux    scsi_debug
  0190 PQ: 0 ANSI: 7
[  103.426302] sd 4:0:0:0: Power-on or device reset occurred
[  103.426368] sd 4:0:0:0: Attached scsi generic sg1 type 0
[  103.431800] sd 4:0:0:0: [sdb] Enabling DIF Type 3 protection
[  103.442794] sd 4:0:0:0: [sdb] 65536 512-byte logical blocks: (33.6
MB/32.0 MiB)
[  103.450168] sd 4:0:0:0: [sdb] Write Protect is off
[  103.454958] sd 4:0:0:0: [sdb] Mode Sense: 73 00 10 08
[  103.455020] sd 4:0:0:0: [sdb] Write cache: enabled, read cache:
enabled, supports DPO and FUA
[  103.463665] sd 4:0:0:0: [sdb] Optimal transfer size 524288 bytes
[  103.567989] sd 4:0:0:0: [sdb] Enabling DIX T10-DIF-TYPE3-CRC protection
[  103.574602] sd 4:0:0:0: [sdb] DIF application tag size 6
[  103.757781] sd 4:0:0:0: [sdb] Attached SCSI disk
[  104.620435] enP2p1s0v1 speed is unknown, defaulting to 1000
[  104.805722] enP2p1s0v4 speed is unknown, defaulting to 1000
[  105.168234] Rounding down aligned max_sectors from 4294967295 to 4294967288
[  105.313416] ib_srpt:srpt_add_one: ib_srpt device = 0000000043289393
[  105.313438] ib_srpt:srpt_use_srq: ib_srpt
srpt_use_srq(enP2p1s0v0_siw): use_srq = 0; ret
--snip--
[  172.857740] ib_srpt:srpt_release_channel_work: ib_srpt
2620:0052:0000:13f0:1e1b:0dff:fe9d:b031-63
[  172.857885] ib_srpt:srpt_release_channel_work: ib_srpt
2620:0052:0000:13f0:1e1b:0dff:fe9d:b031-66
[  172.858032] ib_srpt:srpt_release_channel_work: ib_srpt
2620:0052:0000:13f0:1e1b:0dff:fe9d:b031-68
[  172.858185] ib_srpt:srpt_release_channel_work: ib_srpt
2620:0052:0000:13f0:1e1b:0dff:fe9d:b031-70
[  172.858344] ib_srpt:srpt_release_channel_work: ib_srpt
2620:0052:0000:13f0:1e1b:0dff:fe9d:b031-72
[  172.858501] ib_srpt:srpt_release_channel_work: ib_srpt
2620:0052:0000:13f0:1e1b:0dff:fe9d:b031-74
[  172.858666] ib_srpt:srpt_release_channel_work: ib_srpt
2620:0052:0000:13f0:1e1b:0dff:fe9d:b031-76
[  172.858822] ib_srpt:srpt_release_channel_work: ib_srpt
2620:0052:0000:13f0:1e1b:0dff:fe9d:b031-78
[  172.858976] ib_srpt:srpt_release_channel_work: ib_srpt
2620:0052:0000:13f0:1e1b:0dff:fe9d:b031-80
[  172.859120] ib_srpt:srpt_release_channel_work: ib_srpt
2620:0052:0000:13f0:1e1b:0dff:fe9d:b031-82
[  172.859278] ib_srpt:srpt_release_channel_work: ib_srpt
2620:0052:0000:13f0:1e1b:0dff:fe9d:b031-84
[  172.859426] ib_srpt:srpt_release_channel_work: ib_srpt
2620:0052:0000:13f0:1e1b:0dff:fe9d:b031-86
[  172.859564] ib_srpt:srpt_release_channel_work: ib_srpt
2620:0052:0000:13f0:1e1b:0dff:fe9d:b031-88
[  172.859706] ib_srpt:srpt_release_channel_work: ib_srpt
2620:0052:0000:13f0:1e1b:0dff:fe9d:b031-90
[  172.859851] ib_srpt:srpt_release_channel_work: ib_srpt
2620:0052:0000:13f0:1e1b:0dff:fe9d:b031-92
[  173.439406] ib_srpt:srpt_disconnect_ch_sync: ib_srpt ch
2620:0052:0000:13f0:a236:9fff:fe79:eb22-62 state 4
[  178.456609] ib_srpt
srpt_disconnect_ch_sync(2620:0052:0000:13f0:a236:9fff:fe79:eb22-62
state 4): still waiting ...
[  183.496506] ib_srpt
srpt_disconnect_ch_sync(2620:0052:0000:13f0:a236:9fff:fe79:eb22-62
state 4): still waiting ...
[  188.536450] ib_srpt
srpt_disconnect_ch_sync(2620:0052:0000:13f0:a236:9fff:fe79:eb22-62
state 4): still waiting ...
[  193.576351] ib_srpt
srpt_disconnect_ch_sync(2620:0052:0000:13f0:a236:9fff:fe79:eb22-62
state 4): still waiting ...
[  198.616280] ib_srpt
srpt_disconnect_ch_sync(2620:0052:0000:13f0:a236:9fff:fe79:eb22-62
state 4): still waiting ...


-- 
Best Regards,
  Yi Zhang


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [bug report] blktests srp/011 hang at "ib_srpt srpt_disconnect_ch_sync:still waiting ..."
  2021-12-01  8:55 [bug report] blktests srp/011 hang at "ib_srpt srpt_disconnect_ch_sync:still waiting ..." Yi Zhang
@ 2021-12-02  9:08 ` Bernard Metzler
  2021-12-02 18:42 ` Bart Van Assche
  2021-12-02 19:00 ` Bernard Metzler
  2 siblings, 0 replies; 6+ messages in thread
From: Bernard Metzler @ 2021-12-02  9:08 UTC (permalink / raw)
  To: Yi Zhang; +Cc: RDMA mailing list

-----"Yi Zhang" <yi.zhang@redhat.com> wrote: -----

>To: "RDMA mailing list" <linux-rdma@vger.kernel.org>
>From: "Yi Zhang" <yi.zhang@redhat.com>
>Date: 12/01/2021 09:55AM
>Subject: [EXTERNAL] [bug report] blktests srp/011 hang at "ib_srpt
>srpt_disconnect_ch_sync:still waiting ..."
>
>Hello
>I found blktest srp/011 hang on latest 5.16.0-rc3+, and from dmesg I
>can see kernel repeat printing "ib_srpt srpt_disconnect_ch_sync:still
>waiting ...".
>Pls help check it and let me know if you need any info/testing for
>it, thanks.
>

Is this bug happening only when using siw, or also happening with
rxe?

I'll try to recreate.


Thanks,
Bernard.

>[root@gigabyte-r120-11 blktests]# use_siw=1 ./check srp/011
>-------------> hang
>srp/011 (Block I/O on top of multipath concurrently with logout and
>login) [passed]
>    runtime  52.731s  ...  61.351s
>
>dmesg:
>[  101.614632] run blktests srp/011 at 2021-12-01 03:43:24
>[  102.493106] alua: device handler registered
>[  102.519148] emc: device handler registered
>[  102.540806] rdac: device handler registered
>[  102.608792] null_blk: module loaded
>[  103.031132] SoftiWARP attached
>[  103.067829] enP2p1s0v1 speed is unknown, defaulting to 1000
>[  103.073399] enP2p1s0v1 speed is unknown, defaulting to 1000
>[  103.079038] enP2p1s0v1 speed is unknown, defaulting to 1000
>[  103.093348] enP2p1s0v1 speed is unknown, defaulting to 1000
>[  103.111956] enP2p1s0v1 speed is unknown, defaulting to 1000
>[  103.130870] enP2p1s0v1 speed is unknown, defaulting to 1000
>[  103.141017] enP2p1s0v4 speed is unknown, defaulting to 1000
>[  103.146585] enP2p1s0v4 speed is unknown, defaulting to 1000
>[  103.152374] enP2p1s0v4 speed is unknown, defaulting to 1000
>[  103.166691] enP2p1s0v1 speed is unknown, defaulting to 1000
>[  103.172623] enP2p1s0v4 speed is unknown, defaulting to 1000
>[  103.191728] enP2p1s0v1 speed is unknown, defaulting to 1000
>[  103.197641] enP2p1s0v4 speed is unknown, defaulting to 1000
>[  103.380984] scsi_debug:sdebug_add_store: dif_storep 524288 bytes @
>0000000068763489
>[  103.389445] scsi_debug:sdebug_driver_probe: scsi_debug: trim
>poll_queues to 0. poll_q/nr_hw = (0/1)
>[  103.398577] scsi_debug:sdebug_driver_probe: host protection DIF3
>DIX3
>[  103.405018] scsi host4: scsi_debug: version 0190 [20200710]
>[  103.405018]   dev_size_mb=32, opts=0x0, submit_queues=1,
>statistics=0
>[  103.417664] scsi 4:0:0:0: Direct-Access     Linux    scsi_debug
>  0190 PQ: 0 ANSI: 7
>[  103.426302] sd 4:0:0:0: Power-on or device reset occurred
>[  103.426368] sd 4:0:0:0: Attached scsi generic sg1 type 0
>[  103.431800] sd 4:0:0:0: [sdb] Enabling DIF Type 3 protection
>[  103.442794] sd 4:0:0:0: [sdb] 65536 512-byte logical blocks: (33.6
>MB/32.0 MiB)
>[  103.450168] sd 4:0:0:0: [sdb] Write Protect is off
>[  103.454958] sd 4:0:0:0: [sdb] Mode Sense: 73 00 10 08
>[  103.455020] sd 4:0:0:0: [sdb] Write cache: enabled, read cache:
>enabled, supports DPO and FUA
>[  103.463665] sd 4:0:0:0: [sdb] Optimal transfer size 524288 bytes
>[  103.567989] sd 4:0:0:0: [sdb] Enabling DIX T10-DIF-TYPE3-CRC
>protection
>[  103.574602] sd 4:0:0:0: [sdb] DIF application tag size 6
>[  103.757781] sd 4:0:0:0: [sdb] Attached SCSI disk
>[  104.620435] enP2p1s0v1 speed is unknown, defaulting to 1000
>[  104.805722] enP2p1s0v4 speed is unknown, defaulting to 1000
>[  105.168234] Rounding down aligned max_sectors from 4294967295 to
>4294967288
>[  105.313416] ib_srpt:srpt_add_one: ib_srpt device =
>0000000043289393
>[  105.313438] ib_srpt:srpt_use_srq: ib_srpt
>srpt_use_srq(enP2p1s0v0_siw): use_srq = 0; ret[  101.614632] run
>blktests srp/011 at 2021-12-01 03:43:24
>[  102.493106] alua: device handler registered
>[  102.519148] emc: device handler registered
>[  102.540806] rdac: device handler registered
>[  102.608792] null_blk: module loaded
>[  103.031132] SoftiWARP attached
>[  103.067829] enP2p1s0v1 speed is unknown, defaulting to 1000
>[  103.073399] enP2p1s0v1 speed is unknown, defaulting to 1000
>[  103.079038] enP2p1s0v1 speed is unknown, defaulting to 1000
>[  103.093348] enP2p1s0v1 speed is unknown, defaulting to 1000
>[  103.111956] enP2p1s0v1 speed is unknown, defaulting to 1000
>[  103.130870] enP2p1s0v1 speed is unknown, defaulting to 1000
>[  103.141017] enP2p1s0v4 speed is unknown, defaulting to 1000
>[  103.146585] enP2p1s0v4 speed is unknown, defaulting to 1000
>[  103.152374] enP2p1s0v4 speed is unknown, defaulting to 1000
>[  103.166691] enP2p1s0v1 speed is unknown, defaulting to 1000
>[  103.172623] enP2p1s0v4 speed is unknown, defaulting to 1000
>[  103.191728] enP2p1s0v1 speed is unknown, defaulting to 1000
>[  103.197641] enP2p1s0v4 speed is unknown, defaulting to 1000
>[  103.380984] scsi_debug:sdebug_add_store: dif_storep 524288 bytes @
>0000000068763489
>[  103.389445] scsi_debug:sdebug_driver_probe: scsi_debug: trim
>poll_queues to 0. poll_q/nr_hw = (0/1)
>[  103.398577] scsi_debug:sdebug_driver_probe: host protection DIF3
>DIX3
>[  103.405018] scsi host4: scsi_debug: version 0190 [20200710]
>[  103.405018]   dev_size_mb=32, opts=0x0, submit_queues=1,
>statistics=0
>[  103.417664] scsi 4:0:0:0: Direct-Access     Linux    scsi_debug
>  0190 PQ: 0 ANSI: 7
>[  103.426302] sd 4:0:0:0: Power-on or device reset occurred
>[  103.426368] sd 4:0:0:0: Attached scsi generic sg1 type 0
>[  103.431800] sd 4:0:0:0: [sdb] Enabling DIF Type 3 protection
>[  103.442794] sd 4:0:0:0: [sdb] 65536 512-byte logical blocks: (33.6
>MB/32.0 MiB)
>[  103.450168] sd 4:0:0:0: [sdb] Write Protect is off
>[  103.454958] sd 4:0:0:0: [sdb] Mode Sense: 73 00 10 08
>[  103.455020] sd 4:0:0:0: [sdb] Write cache: enabled, read cache:
>enabled, supports DPO and FUA
>[  103.463665] sd 4:0:0:0: [sdb] Optimal transfer size 524288 bytes
>[  103.567989] sd 4:0:0:0: [sdb] Enabling DIX T10-DIF-TYPE3-CRC
>protection
>[  103.574602] sd 4:0:0:0: [sdb] DIF application tag size 6
>[  103.757781] sd 4:0:0:0: [sdb] Attached SCSI disk
>[  104.620435] enP2p1s0v1 speed is unknown, defaulting to 1000
>[  104.805722] enP2p1s0v4 speed is unknown, defaulting to 1000
>[  105.168234] Rounding down aligned max_sectors from 4294967295 to
>4294967288
>[  105.313416] ib_srpt:srpt_add_one: ib_srpt device =
>0000000043289393
>[  105.313438] ib_srpt:srpt_use_srq: ib_srpt
>srpt_use_srq(enP2p1s0v0_siw): use_srq = 0; ret
>--snip--
>[  172.857740] ib_srpt:srpt_release_channel_work: ib_srpt
>2620:0052:0000:13f0:1e1b:0dff:fe9d:b031-63
>[  172.857885] ib_srpt:srpt_release_channel_work: ib_srpt
>2620:0052:0000:13f0:1e1b:0dff:fe9d:b031-66
>[  172.858032] ib_srpt:srpt_release_channel_work: ib_srpt
>2620:0052:0000:13f0:1e1b:0dff:fe9d:b031-68
>[  172.858185] ib_srpt:srpt_release_channel_work: ib_srpt
>2620:0052:0000:13f0:1e1b:0dff:fe9d:b031-70
>[  172.858344] ib_srpt:srpt_release_channel_work: ib_srpt
>2620:0052:0000:13f0:1e1b:0dff:fe9d:b031-72
>[  172.858501] ib_srpt:srpt_release_channel_work: ib_srpt
>2620:0052:0000:13f0:1e1b:0dff:fe9d:b031-74
>[  172.858666] ib_srpt:srpt_release_channel_work: ib_srpt
>2620:0052:0000:13f0:1e1b:0dff:fe9d:b031-76
>[  172.858822] ib_srpt:srpt_release_channel_work: ib_srpt
>2620:0052:0000:13f0:1e1b:0dff:fe9d:b031-78
>[  172.858976] ib_srpt:srpt_release_channel_work: ib_srpt
>2620:0052:0000:13f0:1e1b:0dff:fe9d:b031-80
>[  172.859120] ib_srpt:srpt_release_channel_work: ib_srpt
>2620:0052:0000:13f0:1e1b:0dff:fe9d:b031-82
>[  172.859278] ib_srpt:srpt_release_channel_work: ib_srpt
>2620:0052:0000:13f0:1e1b:0dff:fe9d:b031-84
>[  172.859426] ib_srpt:srpt_release_channel_work: ib_srpt
>2620:0052:0000:13f0:1e1b:0dff:fe9d:b031-86
>[  172.859564] ib_srpt:srpt_release_channel_work: ib_srpt
>2620:0052:0000:13f0:1e1b:0dff:fe9d:b031-88
>[  172.859706] ib_srpt:srpt_release_channel_work: ib_srpt
>2620:0052:0000:13f0:1e1b:0dff:fe9d:b031-90
>[  172.859851] ib_srpt:srpt_release_channel_work: ib_srpt
>2620:0052:0000:13f0:1e1b:0dff:fe9d:b031-92
>[  173.439406] ib_srpt:srpt_disconnect_ch_sync: ib_srpt ch
>2620:0052:0000:13f0:a236:9fff:fe79:eb22-62 state 4
>[  178.456609] ib_srpt
>srpt_disconnect_ch_sync(2620:0052:0000:13f0:a236:9fff:fe79:eb22-62
>state 4): still waiting ...
>[  183.496506] ib_srpt
>srpt_disconnect_ch_sync(2620:0052:0000:13f0:a236:9fff:fe79:eb22-62
>state 4): still waiting ...
>[  188.536450] ib_srpt
>srpt_disconnect_ch_sync(2620:0052:0000:13f0:a236:9fff:fe79:eb22-62
>state 4): still waiting ...
>[  193.576351] ib_srpt
>srpt_disconnect_ch_sync(2620:0052:0000:13f0:a236:9fff:fe79:eb22-62
>state 4): still waiting ...
>[  198.616280] ib_srpt
>srpt_disconnect_ch_sync(2620:0052:0000:13f0:a236:9fff:fe79:eb22-62
>state 4): still waiting ...
>
>
>-- 
>Best Regards,
>  Yi Zhang
>
>

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [bug report] blktests srp/011 hang at "ib_srpt srpt_disconnect_ch_sync:still waiting ..."
  2021-12-01  8:55 [bug report] blktests srp/011 hang at "ib_srpt srpt_disconnect_ch_sync:still waiting ..." Yi Zhang
  2021-12-02  9:08 ` Bernard Metzler
@ 2021-12-02 18:42 ` Bart Van Assche
  2021-12-05 12:53   ` Yi Zhang
  2021-12-02 19:00 ` Bernard Metzler
  2 siblings, 1 reply; 6+ messages in thread
From: Bart Van Assche @ 2021-12-02 18:42 UTC (permalink / raw)
  To: Yi Zhang, RDMA mailing list

On 12/1/21 12:55 AM, Yi Zhang wrote:
> [root@gigabyte-r120-11 blktests]# use_siw=1 ./check srp/011 -------------> hang

Hi Yi,

Does this only occur with the siw driver or also with the rdma_rxe driver?

If this hang occurs with both drivers, how about bisecting this issue? I
have not yet run into this issue with the rdma_rxe driver and Linus' master
branch.

Thanks,

Bart.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Re: [bug report] blktests srp/011 hang at "ib_srpt srpt_disconnect_ch_sync:still waiting ..."
  2021-12-01  8:55 [bug report] blktests srp/011 hang at "ib_srpt srpt_disconnect_ch_sync:still waiting ..." Yi Zhang
  2021-12-02  9:08 ` Bernard Metzler
  2021-12-02 18:42 ` Bart Van Assche
@ 2021-12-02 19:00 ` Bernard Metzler
  2021-12-03  5:44   ` Yi Zhang
  2 siblings, 1 reply; 6+ messages in thread
From: Bernard Metzler @ 2021-12-02 19:00 UTC (permalink / raw)
  To: Bart Van Assche; +Cc: Yi Zhang, RDMA mailing list

-----"Bart Van Assche" <bvanassche@acm.org> wrote: -----

>To: "Yi Zhang" <yi.zhang@redhat.com>, "RDMA mailing list"
><linux-rdma@vger.kernel.org>
>From: "Bart Van Assche" <bvanassche@acm.org>
>Date: 12/02/2021 07:43PM
>Subject: [EXTERNAL] Re: [bug report] blktests srp/011 hang at
>"ib_srpt srpt_disconnect_ch_sync:still waiting ..."
>
>On 12/1/21 12:55 AM, Yi Zhang wrote:
>> [root@gigabyte-r120-11 blktests]# use_siw=1 ./check srp/011
>-------------> hang
>
>Hi Yi,
>
>Does this only occur with the siw driver or also with the rdma_rxe
>driver?
>
>If this hang occurs with both drivers, how about bisecting this
>issue? I
>have not yet run into this issue with the rdma_rxe driver and Linus'
>master
>branch.
>

I can't get it broken for siw nor rxe. Though for rxe is see
quite some

'ib_srpt receiving failed for ioctx 00000000nnnnnnnn with status 5'

Yi, what is the architecture you are running on?
Maybe you can try switching on dynamic debugging for the siw module
and send me the dmesg trace for the hang? Of course it
will not hang with all the prints ;)

Bernard.


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Re: [bug report] blktests srp/011 hang at "ib_srpt srpt_disconnect_ch_sync:still waiting ..."
  2021-12-02 19:00 ` Bernard Metzler
@ 2021-12-03  5:44   ` Yi Zhang
  0 siblings, 0 replies; 6+ messages in thread
From: Yi Zhang @ 2021-12-03  5:44 UTC (permalink / raw)
  To: Bernard Metzler; +Cc: Bart Van Assche, RDMA mailing list

On Fri, Dec 3, 2021 at 3:01 AM Bernard Metzler <BMT@zurich.ibm.com> wrote:
>
> -----"Bart Van Assche" <bvanassche@acm.org> wrote: -----
>
> >To: "Yi Zhang" <yi.zhang@redhat.com>, "RDMA mailing list"
> ><linux-rdma@vger.kernel.org>
> >From: "Bart Van Assche" <bvanassche@acm.org>
> >Date: 12/02/2021 07:43PM
> >Subject: [EXTERNAL] Re: [bug report] blktests srp/011 hang at
> >"ib_srpt srpt_disconnect_ch_sync:still waiting ..."
> >
> >On 12/1/21 12:55 AM, Yi Zhang wrote:
> >> [root@gigabyte-r120-11 blktests]# use_siw=1 ./check srp/011
> >-------------> hang
> >
> >Hi Yi,
> >
> >Does this only occur with the siw driver or also with the rdma_rxe
> >driver?
> >
> >If this hang occurs with both drivers, how about bisecting this
> >issue? I
> >have not yet run into this issue with the rdma_rxe driver and Linus'
> >master
> >branch.
Hi Bart
This only reproduced with siw.
Maybe it's related to the num of network interfaces, I just found this
issue only can be reproduced when the system has 3(or greater) up
network interfaces, and cannot be reproduced with two up network
interfaces.
And from ps, seems was hang at removing the srpt target:
[root@gigabyte-r120-11 ~]# ps aux | grep rm
root         232  0.0  0.0      0     0 ?        I<   00:20   0:00
[acpi_thermal_pm]
root       10198  0.2  0.0   5240   776 pts/0    D+   00:37   0:00
rmdir /sys/kernel/config/target/srpt/0x1c1b0d9db02f00000000000000000000/0x1c1b0d9db02f00000000000000000000/acls/0x1c1b0d9db02f00000000000000000000
/sys/kernel/config/target/srpt/0x1c1b0d9db02f00000000000000000000/0x1c1b0d9db02f00000000000000000000/acls/0x1c1b0d9db03000000000000000000000
/sys/kernel/config/target/srpt/0x1c1b0d9db02f00000000000000000000/0x1c1b0d9db02f00000000000000000000/acls/0x1c1b0d9db03100000000000000000000
/sys/kernel/config/target/srpt/0x1c1b0d9db02f00000000000000000000/0x1c1b0d9db02f00000000000000000000/acls/0x1c1b0d9db03200000000000000000000
/sys/kernel/config/target/srpt/0x1c1b0d9db02f00000000000000000000/0x1c1b0d9db02f00000000000000000000/acls/0x1c1b0d9db03300000000000000000000
/sys/kernel/config/target/srpt/0x1c1b0d9db02f00000000000000000000/0x1c1b0d9db02f00000000000000000000/acls/0xa0369f79eb2000000000000000000000
/sys/kernel/config/target/srpt/0x1c1b0d9db02f00000000000000000000/0x1c1b0d9db02f00000000000000000000/acls/0xa0369f79eb2200000000000000000000
/sys/kernel/config/target/srpt/0x1c1b0d9db02f00000000000000000000/0x1c1b0d9db02f00000000000000000000/acls/10.19.240.47
/sys/kernel/config/target/srpt/0x1c1b0d9db02f00000000000000000000/0x1c1b0d9db02f00000000000000000000/acls/10.19.243.129
/sys/kernel/config/target/srpt/0x1c1b0d9db02f00000000000000000000/0x1c1b0d9db02f00000000000000000000/acls/10.19.243.242
/sys/kernel/config/target/srpt/0x1c1b0d9db02f00000000000000000000/0x1c1b0d9db02f00000000000000000000/acls/2620:0052:0000:13f0:1e1b:0dff:fe9d:b02f
/sys/kernel/config/target/srpt/0x1c1b0d9db02f00000000000000000000/0x1c1b0d9db02f00000000000000000000/acls/2620:0052:0000:13f0:1e1b:0dff:fe9d:b02f:fe80:1e1b:0dff:fe9d:b02f
/sys/kernel/config/target/srpt/0x1c1b0d9db02f00000000000000000000/0x1c1b0d9db02f00000000000000000000/acls/2620:0052:0000:13f0:1e1b:0dff:fe9d:b031
/sys/kernel/config/target/srpt/0x1c1b0d9db02f00000000000000000000/0x1c1b0d9db02f00000000000000000000/acls/2620:0052:0000:13f0:1e1b:0dff:fe9d:b031:fe80:1e1b:0dff:fe9d:b031
/sys/kernel/config/target/srpt/0x1c1b0d9db02f00000000000000000000/0x1c1b0d9db02f00000000000000000000/acls/2620:0052:0000:13f0:1e1b:0dff:fe9d:b032
/sys/kernel/config/target/srpt/0x1c1b0d9db02f00000000000000000000/0x1c1b0d9db02f00000000000000000000/acls/2620:0052:0000:13f0:1e1b:0dff:fe9d:b032:fe80:1e1b:0dff:fe9d:b032
/sys/kernel/config/target/srpt/0x1c1b0d9db03000000000000000000000/0x1c1b0d9db03000000000000000000000/acls/0x1c1b0d9db02f00000000000000000000
/sys/kernel/config/target/srpt/0x1c1b0d9db03000000000000000000000/0x1c1b0d9db03000000000000000000000/acls/0x1c1b0d9db03000000000000000000000
/sys/kernel/config/target/srpt/0x1c1b0d9db03000000000000000000000/0x1c1b0d9db03000000000000000000000/acls/0x1c1b0d9db03100000000000000000000
/sys/kernel/config/target/srpt/0x1c1b0d9db03000000000000000000000/0x1c1b0d9db03000000000000000000000/acls/0x1c1b0d9db03200000000000000000000
/sys/kernel/config/target/srpt/0x1c1b0d9db03000000000000000000000/0x1c1b0d9db03000000000000000000000/acls/0x1c1b0d9db03300000000000000000000
/sys/kernel/config/target/srpt/0x1c1b0d9db03000000000000000000000/0x1c1b0d9db03000000000000000000000/acls/0xa0369f79eb2000000000000000000000
/sys/kernel/config/target/srpt/0x1c1b0d9db03000000000000000000000/0x1c1b0d9db03000000000000000000000/acls/0xa0369f79eb2200000000000000000000
/sys/kernel/config/target/srpt/0x1c1b0d9db03000000000000000000000/0x1c1b0d9db03000000000000000000000/acls/10.19.240.47
/sys/kernel/config/target/srpt/0x1c1b0d9db03000000000000000000000/0x1c1b0d9db03000000000000000000000/acls/10.19.243.129
/sys/kernel/config/target/srpt/0x1c1b0d9db03000000000000000000000/0x1c1b0d9db03000000000000000000000/acls/10.19.243.242
/sys/kernel/config/target/srpt/0x1c1b0d9db03000000000000000000000/0x1c1b0d9db03000000000000000000000/acls/2620:0052:0000:13f0:1e1b:0dff:fe9d:b02f
/sys/kernel/config/target/srpt/0x1c1b0d9db03000000000000000000000/0x1c1b0d9db03000000000000000000000/acls/2620:0052:0000:13f0:1e1b:0dff:fe9d:b02f:fe80:1e1b:0dff:fe9d:b02f
/sys/kernel/config/target/srpt/0x1c1b0d9db03000000000000000000000/0x1c1b0d9db03000000000000000000000/acls/2620:0052:0000:13f0:1e1b:0dff:fe9d:b031
/sys/kernel/config/target/srpt/0x1c1b0d9db03000000000000000000000/0x1c1b0d9db03000000000000000000000/acls/2620:0052:0000:13f0:1e1b:0dff:fe9d:b031:fe80:1e1b:0dff:fe9d:b031
/sys/kernel/config/target/srpt/0x1c1b0d9db03000000000000000000000/0x1c1b0d9db03000000000000000000000/acls/2620:0052:0000:13f0:1e1b:0dff:fe9d:b032
/sys/kernel/config/target/srpt/0x1c1b0d9db03000000000000000000000/0x1c1b0d9db03000000000000000000000/acls/2620:0052:0000:13f0:1e1b:0dff:fe9d:b032:fe80:1e1b:0dff:fe9d:b032
/sys/kernel/config/target/srpt/0x1c1b0d9db03100000000000000000000/0x1c1b0d9db03100000000000000000000/acls/0x1c1b0d9db02f00000000000000000000
/sys/kernel/config/target/srpt/0x1c1b0d9db03100000000000000000000/0x1c1b0d9db03100000000000000000000/acls/0x1c1b0d9db03000000000000000000000
/sys/kernel/config/target/srpt/0x1c1b0d9db03100000000000000000000/0x1c1b0d9db03100000000000000000000/acls/0x1c1b0d9db03100000000000000000000
/sys/kernel/config/target/srpt/0x1c1b0d9db03100000000000000000000/0x1c1b0d9db03100000000000000000000/acls/0x1c1b0d9db03200000000000000000000
/sys/kernel/config/target/srpt/0x1c1b0d9db03100000000000000000000/0x1c1b0d9db03100000000000000000000/acls/0x1c1b0d9db03300000000000000000000
/sys/kernel/config/target/srpt/0x1c1b0d9db03100000000000000000000/0x1c1b0d9db03100000000000000000000/acls/0xa0369f79eb2000000000000000000000
/sys/kernel/config/target/srpt/0x1c1b0d9db03100000000000000000000/0x1c1b0d9db03100000000000000000000/acls/0xa0369f79eb2200000000000000000000
/sys/kernel/config/target/srpt/0x1c1b0d9db03100000000000000000000/0x1c1b0d9db03100000000000000000000/acls/10.19.240.47
/sys/kernel/config/target/srpt/0x1c1b0d9db03100000000000000000000/0x1c1b0d9db03100000000000000000000/acls/10.19.243.129
/sys/kernel/config/target/srpt/0x1c1b0d9db03100000000000000000000/0x1c1b0d9db03100000000000000000000/acls/10.19.243.242
/sys/kernel/config/target/srpt/0x1c1b0d9db03100000000000000000000/0x1c1b0d9db03100000000000000000000/acls/2620:0052:0000:13f0:1e1b:0dff:fe9d:b02f
/sys/kernel/config/target/srpt/0x1c1b0d9db03100000000000000000000/0x1c1b0d9db03100000000000000000000/acls/2620:0052:0000:13f0:1e1b:0dff:fe9d:b02f:fe80:1e1b:0dff:fe9d:b02f
/sys/kernel/config/target/srpt/0x1c1b0d9db03100000000000000000000/0x1c1b0d9db03100000000000000000000/acls/2620:0052:0000:13f0:1e1b:0dff:fe9d:b031
/sys/kernel/config/target/srpt/0x1c1b0d9db03100000000000000000000/0x1c1b0d9db03100000000000000000000/acls/2620:0052:0000:13f0:1e1b:0dff:fe9d:b031:fe80:1e1b:0dff:fe9d:b031
/sys/kernel/config/target/srpt/0x1c1b0d9db03100000000000000000000/0x1c1b0d9db03100000000000000000000/acls/2620:0052:0000:13f0:1e1b:0dff:fe9d:b032
/sys/kernel/config/target/srpt/0x1c1b0d9db03100000000000000000000/0x1c1b0d9db03100000000000000000000/acls/2620:0052:0000:13f0:1e1b:0dff:fe9d:b032:fe80:1e1b:0dff:fe9d:b032
/sys/kernel/config/target/srpt/0x1c1b0d9db03200000000000000000000/0x1c1b0d9db03200000000000000000000/acls/0x1c1b0d9db02f00000000000000000000
/sys/kernel/config/target/srpt/0x1c1b0d9db03200000000000000000000/0x1c1b0d9db03200000000000000000000/acls/0x1c1b0d9db03000000000000000000000
/sys/kernel/config/target/srpt/0x1c1b0d9db03200000000000000000000/0x1c1b0d9db03200000000000000000000/acls/0x1c1b0d9db03100000000000000000000
/sys/kernel/config/target/srpt/0x1c1b0d9db03200000000000000000000/0x1c1b0d9db03200000000000000000000/acls/0x1c1b0d9db03200000000000000000000
/sys/kernel/config/target/srpt/0x1c1b0d9db03200000000000000000000/0x1c1b0d9db03200000000000000000000/acls/0x1c1b0d9db03300000000000000000000
/sys/kernel/config/target/srpt/0x1c1b0d9db03200000000000000000000/0x1c1b0d9db03200000000000000000000/acls/0xa0369f79eb2000000000000000000000
/sys/kernel/config/target/srpt/0x1c1b0d9db03200000000000000000000/0x1c1b0d9db03200000000000000000000/acls/0xa0369f79eb2200000000000000000000
/sys/kernel/config/target/srpt/0x1c1b0d9db03200000000000000000000/0x1c1b0d9db03200000000000000000000/acls/10.19.240.47
/sys/kernel/config/target/srpt/0x1c1b0d9db03200000000000000000000/0x1c1b0d9db03200000000000000000000/acls/10.19.243.129
/sys/kernel/config/target/srpt/0x1c1b0d9db03200000000000000000000/0x1c1b0d9db03200000000000000000000/acls/10.19.243.242
/sys/kernel/config/target/srpt/0x1c1b0d9db03200000000000000000000/0x1c1b0d9db03200000000000000000000/acls/2620:0052:0000:13f0:1e1b:0dff:fe9d:b02f
/sys/kernel/config/target/srpt/0x1c1b0d9db03200000000000000000000/0x1c1b0d9db03200000000000000000000/acls/2620:0052:0000:13f0:1e1b:0dff:fe9d:b02f:fe80:1e1b:0dff:fe9d:b02f
/sys/kernel/config/target/srpt/0x1c1b0d9db03200000000000000000000/0x1c1b0d9db03200000000000000000000/acls/2620:0052:0000:13f0:1e1b:0dff:fe9d:b031
/sys/kernel/config/target/srpt/0x1c1b0d9db03200000000000000000000/0x1c1b0d9db03200000000000000000000/acls/2620:0052:0000:13f0:1e1b:0dff:fe9d:b031:fe80:1e1b:0dff:fe9d:b031
/sys/kernel/config/target/srpt/0x1c1b0d9db03200000000000000000000/0x1c1b0d9db03200000000000000000000/acls/2620:0052:0000:13f0:1e1b:0dff:fe9d:b032
/sys/kernel/config/target/srpt/0x1c1b0d9db03200000000000000000000/0x1c1b0d9db03200000000000000000000/acls/2620:0052:0000:13f0:1e1b:0dff:fe9d:b032:fe80:1e1b:0dff:fe9d:b032
/sys/kernel/config/target/srpt/0x1c1b0d9db03300000000000000000000/0x1c1b0d9db03300000000000000000000/acls/0x1c1b0d9db02f00000000000000000000
/sys/kernel/config/target/srpt/0x1c1b0d9db03300000000000000000000/0x1c1b0d9db03300000000000000000000/acls/0x1c1b0d9db03000000000000000000000
/sys/kernel/config/target/srpt/0x1c1b0d9db03300000000000000000000/0x1c1b0d9db03300000000000000000000/acls/0x1c1b0d9db03100000000000000000000
/sys/kernel/config/target/srpt/0x1c1b0d9db03300000000000000000000/0x1c1b0d9db03300000000000000000000/acls/0x1c1b0d9db03200000000000000000000
/sys/kernel/config/target/srpt/0x1c1b0d9db03300000000000000000000/0x1c1b0d9db03300000000000000000000/acls/0x1c1b0d9db03300000000000000000000
/sys/kernel/config/target/srpt/0x1c1b0d9db03300000000000000000000/0x1c1b0d9db03300000000000000000000/acls/0xa0369f79eb2000000000000000000000
/sys/kernel/config/target/srpt/0x1c1b0d9db03300000000000000000000/0x1c1b0d9db03300000000000000000000/acls/0xa0369f79eb2200000000000000000000
/sys/kernel/config/target/srpt/0x1c1b0d9db03300000000000000000000/0x1c1b0d9db03300000000000000000000/acls/10.19.240.47
/sys/kernel/config/target/srpt/0x1c1b0d9db03300000000000000000000/0x1c1b0d9db03300000000000000000000/acls/10.19.243.129
/sys/kernel/config/target/srpt/0x1c1b0d9db03300000000000000000000/0x1c1b0d9db03300000000000000000000/acls/10.19.243.242
/sys/kernel/config/target/srpt/0x1c1b0d9db03300000000000000000000/0x1c1b0d9db03300000000000000000000/acls/2620:0052:0000:13f0:1e1b:0dff:fe9d:b02f
/sys/kernel/config/target/srpt/0x1c1b0d9db03300000000000000000000/0x1c1b0d9db03300000000000000000000/acls/2620:0052:0000:13f0:1e1b:0dff:fe9d:b02f:fe80:1e1b:0dff:fe9d:b02f
/sys/kernel/config/target/srpt/0x1c1b0d9db03300000000000000000000/0x1c1b0d9db03300000000000000000000/acls/2620:0052:0000:13f0:1e1b:0dff:fe9d:b031
/sys/kernel/config/target/srpt/0x1c1b0d9db03300000000000000000000/0x1c1b0d9db03300000000000000000000/acls/2620:0052:0000:13f0:1e1b:0dff:fe9d:b031:fe80:1e1b:0dff:fe9d:b031
/sys/kernel/config/target/srpt/0x1c1b0d9db03300000000000000000000/0x1c1b0d9db03300000000000000000000/acls/2620:0052:0000:13f0:1e1b:0dff:fe9d:b032
/sys/kernel/config/target/srpt/0x1c1b0d9db03300000000000000000000/0x1c1b0d9db03300000000000000000000/acls/2620:0052:0000:13f0:1e1b:0dff:fe9d:b032:fe80:1e1b:0dff:fe9d:b032
/sys/kernel/config/target/srpt/0xa0369f79eb2000000000000000000000/0xa0369f79eb2000000000000000000000/acls/0x1c1b0d9db02f00000000000000000000
/sys/kernel/config/target/srpt/0xa0369f79eb2000000000000000000000/0xa0369f79eb2000000000000000000000/acls/0x1c1b0d9db03000000000000000000000
/sys/kernel/config/target/srpt/0xa0369f79eb2000000000000000000000/0xa0369f79eb2000000000000000000000/acls/0x1c1b0d9db03100000000000000000000
/sys/kernel/config/target/srpt/0xa0369f79eb2000000000000000000000/0xa0369f79eb2000000000000000000000/acls/0x1c1b0d9db03200000000000000000000
/sys/kernel/config/target/srpt/0xa0369f79eb2000000000000000000000/0xa0369f79eb2000000000000000000000/acls/0x1c1b0d9db03300000000000000000000
/sys/kernel/config/target/srpt/0xa0369f79eb2000000000000000000000/0xa0369f79eb2000000000000000000000/acls/0xa0369f79eb2000000000000000000000
/sys/kernel/config/target/srpt/0xa0369f79eb2000000000000000000000/0xa0369f79eb2000000000000000000000/acls/0xa0369f79eb2200000000000000000000
/sys/kernel/config/target/srpt/0xa0369f79eb2000000000000000000000/0xa0369f79eb2000000000000000000000/acls/10.19.240.47
/sys/kernel/config/target/srpt/0xa0369f79eb2000000000000000000000/0xa0369f79eb2000000000000000000000/acls/10.19.243.129
/sys/kernel/config/target/srpt/0xa0369f79eb2000000000000000000000/0xa0369f79eb2000000000000000000000/acls/10.19.243.242
/sys/kernel/config/target/srpt/0xa0369f79eb2000000000000000000000/0xa0369f79eb2000000000000000000000/acls/2620:0052:0000:13f0:1e1b:0dff:fe9d:b02f
/sys/kernel/config/target/srpt/0xa0369f79eb2000000000000000000000/0xa0369f79eb2000000000000000000000/acls/2620:0052:0000:13f0:1e1b:0dff:fe9d:b02f:fe80:1e1b:0dff:fe9d:b02f
/sys/kernel/config/target/srpt/0xa0369f79eb2000000000000000000000/0xa0369f79eb2000000000000000000000/acls/2620:0052:0000:13f0:1e1b:0dff:fe9d:b031
/sys/kernel/config/target/srpt/0xa0369f79eb2000000000000000000000/0xa0369f79eb2000000000000000000000/acls/2620:0052:0000:13f0:1e1b:0dff:fe9d:b031:fe80:1e1b:0dff:fe9d:b031
/sys/kernel/config/target/srpt/0xa0369f79eb2000000000000000000000/0xa0369f79eb2000000000000000000000/acls/2620:0052:0000:13f0:1e1b:0dff:fe9d:b032
/sys/kernel/config/target/srpt/0xa0369f79eb2000000000000000000000/0xa0369f79eb2000000000000000000000/acls/2620:0052:0000:13f0:1e1b:0dff:fe9d:b032:fe80:1e1b:0dff:fe9d:b032
/sys/kernel/config/target/srpt/0xa0369f79eb2200000000000000000000/0xa0369f79eb2200000000000000000000/acls/0x1c1b0d9db02f00000000000000000000
/sys/kernel/config/target/srpt/0xa0369f79eb2200000000000000000000/0xa0369f79eb2200000000000000000000/acls/0x1c1b0d9db03000000000000000000000
/sys/kernel/config/target/srpt/0xa0369f79eb2200000000000000000000/0xa0369f79eb2200000000000000000000/acls/0x1c1b0d9db03100000000000000000000
/sys/kernel/config/target/srpt/0xa0369f79eb2200000000000000000000/0xa0369f79eb2200000000000000000000/acls/0x1c1b0d9db03200000000000000000000
/sys/kernel/config/target/srpt/0xa0369f79eb2200000000000000000000/0xa0369f79eb2200000000000000000000/acls/0x1c1b0d9db03300000000000000000000
/sys/kernel/config/target/srpt/0xa0369f79eb2200000000000000000000/0xa0369f79eb2200000000000000000000/acls/0xa0369f79eb2000000000000000000000
/sys/kernel/config/target/srpt/0xa0369f79eb2200000000000000000000/0xa0369f79eb2200000000000000000000/acls/0xa0369f79eb2200000000000000000000
/sys/kernel/config/target/srpt/0xa0369f79eb2200000000000000000000/0xa0369f79eb2200000000000000000000/acls/10.19.240.47
/sys/kernel/config/target/srpt/0xa0369f79eb2200000000000000000000/0xa0369f79eb2200000000000000000000/acls/10.19.243.129
/sys/kernel/config/target/srpt/0xa0369f79eb2200000000000000000000/0xa0369f79eb2200000000000000000000/acls/10.19.243.242
/sys/kernel/config/target/srpt/0xa0369f79eb2200000000000000000000/0xa0369f79eb2200000000000000000000/acls/2620:0052:0000:13f0:1e1b:0dff:fe9d:b02f
/sys/kernel/config/target/srpt/0xa0369f79eb2200000000000000000000/0xa0369f79eb2200000000000000000000/acls/2620:0052:0000:13f0:1e1b:0dff:fe9d:b02f:fe80:1e1b:0dff:fe9d:b02f
/sys/kernel/config/target/srpt/0xa0369f79eb2200000000000000000000/0xa0369f79eb2200000000000000000000/acls/2620:0052:0000:13f0:1e1b:0dff:fe9d:b031
/sys/kernel/config/target/srpt/0xa0369f79eb2200000000000000000000/0xa0369f79eb2200000000000000000000/acls/2620:0052:0000:13f0:1e1b:0dff:fe9d:b031:fe80:1e1b:0dff:fe9d:b031
/sys/kernel/config/target/srpt/0xa0369f79eb2200000000000000000000/0xa0369f79eb2200000000000000000000/acls/2620:0052:0000:13f0:1e1b:0dff:fe9d:b032
/sys/kernel/config/target/srpt/0xa0369f79eb2200000000000000000000/0xa0369f79eb2200000000000000000000/acls/2620:0052:0000:13f0:1e1b:0dff:fe9d:b032:fe80:1e1b:0dff:fe9d:b032

# cat /proc/10198/stack
[<0>] __switch_to+0x150/0x1b0
[<0>] srpt_close_session+0x80/0xe4 [ib_srpt]
[<0>] target_shutdown_sessions+0x128/0x150 [target_core_mod]
[<0>] core_tpg_del_initiator_node_acl+0x7c/0x130 [target_core_mod]
[<0>] target_fabric_nacl_base_release+0x30/0x40 [target_core_mod]
[<0>] config_item_cleanup+0x60/0x160
[<0>] config_item_put+0x6c/0x120
[<0>] configfs_rmdir+0x24c/0x3f0
[<0>] vfs_rmdir+0x8c/0x210
[<0>] do_rmdir+0x174/0x1a0
[<0>] __arm64_sys_unlinkat+0x74/0x90
[<0>] invoke_syscall+0x50/0x120
[<0>] el0_svc_common.constprop.0+0x4c/0x100
[<0>] do_el0_svc+0x34/0xa0
[<0>] el0_svc+0x30/0xd0
[<0>] el0t_64_sync_handler+0xa4/0x130
[<0>] el0t_64_sync+0x1a4/0x1a8


> >
>
> I can't get it broken for siw nor rxe. Though for rxe is see
> quite some
>
> 'ib_srpt receiving failed for ioctx 00000000nnnnnnnn with status 5'
>
> Yi, what is the architecture you are running on?
> Maybe you can try switching on dynamic debugging for the siw module
> and send me the dmesg trace for the hang? Of course it
> will not hang with all the prints ;)

I reproduced it on one aarch64 server, here is the full log with
dynamic siw debug, can you try with one server which has 3 network
interfaces as I mentioned above.
https://pastebin.com/8XMLdeT0


>
> Bernard.
>


-- 
Best Regards,
  Yi Zhang


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [bug report] blktests srp/011 hang at "ib_srpt srpt_disconnect_ch_sync:still waiting ..."
  2021-12-02 18:42 ` Bart Van Assche
@ 2021-12-05 12:53   ` Yi Zhang
  0 siblings, 0 replies; 6+ messages in thread
From: Yi Zhang @ 2021-12-05 12:53 UTC (permalink / raw)
  To: Bart Van Assche; +Cc: RDMA mailing list

On Fri, Dec 3, 2021 at 2:42 AM Bart Van Assche <bvanassche@acm.org> wrote:
>
> On 12/1/21 12:55 AM, Yi Zhang wrote:
> > [root@gigabyte-r120-11 blktests]# use_siw=1 ./check srp/011 -------------> hang
>
> Hi Yi,
>
> Does this only occur with the siw driver or also with the rdma_rxe driver?
>
> If this hang occurs with both drivers, how about bisecting this issue? I

Hi Bart
Bisecting shows it was introduced with below commit:
commit 5a836bf6b09f99ead1b69457ff39ab3011ece57b
Author: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Date:   Fri Feb 26 17:11:55 2021 +0100

    mm: slub: move flush_cpu_slab() invocations __free_slab()
invocations out of IRQ context

    flush_all() flushes a specific SLAB cache on each CPU (where the cache
    is present). The deactivate_slab()/__free_slab() invocation happens
    within IPI handler and is problematic for PREEMPT_RT.

    The flush operation is not a frequent operation or a hot path. The
    per-CPU flush operation can be moved to within a workqueue.

    Because a workqueue handler, unlike IPI handler, does not disable irqs,
    flush_slab() now has to disable them for working with the kmem_cache_cpu
    fields. deactivate_slab() is safe to call with irqs enabled.

    [vbabka@suse.cz: adapt to new SLUB changes]
    Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
    Signed-off-by: Vlastimil Babka <vbabka@suse.cz>

BTW, I just found this issue cannot be reproduced If I update the
ch_count to 10,.
# cat /etc/modprobe.d/ib_srp.conf
options ib_srp ch_count=10


> have not yet run into this issue with the rdma_rxe driver and Linus' master
> branch.
>
> Thanks,
>
> Bart.
>


-- 
Best Regards,
  Yi Zhang


^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2021-12-05 12:53 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-12-01  8:55 [bug report] blktests srp/011 hang at "ib_srpt srpt_disconnect_ch_sync:still waiting ..." Yi Zhang
2021-12-02  9:08 ` Bernard Metzler
2021-12-02 18:42 ` Bart Van Assche
2021-12-05 12:53   ` Yi Zhang
2021-12-02 19:00 ` Bernard Metzler
2021-12-03  5:44   ` Yi Zhang

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).