Linux-NVME Archive on lore.kernel.org
 help / color / Atom feed
* Bug report: NVMeoF RDMA bnxt_roce: I/O 1 QID 0 timeout
       [not found] <1269174553.11144711.1573527781970.JavaMail.zimbra@redhat.com>
@ 2019-11-12  3:18 ` Yi Zhang
  2019-11-12  4:51   ` Selvin Xavier
  0 siblings, 1 reply; 2+ messages in thread
From: Yi Zhang @ 2019-11-12  3:18 UTC (permalink / raw)
  To: linux-nvme; +Cc: selvin.xavier, sagi

Hello

I would like to report a NVMeoF RDMA I/O timeout issue, here is the reproducer and kernel log, let me know if you need more info 

Reproducer:
1.setup nvmeof rdma roce environment
   target: qedr roce
   client: bnxt roce

2. client: connect target
3. client: do fio stress

HW info:
target:
# lspci | grep -i ql
19:00.0 Ethernet controller: QLogic Corp. FastLinQ QL41000 Series 10/25/40/50GbE Controller (rev 02)
19:00.1 Ethernet controller: QLogic Corp. FastLinQ QL41000 Series 10/25/40/50GbE Controller (rev 02)
19:00.2 Ethernet controller: QLogic Corp. FastLinQ QL41000 Series 10/25/40/50GbE Controller (rev 02)
19:00.3 Ethernet controller: QLogic Corp. FastLinQ QL41000 Series 10/25/40/50GbE Controller (rev 02)
client:
# lspci  | grep -i bro
01:00.0 Ethernet controller: Broadcom Inc. and subsidiaries NetXtreme BCM5720 2-port Gigabit Ethernet PCIe
01:00.1 Ethernet controller: Broadcom Inc. and subsidiaries NetXtreme BCM5720 2-port Gigabit Ethernet PCIe
18:00.0 RAID bus controller: Broadcom / LSI MegaRAID SAS-3 3008 [Fury] (rev 02)
19:00.0 Ethernet controller: Broadcom Inc. and subsidiaries BCM57412 NetXtreme-E 10Gb RDMA Ethernet Controller (rev 01)
19:00.1 Ethernet controller: Broadcom Inc. and subsidiaries BCM57412 NetXtreme-E 10Gb RDMA Ethernet Controller (rev 01)


kerne log:
client:
[   65.974085] nvme nvme2: new ctrl: NQN "nqn.2014-08.org.nvmexpress.discovery", addr 172.31.45.186:4420
[   65.974592] nvme nvme2: Removing ctrl: NQN "nqn.2014-08.org.nvmexpress.discovery"
[   66.027510] nvme nvme2: creating 1 I/O queues.
[   66.053178] nvme nvme2: mapped 1/0/0 default/read/poll queues.
[   66.053802] nvme nvme2: new ctrl: NQN "testnqn", addr 172.31.45.186:4420
[   66.054465] nvme2n1: detected capacity change from 0 to 1600321314816
[   81.926738] nvme nvme2: I/O 1 QID 0 timeout
[   81.930943] nvme nvme2: starting error recovery
[   88.070471] nvme nvme2: I/O 1 QID 0 timeout
[   93.702284] nvme nvme2: I/O 1 QID 0 timeout
[   98.310013] nvme nvme2: I/O 49 QID 1 timeout
[   98.314297] nvme nvme2: I/O 50 QID 1 timeout
[   98.318575] nvme nvme2: I/O 51 QID 1 timeout
[   98.322855] nvme nvme2: I/O 52 QID 1 timeout
[   98.327135] nvme nvme2: I/O 54 QID 1 timeout
[   98.331419] nvme nvme2: I/O 55 QID 1 timeout
[   98.335699] nvme nvme2: I/O 56 QID 1 timeout
[   98.339984] nvme nvme2: I/O 57 QID 1 timeout
[   98.344259] nvme nvme2: I/O 65 QID 1 timeout
[   98.348531] nvme nvme2: I/O 66 QID 1 timeout
[   98.352805] nvme nvme2: I/O 67 QID 1 timeout
[   98.357086] nvme nvme2: I/O 68 QID 1 timeout
[   98.361369] nvme nvme2: I/O 69 QID 1 timeout
[   98.365648] nvme nvme2: I/O 70 QID 1 timeout
[   98.369928] nvme nvme2: I/O 71 QID 1 timeout
[   98.374203] nvme nvme2: I/O 72 QID 1 timeout
[   98.378484] nvme nvme2: I/O 73 QID 1 timeout
[   98.382764] nvme nvme2: I/O 74 QID 1 timeout
[   98.387035] nvme nvme2: I/O 75 QID 1 timeout
[   98.391310] nvme nvme2: I/O 76 QID 1 timeout
[   98.395590] nvme nvme2: I/O 77 QID 1 timeout
[   98.399864] nvme nvme2: I/O 78 QID 1 timeout
[   98.404141] nvme nvme2: I/O 79 QID 1 timeout
[   98.408415] nvme nvme2: I/O 80 QID 1 timeout
[   98.412686] nvme nvme2: I/O 81 QID 1 timeout
[   98.416961] nvme nvme2: I/O 82 QID 1 timeout
[   98.421232] nvme nvme2: I/O 86 QID 1 timeout
[   98.425503] nvme nvme2: I/O 87 QID 1 timeout
[   98.429776] nvme nvme2: I/O 94 QID 1 timeout
[   98.434050] nvme nvme2: I/O 95 QID 1 timeout
[   98.438323] nvme nvme2: I/O 96 QID 1 timeout
[   98.442596] nvme nvme2: I/O 97 QID 1 timeout
[   98.446867] nvme nvme2: I/O 98 QID 1 timeout
[   98.451141] nvme nvme2: I/O 99 QID 1 timeout
[   98.455412] nvme nvme2: I/O 100 QID 1 timeout
[   98.459772] nvme nvme2: I/O 101 QID 1 timeout
[   98.464131] nvme nvme2: I/O 102 QID 1 timeout
[   98.468490] nvme nvme2: I/O 103 QID 1 timeout
[   98.472849] nvme nvme2: I/O 104 QID 1 timeout
[   98.477208] nvme nvme2: I/O 105 QID 1 timeout
[   98.481568] nvme nvme2: I/O 106 QID 1 timeout
[   98.485926] nvme nvme2: I/O 109 QID 1 timeout
[   98.490286] nvme nvme2: I/O 111 QID 1 timeout
[   98.494644] nvme nvme2: I/O 112 QID 1 timeout
[   98.499004] nvme nvme2: I/O 113 QID 1 timeout
[   98.503363] nvme nvme2: I/O 115 QID 1 timeout
[   98.507721] nvme nvme2: I/O 116 QID 1 timeout
[   98.512080] nvme nvme2: I/O 117 QID 1 timeout
[   98.516441] nvme nvme2: I/O 118 QID 1 timeout
[   98.520807] nvme nvme2: I/O 119 QID 1 timeout
[   98.525167] nvme nvme2: I/O 120 QID 1 timeout
[   98.529526] nvme nvme2: I/O 121 QID 1 timeout
[   98.533885] nvme nvme2: I/O 122 QID 1 timeout
[   98.538245] nvme nvme2: I/O 123 QID 1 timeout
[   98.542603] nvme nvme2: I/O 124 QID 1 timeout
[   98.546963] nvme nvme2: I/O 126 QID 1 timeout
[   99.846073] nvme nvme2: I/O 1 QID 0 timeout
[  102.405967] bnxt_en 0000:19:00.0: QPLIB: cmdq[0x41d]=0x3 timedout (20000)msec
[  102.413114] infiniband bnxt_re0: Failed to modify HW QP
[  102.418367] bnxt_en 0000:19:00.0: QPLIB: cmdq[0x0]=0x3 send failed
[  102.424551] infiniband bnxt_re0: Failed to modify HW QP
[  102.429779] ------------[ cut here ]------------
[  102.434399] failed to drain send queue: -110
[  102.438730] WARNING: CPU: 43 PID: 780 at drivers/infiniband/core/verbs.c:2656 __ib_drain_sq+0x143/0x190 [ib_core]
[  102.448985] Modules linked in: nvme_rdma nvme_fabrics nvmet_rdma nvmet 8021q garp mrp stp llc ib_isert iscsi_target_mod ib_srpt target_core_mod ib_srp scsi_transport_srp vfat fat mlx5_ib opa_vnic ib_umad ib_ipoib rpcrdma intel_rapl_msr sunrpc intel_rapl_common rdma_ucm ib_iser libiscsi isst_if_common scsi_transport_iscsi iTCO_wdt iTCO_vendor_support skx_edac dcdbas nfit libnvdimm x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel ipmi_ssif iw_cxgb4 kvm irqbypass rdma_cm iw_cm ib_cm libcxgb crct10dif_pclmul crc32_pclmul hfi1 bnxt_re rdmavt ghash_clmulni_intel intel_cstate ib_uverbs ib_core intel_uncore dell_smbios intel_rapl_perf wmi_bmof dell_wmi_descriptor sg pcspkr mei_me ipmi_si i2c_i801 lpc_ich mei ipmi_devintf ipmi_msghandler acpi_power_meter xfs libcrc32c sd_mod mgag200 drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops i2c_algo_bit drm_vram_helper ttm mlx5_core drm ahci csiostor cxgb4 libahci crc32c_intel nvme mlxfw bnxt_en nvme_core libata megaraid_sas
[  102.449018]  scsi_transport_fc tg3 pci_hyperv_intf wmi dm_mirror dm_region_hash dm_log dm_mod
[  102.543833] CPU: 43 PID: 780 Comm: kworker/u98:3 Kdump: loaded Not tainted 5.4.0-0.rc6.2.elrdy.x86_64 #1
[  102.553303] Hardware name: Dell Inc. PowerEdge R740/00WGD1, BIOS 2.2.11 06/13/2019
[  102.560872] Workqueue: nvme-wq nvme_rdma_error_recovery_work [nvme_rdma]
[  102.567572] RIP: 0010:__ib_drain_sq+0x143/0x190 [ib_core]
[  102.572968] Code: 00 00 00 48 89 df e8 8c 28 7f f4 48 85 c0 74 e1 e9 69 ff ff ff 89 c6 48 c7 c7 48 ce 51 c1 c6 05 cd 0c 04 00 01 e8 56 2f fe f3 <0f> 0b e9 4d ff ff ff 80 3d b9 0c 04 00 00 0f 85 40 ff ff ff 89 c6
[  102.591711] RSP: 0018:ffffbb79c7883cf8 EFLAGS: 00010282
[  102.596940] RAX: 0000000000000000 RBX: ffff9ed9bdac3818 RCX: 0000000000000000
[  102.604072] RDX: ffff9ee60fb66900 RSI: ffff9ee60fb56b88 RDI: ffff9ee60fb56b88
[  102.611204] RBP: ffff9ee60b1d8400 R08: 000000000000079a R09: 000000000000002b
[  102.618334] R10: 0000000000000000 R11: ffffbb79c7883ba0 R12: ffffbb79c7883d28
[  102.625458] R13: ffffbb79c7883d00 R14: ffff9eda44b0fbc0 R15: ffff9ee5ff4000b0
[  102.632584] FS:  0000000000000000(0000) GS:ffff9ee60fb40000(0000) knlGS:0000000000000000
[  102.640669] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  102.646415] CR2: 00007faba43bcec8 CR3: 0000000f4fa0a004 CR4: 00000000007606e0
[  102.653547] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[  102.660680] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[  102.667812] PKRU: 55555554
[  102.670525] Call Trace:
[  102.672987]  ? wb_over_bg_thresh+0x20c/0x220
[  102.677265]  ib_drain_qp+0xe/0x20 [ib_core]
[  102.681448]  nvme_rdma_teardown_io_queues.part.35+0x4e/0xa0 [nvme_rdma]
[  102.688057]  nvme_rdma_error_recovery_work+0x35/0x90 [nvme_rdma]
[  102.694064]  process_one_work+0x1a1/0x360
[  102.698074]  worker_thread+0x30/0x380
[  102.701742]  ? pwq_unbound_release_workfn+0xd0/0xd0
[  102.706622]  kthread+0x112/0x130
[  102.709851]  ? __kthread_parkme+0x70/0x70
[  102.713867]  ret_from_fork+0x35/0x40
[  102.717444] ---[ end trace b9af855765aeac25 ]---
[  102.722074] bnxt_en 0000:19:00.0: QPLIB: cmdq[0x0]=0x3 send failed
[  102.728271] infiniband bnxt_re0: Failed to modify HW QP

target:
[   44.362119] nvmet: adding nsid 1 to subsystem testnqn
[   44.362662] nvmet_rdma: enabling port 2 (172.31.45.186:4420)
[   61.722470] SubnSet(OPA_PortInfo) smlid 0x4
[   62.127400] nvmet: creating controller 1 for subsystem nqn.2014-08.org.nvmexpress.discovery for NQN nqn.2014-08.org.nvmexpress:uuid:07212bc2-57e0-4729-9377-4608892fed05.
[   62.180752] nvmet: creating controller 1 for subsystem testnqn for NQN nqn.2014-08.org.nvmexpress:uuid:07212bc2-57e0-4729-9377-4608892fed05.
[   80.914135] [qedr_poll_cq_req:3816(qedr0)]Error: POLL CQ with ROCE_CQE_REQ_STS_TRANSPORT_RETRY_CNT_ERR. CQ icid=0x1, QP icid=0x2
[   92.839414] nvmet: ctrl 1 keep-alive timer (15 seconds) expired!
[   92.845445] nvmet: ctrl 1 fatal error occurred!

Best Regards,
  Yi Zhang



_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 2+ messages in thread

* Re: Bug report: NVMeoF RDMA bnxt_roce: I/O 1 QID 0 timeout
  2019-11-12  3:18 ` Bug report: NVMeoF RDMA bnxt_roce: I/O 1 QID 0 timeout Yi Zhang
@ 2019-11-12  4:51   ` Selvin Xavier
  0 siblings, 0 replies; 2+ messages in thread
From: Selvin Xavier @ 2019-11-12  4:51 UTC (permalink / raw)
  To: Yi Zhang; +Cc: sagi, linux-nvme

Hi Yi Zhang,
Thanks for reporting the issue.  Seems like the FW on bnxt_re side is
not responding once the issue happens and all the commands are failing
after that.
I will get in touch with you for retrieving the FW information and
further investigation. We might need a FW upgrade too.

Thanks,
Selvin


On Tue, Nov 12, 2019 at 8:48 AM Yi Zhang <yi.zhang@redhat.com> wrote:
>
> Hello
>
> I would like to report a NVMeoF RDMA I/O timeout issue, here is the reproducer and kernel log, let me know if you need more info
>
> Reproducer:
> 1.setup nvmeof rdma roce environment
>    target: qedr roce
>    client: bnxt roce
>
> 2. client: connect target
> 3. client: do fio stress
>
> HW info:
> target:
> # lspci | grep -i ql
> 19:00.0 Ethernet controller: QLogic Corp. FastLinQ QL41000 Series 10/25/40/50GbE Controller (rev 02)
> 19:00.1 Ethernet controller: QLogic Corp. FastLinQ QL41000 Series 10/25/40/50GbE Controller (rev 02)
> 19:00.2 Ethernet controller: QLogic Corp. FastLinQ QL41000 Series 10/25/40/50GbE Controller (rev 02)
> 19:00.3 Ethernet controller: QLogic Corp. FastLinQ QL41000 Series 10/25/40/50GbE Controller (rev 02)
> client:
> # lspci  | grep -i bro
> 01:00.0 Ethernet controller: Broadcom Inc. and subsidiaries NetXtreme BCM5720 2-port Gigabit Ethernet PCIe
> 01:00.1 Ethernet controller: Broadcom Inc. and subsidiaries NetXtreme BCM5720 2-port Gigabit Ethernet PCIe
> 18:00.0 RAID bus controller: Broadcom / LSI MegaRAID SAS-3 3008 [Fury] (rev 02)
> 19:00.0 Ethernet controller: Broadcom Inc. and subsidiaries BCM57412 NetXtreme-E 10Gb RDMA Ethernet Controller (rev 01)
> 19:00.1 Ethernet controller: Broadcom Inc. and subsidiaries BCM57412 NetXtreme-E 10Gb RDMA Ethernet Controller (rev 01)
>
>
> kerne log:
> client:
> [   65.974085] nvme nvme2: new ctrl: NQN "nqn.2014-08.org.nvmexpress.discovery", addr 172.31.45.186:4420
> [   65.974592] nvme nvme2: Removing ctrl: NQN "nqn.2014-08.org.nvmexpress.discovery"
> [   66.027510] nvme nvme2: creating 1 I/O queues.
> [   66.053178] nvme nvme2: mapped 1/0/0 default/read/poll queues.
> [   66.053802] nvme nvme2: new ctrl: NQN "testnqn", addr 172.31.45.186:4420
> [   66.054465] nvme2n1: detected capacity change from 0 to 1600321314816
> [   81.926738] nvme nvme2: I/O 1 QID 0 timeout
> [   81.930943] nvme nvme2: starting error recovery
> [   88.070471] nvme nvme2: I/O 1 QID 0 timeout
> [   93.702284] nvme nvme2: I/O 1 QID 0 timeout
> [   98.310013] nvme nvme2: I/O 49 QID 1 timeout
> [   98.314297] nvme nvme2: I/O 50 QID 1 timeout
> [   98.318575] nvme nvme2: I/O 51 QID 1 timeout
> [   98.322855] nvme nvme2: I/O 52 QID 1 timeout
> [   98.327135] nvme nvme2: I/O 54 QID 1 timeout
> [   98.331419] nvme nvme2: I/O 55 QID 1 timeout
> [   98.335699] nvme nvme2: I/O 56 QID 1 timeout
> [   98.339984] nvme nvme2: I/O 57 QID 1 timeout
> [   98.344259] nvme nvme2: I/O 65 QID 1 timeout
> [   98.348531] nvme nvme2: I/O 66 QID 1 timeout
> [   98.352805] nvme nvme2: I/O 67 QID 1 timeout
> [   98.357086] nvme nvme2: I/O 68 QID 1 timeout
> [   98.361369] nvme nvme2: I/O 69 QID 1 timeout
> [   98.365648] nvme nvme2: I/O 70 QID 1 timeout
> [   98.369928] nvme nvme2: I/O 71 QID 1 timeout
> [   98.374203] nvme nvme2: I/O 72 QID 1 timeout
> [   98.378484] nvme nvme2: I/O 73 QID 1 timeout
> [   98.382764] nvme nvme2: I/O 74 QID 1 timeout
> [   98.387035] nvme nvme2: I/O 75 QID 1 timeout
> [   98.391310] nvme nvme2: I/O 76 QID 1 timeout
> [   98.395590] nvme nvme2: I/O 77 QID 1 timeout
> [   98.399864] nvme nvme2: I/O 78 QID 1 timeout
> [   98.404141] nvme nvme2: I/O 79 QID 1 timeout
> [   98.408415] nvme nvme2: I/O 80 QID 1 timeout
> [   98.412686] nvme nvme2: I/O 81 QID 1 timeout
> [   98.416961] nvme nvme2: I/O 82 QID 1 timeout
> [   98.421232] nvme nvme2: I/O 86 QID 1 timeout
> [   98.425503] nvme nvme2: I/O 87 QID 1 timeout
> [   98.429776] nvme nvme2: I/O 94 QID 1 timeout
> [   98.434050] nvme nvme2: I/O 95 QID 1 timeout
> [   98.438323] nvme nvme2: I/O 96 QID 1 timeout
> [   98.442596] nvme nvme2: I/O 97 QID 1 timeout
> [   98.446867] nvme nvme2: I/O 98 QID 1 timeout
> [   98.451141] nvme nvme2: I/O 99 QID 1 timeout
> [   98.455412] nvme nvme2: I/O 100 QID 1 timeout
> [   98.459772] nvme nvme2: I/O 101 QID 1 timeout
> [   98.464131] nvme nvme2: I/O 102 QID 1 timeout
> [   98.468490] nvme nvme2: I/O 103 QID 1 timeout
> [   98.472849] nvme nvme2: I/O 104 QID 1 timeout
> [   98.477208] nvme nvme2: I/O 105 QID 1 timeout
> [   98.481568] nvme nvme2: I/O 106 QID 1 timeout
> [   98.485926] nvme nvme2: I/O 109 QID 1 timeout
> [   98.490286] nvme nvme2: I/O 111 QID 1 timeout
> [   98.494644] nvme nvme2: I/O 112 QID 1 timeout
> [   98.499004] nvme nvme2: I/O 113 QID 1 timeout
> [   98.503363] nvme nvme2: I/O 115 QID 1 timeout
> [   98.507721] nvme nvme2: I/O 116 QID 1 timeout
> [   98.512080] nvme nvme2: I/O 117 QID 1 timeout
> [   98.516441] nvme nvme2: I/O 118 QID 1 timeout
> [   98.520807] nvme nvme2: I/O 119 QID 1 timeout
> [   98.525167] nvme nvme2: I/O 120 QID 1 timeout
> [   98.529526] nvme nvme2: I/O 121 QID 1 timeout
> [   98.533885] nvme nvme2: I/O 122 QID 1 timeout
> [   98.538245] nvme nvme2: I/O 123 QID 1 timeout
> [   98.542603] nvme nvme2: I/O 124 QID 1 timeout
> [   98.546963] nvme nvme2: I/O 126 QID 1 timeout
> [   99.846073] nvme nvme2: I/O 1 QID 0 timeout
> [  102.405967] bnxt_en 0000:19:00.0: QPLIB: cmdq[0x41d]=0x3 timedout (20000)msec
> [  102.413114] infiniband bnxt_re0: Failed to modify HW QP
> [  102.418367] bnxt_en 0000:19:00.0: QPLIB: cmdq[0x0]=0x3 send failed
> [  102.424551] infiniband bnxt_re0: Failed to modify HW QP
> [  102.429779] ------------[ cut here ]------------
> [  102.434399] failed to drain send queue: -110
> [  102.438730] WARNING: CPU: 43 PID: 780 at drivers/infiniband/core/verbs.c:2656 __ib_drain_sq+0x143/0x190 [ib_core]
> [  102.448985] Modules linked in: nvme_rdma nvme_fabrics nvmet_rdma nvmet 8021q garp mrp stp llc ib_isert iscsi_target_mod ib_srpt target_core_mod ib_srp scsi_transport_srp vfat fat mlx5_ib opa_vnic ib_umad ib_ipoib rpcrdma intel_rapl_msr sunrpc intel_rapl_common rdma_ucm ib_iser libiscsi isst_if_common scsi_transport_iscsi iTCO_wdt iTCO_vendor_support skx_edac dcdbas nfit libnvdimm x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel ipmi_ssif iw_cxgb4 kvm irqbypass rdma_cm iw_cm ib_cm libcxgb crct10dif_pclmul crc32_pclmul hfi1 bnxt_re rdmavt ghash_clmulni_intel intel_cstate ib_uverbs ib_core intel_uncore dell_smbios intel_rapl_perf wmi_bmof dell_wmi_descriptor sg pcspkr mei_me ipmi_si i2c_i801 lpc_ich mei ipmi_devintf ipmi_msghandler acpi_power_meter xfs libcrc32c sd_mod mgag200 drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops i2c_algo_bit drm_vram_helper ttm mlx5_core drm ahci csiostor cxgb4 libahci crc32c_intel nvme mlxfw bnxt_en nvme_core libata megaraid_sas
> [  102.449018]  scsi_transport_fc tg3 pci_hyperv_intf wmi dm_mirror dm_region_hash dm_log dm_mod
> [  102.543833] CPU: 43 PID: 780 Comm: kworker/u98:3 Kdump: loaded Not tainted 5.4.0-0.rc6.2.elrdy.x86_64 #1
> [  102.553303] Hardware name: Dell Inc. PowerEdge R740/00WGD1, BIOS 2.2.11 06/13/2019
> [  102.560872] Workqueue: nvme-wq nvme_rdma_error_recovery_work [nvme_rdma]
> [  102.567572] RIP: 0010:__ib_drain_sq+0x143/0x190 [ib_core]
> [  102.572968] Code: 00 00 00 48 89 df e8 8c 28 7f f4 48 85 c0 74 e1 e9 69 ff ff ff 89 c6 48 c7 c7 48 ce 51 c1 c6 05 cd 0c 04 00 01 e8 56 2f fe f3 <0f> 0b e9 4d ff ff ff 80 3d b9 0c 04 00 00 0f 85 40 ff ff ff 89 c6
> [  102.591711] RSP: 0018:ffffbb79c7883cf8 EFLAGS: 00010282
> [  102.596940] RAX: 0000000000000000 RBX: ffff9ed9bdac3818 RCX: 0000000000000000
> [  102.604072] RDX: ffff9ee60fb66900 RSI: ffff9ee60fb56b88 RDI: ffff9ee60fb56b88
> [  102.611204] RBP: ffff9ee60b1d8400 R08: 000000000000079a R09: 000000000000002b
> [  102.618334] R10: 0000000000000000 R11: ffffbb79c7883ba0 R12: ffffbb79c7883d28
> [  102.625458] R13: ffffbb79c7883d00 R14: ffff9eda44b0fbc0 R15: ffff9ee5ff4000b0
> [  102.632584] FS:  0000000000000000(0000) GS:ffff9ee60fb40000(0000) knlGS:0000000000000000
> [  102.640669] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [  102.646415] CR2: 00007faba43bcec8 CR3: 0000000f4fa0a004 CR4: 00000000007606e0
> [  102.653547] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> [  102.660680] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> [  102.667812] PKRU: 55555554
> [  102.670525] Call Trace:
> [  102.672987]  ? wb_over_bg_thresh+0x20c/0x220
> [  102.677265]  ib_drain_qp+0xe/0x20 [ib_core]
> [  102.681448]  nvme_rdma_teardown_io_queues.part.35+0x4e/0xa0 [nvme_rdma]
> [  102.688057]  nvme_rdma_error_recovery_work+0x35/0x90 [nvme_rdma]
> [  102.694064]  process_one_work+0x1a1/0x360
> [  102.698074]  worker_thread+0x30/0x380
> [  102.701742]  ? pwq_unbound_release_workfn+0xd0/0xd0
> [  102.706622]  kthread+0x112/0x130
> [  102.709851]  ? __kthread_parkme+0x70/0x70
> [  102.713867]  ret_from_fork+0x35/0x40
> [  102.717444] ---[ end trace b9af855765aeac25 ]---
> [  102.722074] bnxt_en 0000:19:00.0: QPLIB: cmdq[0x0]=0x3 send failed
> [  102.728271] infiniband bnxt_re0: Failed to modify HW QP
>
> target:
> [   44.362119] nvmet: adding nsid 1 to subsystem testnqn
> [   44.362662] nvmet_rdma: enabling port 2 (172.31.45.186:4420)
> [   61.722470] SubnSet(OPA_PortInfo) smlid 0x4
> [   62.127400] nvmet: creating controller 1 for subsystem nqn.2014-08.org.nvmexpress.discovery for NQN nqn.2014-08.org.nvmexpress:uuid:07212bc2-57e0-4729-9377-4608892fed05.
> [   62.180752] nvmet: creating controller 1 for subsystem testnqn for NQN nqn.2014-08.org.nvmexpress:uuid:07212bc2-57e0-4729-9377-4608892fed05.
> [   80.914135] [qedr_poll_cq_req:3816(qedr0)]Error: POLL CQ with ROCE_CQE_REQ_STS_TRANSPORT_RETRY_CNT_ERR. CQ icid=0x1, QP icid=0x2
> [   92.839414] nvmet: ctrl 1 keep-alive timer (15 seconds) expired!
> [   92.845445] nvmet: ctrl 1 fatal error occurred!
>
> Best Regards,
>   Yi Zhang
>
>

_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, back to index

Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <1269174553.11144711.1573527781970.JavaMail.zimbra@redhat.com>
2019-11-12  3:18 ` Bug report: NVMeoF RDMA bnxt_roce: I/O 1 QID 0 timeout Yi Zhang
2019-11-12  4:51   ` Selvin Xavier

Linux-NVME Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/linux-nvme/0 linux-nvme/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 linux-nvme linux-nvme/ https://lore.kernel.org/linux-nvme \
		linux-nvme@lists.infradead.org
	public-inbox-index linux-nvme

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.infradead.lists.linux-nvme


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git