Hi Sasha, Valeriy, With the help of Valeriy's logs I was able to get to the bottom of this. The root cause is that for NVMe-oF requests that don't transfer any data, such as keep_alive, we were not properly resetting the value of rdma_req->num_outstanding_data_wr between uses of that structure. All data carrying operations properly reset this value in spdk_nvmf_rdma_req_parse_sgl. My local repro steps look like this for anyone interested. Start the SPDK target, Submit a full queue depth worth of Smart log requests (sequentially is fine). A smaller number also works, but takes much longer. Wait for a while (This assumes you have keep alive enabled). Keep alive requests will reuse the rdma_req objects slowly incrementing the curr_send_depth on the admin qpair. Eventually the admin qpair will be unable to submit I/O. I was able to fix the issue locally with the following patch. https://review.gerrithub.io/#/c/spdk/spdk/+/443811/. Valeriy, please let me know if applying this patch also fixes it for you ( I am pretty sure that it will). Thank you for the bug report and for all of your help, Seth -----Original Message----- From: SPDK [mailto:spdk-bounces(a)lists.01.org] On Behalf Of Sasha Kotchubievsky Sent: Thursday, February 7, 2019 11:06 AM To: spdk(a)lists.01.org Subject: Re: [SPDK] A problem with SPDK 19.01 NVMeoF/RDMA target Hi, RNR value shouldn't affect NVMF. I just want to check if NVMF prepost enough receive requests.  19.10 introduced some new way for flow control and count number of send and receive work requests. Probably, NVMF doesn't pre-post enough requests. Which network do you use : IB or ROcE? What it is you HW and SW stack in host and in target sides? (OS, OFED/MOFED version, NIC type) I'd suggest to configure NVMF with big max queue depth, and in your test actually use a half of the value. On 2/7/2019 5:37 PM, Valeriy Glushkov wrote: > Hi Sasha, > > There is no IBV on the host side, it's Windows. > So we have no control over the RNR field. > > From a RDMA session's dump I can see that the initiator sets > infiniband.cm.req.rnrretrcount to 0x6. > > Could the RNR value be related to the problem we have with SPDK 19.01 > NVMeoF target? > _______________________________________________ SPDK mailing list SPDK(a)lists.01.org https://lists.01.org/mailman/listinfo/spdk