From mboxrd@z Thu Jan 1 00:00:00 1970 Content-Type: multipart/mixed; boundary="===============4705616152270633799==" MIME-Version: 1.0 From: Valeriy Glushkov Subject: Re: [SPDK] A problem with SPDK 19.01 NVMeoF/RDMA target Date: Wed, 06 Feb 2019 18:15:03 +0000 Message-ID: In-Reply-To: EA913ED399BBA34AA4EAC2EDC24CDD009C25DFB8@FMSMSX105.amr.corp.intel.com List-ID: To: spdk@lists.01.org --===============4705616152270633799== Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Hi Seth, thank you for your great explanation of the changes you have done! I added some debug prints to the target code as you requested. Here is a part of the target's debug log with the last Admin queue = requests: =3D=3D=3D=3D=3D=3D=3D rdma.c:1508:spdk_nvmf_rdma_request_process: *DEBUG*: Request 0xff14e0 = entering state 1 rdma.c:1508:spdk_nvmf_rdma_request_process: *DEBUG*: Request 0xff14e0 = entering state 5 request.c: 121:nvmf_trace_command: *DEBUG*: Admin cmd: opc 0x0a fuse 0 cid = 2815 nsid 0 cdw10 0x00000006 request.c: 136:nvmf_trace_command: *DEBUG*: SGL: Keyed: addr 0x0 key 0x0 = len 0x0 request.c: 92:spdk_nvmf_request_complete: *DEBUG*: cpl: cid=3D2815 = cdw0=3D0x00000001 rsvd1=3D0 status=3D0x0000 rdma.c:1508:spdk_nvmf_rdma_request_process: *DEBUG*: Request 0xff14e0 = entering state 7 rdma.c:1508:spdk_nvmf_rdma_request_process: *DEBUG*: Request 0xff14e0 = entering state 9 rdma.c: 978:request_transfer_out: *DEBUG*: RDMA RECV POSTED. Recv: = 0xfdd2e8 Connection: 0xf04970 rdma.c:1001:request_transfer_out: *DEBUG*: RDMA SEND POSTED. Request: = 0xff14e0 Connection: 0xf04970 rdma.c:1508:spdk_nvmf_rdma_request_process: *DEBUG*: Request 0xff14e0 = entering state 11 rdma.c:1508:spdk_nvmf_rdma_request_process: *DEBUG*: Request 0xff14e0 = entering state 11 rdma.c:1508:spdk_nvmf_rdma_request_process: *DEBUG*: Request 0xff14e0 = entering state 12 rdma.c:1508:spdk_nvmf_rdma_request_process: *DEBUG*: Request 0xff14e0 = entering state 0 ... rdma.c:1508:spdk_nvmf_rdma_request_process: *DEBUG*: Request 0xff1a20 = entering state 1 rdma.c:1508:spdk_nvmf_rdma_request_process: *DEBUG*: Request 0xff1a20 = entering state 2 rdma.c:1414:spdk_nvmf_rdma_request_parse_sgl: *DEBUG*: Request 0xff1a20 = took 1 buffer/s from central pool rdma.c:1508:spdk_nvmf_rdma_request_process: *DEBUG*: Request 0xff1a20 = entering state 5 request.c: 121:nvmf_trace_command: *DEBUG*: Admin cmd: opc 0x02 fuse 0 cid = 2847 nsid 4294967295 cdw10 0x001e0002 request.c: 136:nvmf_trace_command: *DEBUG*: SGL: Keyed (Inv): addr = 0xffffe483a781fdd0 key 0x82929 len 0x200 ctrlr.c:1285:spdk_nvmf_ctrlr_get_log_page: *DEBUG*: Get log page: LID=3D0x0= 2 = offset=3D0x0 len=3D0x7c request.c: 92:spdk_nvmf_request_complete: *DEBUG*: cpl: cid=3D2847 = cdw0=3D0x00000000 rsvd1=3D0 status=3D0x0000 rdma.c:1508:spdk_nvmf_rdma_request_process: *DEBUG*: Request 0xff1a20 = entering state 7 rdma.c:1508:spdk_nvmf_rdma_request_process: *DEBUG*: Request 0xff1a20 = entering state 8 rdma.c:1643:spdk_nvmf_rdma_request_process: *DEBUG*: 62+1 WRs outstanding = (max_send_depth 63). Waiting until some finish. rdma.c:1508:spdk_nvmf_rdma_request_process: *DEBUG*: Request 0xff1a20 = entering state 8 rdma.c:1643:spdk_nvmf_rdma_request_process: *DEBUG*: 62+1 WRs outstanding = (max_send_depth 63). Waiting until some finish. =3D=3D=3D=3D=3D=3D=3D The Admin queue depth =3D 32. It seems that there is an issue with the rqpair->current_send_depth = counter as the request was the only one in the queue at the moment. I could send you the complete debug log, if you need. Thank you for your help. -- = Best regards, Valeriy Glushkov www.starwind.com valeriy.glushkov(a)starwind.com Howell, Seth =D0=BF=D0=B8=D1=81=D0=B0=D0=BB(=D0= =B0) =D0=B2 =D1=81=D0=B2=D0=BE=D1=91=D0=BC =D0=BF=D0=B8=D1=81=D1=8C=D0=BC= =D0=B5 Wed, 06 Feb = 2019 18:03:49 +0200: > Hi Valeriy, > > I made the changes that you mentioned to the RDMA request state machine. = > Here is a little background on those changes. > Those changes were part of an effort to better approach the limits of = > work requests that can be posted at once to the send queue. Previously = > we were limiting both RDMA reads and RDMA writes to the same limit. That = > is the one indicated by device_attr.qp_rd_atom_max. This value only = > controls rdma READ operations, and write operations should be governed = > by the size of the send queue. So we separated the state = > RDMA_REQUEST_STATE_DATA_TRANSFER_PENDING into two different states, one = > for controller to host transfers, and the other for host to controller = > transfers. > This was useful in two ways. Firstly, it allowed us to loosen = > restrictions on rdma sends, and secondly it allowed us to make the = > states of the rdma request a linear chain that can be followed through = > from start to finish. > It looks like your I/O is getting stuck in state 8, or = > RDMA_REQUEST_STATE_DATA_TRANSFERRING_HOST_TO_CONTROLLER. I see from your = > stacktrace that the I/O enters that state at least twice, which is = > normal. If there is more than one I/O queued in that state, we operate = > on them in a fifo manner. Also, if there are currently too many = > outstanding send wrs, we wait until we receive some completions on the = > send queue to continue. Every time we get a completion, we will poll = > through this list to try to submit more SENDs. > > I have a few follow up questions about your configuration and = > information we will need to be able to help: > > 1. What is the queue depth of I/O you are submitting to the target? > 2. Are you doing a 100% NVMe-oF read workload, or is it a mixed = > read/write workload? > 3. Your stacktrace shows the I/O enter state 8 twice, does this trend = > continue forever until you get a timeout on the I/O? > 4. Can you add two extra debug prints to the target, one to each if = > statement inside of state = > RDMA_REQUEST_STATE_DATA_TRANSFER_TO_HOST_PENDING? I want to see why we = > are staying in that state. In those debug prints, it would be useful if = > you could include the following information: rqpair->current_send_depth, = > rdma_req->num_outstanding_data_wr, rqpair->max_send_depth. > > Some other information that may be useful: > How many cores are you running the SPDK target on? How many devices and = > connections do you have when this hits? > > This information will help us understand better why your I/O aren't = > making forward progress. I am really interested to see what is going on = > here since we don't have this issue on either the kernel or SPDK = > initiators, and the changes we made should have loosened up the = > requirements for sending I/O. > > Thanks, > > Seth Howell > > > -----Original Message----- > From: SPDK [mailto:spdk-bounces(a)lists.01.org] On Behalf Of Harris, Jame= s = > R > Sent: Wednesday, February 6, 2019 7:55 AM > To: Storage Performance Development Kit > Subject: Re: [SPDK] A problem with SPDK 19.01 NVMeoF/RDMA target > > > > =EF=BB=BFOn 2/6/19, 12:59 AM, "SPDK on behalf of Valeriy Glushkov" = > = > wrote: > > Hi Jim, > Our module is an implementation of the NVMeoF/RDMA host. > It works with SPDK 18.10.1 well, so the problem seems to be related = > to the > SPDK 19.01 code. > I can see that the RDMA request's state engine have been changed in = > the > recent SPDK release. > So it would be great if the author of the modifications could take a = > look > at the issue... > Thank you for your help! > > > Hi Valeriy, > > Can you provide detailed information about what your host module is = > doing to induce this behavior? Our tests with the Linux kernel host = > driver and the SPDK host driver do not seem to be hitting this problem. > > Thanks, > > -Jim > > = --===============4705616152270633799==--