From mboxrd@z Thu Jan  1 00:00:00 1970
Content-Type: multipart/mixed; boundary="===============4705616152270633799=="
MIME-Version: 1.0
From: Valeriy Glushkov <valeriy.glushkov at starwind.com>
Subject: Re: [SPDK] A problem with SPDK 19.01 NVMeoF/RDMA target
Date: Wed, 06 Feb 2019 18:15:03 +0000
Message-ID: <op.zwsya7ahns8zkg@sam8>
In-Reply-To: EA913ED399BBA34AA4EAC2EDC24CDD009C25DFB8@FMSMSX105.amr.corp.intel.com
List-ID: <spdk@lists.01.org>
To: spdk@lists.01.org

--===============4705616152270633799==
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: quoted-printable

Hi Seth,

thank you for your great explanation of the changes you have done!

I added some debug prints to the target code as you requested.

Here is a part of the target's debug log with the last Admin queue  =

requests:
=3D=3D=3D=3D=3D=3D=3D
rdma.c:1508:spdk_nvmf_rdma_request_process: *DEBUG*: Request 0xff14e0  =

entering state 1
rdma.c:1508:spdk_nvmf_rdma_request_process: *DEBUG*: Request 0xff14e0  =

entering state 5
request.c: 121:nvmf_trace_command: *DEBUG*: Admin cmd: opc 0x0a fuse 0 cid  =

2815 nsid 0 cdw10 0x00000006
request.c: 136:nvmf_trace_command: *DEBUG*: SGL: Keyed: addr 0x0 key 0x0  =

len 0x0
request.c:  92:spdk_nvmf_request_complete: *DEBUG*: cpl: cid=3D2815  =

cdw0=3D0x00000001 rsvd1=3D0 status=3D0x0000
rdma.c:1508:spdk_nvmf_rdma_request_process: *DEBUG*: Request 0xff14e0  =

entering state 7
rdma.c:1508:spdk_nvmf_rdma_request_process: *DEBUG*: Request 0xff14e0  =

entering state 9
rdma.c: 978:request_transfer_out: *DEBUG*: RDMA RECV POSTED. Recv:  =

0xfdd2e8 Connection: 0xf04970
rdma.c:1001:request_transfer_out: *DEBUG*: RDMA SEND POSTED. Request:  =

0xff14e0 Connection: 0xf04970
rdma.c:1508:spdk_nvmf_rdma_request_process: *DEBUG*: Request 0xff14e0  =

entering state 11
rdma.c:1508:spdk_nvmf_rdma_request_process: *DEBUG*: Request 0xff14e0  =

entering state 11
rdma.c:1508:spdk_nvmf_rdma_request_process: *DEBUG*: Request 0xff14e0  =

entering state 12
rdma.c:1508:spdk_nvmf_rdma_request_process: *DEBUG*: Request 0xff14e0  =

entering state 0
...
rdma.c:1508:spdk_nvmf_rdma_request_process: *DEBUG*: Request 0xff1a20  =

entering state 1
rdma.c:1508:spdk_nvmf_rdma_request_process: *DEBUG*: Request 0xff1a20  =

entering state 2
rdma.c:1414:spdk_nvmf_rdma_request_parse_sgl: *DEBUG*: Request 0xff1a20  =

took 1 buffer/s from central pool
rdma.c:1508:spdk_nvmf_rdma_request_process: *DEBUG*: Request 0xff1a20  =

entering state 5
request.c: 121:nvmf_trace_command: *DEBUG*: Admin cmd: opc 0x02 fuse 0 cid  =

2847 nsid 4294967295 cdw10 0x001e0002
request.c: 136:nvmf_trace_command: *DEBUG*: SGL: Keyed (Inv): addr  =

0xffffe483a781fdd0 key 0x82929 len 0x200
ctrlr.c:1285:spdk_nvmf_ctrlr_get_log_page: *DEBUG*: Get log page: LID=3D0x0=
2  =

offset=3D0x0 len=3D0x7c
request.c:  92:spdk_nvmf_request_complete: *DEBUG*: cpl: cid=3D2847  =

cdw0=3D0x00000000 rsvd1=3D0 status=3D0x0000
rdma.c:1508:spdk_nvmf_rdma_request_process: *DEBUG*: Request 0xff1a20  =

entering state 7
rdma.c:1508:spdk_nvmf_rdma_request_process: *DEBUG*: Request 0xff1a20  =

entering state 8
rdma.c:1643:spdk_nvmf_rdma_request_process: *DEBUG*: 62+1 WRs outstanding  =

(max_send_depth 63). Waiting until some finish.
rdma.c:1508:spdk_nvmf_rdma_request_process: *DEBUG*: Request 0xff1a20  =

entering state 8
rdma.c:1643:spdk_nvmf_rdma_request_process: *DEBUG*: 62+1 WRs outstanding  =

(max_send_depth 63). Waiting until some finish.
=3D=3D=3D=3D=3D=3D=3D

The Admin queue depth =3D 32.

It seems that there is an issue with the rqpair->current_send_depth  =

counter as the request was the only one in the queue at the moment.

I could send you the complete debug log, if you need.

Thank you for your help.

-- =

Best regards,
Valeriy Glushkov
www.starwind.com
valeriy.glushkov(a)starwind.com

Howell, Seth <seth.howell(a)intel.com> =D0=BF=D0=B8=D1=81=D0=B0=D0=BB(=D0=
=B0) =D0=B2 =D1=81=D0=B2=D0=BE=D1=91=D0=BC =D0=BF=D0=B8=D1=81=D1=8C=D0=BC=
=D0=B5 Wed, 06 Feb  =

2019 18:03:49 +0200:

> Hi Valeriy,
>
> I made the changes that you mentioned to the RDMA request state machine.  =

> Here is a little background on those changes.
> Those changes were part of an effort to better approach the limits of  =

> work requests that can be posted at once to the send queue. Previously  =

> we were limiting both RDMA reads and RDMA writes to the same limit. That  =

> is the one indicated by device_attr.qp_rd_atom_max. This value only  =

> controls rdma READ operations, and write operations should be governed  =

> by the size of the send queue. So we separated the state  =

> RDMA_REQUEST_STATE_DATA_TRANSFER_PENDING into two different states, one  =

> for controller to host transfers, and the other for host to controller  =

> transfers.
> This was useful in two ways. Firstly, it allowed us to loosen  =

> restrictions on rdma sends, and secondly it allowed us to make the  =

> states of the rdma request a linear chain that can be followed through  =

> from start to finish.
> It looks like your I/O is getting stuck in state 8, or  =

> RDMA_REQUEST_STATE_DATA_TRANSFERRING_HOST_TO_CONTROLLER. I see from your  =

> stacktrace that the I/O enters that state at least twice, which is  =

> normal. If there is more than one I/O queued in that state, we operate  =

> on them in a fifo manner. Also, if there are currently too many  =

> outstanding send wrs, we wait until we receive some completions on the  =

> send queue to continue. Every time we get a completion, we will poll  =

> through this list to try to submit more SENDs.
>
> I have a few follow up questions about your configuration and  =

> information we will need to be able to help:
>
> 1. What is the queue depth of I/O you are submitting to the target?
> 2. Are you doing a 100% NVMe-oF read workload, or is it a mixed  =

> read/write workload?
> 3. Your stacktrace shows the I/O enter state 8 twice, does this trend  =

> continue forever until you get a timeout on the I/O?
> 4. Can you add two extra debug prints to the target, one to each if  =

> statement inside of state  =

> RDMA_REQUEST_STATE_DATA_TRANSFER_TO_HOST_PENDING? I want to see why we  =

> are staying in that state. In those debug prints, it would be useful if  =

> you could include the following information: rqpair->current_send_depth,  =

> rdma_req->num_outstanding_data_wr, rqpair->max_send_depth.
>
> Some other information that may be useful:
> How many cores are you running the SPDK target on? How many devices and  =

> connections do you have when this hits?
>
> This information will help us understand better why your I/O aren't  =

> making forward progress. I am really interested to see what is going on  =

> here since we don't have this issue on either the kernel or SPDK  =

> initiators, and the changes we made should have loosened up the  =

> requirements for sending I/O.
>
> Thanks,
>
> Seth Howell
>
>
> -----Original Message-----
> From: SPDK [mailto:spdk-bounces(a)lists.01.org] On Behalf Of Harris, Jame=
s  =

> R
> Sent: Wednesday, February 6, 2019 7:55 AM
> To: Storage Performance Development Kit <spdk(a)lists.01.org>
> Subject: Re: [SPDK] A problem with SPDK 19.01 NVMeoF/RDMA target
>
>
>
> =EF=BB=BFOn 2/6/19, 12:59 AM, "SPDK on behalf of Valeriy Glushkov"  =

> <spdk-bounces(a)lists.01.org on behalf of valeriy.glushkov(a)starwind.com=
>  =

> wrote:
>
>     Hi Jim,
>    Our module is an implementation of the NVMeoF/RDMA host.
>    It works with SPDK 18.10.1 well, so the problem seems to be related  =

> to the
>     SPDK 19.01 code.
>    I can see that the RDMA request's state engine have been changed in  =

> the
>     recent SPDK release.
>     So it would be great if the author of the modifications could take a  =

> look
>     at the issue...
>    Thank you for your help!
>
>
> Hi Valeriy,
>
> Can you provide detailed information about what your host module is  =

> doing to induce this behavior?  Our tests with the Linux kernel host  =

> driver and the SPDK host driver do not seem to be hitting this problem.
>
> Thanks,
>
> -Jim
>
>


	=


--===============4705616152270633799==--