Re: [SPDK] A problem with SPDK 19.01 NVMeoF/RDMA target

From: Harris, James R <james.r.harris at intel.com>
To: spdk@lists.01.org
Subject: Re: [SPDK] A problem with SPDK 19.01 NVMeoF/RDMA target
Date: Wed, 06 Feb 2019 16:54:06 +0000	[thread overview]
Message-ID: <97830C48-9A30-467D-A82E-4A5E35057B6E@intel.com> (raw)
In-Reply-To: EA913ED399BBA34AA4EAC2EDC24CDD009C25DFB8@FMSMSX105.amr.corp.intel.com

[-- Attachment #1: Type: text/plain, Size: 4912 bytes --]

Hey Seth,

Looking at Valeriy's original e-mail, this looks like a GET_LOG_PAGE.

request.c: 121:nvmf_trace_command: *DEBUG*: Admin cmd: opc 0x02 fuse 0 cid 2751 nsid 4294967295 cdw10 0x001e0002

Admin cmd opc 0x02 = GET LOG PAGE
cdw10 0x001e0002
Log Page Identifier (07:00) == 0x02 (SMART/Health)
Number of Dwords Lower (31:16) == 0x1e == 30 (but since this is 0's based, it actually means 31 dwords)

-Jim

On 2/6/19, 9:29 AM, "SPDK on behalf of Howell, Seth" <spdk-bounces(a)lists.01.org on behalf of seth.howell(a)intel.com> wrote:

    Hi Valeriy,

    I made the changes that you mentioned to the RDMA request state machine. Here is a little background on those changes.
    Those changes were part of an effort to better approach the limits of work requests that can be posted at once to the send queue. Previously we were limiting both RDMA reads and RDMA writes to the same limit. That is the one indicated by device_attr.qp_rd_atom_max. This value only controls rdma READ operations, and write operations should be governed by the size of the send queue. So we separated the state RDMA_REQUEST_STATE_DATA_TRANSFER_PENDING into two different states, one for controller to host transfers, and the other for host to controller transfers.
    This was useful in two ways. Firstly, it allowed us to loosen restrictions on rdma sends, and secondly it allowed us to make the states of the rdma request a linear chain that can be followed through from start to finish.
    It looks like your I/O is getting stuck in state 8, or RDMA_REQUEST_STATE_DATA_TRANSFERRING_HOST_TO_CONTROLLER. I see from your stacktrace that the I/O enters that state at least twice, which is normal. If there is more than one I/O queued in that state, we operate on them in a fifo manner. Also, if there are currently too many outstanding send wrs, we wait until we receive some completions on the send queue to continue. Every time we get a completion, we will poll through this list to try to submit more SENDs.

    I have a few follow up questions about your configuration and information we will need to be able to help:

    1. What is the queue depth of I/O you are submitting to the target?
    2. Are you doing a 100% NVMe-oF read workload, or is it a mixed read/write workload?
    3. Your stacktrace shows the I/O enter state 8 twice, does this trend continue forever until you get a timeout on the I/O?
    4. Can you add two extra debug prints to the target, one to each if statement inside of state RDMA_REQUEST_STATE_DATA_TRANSFER_TO_HOST_PENDING? I want to see why we are staying in that state. In those debug prints, it would be useful if you could include the following information: rqpair->current_send_depth, rdma_req->num_outstanding_data_wr, rqpair->max_send_depth.

    Some other information that may be useful:
    How many cores are you running the SPDK target on? How many devices and connections do you have when this hits? 

    This information will help us understand better why your I/O aren't making forward progress. I am really interested to see what is going on here since we don't have this issue on either the kernel or SPDK initiators, and the changes we made should have loosened up the requirements for sending I/O.

    Thanks,

    Seth Howell

    -----Original Message-----
    From: SPDK [mailto:spdk-bounces(a)lists.01.org] On Behalf Of Harris, James R
    Sent: Wednesday, February 6, 2019 7:55 AM
    To: Storage Performance Development Kit <spdk(a)lists.01.org>
    Subject: Re: [SPDK] A problem with SPDK 19.01 NVMeoF/RDMA target

    On 2/6/19, 12:59 AM, "SPDK on behalf of Valeriy Glushkov" <spdk-bounces(a)lists.01.org on behalf of valeriy.glushkov(a)starwind.com> wrote:

        Hi Jim,

        Our module is an implementation of the NVMeoF/RDMA host.

        It works with SPDK 18.10.1 well, so the problem seems to be related to the  
        SPDK 19.01 code.

        I can see that the RDMA request's state engine have been changed in the  
        recent SPDK release.
        So it would be great if the author of the modifications could take a look  
        at the issue...

        Thank you for your help!

    Hi Valeriy,

    Can you provide detailed information about what your host module is doing to induce this behavior?  Our tests with the Linux kernel host driver and the SPDK host driver do not seem to be hitting this problem.

    Thanks,

    -Jim

    _______________________________________________
    SPDK mailing list
    SPDK(a)lists.01.org
    https://lists.01.org/mailman/listinfo/spdk
    _______________________________________________
    SPDK mailing list
    SPDK(a)lists.01.org
    https://lists.01.org/mailman/listinfo/spdk