[SPDK] Re: nvmeof to localhost will hung forever

* [SPDK] Re: nvmeof to localhost will hung forever
@ 2019-12-07  0:28 Yang, Ziye
  0 siblings, 0 replies; 6+ messages in thread
From: Yang, Ziye @ 2019-12-07  0:28 UTC (permalink / raw)
  To: spdk

[-- Attachment #1: Type: text/plain, Size: 6706 bytes --]

For the potential hang issue, we can add a timeout which is already discussed in our previous issue thread, it can be feasible.

For the async method, current I did not have a good method yet. Since the nvme tcp or RDMA initiator lib is designed for any application(it can be used without SPDK poller related framework). So it is not a very easy work to decouple the functions into two parts, i.e., send and async completion.

发自我的 iPad

> 在 2019年12月7日，上午1:19，peng yu <yupeng0921(a)gmail.com> 写道：
> 
> Hi Ziye
> 
> I don't explain my question clearly. I mean the
> bdev_nvme_attach_controller command might impact the IOs on other
> devices. Assume the spdk application has a native nvme device, exports
> it has a vhost, and it is used by a virtual machine. Now I try to call
> the bdev_nvme_attach_controller command to connect to a nvmeof target.
> The primary core will stuck until it receives the response from the
> nvemof target. During this time, the primary core can not handle any
> IO for the native nvme device. So I want to know whether we could
> avoid to let the primary core handle any IO, let the primary core
> handle rpc request only, then the IO won't be impacted. If I
> misunderstand anything or you have any other idea, please let me know.
> 
> And I wonder should we add a timeout in the nvme_tcp_qpair_icreq_send
> function (and the corresponding function for rdma)? Depend on my test,
> if the nvmeof target accepts the tcp connection but doesn't send any
> response, the spdk application will stuck forever, it won't reply to
> any rpc in the further. We shouldn't let the spdk application stuck
> due to the issue of a remove nvmeof target.
> 
> Best regards.
> 
>> On Fri, Dec 6, 2019 at 12:02 AM Yang, Ziye <ziye.yang(a)intel.com> wrote:
>> 
>> Hi Peng Yu,
>> 
>> It will not. After the connection is constructed, all the I/os will become async.
>> 
>> 
>> 
>> Best Regards
>> Ziye Yang
>> 
>> -----Original Message-----
>> From: peng yu <yupeng0921(a)gmail.com>
>> Sent: Friday, December 6, 2019 3:43 PM
>> To: Storage Performance Development Kit <spdk(a)lists.01.org>
>> Subject: [SPDK] Re: nvmeof to localhost will hung forever
>> 
>> Hi Ziye
>> 
>> Thanks for your explanation. I found a workaround:
>> I could run multiple spdk application on the same server, and specify different socket path for the two applications. Then use one as nvmeof target, another as nvmeof initiator. Depend on my simple test, it could work.
>> 
>> I have anyother concern: The synchronized operation would stuck the primary core. Is it possible to let all IO operations are handled by other cores. Assume I run the bdev_nvme_attach_controller, and the target has problem, it doesn't response or the response latency is pretty high, all IOs on this core would stuck togher with the bdev_nvme_attach_controller command. I hope such kind of issue won't impact the IO performance.
>> 
>>> On Thu, Dec 5, 2019 at 11:06 PM Yang, Ziye <ziye.yang(a)intel.com> wrote:
>>> 
>>> By the way, this issue affects not only TCP transport but also RDMA transport. And current conclusion in the previous issue, is that we conclude as "Don't recommend this as a use case".
>>> 
>>> 
>>> 
>>> 
>>> Best Regards
>>> Ziye Yang
>>> 
>>> -----Original Message-----
>>> From: Yang, Ziye
>>> Sent: Friday, December 6, 2019 2:58 PM
>>> To: Storage Performance Development Kit <spdk(a)lists.01.org>
>>> Subject: RE: [SPDK] nvmeof to localhost will hung forever
>>> 
>>> Hi Peng Yu,
>>> 
>>> We currently do not support to test the target and initiator in the same process instance.
>>> See this previous spdk reported issue.  Adding the async poller will not be OK in the low level nvme transport library. You can see the following reported spdk issue, it is same with yours.
>>> 
>>> https://github.com/spdk/spdk/issues/587
>>> 
>>> 
>>> 
>>> 
>>> Best Regards
>>> Ziye Yang
>>> 
>>> -----Original Message-----
>>> From: peng yu <yupeng0921(a)gmail.com>
>>> Sent: Friday, December 6, 2019 2:50 PM
>>> To: Storage Performance Development Kit <spdk(a)lists.01.org>
>>> Subject: [SPDK] nvmeof to localhost will hung forever
>>> 
>>> Below is the steps to reproduce the issue:
>>> 
>>> (1) run a spdk applicatoin, e.g.:
>>> sudo ./app/spdk_tgt/spdk_tgt
>>> 
>>> (2) run the nvmeof target part commands:
>>> sudo ./scripts/rpc.py nvmf_create_transport -t TCP -u 16384 -p 8 -c
>>> 8192 sudo ./scripts/rpc.py bdev_malloc_create -b Malloc0 512 512 sudo
>>> ./scripts/rpc.py nvmf_create_subsystem nqn.2016-06.io.spdk:cnode1 -a
>>> -s SPDK00000000000001 -d SPDK_Controller1 sudo ./scripts/rpc.py
>>> nvmf_subsystem_add_ns nqn.2016-06.io.spdk:cnode1 Malloc0 sudo
>>> ./scripts/rpc.py nvmf_subsystem_add_listener
>>> nqn.2016-06.io.spdk:cnode1 -t tcp -a 127.0.0.1 -s 4420
>>> 
>>> (3) run the nvmeof initiator part command:
>>> sudo ./scripts/rpc.py bdev_nvme_attach_controller -b Nvme0 -t tcp -a
>>> 127.0.0.1 -f IPv4 -s 4420 -n nqn.2016-06.io.spdk:cnode1
>>> 
>>> The bdev_nvme_attach_controller command will hung forever. I found the problem is in the nvme_tcp_qpair_icreq_send function, the spdk will stuck on below code:
>>> 
>>>    while (tqpair->state == NVME_TCP_QPAIR_STATE_INVALID) {
>>>        nvme_tcp_qpair_process_completions(&tqpair->qpair, 0);
>>>    }
>>> 
>>> The while loop won't finish. The nvme_tcp_qpair_process_completions
>>> function will try to receive response from the target. The target is the same spdk application, and as the application is spinning in the above while loop, the nvmeof target part code doesn't have a chance to send a response.
>>> 
>>> Is it possible to use a poller to replace the while loop? We could add a callback function, and let the poller call it when the tqpair->state is not NVME_TCP_QPAIR_STATE_INVALID. Does it make sense?
>>> _______________________________________________
>>> SPDK mailing list -- spdk(a)lists.01.org To unsubscribe send an email to
>>> spdk-leave(a)lists.01.org
>>> _______________________________________________
>>> SPDK mailing list -- spdk(a)lists.01.org To unsubscribe send an email to
>>> spdk-leave(a)lists.01.org
>> _______________________________________________
>> SPDK mailing list -- spdk(a)lists.01.org
>> To unsubscribe send an email to spdk-leave(a)lists.01.org
>> _______________________________________________
>> SPDK mailing list -- spdk(a)lists.01.org
>> To unsubscribe send an email to spdk-leave(a)lists.01.org
> _______________________________________________
> SPDK mailing list -- spdk(a)lists.01.org
> To unsubscribe send an email to spdk-leave(a)lists.01.org

^ permalink raw reply	[flat|nested] 6+ messages in thread