Thanks guys for all your help! Now I know better about SPDK internals
and plan to add status checking code before submitting further IO request.

I misunderstood the claim on http://www.spdk.io/doc/blob.html, which
says "The blobstore is ... typically in lieu of a traditional
filesystem". When it comes to writing code, using blobstore API means we
are taking a lot more responsibilities than using a traditional filesystem.


All the best!
Zhengyu

On 1/11/18 4:53 AM, Walker, Benjamin wrote:
> On Wed, 2018-01-10 at 19:28 +0000, Andrey Kuzmin wrote:
>> On Wed, Jan 10, 2018, 20:17 Walker, Benjamin <benjamin.walker(a)intel.com>
>> wrote:
>>> On Wed, 2018-01-10 at 17:00 +0000, Andrey Kuzmin wrote:
>>>> It appears quite logical to start submission with a check for pending
>>>> completions, doesn't it? Or check for completions if downstream bdev
>>> returns
>>>> busy status. That would definitely meet app expectations whatever the
>>> request
>>>> pool size is.
>>>
>>> We've considered checking for completions inside the submission path if we
>>> would
>>> otherwise return ENOMEM. So far, we've decided not to go that direction for
>>> two
>>> reasons.
>>>
>>> 1) Even if we do this, there are still cases where we'll return ENOMEM. For
>>> instance, if there are no completions to reap yet.
>>
>> While theoretically possible, such a case is problematic to imagine in
>> practice.
> 
> The user has 512 queue depth available and is submitting I/O in a tight loop.
> The submission path through the blobstore and into the NVMe driver probably
> takes on the order of 500ns to run. That means you can submit your full queue
> depth worth in 256us. On many NAND SSDs that's well within P99 latency
> expectations for 4KiB I/O, and it gets increasingly likely with larger I/O to
> the point where it is almost guaranteed to happen with 128KiB requests. The user
> is free to reduce the available queue depth to save memory as well.
> 
>>> 2) This would result in completion callbacks in response to a submit call.
>>> Today, the expectations are set that completions are called in response to a
>>> poll call only.
>>
>> Feel free to correct me if I'm wrong, but my recollection is that completion
>> callback may be called on submission path in case of error.
> 
> I just checked and for the nvme and bdev libraries an error code will be given
> to the user as the return code for the function. The callback will not be called
> because the failure is known immediately. For the blobstore library it works the
> opposite way - the functions have no return code and instead always call the
> user callback. I think this is probably a design mistake on my part. For these
> ENOMEM cases, we need to return that to the user as a return code. That makes it
> much easier to handle the situation and makes it consistent with the other
> libraries.
> 
>> The case in question is, apparently, a corner one as application must check
>> for completions if bdev returns busy status. One cannot run an unlimited rate
>> client atop a rate-limited server w/o a poll enforced at some point.
>>
>> It might also be helpful to add a parameter to the poll call specifying the
>> minimum number of completions to reap before returning control to the app, to
>> deal with deadlocks like this one.
> 
> There already is a parameter that limits the number of completions reaped in a
> single poll call. Even if you don't specify a limit, the drivers enforce
> sensible limits by default.
> 
>>
>> Regards,
>> Andrey
>>
>>> _______________________________________________
>>> SPDK mailing list
>>> SPDK(a)lists.01.org
>>> https://lists.01.org/mailman/listinfo/spdk
>>
>> -- 
>> Regards,
>> Andrey
>> _______________________________________________
>> SPDK mailing list
>> SPDK(a)lists.01.org
>> https://lists.01.org/mailman/listinfo/spdk
> _______________________________________________
> SPDK mailing list
> SPDK(a)lists.01.org
> https://lists.01.org/mailman/listinfo/spdk
>