All of lore.kernel.org
 help / color / mirror / Atom feed
From: Walker, Benjamin <benjamin.walker at intel.com>
To: spdk@lists.01.org
Subject: Re: [SPDK] Problem with Blobstore when write 65MB continously
Date: Wed, 10 Jan 2018 20:53:39 +0000	[thread overview]
Message-ID: <1515617617.6063.79.camel@intel.com> (raw)
In-Reply-To: CANvN+ek=pdYkE0JSf5eycgyeFzpPy=sOEjT=1xD0+1yTkTbBqg@mail.gmail.com

[-- Attachment #1: Type: text/plain, Size: 3339 bytes --]

On Wed, 2018-01-10 at 19:28 +0000, Andrey Kuzmin wrote:
> On Wed, Jan 10, 2018, 20:17 Walker, Benjamin <benjamin.walker(a)intel.com>
> wrote:
> > On Wed, 2018-01-10 at 17:00 +0000, Andrey Kuzmin wrote:
> > > It appears quite logical to start submission with a check for pending
> > > completions, doesn't it? Or check for completions if downstream bdev
> > returns
> > > busy status. That would definitely meet app expectations whatever the
> > request
> > > pool size is.
> > 
> > We've considered checking for completions inside the submission path if we
> > would
> > otherwise return ENOMEM. So far, we've decided not to go that direction for
> > two
> > reasons.
> > 
> > 1) Even if we do this, there are still cases where we'll return ENOMEM. For
> > instance, if there are no completions to reap yet.
> 
> While theoretically possible, such a case is problematic to imagine in
> practice.

The user has 512 queue depth available and is submitting I/O in a tight loop.
The submission path through the blobstore and into the NVMe driver probably
takes on the order of 500ns to run. That means you can submit your full queue
depth worth in 256us. On many NAND SSDs that's well within P99 latency
expectations for 4KiB I/O, and it gets increasingly likely with larger I/O to
the point where it is almost guaranteed to happen with 128KiB requests. The user
is free to reduce the available queue depth to save memory as well.

> > 2) This would result in completion callbacks in response to a submit call.
> > Today, the expectations are set that completions are called in response to a
> > poll call only.
> 
> Feel free to correct me if I'm wrong, but my recollection is that completion
> callback may be called on submission path in case of error.

I just checked and for the nvme and bdev libraries an error code will be given
to the user as the return code for the function. The callback will not be called
because the failure is known immediately. For the blobstore library it works the
opposite way - the functions have no return code and instead always call the
user callback. I think this is probably a design mistake on my part. For these
ENOMEM cases, we need to return that to the user as a return code. That makes it
much easier to handle the situation and makes it consistent with the other
libraries.

> The case in question is, apparently, a corner one as application must check
> for completions if bdev returns busy status. One cannot run an unlimited rate
> client atop a rate-limited server w/o a poll enforced at some point.
> 
> It might also be helpful to add a parameter to the poll call specifying the
> minimum number of completions to reap before returning control to the app, to
> deal with deadlocks like this one.

There already is a parameter that limits the number of completions reaped in a
single poll call. Even if you don't specify a limit, the drivers enforce
sensible limits by default.

> 
> Regards,
> Andrey
> 
> > _______________________________________________
> > SPDK mailing list
> > SPDK(a)lists.01.org
> > https://lists.01.org/mailman/listinfo/spdk
> 
> -- 
> Regards,
> Andrey
> _______________________________________________
> SPDK mailing list
> SPDK(a)lists.01.org
> https://lists.01.org/mailman/listinfo/spdk

             reply	other threads:[~2018-01-10 20:53 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-01-10 20:53 Walker, Benjamin [this message]
  -- strict thread matches above, loose matches on Subject: below --
2018-01-11  4:08 [SPDK] Problem with Blobstore when write 65MB continously Zhengyu Zhang
2018-01-10 19:28 Andrey Kuzmin
2018-01-10 17:17 Walker, Benjamin
2018-01-10 17:11 Walker, Benjamin
2018-01-10 17:02 Luse, Paul E
2018-01-10 17:00 Andrey Kuzmin
2018-01-10 16:58 Walker, Benjamin
2018-01-10 16:47 Luse, Paul E
2018-01-10 16:32 Walker, Benjamin
2018-01-10 16:21 Harris, James R
2018-01-10 16:03 Luse, Paul E
2018-01-10 15:59 Zhengyu Zhang
2018-01-10 15:18 Luse, Paul E
2018-01-10 14:03 Luse, Paul E
2018-01-10  3:15 Zhengyu Zhang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1515617617.6063.79.camel@intel.com \
    --to=spdk@lists.01.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.