All of lore.kernel.org
 help / color / mirror / Atom feed
From: Andrey Kuzmin <andrey.v.kuzmin at gmail.com>
To: spdk@lists.01.org
Subject: Re: [SPDK] Callback passed to spdk_nvme_ns_cmd_read not being called sometimes
Date: Wed, 06 Jul 2016 21:47:05 +0000	[thread overview]
Message-ID: <CANvN+enZPuZL0Xk+AVhwEemkTeANr4ef7nMyA=eyVXAhvC6R+w@mail.gmail.com> (raw)
In-Reply-To: 1467841201.5999.222.camel@intel.com

[-- Attachment #1: Type: text/plain, Size: 12685 bytes --]

Will,

the issue I was referring to was resolved by the following commit:
https://github.com/spdk/spdk/commit/eb555b139e56b3c0b7f508906bcfde5bda844592
.

On Thu, Jul 7, 2016, 00:40 Walker, Benjamin <benjamin.walker(a)intel.com>
wrote:

> On Wed, 2016-07-06 at 21:00 +0000, Will Del Genio wrote:
> > Ben,
> >
> > Thanks, you explained that very well.  I’m working with a random 4k read
> only workload of queue
> > depth 256.
>
> Can you try capping your queue depth at 128? That's the maximum I/O we
> allow outstanding at the
> hardware. The NVMe driver should be doing software queueing beyond that
> automatically, but this data
> point will help narrow down the problem.
>
> >  I’m using 4 drives and one thread per drive.  If it’s true that there
> are just no more
> > completions to handle, then I will recheck the code I wrote to keep
> track of the number of
> > outstanding read requests.
> >
> > Here is the qpair:
> > (gdb) p *qpair
> > $1 = {sq_tdbl = 0x7ffff7ff2008, cq_hdbl = 0x7ffff7ff200c, cmd =
> 0x7fffdaae0000, cpl =
> > 0x7fffdaade000, free_tr = {lh_first = 0x0}, outstanding_tr = {lh_first =
> 0x7fffdaad8000}, tr =
> > 0x7fffdaa5d000,
> >   queued_req = {stqh_first = 0x7fffea9a1780, stqh_last =
> 0x7fffea9cdf40}, id = 1, num_entries =
> > 256, sq_tail = 249, cq_head = 121, phase = 0 '\000', is_enabled = true,
> sq_in_cmb = false,
>
> Note how sq_tail - cq_head is 128, meaning the driver believes there to be
> 128 commands outstanding.
> The driver's view of the world (commands outstanding) doesn't line up with
> us not getting any NVMe
> completions - there is definitely a problem here.
>
> >   qprio = 0 '\000', ctrlr = 0x7fffddca6740, tailq = {tqe_next = 0x0,
> tqe_prev = 0x7fffddca7b70},
> > cmd_bus_addr = 35430203392, cpl_bus_addr = 35430195200}
> > (gdb) p qpair->phase
> > $2 = 0 '\000'
> > (gdb) p qpair->cpl[qpair->cq_head]
> > $3 = {cdw0 = 0, rsvd1 = 0, sqhd = 132, sqid = 1, cid = 112, status = {p
> = 1, sc = 0, sct = 0,
> > rsvd2 = 0, m = 0, dnr = 0}}
>
> Can you print out the following 3 things:
> - qpair->cpl[qpair->cq_head - 1]
> - qpair->sq_tdbl
> - qpair->cq_hdbl
>
> >
> > -Will
> >
> > On 7/6/16, 3:50 PM, "SPDK on behalf of Walker, Benjamin" <
> spdk-bounces(a)lists.01.org on behalf of b
> > enjamin.walker(a)intel.com> wrote:
> >
> > On Wed, 2016-07-06 at 20:33 +0000, Will Del Genio wrote:
> > > Andrey,
> > >
> > > I was able to step into the spdk_nvme_qpair_process_completions()
> function with gdb and found
> > > the
> > > reason is isn’t returning any completions is because this check is
> failing at line 469: if (cpl-
> > > > status.p != qpair->phase)
> > >
> > > Relevant gdb info here:
> > > Thread 4 "xg:nvmeIo:9" hit Breakpoint 5,
> spdk_nvme_qpair_process_completions
> > > (qpair=0x7ffff0a5baa0, max_completions=0) at nvme_qpair.c:463
> > > 463   in nvme_qpair.c
> > > (gdb) p qpair->cpl[qpair->cq_head]
> > > $11 = {cdw0 = 0, rsvd1 = 0, sqhd = 51, sqid = 1, cid = 27, status = {p
> = 0, sc = 0, sct = 0,
> > > rsvd2
> > > = 0, m = 0, dnr = 0}}
> > > (gdb) p qpair->phase
> > > $12 = 1 '\001'
> > >
> > > What does this mean?  Does this information help at all?
> >
> > The NVMe hardware queue pairs consist of two arrays - one of commands
> and the other of responses -
> > and a set of head and tail indices. The arrays are circular, so you can
> loop back around to the
> > beginning. Each entry in the array contains a phase bit, which is either
> 1 or 0. On the first pass
> > through the array, new entries in the queue are marked by setting their
> phase to 1. On the next
> > pass
> > through the array, new elements are marked by setting their phase bit to
> 0, etc. The current
> > iteration's expected phase value is stored in qpair->phase. So the code
> you are looking at on
> > lines
> > 467-470 is basically saying:
> >
> > 1) Grab the completion entry at the head of the queue
> > 2) Check its phase bit. If it hasn't toggled, there is no new
> completion, so exit
> >
> > All that means is that there are no completions outstanding for any
> commands according to the SSD,
> > which doesn't narrow much down. At the point where you are broken in,
> can you just dump out the
> > whole qpair structure? Something like "p *qpair" should do it. That way
> I can see if there are any
> > commands actually pending at the device and what state the device is in.
> My expectation is that
> > there aren't any commands outstanding.
> >
> > Can you also provide some background as to what kind of I/O you're
> submitting (read or write,
> > size,
> > queue depth, etc.) when this occurs?
> >
> > > -Will
> > >
> > > From: SPDK <spdk-bounces(a)lists.01.org> on behalf of Andrey Kuzmin <
> andrey.v.kuzmin(a)gmail.com>
> > > Reply-To: Storage Performance Development Kit <spdk(a)lists.01.org>
> > > Date: Wednesday, July 6, 2016 at 2:35 PM
> > > To: Storage Performance Development Kit <spdk(a)lists.01.org>
> > > Subject: Re: [SPDK] Callback passed to spdk_nvme_ns_cmd_read not being
> called sometimes
> > >
> > >
> > >
> > >
> > >
> > > On Wed, Jul 6, 2016, 20:56 Will Del Genio <wdelgenio(a)xeograph.com>
> wrote:
> > >
> > > Andrey,
> > > That sounds exactly like what we are experiencing, however we’re
> working off the spdk codebase
> > > that was current as of last week and are still experiencing the
> issue.  Do you know what the
> > > resource allocation fault was and how we might be able to determine if
> that is still occurring?
> > >
> > >
> > >
> > > I'll take a look at commit log, both SPDK and mine, and will get back
> to you.
> > >
> > >
> > >
> > > Regards,
> > >
> > > Andrey
> > >
> > > Ben,
> > > We’re ASSERTing that the result of spdk_nvme_ns_cmd_read() == 0.  If I
> set our queue depth high
> > > enough it will fail that assertion, as would be expected.  Whatever
> other failure we’re
> > > experiencing does not seem to be causing spdk_nvme_ns_cmd_read() to
> return an error code.
> > >
> > > Also I performed some tests with the spdk perf tool and was not able
> to replicate our
> > > problem.  It
> > > ran fine at various queue depths and core masks.  When the qd was set
> too high, it failed
> > > gracefully with an error message.  This is all as expected.
> > >
> > > I’d like to continue down the path of investigating if some resource
> allocation or something
> > > else
> > > is failing silently for us.  Any specific ideas?
> > >
> > > Thanks!
> > > Will
> > >
> > > From: SPDK <spdk-bounces(a)lists.01.org> on behalf of Andrey Kuzmin <
> andrey.v.kuzmin(a)gmail.com>
> > > Reply-To: Storage Performance Development Kit <spdk(a)lists.01.org>
> > > Date: Wednesday, July 6, 2016 at 12:01 PM
> > > To: Storage Performance Development Kit <spdk(a)lists.01.org>
> > > Subject: Re: [SPDK] Callback passed to spdk_nvme_ns_cmd_read not being
> called sometimes
> > >
> > >
> > >
> > >
> > >
> > > On Wed, Jul 6, 2016 at 6:35 PM, Walker, Benjamin <
> benjamin.walker(a)intel.com> wrote:
> > >
> > > Hi Will,
> > >
> > >
> > >
> > > Since I can't see the code for your application I'd like to try and
> reproduce the problem with
> > > code that I have some visibility into. Are you able to reproduce the
> problem using our perf tool
> > > (examples/nvme/perf)? If you aren't, this is likely a problem with
> your test application and not
> > > SPDK.
> > >
> > >
> > >
> > >
> > > I had been witnessing a similar issue with an earlier SPDK release,
> back around Feb, where the
> > > submit call was failing due to the resource allocation fault and
> neither returning an error nor
> > > invoking the callback, but my issue has been fixed in the recent
> release (I can't recall the
> > > actual commit, but there definitely was one dealing exactly with the
> cause).
> > >
> > >
> > >
> > >
> > > Based on the symptoms, my best guess is that your memory pool ran out
> of request objects. The
> > > first thing to check is whether spdk_nvme_ns_cmd_read failed. If it
> fails, it won't call the
> > > callback. You can check for failure by looking at the return value -
> see the documentation here.
> > > Your application allocates this memory pool up front - all of our
> examples allocate 8k requests
> > > (see line 1097 in examples/nvme/perf/perf.c) You need to allocate a
> large enough pool to handle
> > > the maximum number of outstanding requests you plan to have. We
> recently added a "hello_world"
> > > style example for the NVMe driver at
> https://github.com/spdk/spdk/tree/master/examples/nvme/hell
> > > o_
> > > world with tons of comments. One of the comments explains this memory
> pool in detail.
> > >
> > >
> > >
> > > That memory pool allocation is a bit of a wart on our otherwise clean
> API. We're looking at
> > > different strategies to clean that up. Let me know what the result of
> the debugging is and I'll
> > > shoot you some more ideas to try if necessary.
> > >
> > >
> > >
> > >
> > > Are there any plans regarding the global request pool rework?
> > >
> > >
> > >
> > > Regards,
> > >
> > > Andrey
> > >
> > >
> > >
> > >
> > >
> > > Thanks,
> > >
> > > Ben
> > >
> > >
> > >
> > > On Tue, 2016-07-05 at 21:03 +0000, Will Del Genio wrote:
> > >
> > > Hello,
> > > We have written a test application that is utilizing the spdk library
> to benchmark a set of 3
> > > Intel P3700 drives and a single 750 drive (concurrently).  We’ve done
> some testing using fio and
> > > the kernel nvme drivers and have had no problem achieving the claimed
> IOPs (4k random read) of
> > > all
> > > drives on our system.
> > >
> > > What we have found during our testing is that spdk will sometimes
> start to silently fail to call
> > > the callback passed to spdk_nvme_ns_cmd_read in the following
> situations:
> > > 1.       Testing a single drive and passing in 0 for max_completions to
> > > spdk_nvme_qpair_process_completions().  We haven’t seen any issues
> with single drive testing
> > > when
> > > max_completions was > 0.
> > > 2.       Testing all four drives at once will result in one drive
> failing to receive callbacks,
> > > seemingly regardless of what number we pass for max_completions (1
> through 128).
> > >
> > > Here are other observations we’ve made
> > > -When the callbacks fail to be called for a drive, they fail to be
> called for the remaining
> > > duration of the test.
> > > -The drive that ‘fails’ when testing 4 drives concurrently varies from
> test to test.
> > > -‘failure’ of a drive seems to be correlated with the number of
> outstanding read operations,
> > > though it is not a strict correlation.
> > >
> > > Our system is a dual socket  E5-2630 v3.  One drive is on a PCI slot
> for CPU 0 and the other 3
> > > are
> > > on PCI slots on CPU 1.  The master/slave threads are on the the same
> cpu socket as the nvme
> > > device
> > > they are talking to.
> > >
> > > We’d like to know what is causing this issue and what we can do to
> help investigate the
> > > problem.
> > > What other information can we provide?  Is there some part of the spdk
> code that we can look at
> > > to
> > > help determine the cause?
> > >
> > > Thanks,
> > > Will
> > >
> > >
> > > _______________________________________________
> > > SPDK mailing list
> > > SPDK(a)lists.01.org
> > > https://lists.01.org/mailman/listinfo/spdk
> > >
> > >
> > > _______________________________________________
> > > SPDK mailing list
> > > SPDK(a)lists.01.org
> > > https://lists.01.org/mailman/listinfo/spdk
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > > _______________________________________________
> > > SPDK mailing list
> > > SPDK(a)lists.01.org
> > > https://lists.01.org/mailman/listinfo/spdk
> > >
> > > --
> > >
> > > Regards,
> > > Andrey
> > >
> > >
> > >
> > >
> > > _______________________________________________
> > > SPDK mailing list
> > > SPDK(a)lists.01.org
> > > https://lists.01.org/mailman/listinfo/spdk
> > _______________________________________________
> > SPDK mailing list
> > SPDK(a)lists.01.org
> > https://lists.01.org/mailman/listinfo/spdk
> >
> >
> > _______________________________________________
> > SPDK mailing list
> > SPDK(a)lists.01.org
> > https://lists.01.org/mailman/listinfo/spdk
> _______________________________________________
> SPDK mailing list
> SPDK(a)lists.01.org
> https://lists.01.org/mailman/listinfo/spdk
>
-- 

Regards,
Andrey

[-- Attachment #2: attachment.html --]
[-- Type: text/html, Size: 16815 bytes --]

             reply	other threads:[~2016-07-06 21:47 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-07-06 21:47 Andrey Kuzmin [this message]
  -- strict thread matches above, loose matches on Subject: below --
2016-07-12 14:38 [SPDK] Callback passed to spdk_nvme_ns_cmd_read not being called sometimes Will Del Genio
2016-07-07 21:03 Walker, Benjamin
2016-07-07 18:45 Will Del Genio
2016-07-07 15:36 Will Del Genio
2016-07-07 15:09 Will Del Genio
2016-07-06 23:23 Walker, Benjamin
2016-07-06 22:56 Will Del Genio
2016-07-06 21:40 Walker, Benjamin
2016-07-06 21:21 Will Del Genio
2016-07-06 21:00 Will Del Genio
2016-07-06 20:50 Walker, Benjamin
2016-07-06 20:33 Will Del Genio
2016-07-06 19:35 Andrey Kuzmin
2016-07-06 17:56 Will Del Genio
2016-07-06 17:01 Andrey Kuzmin
2016-07-06 15:35 Walker, Benjamin
2016-07-05 21:03 Will Del Genio

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CANvN+enZPuZL0Xk+AVhwEemkTeANr4ef7nMyA=eyVXAhvC6R+w@mail.gmail.com' \
    --to=spdk@lists.01.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.