All of lore.kernel.org
 help / color / mirror / Atom feed
From: Keith Busch <kbusch@kernel.org>
To: Sagi Grimberg <sagi@grimberg.me>
Cc: linux-nvme@lists.infradead.org, hch@lst.de
Subject: Re: nvme tcp receive errors
Date: Mon, 3 May 2021 13:25:45 -0700	[thread overview]
Message-ID: <20210503202545.GB910455@dhcp-10-100-145-180.wdc.com> (raw)
In-Reply-To: <96afc178-f5db-dd56-59a6-467e032c619a@grimberg.me>

On Mon, May 03, 2021 at 12:58:23PM -0700, Sagi Grimberg wrote:
> 
> > > > The driver tracepoints captured millions of IO's where everything
> > > > happened as expected, so I really think something got confused and
> > > > mucked with the wrong request. I've added more trace points to increase
> > > > visibility because I frankly didn't find how that could happen just from
> > > > code inspection. We will also incorporate your patch below for the next
> > > > recreate.
> > > 
> > > Keith, does the issue still happen with eliminating the network send
> > > from .queue_rq() ?
> > 
> > This patch is successful at resolving the observed r2t issues after the
> > weekend test run, which is much longer than it could have run
> > previously. I'm happy we're narrowing this down, but I'm not seeing how
> > this addresses the problem. It looks like the mutex single threads the
> > critical parts, but maybe I'm missing something. Any ideas?
> 
> Not yet, but note that the send part is mutually exclusive but the
> receive context is where we handle the r2t, validate length/offset
> and (re)queue the request for sending a h2cdata pdu back to the
> controller.
>
> The network send was an optimization for latency, and then I modified
> the queueing in the driver such that a request would first go to llist
> and then the sending context (either io_work or .queue_rq) would reap it
> to a local send_list. This helps the driver get better understanding of
> what is inflight such that it better set network msg flags for EOR/MORE.
> 
> My assumption is that maybe somehow we send the the initial command
> pdu to the controller from queue_rq, receive the r2t back before the
> .queue_rq context has completed and something may not be coherent.

Interesting. The network traces look correct, so my thoughts jumped to
possibly incorrect usage of PCIe relaxed ordering, but that appears to
be disabled.. I'll keep looking for other possibilities.

> Side question, are you running with a fully preemptible kernel? or
> less NVMe queues than cpus?

Voluntary preempt. This test is using the kernel config from Ubuntu
20.04.

There are 16 CPUs in this set up with just 7 IO queues.

_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

  reply	other threads:[~2021-05-03 20:26 UTC|newest]

Thread overview: 50+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-03-31 16:18 nvme tcp receive errors Keith Busch
2021-03-31 19:10 ` Sagi Grimberg
2021-03-31 20:49   ` Keith Busch
2021-03-31 22:16     ` Sagi Grimberg
2021-03-31 22:26       ` Keith Busch
2021-03-31 22:45         ` Sagi Grimberg
2021-04-02 17:11     ` Keith Busch
2021-04-02 17:27       ` Sagi Grimberg
2021-04-05 14:37         ` Keith Busch
2021-04-07 19:53           ` Keith Busch
2021-04-09 21:38             ` Sagi Grimberg
2021-04-27 23:39               ` Keith Busch
2021-04-27 23:55                 ` Sagi Grimberg
2021-04-28 15:58                   ` Keith Busch
2021-04-28 17:42                     ` Sagi Grimberg
2021-04-28 18:01                       ` Keith Busch
2021-04-28 23:06                         ` Sagi Grimberg
2021-04-29  3:33                           ` Keith Busch
2021-04-29  4:52                             ` Sagi Grimberg
2021-05-03 18:51                               ` Keith Busch
2021-05-03 19:58                                 ` Sagi Grimberg
2021-05-03 20:25                                   ` Keith Busch [this message]
2021-05-04 19:29                                     ` Sagi Grimberg
2021-04-09 18:04           ` Sagi Grimberg
2021-04-14  0:29             ` Keith Busch
2021-04-21  5:33               ` Sagi Grimberg
2021-04-21 14:28                 ` Keith Busch
2021-04-21 16:59                   ` Sagi Grimberg
2021-04-26 15:31                 ` Keith Busch
2021-04-27  3:10                   ` Sagi Grimberg
2021-04-27 18:12                     ` Keith Busch
2021-04-27 23:58                       ` Sagi Grimberg
2021-04-30 23:42                         ` Sagi Grimberg
2021-05-03 14:28                           ` Keith Busch
2021-05-03 19:36                             ` Sagi Grimberg
2021-05-03 19:38                               ` Sagi Grimberg
2021-05-03 19:44                                 ` Keith Busch
2021-05-03 20:00                                   ` Sagi Grimberg
2021-05-04 14:36                                     ` Keith Busch
2021-05-04 18:15                                       ` Sagi Grimberg
2021-05-04 19:14                                         ` Keith Busch
2021-05-10 18:06                                           ` Keith Busch
2021-05-10 18:18                                             ` Sagi Grimberg
2021-05-10 18:30                                               ` Keith Busch
2021-05-10 21:07                                                 ` Sagi Grimberg
2021-05-11  3:00                                                   ` Keith Busch
2021-05-11 17:17                                                     ` Sagi Grimberg
2021-05-13 15:48                                                       ` Keith Busch
2021-05-13 19:53                                                         ` Sagi Grimberg
2021-05-17 20:48                                                           ` Keith Busch

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210503202545.GB910455@dhcp-10-100-145-180.wdc.com \
    --to=kbusch@kernel.org \
    --cc=hch@lst.de \
    --cc=linux-nvme@lists.infradead.org \
    --cc=sagi@grimberg.me \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.