linux-nvme.lists.infradead.org archive mirror
 help / color / mirror / Atom feed
From: Keith Busch <kbusch@kernel.org>
To: Sagi Grimberg <sagi@grimberg.me>
Cc: linux-nvme@lists.infradead.org, hch@lst.de
Subject: Re: nvme tcp receive errors
Date: Tue, 4 May 2021 12:14:41 -0700	[thread overview]
Message-ID: <20210504191441.GA911866@dhcp-10-100-145-180.wdc.com> (raw)
In-Reply-To: <76a715f5-6a37-8535-3fbe-1aa0f3a54dbc@grimberg.me>

On Tue, May 04, 2021 at 11:15:28AM -0700, Sagi Grimberg wrote:
> > > > > > I've looked again at the code, and I'm not convinced that the patch
> > > > > > is needed at all anymore, I'm now surprised that it actually changed
> > > > > > anything (disregarding data digest).
> > > > > > 
> > > > > > The driver does not track the received bytes by definition, it relies
> > > > > > on the controller to send it a completion, or set the success flag in
> > > > > > the _last_ c2hdata pdu. Does your target set
> > > > > > NVME_TCP_F_DATA_SUCCESS on any of the c2hdata pdus?
> > > > > 
> > > > > Perhaps you can also run this patch instead?
> > > > 
> > > > Thanks, will give this a shot.
> > > 
> > > Still would be beneficial to look at the traces and check if
> > > the success flag happens to be set. If this flag is set, the
> > > driver _will_ complete the request without checking the bytes
> > > received thus far (similar to how pci and rdma don't and can't
> > > check dma byte count).
> > 
> > I realized this patch is the same as one you'd sent earlier. We hit the
> > BUG_ON(), and then proceeded to use your follow-up patch, which appeared
> > to fix the data receive problem, but introduced data digest problems.
> > 
> > So, are you saying that hitting this BUG_ON means that the driver has
> > observed the completion out-of-order from the expected data?
> 
> If you hit the BUG_ON it means that the host spotted a c2hdata
> PDU that has the success flag set before all the request data
> was received:
> --
> @@ -759,6 +761,7 @@ static int nvme_tcp_recv_data(struct nvme_tcp_queue
> *queue, struct sk_buff *skb,
>                         queue->ddgst_remaining = NVME_TCP_DIGEST_LENGTH;
>                 } else {
>                         if (pdu->hdr.flags & NVME_TCP_F_DATA_SUCCESS) {
> +                               BUG_ON(req->data_received != req->data_len);
>                                 nvme_tcp_end_request(rq, NVME_SC_SUCCESS);
>                                 queue->nr_cqe++;
>                         }
> --

I apologize for the confusion. There is a subtle difference in your most
recent patch request vs. the previous one: the BUG_ON() is within the
DATA_SUCCESS section, and we hadn't actually run with that. We did hit
the BUG_ON() in the first version, and looking at it now, I suspect you
intended to put it in this new location.

We'll retest, but I don't think we'll hit the BUG: none of the headers
have the DATA_SUCCESS flag set in the tcp dumps I've seen.

And also I see your point about how the original patch shouldn't be
needed at all, and I also don't see why it could have changed the
observation without data digest.

Thank you for your patience on this issue. I will get back to you with
more info after circling back with the test group.

_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

  reply	other threads:[~2021-05-04 19:15 UTC|newest]

Thread overview: 50+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-03-31 16:18 nvme tcp receive errors Keith Busch
2021-03-31 19:10 ` Sagi Grimberg
2021-03-31 20:49   ` Keith Busch
2021-03-31 22:16     ` Sagi Grimberg
2021-03-31 22:26       ` Keith Busch
2021-03-31 22:45         ` Sagi Grimberg
2021-04-02 17:11     ` Keith Busch
2021-04-02 17:27       ` Sagi Grimberg
2021-04-05 14:37         ` Keith Busch
2021-04-07 19:53           ` Keith Busch
2021-04-09 21:38             ` Sagi Grimberg
2021-04-27 23:39               ` Keith Busch
2021-04-27 23:55                 ` Sagi Grimberg
2021-04-28 15:58                   ` Keith Busch
2021-04-28 17:42                     ` Sagi Grimberg
2021-04-28 18:01                       ` Keith Busch
2021-04-28 23:06                         ` Sagi Grimberg
2021-04-29  3:33                           ` Keith Busch
2021-04-29  4:52                             ` Sagi Grimberg
2021-05-03 18:51                               ` Keith Busch
2021-05-03 19:58                                 ` Sagi Grimberg
2021-05-03 20:25                                   ` Keith Busch
2021-05-04 19:29                                     ` Sagi Grimberg
2021-04-09 18:04           ` Sagi Grimberg
2021-04-14  0:29             ` Keith Busch
2021-04-21  5:33               ` Sagi Grimberg
2021-04-21 14:28                 ` Keith Busch
2021-04-21 16:59                   ` Sagi Grimberg
2021-04-26 15:31                 ` Keith Busch
2021-04-27  3:10                   ` Sagi Grimberg
2021-04-27 18:12                     ` Keith Busch
2021-04-27 23:58                       ` Sagi Grimberg
2021-04-30 23:42                         ` Sagi Grimberg
2021-05-03 14:28                           ` Keith Busch
2021-05-03 19:36                             ` Sagi Grimberg
2021-05-03 19:38                               ` Sagi Grimberg
2021-05-03 19:44                                 ` Keith Busch
2021-05-03 20:00                                   ` Sagi Grimberg
2021-05-04 14:36                                     ` Keith Busch
2021-05-04 18:15                                       ` Sagi Grimberg
2021-05-04 19:14                                         ` Keith Busch [this message]
2021-05-10 18:06                                           ` Keith Busch
2021-05-10 18:18                                             ` Sagi Grimberg
2021-05-10 18:30                                               ` Keith Busch
2021-05-10 21:07                                                 ` Sagi Grimberg
2021-05-11  3:00                                                   ` Keith Busch
2021-05-11 17:17                                                     ` Sagi Grimberg
2021-05-13 15:48                                                       ` Keith Busch
2021-05-13 19:53                                                         ` Sagi Grimberg
2021-05-17 20:48                                                           ` Keith Busch

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210504191441.GA911866@dhcp-10-100-145-180.wdc.com \
    --to=kbusch@kernel.org \
    --cc=hch@lst.de \
    --cc=linux-nvme@lists.infradead.org \
    --cc=sagi@grimberg.me \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).