From: Keith Busch <kbusch@kernel.org>
To: Sagi Grimberg <sagi@grimberg.me>
Cc: linux-nvme@lists.infradead.org, hch@lst.de
Subject: Re: nvme tcp receive errors
Date: Tue, 4 May 2021 12:14:41 -0700 [thread overview]
Message-ID: <20210504191441.GA911866@dhcp-10-100-145-180.wdc.com> (raw)
In-Reply-To: <76a715f5-6a37-8535-3fbe-1aa0f3a54dbc@grimberg.me>
On Tue, May 04, 2021 at 11:15:28AM -0700, Sagi Grimberg wrote:
> > > > > > I've looked again at the code, and I'm not convinced that the patch
> > > > > > is needed at all anymore, I'm now surprised that it actually changed
> > > > > > anything (disregarding data digest).
> > > > > >
> > > > > > The driver does not track the received bytes by definition, it relies
> > > > > > on the controller to send it a completion, or set the success flag in
> > > > > > the _last_ c2hdata pdu. Does your target set
> > > > > > NVME_TCP_F_DATA_SUCCESS on any of the c2hdata pdus?
> > > > >
> > > > > Perhaps you can also run this patch instead?
> > > >
> > > > Thanks, will give this a shot.
> > >
> > > Still would be beneficial to look at the traces and check if
> > > the success flag happens to be set. If this flag is set, the
> > > driver _will_ complete the request without checking the bytes
> > > received thus far (similar to how pci and rdma don't and can't
> > > check dma byte count).
> >
> > I realized this patch is the same as one you'd sent earlier. We hit the
> > BUG_ON(), and then proceeded to use your follow-up patch, which appeared
> > to fix the data receive problem, but introduced data digest problems.
> >
> > So, are you saying that hitting this BUG_ON means that the driver has
> > observed the completion out-of-order from the expected data?
>
> If you hit the BUG_ON it means that the host spotted a c2hdata
> PDU that has the success flag set before all the request data
> was received:
> --
> @@ -759,6 +761,7 @@ static int nvme_tcp_recv_data(struct nvme_tcp_queue
> *queue, struct sk_buff *skb,
> queue->ddgst_remaining = NVME_TCP_DIGEST_LENGTH;
> } else {
> if (pdu->hdr.flags & NVME_TCP_F_DATA_SUCCESS) {
> + BUG_ON(req->data_received != req->data_len);
> nvme_tcp_end_request(rq, NVME_SC_SUCCESS);
> queue->nr_cqe++;
> }
> --
I apologize for the confusion. There is a subtle difference in your most
recent patch request vs. the previous one: the BUG_ON() is within the
DATA_SUCCESS section, and we hadn't actually run with that. We did hit
the BUG_ON() in the first version, and looking at it now, I suspect you
intended to put it in this new location.
We'll retest, but I don't think we'll hit the BUG: none of the headers
have the DATA_SUCCESS flag set in the tcp dumps I've seen.
And also I see your point about how the original patch shouldn't be
needed at all, and I also don't see why it could have changed the
observation without data digest.
Thank you for your patience on this issue. I will get back to you with
more info after circling back with the test group.
_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme
next prev parent reply other threads:[~2021-05-04 19:15 UTC|newest]
Thread overview: 50+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-03-31 16:18 nvme tcp receive errors Keith Busch
2021-03-31 19:10 ` Sagi Grimberg
2021-03-31 20:49 ` Keith Busch
2021-03-31 22:16 ` Sagi Grimberg
2021-03-31 22:26 ` Keith Busch
2021-03-31 22:45 ` Sagi Grimberg
2021-04-02 17:11 ` Keith Busch
2021-04-02 17:27 ` Sagi Grimberg
2021-04-05 14:37 ` Keith Busch
2021-04-07 19:53 ` Keith Busch
2021-04-09 21:38 ` Sagi Grimberg
2021-04-27 23:39 ` Keith Busch
2021-04-27 23:55 ` Sagi Grimberg
2021-04-28 15:58 ` Keith Busch
2021-04-28 17:42 ` Sagi Grimberg
2021-04-28 18:01 ` Keith Busch
2021-04-28 23:06 ` Sagi Grimberg
2021-04-29 3:33 ` Keith Busch
2021-04-29 4:52 ` Sagi Grimberg
2021-05-03 18:51 ` Keith Busch
2021-05-03 19:58 ` Sagi Grimberg
2021-05-03 20:25 ` Keith Busch
2021-05-04 19:29 ` Sagi Grimberg
2021-04-09 18:04 ` Sagi Grimberg
2021-04-14 0:29 ` Keith Busch
2021-04-21 5:33 ` Sagi Grimberg
2021-04-21 14:28 ` Keith Busch
2021-04-21 16:59 ` Sagi Grimberg
2021-04-26 15:31 ` Keith Busch
2021-04-27 3:10 ` Sagi Grimberg
2021-04-27 18:12 ` Keith Busch
2021-04-27 23:58 ` Sagi Grimberg
2021-04-30 23:42 ` Sagi Grimberg
2021-05-03 14:28 ` Keith Busch
2021-05-03 19:36 ` Sagi Grimberg
2021-05-03 19:38 ` Sagi Grimberg
2021-05-03 19:44 ` Keith Busch
2021-05-03 20:00 ` Sagi Grimberg
2021-05-04 14:36 ` Keith Busch
2021-05-04 18:15 ` Sagi Grimberg
2021-05-04 19:14 ` Keith Busch [this message]
2021-05-10 18:06 ` Keith Busch
2021-05-10 18:18 ` Sagi Grimberg
2021-05-10 18:30 ` Keith Busch
2021-05-10 21:07 ` Sagi Grimberg
2021-05-11 3:00 ` Keith Busch
2021-05-11 17:17 ` Sagi Grimberg
2021-05-13 15:48 ` Keith Busch
2021-05-13 19:53 ` Sagi Grimberg
2021-05-17 20:48 ` Keith Busch
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20210504191441.GA911866@dhcp-10-100-145-180.wdc.com \
--to=kbusch@kernel.org \
--cc=hch@lst.de \
--cc=linux-nvme@lists.infradead.org \
--cc=sagi@grimberg.me \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).