linux-nvme.lists.infradead.org archive mirror
 help / color / mirror / Atom feed
From: Sagi Grimberg <sagi@grimberg.me>
To: Keith Busch <kbusch@kernel.org>
Cc: linux-nvme@lists.infradead.org, hch@lst.de
Subject: Re: nvme tcp receive errors
Date: Tue, 4 May 2021 11:15:28 -0700	[thread overview]
Message-ID: <76a715f5-6a37-8535-3fbe-1aa0f3a54dbc@grimberg.me> (raw)
In-Reply-To: <20210504143633.GC910455@dhcp-10-100-145-180.wdc.com>


>>>>>>> Hey Keith,
>>>>>>>
>>>>>>> Did this resolve the issues?
>>>>>>
>>>>>> We're unfortunately still observing data digest issues even with this.
>>>>>> Most of the testing has shifted to the r2t error, so I don't have any
>>>>>> additional details on the data digest problem.
>>>>>
>>>>> I've looked again at the code, and I'm not convinced that the patch
>>>>> is needed at all anymore, I'm now surprised that it actually changed
>>>>> anything (disregarding data digest).
>>>>>
>>>>> The driver does not track the received bytes by definition, it relies
>>>>> on the controller to send it a completion, or set the success flag in
>>>>> the _last_ c2hdata pdu. Does your target set
>>>>> NVME_TCP_F_DATA_SUCCESS on any of the c2hdata pdus?
>>>>
>>>> Perhaps you can also run this patch instead?
>>>
>>> Thanks, will give this a shot.
>>
>> Still would be beneficial to look at the traces and check if
>> the success flag happens to be set. If this flag is set, the
>> driver _will_ complete the request without checking the bytes
>> received thus far (similar to how pci and rdma don't and can't
>> check dma byte count).
> 
> I realized this patch is the same as one you'd sent earlier. We hit the
> BUG_ON(), and then proceeded to use your follow-up patch, which appeared
> to fix the data receive problem, but introduced data digest problems.
> 
> So, are you saying that hitting this BUG_ON means that the driver has
> observed the completion out-of-order from the expected data?

If you hit the BUG_ON it means that the host spotted a c2hdata
PDU that has the success flag set before all the request data
was received:
--
@@ -759,6 +761,7 @@ static int nvme_tcp_recv_data(struct nvme_tcp_queue 
*queue, struct sk_buff *skb,
                         queue->ddgst_remaining = NVME_TCP_DIGEST_LENGTH;
                 } else {
                         if (pdu->hdr.flags & NVME_TCP_F_DATA_SUCCESS) {
+                               BUG_ON(req->data_received != req->data_len);
                                 nvme_tcp_end_request(rq, NVME_SC_SUCCESS);
                                 queue->nr_cqe++;
                         }
--

Which means that the host is completing the request immediately relying
on the controller sent all the required data and knowing that a
completion response capsule will not be sent.

 From the spec:
C2HData PDUs contain a LAST_PDU flag that is set to ‘1’ in the last PDU
of a command data transfer and is cleared to ‘0’ in all other C2HData
PDUs associated with the command. C2HData PDUs also contain a SUCCESS
flag that may be set to ‘1’ in the last C2HData PDU of a command data
transfer to indicate that the command has completed successfully. In
this case, no Response Capsule is sent by the controller for the command 
and the host synthesizes a completion queue entry for the command with
the Command Specific field and the Status Field both cleared to 0h. If
the SUCCESS flag is cleared to ‘0’ in the last C2HData PDU of a command,
then the controller shall send a Response Capsule for the command to the
host. The SUCCESS flag shall be cleared to ‘0’ in all C2HData PDUs that
are not the last C2HData PDU for a command. The SUCCESS flag may be set
to ‘1’ in the last C2HData PDU only if the controller supports disabling
submission queue head pointer updates.

Hence my question, does the controller set NVME_TCP_F_DATA_SUCCESS in
any of the c2hdata PDUs which is not the last one? does it set it in the
last one and omitting the cqe response capsule as expected by the host?

_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

  reply	other threads:[~2021-05-04 18:16 UTC|newest]

Thread overview: 50+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-03-31 16:18 nvme tcp receive errors Keith Busch
2021-03-31 19:10 ` Sagi Grimberg
2021-03-31 20:49   ` Keith Busch
2021-03-31 22:16     ` Sagi Grimberg
2021-03-31 22:26       ` Keith Busch
2021-03-31 22:45         ` Sagi Grimberg
2021-04-02 17:11     ` Keith Busch
2021-04-02 17:27       ` Sagi Grimberg
2021-04-05 14:37         ` Keith Busch
2021-04-07 19:53           ` Keith Busch
2021-04-09 21:38             ` Sagi Grimberg
2021-04-27 23:39               ` Keith Busch
2021-04-27 23:55                 ` Sagi Grimberg
2021-04-28 15:58                   ` Keith Busch
2021-04-28 17:42                     ` Sagi Grimberg
2021-04-28 18:01                       ` Keith Busch
2021-04-28 23:06                         ` Sagi Grimberg
2021-04-29  3:33                           ` Keith Busch
2021-04-29  4:52                             ` Sagi Grimberg
2021-05-03 18:51                               ` Keith Busch
2021-05-03 19:58                                 ` Sagi Grimberg
2021-05-03 20:25                                   ` Keith Busch
2021-05-04 19:29                                     ` Sagi Grimberg
2021-04-09 18:04           ` Sagi Grimberg
2021-04-14  0:29             ` Keith Busch
2021-04-21  5:33               ` Sagi Grimberg
2021-04-21 14:28                 ` Keith Busch
2021-04-21 16:59                   ` Sagi Grimberg
2021-04-26 15:31                 ` Keith Busch
2021-04-27  3:10                   ` Sagi Grimberg
2021-04-27 18:12                     ` Keith Busch
2021-04-27 23:58                       ` Sagi Grimberg
2021-04-30 23:42                         ` Sagi Grimberg
2021-05-03 14:28                           ` Keith Busch
2021-05-03 19:36                             ` Sagi Grimberg
2021-05-03 19:38                               ` Sagi Grimberg
2021-05-03 19:44                                 ` Keith Busch
2021-05-03 20:00                                   ` Sagi Grimberg
2021-05-04 14:36                                     ` Keith Busch
2021-05-04 18:15                                       ` Sagi Grimberg [this message]
2021-05-04 19:14                                         ` Keith Busch
2021-05-10 18:06                                           ` Keith Busch
2021-05-10 18:18                                             ` Sagi Grimberg
2021-05-10 18:30                                               ` Keith Busch
2021-05-10 21:07                                                 ` Sagi Grimberg
2021-05-11  3:00                                                   ` Keith Busch
2021-05-11 17:17                                                     ` Sagi Grimberg
2021-05-13 15:48                                                       ` Keith Busch
2021-05-13 19:53                                                         ` Sagi Grimberg
2021-05-17 20:48                                                           ` Keith Busch

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=76a715f5-6a37-8535-3fbe-1aa0f3a54dbc@grimberg.me \
    --to=sagi@grimberg.me \
    --cc=hch@lst.de \
    --cc=kbusch@kernel.org \
    --cc=linux-nvme@lists.infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).