From: Boris Pismenny <borispismenny@gmail.com>
To: Sagi Grimberg <sagi@grimberg.me>,
Boris Pismenny <borisp@mellanox.com>,
kuba@kernel.org, davem@davemloft.net, saeedm@nvidia.com,
hch@lst.de, axboe@fb.com, kbusch@kernel.org,
viro@zeniv.linux.org.uk, edumazet@google.com
Cc: Yoray Zack <yorayz@mellanox.com>,
Ben Ben-Ishay <benishay@mellanox.com>,
boris.pismenny@gmail.com, linux-nvme@lists.infradead.org,
netdev@vger.kernel.org, Or Gerlitz <ogerlitz@mellanox.com>
Subject: Re: [PATCH net-next RFC v1 07/10] nvme-tcp : Recalculate crc in the end of the capsule
Date: Sun, 8 Nov 2020 16:46:45 +0200 [thread overview]
Message-ID: <d080bd0c-ca1d-42a6-bee7-e6aa4bcb6896@gmail.com> (raw)
In-Reply-To: <a17cf1ca-4183-8f6c-8470-9d45febb755b@grimberg.me>
On 09/10/2020 1:44, Sagi Grimberg wrote:
>> crc offload of the nvme capsule. Check if all the skb bits
>> are on, and if not recalculate the crc in SW and check it.
> Can you clarify in the patch description that this is only
> for pdu data digest and not header digest?
Will do
>
>> This patch reworks the receive-side crc calculation to always
>> run at the end, so as to keep a single flow for both offload
>> and non-offload. This change simplifies the code, but it may degrade
>> performance for non-offload crc calculation.
> ??
>
> From my scan it doeesn't look like you do that.. Am I missing something?
> Can you explain?
The performance of CRC data digest in the offload's fallback path may be less good compared to CRC calculation with skb_copy_and_hash.
To be clear, the fallback path is occurs when `queue->data_digest && test_bit(NVME_TCP_Q_OFF_CRC_RX, &queue->flags)`, while we receive SKBs where `skb->ddp_crc = 0`
>
>> rq = blk_mq_tag_to_rq(nvme_tcp_tagset(queue), pdu->command_id);
>> if (!rq) {
>> dev_err(queue->ctrl->ctrl.device,
>> @@ -992,7 +1031,7 @@ static int nvme_tcp_recv_data(struct nvme_tcp_queue *queue, struct sk_buff *skb,
>> recv_len = min_t(size_t, recv_len,
>> iov_iter_count(&req->iter));
>>
>> - if (queue->data_digest)
>> + if (queue->data_digest && !test_bit(NVME_TCP_Q_OFFLOADS, &queue->flags))
>> ret = skb_copy_and_hash_datagram_iter(skb, *offset,
>> &req->iter, recv_len, queue->rcv_hash);
> This is the skb copy and hash, not clear why you say that you move this
> to the end...
See the offload fallback path below
>
>> else
>> @@ -1012,7 +1051,6 @@ static int nvme_tcp_recv_data(struct nvme_tcp_queue *queue, struct sk_buff *skb,
>>
>> if (!queue->data_remaining) {
>> if (queue->data_digest) {
>> - nvme_tcp_ddgst_final(queue->rcv_hash, &queue->exp_ddgst);
> If I instead do:
> if (!test_bit(NVME_TCP_Q_OFFLOADS,
> &queue->flags))
> nvme_tcp_ddgst_final(queue->rcv_hash,
> &queue->exp_ddgst);
>
> Does that help the mess in nvme_tcp_recv_ddgst?
Not really, as the code path there takes care of the fallback path, i.e. offloaded requested, but didn't succeed.
>
>> queue->ddgst_remaining = NVME_TCP_DIGEST_LENGTH;
>> } else {
>> if (pdu->hdr.flags & NVME_TCP_F_DATA_SUCCESS) {
>> @@ -1033,8 +1071,11 @@ static int nvme_tcp_recv_ddgst(struct nvme_tcp_queue *queue,
>> char *ddgst = (char *)&queue->recv_ddgst;
>> size_t recv_len = min_t(size_t, *len, queue->ddgst_remaining);
>> off_t off = NVME_TCP_DIGEST_LENGTH - queue->ddgst_remaining;
>> + bool ddgst_offload_fail;
>> int ret;
>>
>> + if (test_bit(NVME_TCP_Q_OFFLOADS, &queue->flags))
>> + nvme_tcp_device_ddgst_update(queue, skb);
>> ret = skb_copy_bits(skb, *offset, &ddgst[off], recv_len);
>> if (unlikely(ret))
>> return ret;
>> @@ -1045,12 +1086,21 @@ static int nvme_tcp_recv_ddgst(struct nvme_tcp_queue *queue,
>> if (queue->ddgst_remaining)
>> return 0;
>>
>> - if (queue->recv_ddgst != queue->exp_ddgst) {
>> - dev_err(queue->ctrl->ctrl.device,
>> - "data digest error: recv %#x expected %#x\n",
>> - le32_to_cpu(queue->recv_ddgst),
>> - le32_to_cpu(queue->exp_ddgst));
>> - return -EIO;
>> + ddgst_offload_fail = !nvme_tcp_device_ddgst_ok(queue);
>> + if (!test_bit(NVME_TCP_Q_OFFLOADS, &queue->flags) ||
>> + ddgst_offload_fail) {
>> + if (test_bit(NVME_TCP_Q_OFFLOADS, &queue->flags) &&
>> + ddgst_offload_fail)
>> + nvme_tcp_crc_recalculate(queue, pdu);
>> +
>> + nvme_tcp_ddgst_final(queue->rcv_hash, &queue->exp_ddgst);
>> + if (queue->recv_ddgst != queue->exp_ddgst) {
>> + dev_err(queue->ctrl->ctrl.device,
>> + "data digest error: recv %#x expected %#x\n",
>> + le32_to_cpu(queue->recv_ddgst),
>> + le32_to_cpu(queue->exp_ddgst));
>> + return -EIO;
> This gets convoluted here...
Will try to simplify, the general idea is that there are 3 paths with common code:
1. non-offload
2. offload failed
3. offload success
(1) and (2) share the code for finalizing checking the data digest, while (3) skips this entirely.
In other words, how about this:
```
offload_fail = !nvme_tcp_ddp_ddgst_ok(queue);
offload = test_bit(NVME_TCP_Q_OFF_CRC_RX, &queue->flags);
if (!offload || offload_fail) {
if (offload && offload_fail) // software-fallback
nvme_tcp_ddp_ddgst_recalc(queue, pdu);
nvme_tcp_ddgst_final(queue->rcv_hash, &queue->exp_ddgst);
if (queue->recv_ddgst != queue->exp_ddgst) {
dev_err(queue->ctrl->ctrl.device,
"data digest error: recv %#x expected %#x\n",
le32_to_cpu(queue->recv_ddgst),
le32_to_cpu(queue->exp_ddgst));
return -EIO;
}
}
```
>
>> + }
>> }
>>
>> if (pdu->hdr.flags & NVME_TCP_F_DATA_SUCCESS) {
>>
_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme
next prev parent reply other threads:[~2020-11-08 14:47 UTC|newest]
Thread overview: 35+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-09-30 16:20 [PATCH net-next RFC v1 00/10] nvme-tcp receive offloads Boris Pismenny
2020-09-30 16:20 ` [PATCH net-next RFC v1 01/10] iov_iter: Skip copy in memcpy_to_page if src==dst Boris Pismenny
2020-10-08 23:05 ` Sagi Grimberg
2020-09-30 16:20 ` [PATCH net-next RFC v1 02/10] net: Introduce direct data placement tcp offload Boris Pismenny
2020-10-08 21:47 ` Sagi Grimberg
2020-10-11 14:44 ` Boris Pismenny
2020-09-30 16:20 ` [PATCH net-next RFC v1 03/10] net: Introduce crc offload for tcp ddp ulp Boris Pismenny
2020-10-08 21:51 ` Sagi Grimberg
2020-10-11 14:58 ` Boris Pismenny
2020-09-30 16:20 ` [PATCH net-next RFC v1 04/10] net/tls: expose get_netdev_for_sock Boris Pismenny
2020-10-08 21:56 ` Sagi Grimberg
2020-09-30 16:20 ` [PATCH net-next RFC v1 05/10] nvme-tcp: Add DDP offload control path Boris Pismenny
2020-10-08 22:19 ` Sagi Grimberg
2020-10-19 18:28 ` Boris Pismenny
[not found] ` <PH0PR18MB3845430DDF572E0DD4832D06CCED0@PH0PR18MB3845.namprd18.prod.outlook.com>
2020-11-08 6:51 ` Shai Malin
2020-11-09 23:23 ` Sagi Grimberg
2020-11-11 5:12 ` FW: " Shai Malin
2020-11-11 5:43 ` Shai Malin
2020-09-30 16:20 ` [PATCH net-next RFC v1 06/10] nvme-tcp: Add DDP data-path Boris Pismenny
2020-10-08 22:29 ` Sagi Grimberg
2020-10-08 23:00 ` Sagi Grimberg
2020-11-08 13:59 ` Boris Pismenny
2020-11-08 9:44 ` Boris Pismenny
2020-11-09 23:18 ` Sagi Grimberg
2020-09-30 16:20 ` [PATCH net-next RFC v1 07/10] nvme-tcp : Recalculate crc in the end of the capsule Boris Pismenny
2020-10-08 22:44 ` Sagi Grimberg
[not found] ` <PH0PR18MB3845764B48FD24C87FA34304CCED0@PH0PR18MB3845.namprd18.prod.outlook.com>
[not found] ` <PH0PR18MB38458FD325BD77983D2623D4CCEB0@PH0PR18MB3845.namprd18.prod.outlook.com>
2020-11-08 6:59 ` Shai Malin
2020-11-08 7:28 ` Boris Pismenny
2020-11-08 14:46 ` Boris Pismenny [this message]
2020-09-30 16:20 ` [PATCH net-next RFC v1 08/10] nvme-tcp: Deal with netdevice DOWN events Boris Pismenny
2020-10-08 22:47 ` Sagi Grimberg
2020-10-11 6:54 ` Or Gerlitz
2020-09-30 16:20 ` [PATCH net-next RFC v1 09/10] net/mlx5e: Add NVMEoTCP offload Boris Pismenny
2020-09-30 16:20 ` [PATCH net-next RFC v1 10/10] net/mlx5e: NVMEoTCP, data-path for DDP offload Boris Pismenny
2020-10-09 0:08 ` [PATCH net-next RFC v1 00/10] nvme-tcp receive offloads Sagi Grimberg
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=d080bd0c-ca1d-42a6-bee7-e6aa4bcb6896@gmail.com \
--to=borispismenny@gmail.com \
--cc=axboe@fb.com \
--cc=benishay@mellanox.com \
--cc=boris.pismenny@gmail.com \
--cc=borisp@mellanox.com \
--cc=davem@davemloft.net \
--cc=edumazet@google.com \
--cc=hch@lst.de \
--cc=kbusch@kernel.org \
--cc=kuba@kernel.org \
--cc=linux-nvme@lists.infradead.org \
--cc=netdev@vger.kernel.org \
--cc=ogerlitz@mellanox.com \
--cc=saeedm@nvidia.com \
--cc=sagi@grimberg.me \
--cc=viro@zeniv.linux.org.uk \
--cc=yorayz@mellanox.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).