All of lore.kernel.org
 help / color / mirror / Atom feed
From: David Ahern <dsahern@gmail.com>
To: Boris Pismenny <borisp@mellanox.com>,
	kuba@kernel.org, davem@davemloft.net, saeedm@nvidia.com,
	hch@lst.de, sagi@grimberg.me, axboe@fb.com, kbusch@kernel.org,
	viro@zeniv.linux.org.uk, edumazet@google.com, smalin@marvell.com
Cc: boris.pismenny@gmail.com, linux-nvme@lists.infradead.org,
	netdev@vger.kernel.org, benishay@nvidia.com, ogerlitz@nvidia.com,
	yorayz@nvidia.com, Ben Ben-Ishay <benishay@mellanox.com>,
	Or Gerlitz <ogerlitz@mellanox.com>,
	Yoray Zack <yorayz@mellanox.com>
Subject: Re: [PATCH v2 net-next 07/21] nvme-tcp: Add DDP data-path
Date: Mon, 18 Jan 2021 21:18:44 -0700	[thread overview]
Message-ID: <84cc2af1-22e8-abf5-07da-bc7b4a2b6b12@gmail.com> (raw)
In-Reply-To: <20210114151033.13020-8-borisp@mellanox.com>

On 1/14/21 8:10 AM, Boris Pismenny wrote:
> @@ -664,8 +753,15 @@ static int nvme_tcp_process_nvme_cqe(struct nvme_tcp_queue *queue,
>  		return -EINVAL;
>  	}
>  
> -	if (!nvme_try_complete_req(rq, cqe->status, cqe->result))
> -		nvme_complete_rq(rq);
> +	req = blk_mq_rq_to_pdu(rq);
> +	if (req->offloaded) {
> +		req->status = cqe->status;
> +		req->result = cqe->result;
> +		nvme_tcp_teardown_ddp(queue, cqe->command_id, rq);
> +	} else {
> +		if (!nvme_try_complete_req(rq, cqe->status, cqe->result))
> +			nvme_complete_rq(rq);
> +	}
>  	queue->nr_cqe++;
>  
>  	return 0;
> @@ -859,9 +955,18 @@ static int nvme_tcp_recv_pdu(struct nvme_tcp_queue *queue, struct sk_buff *skb,
>  static inline void nvme_tcp_end_request(struct request *rq, u16 status)
>  {
>  	union nvme_result res = {};
> +	struct nvme_tcp_request *req = blk_mq_rq_to_pdu(rq);
> +	struct nvme_tcp_queue *queue = req->queue;
> +	struct nvme_tcp_data_pdu *pdu = (void *)queue->pdu;
>  
> -	if (!nvme_try_complete_req(rq, cpu_to_le16(status << 1), res))
> -		nvme_complete_rq(rq);
> +	if (req->offloaded) {
> +		req->status = cpu_to_le16(status << 1);
> +		req->result = res;
> +		nvme_tcp_teardown_ddp(queue, pdu->command_id, rq);
> +	} else {
> +		if (!nvme_try_complete_req(rq, cpu_to_le16(status << 1), res))
> +			nvme_complete_rq(rq);
> +	}
>  }
>  
>  static int nvme_tcp_recv_data(struct nvme_tcp_queue *queue, struct sk_buff *skb,


The req->offload checks assume the offload is to the expected
offload_netdev, but you do not verify the data arrived as expected. You
might get lucky if both netdev's belong to the same PCI device (assuming
the h/w handles it a certain way), but it will not if the netdev's
belong to different devices.

Consider a system with 2 network cards -- even if it is 2 mlx5 based
devices. One setup can have the system using a bond with 1 port from
each PCI device. The tx path picks a leg based on the hash of the ntuple
and that (with Tariq's bond patches) becomes the expected offload
device. A similar example holds for a pure routing setup with ECMP. For
both there is full redundancy in the network - separate NIC cards
connected to separate TORs to have independent network paths.

A packet arrives on the *other* netdevice - you have *no* control over
the Rx path. Your current checks will think the packet arrived with DDP
but it did not.

WARNING: multiple messages have this Message-ID (diff)
From: David Ahern <dsahern@gmail.com>
To: Boris Pismenny <borisp@mellanox.com>,
	kuba@kernel.org, davem@davemloft.net, saeedm@nvidia.com,
	hch@lst.de, sagi@grimberg.me, axboe@fb.com, kbusch@kernel.org,
	viro@zeniv.linux.org.uk, edumazet@google.com, smalin@marvell.com
Cc: Yoray Zack <yorayz@mellanox.com>,
	yorayz@nvidia.com, boris.pismenny@gmail.com,
	Ben Ben-Ishay <benishay@mellanox.com>,
	benishay@nvidia.com, linux-nvme@lists.infradead.org,
	netdev@vger.kernel.org, Or Gerlitz <ogerlitz@mellanox.com>,
	ogerlitz@nvidia.com
Subject: Re: [PATCH v2 net-next 07/21] nvme-tcp: Add DDP data-path
Date: Mon, 18 Jan 2021 21:18:44 -0700	[thread overview]
Message-ID: <84cc2af1-22e8-abf5-07da-bc7b4a2b6b12@gmail.com> (raw)
In-Reply-To: <20210114151033.13020-8-borisp@mellanox.com>

On 1/14/21 8:10 AM, Boris Pismenny wrote:
> @@ -664,8 +753,15 @@ static int nvme_tcp_process_nvme_cqe(struct nvme_tcp_queue *queue,
>  		return -EINVAL;
>  	}
>  
> -	if (!nvme_try_complete_req(rq, cqe->status, cqe->result))
> -		nvme_complete_rq(rq);
> +	req = blk_mq_rq_to_pdu(rq);
> +	if (req->offloaded) {
> +		req->status = cqe->status;
> +		req->result = cqe->result;
> +		nvme_tcp_teardown_ddp(queue, cqe->command_id, rq);
> +	} else {
> +		if (!nvme_try_complete_req(rq, cqe->status, cqe->result))
> +			nvme_complete_rq(rq);
> +	}
>  	queue->nr_cqe++;
>  
>  	return 0;
> @@ -859,9 +955,18 @@ static int nvme_tcp_recv_pdu(struct nvme_tcp_queue *queue, struct sk_buff *skb,
>  static inline void nvme_tcp_end_request(struct request *rq, u16 status)
>  {
>  	union nvme_result res = {};
> +	struct nvme_tcp_request *req = blk_mq_rq_to_pdu(rq);
> +	struct nvme_tcp_queue *queue = req->queue;
> +	struct nvme_tcp_data_pdu *pdu = (void *)queue->pdu;
>  
> -	if (!nvme_try_complete_req(rq, cpu_to_le16(status << 1), res))
> -		nvme_complete_rq(rq);
> +	if (req->offloaded) {
> +		req->status = cpu_to_le16(status << 1);
> +		req->result = res;
> +		nvme_tcp_teardown_ddp(queue, pdu->command_id, rq);
> +	} else {
> +		if (!nvme_try_complete_req(rq, cpu_to_le16(status << 1), res))
> +			nvme_complete_rq(rq);
> +	}
>  }
>  
>  static int nvme_tcp_recv_data(struct nvme_tcp_queue *queue, struct sk_buff *skb,


The req->offload checks assume the offload is to the expected
offload_netdev, but you do not verify the data arrived as expected. You
might get lucky if both netdev's belong to the same PCI device (assuming
the h/w handles it a certain way), but it will not if the netdev's
belong to different devices.

Consider a system with 2 network cards -- even if it is 2 mlx5 based
devices. One setup can have the system using a bond with 1 port from
each PCI device. The tx path picks a leg based on the hash of the ntuple
and that (with Tariq's bond patches) becomes the expected offload
device. A similar example holds for a pure routing setup with ECMP. For
both there is full redundancy in the network - separate NIC cards
connected to separate TORs to have independent network paths.

A packet arrives on the *other* netdevice - you have *no* control over
the Rx path. Your current checks will think the packet arrived with DDP
but it did not.

_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

  reply	other threads:[~2021-01-19  4:20 UTC|newest]

Thread overview: 68+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-01-14 15:10 [PATCH v2 net-next 00/21] nvme-tcp receive offloads Boris Pismenny
2021-01-14 15:10 ` Boris Pismenny
2021-01-14 15:10 ` [PATCH v2 net-next 01/21] iov_iter: Introduce new procedures for copy to iter/pages Boris Pismenny
2021-01-14 15:10   ` Boris Pismenny
2021-01-14 15:10 ` [PATCH v2 net-next 02/21] net: Introduce direct data placement tcp offload Boris Pismenny
2021-01-14 15:10   ` Boris Pismenny
2021-01-14 15:57   ` Eric Dumazet
2021-01-14 15:57     ` Eric Dumazet
2021-01-14 20:19     ` Boris Pismenny
2021-01-14 20:19       ` Boris Pismenny
2021-01-14 20:43       ` Eric Dumazet
2021-01-14 20:43         ` Eric Dumazet
2021-01-31 10:40         ` Boris Pismenny
2021-01-31 10:40           ` Boris Pismenny
2021-01-14 15:10 ` [PATCH v2 net-next 03/21] net: Introduce crc offload for tcp ddp ulp Boris Pismenny
2021-01-14 15:10   ` Boris Pismenny
2021-01-14 15:10 ` [PATCH v2 net-next 04/21] net: SKB copy(+hash) iterators for DDP offloads Boris Pismenny
2021-01-14 15:10   ` Boris Pismenny
2021-01-14 15:10 ` [PATCH v2 net-next 05/21] net/tls: expose get_netdev_for_sock Boris Pismenny
2021-01-14 15:10   ` Boris Pismenny
2021-01-14 15:10 ` [PATCH v2 net-next 06/21] nvme-tcp: Add DDP offload control path Boris Pismenny
2021-01-14 15:10   ` Boris Pismenny
2021-01-19  3:47   ` David Ahern
2021-01-19  3:47     ` David Ahern
2021-01-31  7:51     ` Boris Pismenny
2021-01-31  7:51       ` Boris Pismenny
2021-01-14 15:10 ` [PATCH v2 net-next 07/21] nvme-tcp: Add DDP data-path Boris Pismenny
2021-01-14 15:10   ` Boris Pismenny
2021-01-19  4:18   ` David Ahern [this message]
2021-01-19  4:18     ` David Ahern
2021-01-31  8:44     ` Boris Pismenny
2021-01-31  8:44       ` Boris Pismenny
2021-01-14 15:10 ` [PATCH v2 net-next 08/21] nvme-tcp : Recalculate crc in the end of the capsule Boris Pismenny
2021-01-14 15:10   ` Boris Pismenny
2021-01-14 15:10 ` [PATCH v2 net-next 09/21] nvme-tcp: Deal with netdevice DOWN events Boris Pismenny
2021-01-14 15:10   ` Boris Pismenny
2021-01-14 15:10 ` [PATCH v2 net-next 10/21] net/mlx5: Header file changes for nvme-tcp offload Boris Pismenny
2021-01-14 15:10   ` Boris Pismenny
2021-01-14 15:10 ` [PATCH v2 net-next 11/21] net/mlx5: Add 128B CQE for NVMEoTCP offload Boris Pismenny
2021-01-14 15:10   ` Boris Pismenny
2021-01-14 15:10 ` [PATCH v2 net-next 12/21] net/mlx5e: TCP flow steering for nvme-tcp Boris Pismenny
2021-01-14 15:10   ` Boris Pismenny
2021-01-14 15:10 ` [PATCH v2 net-next 13/21] net/mlx5e: NVMEoTCP offload initialization Boris Pismenny
2021-01-14 15:10   ` Boris Pismenny
2021-01-14 15:10 ` [PATCH v2 net-next 14/21] net/mlx5e: KLM UMR helper macros Boris Pismenny
2021-01-14 15:10   ` Boris Pismenny
2021-01-14 15:10 ` [PATCH v2 net-next 15/21] net/mlx5e: NVMEoTCP use KLM UMRs Boris Pismenny
2021-01-14 15:10   ` Boris Pismenny
2021-01-14 15:10 ` [PATCH v2 net-next 16/21] net/mlx5e: NVMEoTCP queue init/teardown Boris Pismenny
2021-01-14 15:10   ` Boris Pismenny
2021-01-14 15:10 ` [PATCH v2 net-next 17/21] net/mlx5e: NVMEoTCP async ddp invalidation Boris Pismenny
2021-01-14 15:10   ` Boris Pismenny
2021-01-14 15:10 ` [PATCH v2 net-next 18/21] net/mlx5e: NVMEoTCP ddp setup and resync Boris Pismenny
2021-01-14 15:10   ` Boris Pismenny
2021-01-14 15:10 ` [PATCH v2 net-next 19/21] net/mlx5e: NVMEoTCP, data-path for DDP offload Boris Pismenny
2021-01-14 15:10   ` Boris Pismenny
2021-01-16  4:57   ` David Ahern
2021-01-16  4:57     ` David Ahern
2021-01-17  8:42     ` Boris Pismenny
2021-01-17  8:42       ` Boris Pismenny
2021-01-19  4:36       ` David Ahern
2021-01-19  4:36         ` David Ahern
2021-01-31  9:27         ` Boris Pismenny
2021-01-31  9:27           ` Boris Pismenny
2021-01-14 15:10 ` [PATCH v2 net-next 20/21] net/mlx5e: NVMEoTCP statistics Boris Pismenny
2021-01-14 15:10   ` Boris Pismenny
2021-01-14 15:10 ` [PATCH v2 net-next 21/21] Documentation: add TCP DDP offload documentation Boris Pismenny
2021-01-14 15:10   ` Boris Pismenny

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=84cc2af1-22e8-abf5-07da-bc7b4a2b6b12@gmail.com \
    --to=dsahern@gmail.com \
    --cc=axboe@fb.com \
    --cc=benishay@mellanox.com \
    --cc=benishay@nvidia.com \
    --cc=boris.pismenny@gmail.com \
    --cc=borisp@mellanox.com \
    --cc=davem@davemloft.net \
    --cc=edumazet@google.com \
    --cc=hch@lst.de \
    --cc=kbusch@kernel.org \
    --cc=kuba@kernel.org \
    --cc=linux-nvme@lists.infradead.org \
    --cc=netdev@vger.kernel.org \
    --cc=ogerlitz@mellanox.com \
    --cc=ogerlitz@nvidia.com \
    --cc=saeedm@nvidia.com \
    --cc=sagi@grimberg.me \
    --cc=smalin@marvell.com \
    --cc=viro@zeniv.linux.org.uk \
    --cc=yorayz@mellanox.com \
    --cc=yorayz@nvidia.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.