All of lore.kernel.org
 help / color / mirror / Atom feed
From: Boris Pismenny <borispismenny@gmail.com>
To: David Ahern <dsahern@gmail.com>,
	Boris Pismenny <borisp@mellanox.com>,
	kuba@kernel.org, davem@davemloft.net, saeedm@nvidia.com,
	hch@lst.de, sagi@grimberg.me, axboe@fb.com, kbusch@kernel.org,
	viro@zeniv.linux.org.uk, edumazet@google.com, smalin@marvell.com
Cc: boris.pismenny@gmail.com, linux-nvme@lists.infradead.org,
	netdev@vger.kernel.org, benishay@nvidia.com, ogerlitz@nvidia.com,
	yorayz@nvidia.com, Ben Ben-Ishay <benishay@mellanox.com>,
	Or Gerlitz <ogerlitz@mellanox.com>,
	Yoray Zack <yorayz@mellanox.com>
Subject: Re: [PATCH v2 net-next 07/21] nvme-tcp: Add DDP data-path
Date: Sun, 31 Jan 2021 10:44:12 +0200	[thread overview]
Message-ID: <419231da-615c-10c5-7c98-7e049ac54ee7@gmail.com> (raw)
In-Reply-To: <84cc2af1-22e8-abf5-07da-bc7b4a2b6b12@gmail.com>



On 19/01/2021 6:18, David Ahern wrote:
> On 1/14/21 8:10 AM, Boris Pismenny wrote:
>> @@ -664,8 +753,15 @@ static int nvme_tcp_process_nvme_cqe(struct nvme_tcp_queue *queue,
>>  		return -EINVAL;
>>  	}
>>  
>> -	if (!nvme_try_complete_req(rq, cqe->status, cqe->result))
>> -		nvme_complete_rq(rq);
>> +	req = blk_mq_rq_to_pdu(rq);
>> +	if (req->offloaded) {
>> +		req->status = cqe->status;
>> +		req->result = cqe->result;
>> +		nvme_tcp_teardown_ddp(queue, cqe->command_id, rq);
>> +	} else {
>> +		if (!nvme_try_complete_req(rq, cqe->status, cqe->result))
>> +			nvme_complete_rq(rq);
>> +	}
>>  	queue->nr_cqe++;
>>  
>>  	return 0;
>> @@ -859,9 +955,18 @@ static int nvme_tcp_recv_pdu(struct nvme_tcp_queue *queue, struct sk_buff *skb,
>>  static inline void nvme_tcp_end_request(struct request *rq, u16 status)
>>  {
>>  	union nvme_result res = {};
>> +	struct nvme_tcp_request *req = blk_mq_rq_to_pdu(rq);
>> +	struct nvme_tcp_queue *queue = req->queue;
>> +	struct nvme_tcp_data_pdu *pdu = (void *)queue->pdu;
>>  
>> -	if (!nvme_try_complete_req(rq, cpu_to_le16(status << 1), res))
>> -		nvme_complete_rq(rq);
>> +	if (req->offloaded) {
>> +		req->status = cpu_to_le16(status << 1);
>> +		req->result = res;
>> +		nvme_tcp_teardown_ddp(queue, pdu->command_id, rq);
>> +	} else {
>> +		if (!nvme_try_complete_req(rq, cpu_to_le16(status << 1), res))
>> +			nvme_complete_rq(rq);
>> +	}
>>  }
>>  
>>  static int nvme_tcp_recv_data(struct nvme_tcp_queue *queue, struct sk_buff *skb,
> 
> 
> The req->offload checks assume the offload is to the expected
> offload_netdev, but you do not verify the data arrived as expected. You
> might get lucky if both netdev's belong to the same PCI device (assuming
> the h/w handles it a certain way), but it will not if the netdev's
> belong to different devices.
> 
> Consider a system with 2 network cards -- even if it is 2 mlx5 based
> devices. One setup can have the system using a bond with 1 port from
> each PCI device. The tx path picks a leg based on the hash of the ntuple
> and that (with Tariq's bond patches) becomes the expected offload
> device. A similar example holds for a pure routing setup with ECMP. For
> both there is full redundancy in the network - separate NIC cards
> connected to separate TORs to have independent network paths.
> 
> A packet arrives on the *other* netdevice - you have *no* control over
> the Rx path. Your current checks will think the packet arrived with DDP
> but it did not.
> 

There's no problem if another (non-offload) netdevice receives traffic
that arrives here. Because that other device will never set the SKB
frag pages to point to the final destination buffers, and so copy
offload will not take place.

The req->offload indication is mainly for matching ddp_setup with
ddp_teardown calls, it does not control copy/crc offload as these are
controlled per-skb using frags for copy and skb bits for crc.


WARNING: multiple messages have this Message-ID (diff)
From: Boris Pismenny <borispismenny@gmail.com>
To: David Ahern <dsahern@gmail.com>,
	Boris Pismenny <borisp@mellanox.com>,
	kuba@kernel.org, davem@davemloft.net, saeedm@nvidia.com,
	hch@lst.de, sagi@grimberg.me, axboe@fb.com, kbusch@kernel.org,
	viro@zeniv.linux.org.uk, edumazet@google.com, smalin@marvell.com
Cc: Yoray Zack <yorayz@mellanox.com>,
	yorayz@nvidia.com, boris.pismenny@gmail.com,
	Ben Ben-Ishay <benishay@mellanox.com>,
	benishay@nvidia.com, linux-nvme@lists.infradead.org,
	netdev@vger.kernel.org, Or Gerlitz <ogerlitz@mellanox.com>,
	ogerlitz@nvidia.com
Subject: Re: [PATCH v2 net-next 07/21] nvme-tcp: Add DDP data-path
Date: Sun, 31 Jan 2021 10:44:12 +0200	[thread overview]
Message-ID: <419231da-615c-10c5-7c98-7e049ac54ee7@gmail.com> (raw)
In-Reply-To: <84cc2af1-22e8-abf5-07da-bc7b4a2b6b12@gmail.com>



On 19/01/2021 6:18, David Ahern wrote:
> On 1/14/21 8:10 AM, Boris Pismenny wrote:
>> @@ -664,8 +753,15 @@ static int nvme_tcp_process_nvme_cqe(struct nvme_tcp_queue *queue,
>>  		return -EINVAL;
>>  	}
>>  
>> -	if (!nvme_try_complete_req(rq, cqe->status, cqe->result))
>> -		nvme_complete_rq(rq);
>> +	req = blk_mq_rq_to_pdu(rq);
>> +	if (req->offloaded) {
>> +		req->status = cqe->status;
>> +		req->result = cqe->result;
>> +		nvme_tcp_teardown_ddp(queue, cqe->command_id, rq);
>> +	} else {
>> +		if (!nvme_try_complete_req(rq, cqe->status, cqe->result))
>> +			nvme_complete_rq(rq);
>> +	}
>>  	queue->nr_cqe++;
>>  
>>  	return 0;
>> @@ -859,9 +955,18 @@ static int nvme_tcp_recv_pdu(struct nvme_tcp_queue *queue, struct sk_buff *skb,
>>  static inline void nvme_tcp_end_request(struct request *rq, u16 status)
>>  {
>>  	union nvme_result res = {};
>> +	struct nvme_tcp_request *req = blk_mq_rq_to_pdu(rq);
>> +	struct nvme_tcp_queue *queue = req->queue;
>> +	struct nvme_tcp_data_pdu *pdu = (void *)queue->pdu;
>>  
>> -	if (!nvme_try_complete_req(rq, cpu_to_le16(status << 1), res))
>> -		nvme_complete_rq(rq);
>> +	if (req->offloaded) {
>> +		req->status = cpu_to_le16(status << 1);
>> +		req->result = res;
>> +		nvme_tcp_teardown_ddp(queue, pdu->command_id, rq);
>> +	} else {
>> +		if (!nvme_try_complete_req(rq, cpu_to_le16(status << 1), res))
>> +			nvme_complete_rq(rq);
>> +	}
>>  }
>>  
>>  static int nvme_tcp_recv_data(struct nvme_tcp_queue *queue, struct sk_buff *skb,
> 
> 
> The req->offload checks assume the offload is to the expected
> offload_netdev, but you do not verify the data arrived as expected. You
> might get lucky if both netdev's belong to the same PCI device (assuming
> the h/w handles it a certain way), but it will not if the netdev's
> belong to different devices.
> 
> Consider a system with 2 network cards -- even if it is 2 mlx5 based
> devices. One setup can have the system using a bond with 1 port from
> each PCI device. The tx path picks a leg based on the hash of the ntuple
> and that (with Tariq's bond patches) becomes the expected offload
> device. A similar example holds for a pure routing setup with ECMP. For
> both there is full redundancy in the network - separate NIC cards
> connected to separate TORs to have independent network paths.
> 
> A packet arrives on the *other* netdevice - you have *no* control over
> the Rx path. Your current checks will think the packet arrived with DDP
> but it did not.
> 

There's no problem if another (non-offload) netdevice receives traffic
that arrives here. Because that other device will never set the SKB
frag pages to point to the final destination buffers, and so copy
offload will not take place.

The req->offload indication is mainly for matching ddp_setup with
ddp_teardown calls, it does not control copy/crc offload as these are
controlled per-skb using frags for copy and skb bits for crc.


_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

  reply	other threads:[~2021-01-31  8:46 UTC|newest]

Thread overview: 68+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-01-14 15:10 [PATCH v2 net-next 00/21] nvme-tcp receive offloads Boris Pismenny
2021-01-14 15:10 ` Boris Pismenny
2021-01-14 15:10 ` [PATCH v2 net-next 01/21] iov_iter: Introduce new procedures for copy to iter/pages Boris Pismenny
2021-01-14 15:10   ` Boris Pismenny
2021-01-14 15:10 ` [PATCH v2 net-next 02/21] net: Introduce direct data placement tcp offload Boris Pismenny
2021-01-14 15:10   ` Boris Pismenny
2021-01-14 15:57   ` Eric Dumazet
2021-01-14 15:57     ` Eric Dumazet
2021-01-14 20:19     ` Boris Pismenny
2021-01-14 20:19       ` Boris Pismenny
2021-01-14 20:43       ` Eric Dumazet
2021-01-14 20:43         ` Eric Dumazet
2021-01-31 10:40         ` Boris Pismenny
2021-01-31 10:40           ` Boris Pismenny
2021-01-14 15:10 ` [PATCH v2 net-next 03/21] net: Introduce crc offload for tcp ddp ulp Boris Pismenny
2021-01-14 15:10   ` Boris Pismenny
2021-01-14 15:10 ` [PATCH v2 net-next 04/21] net: SKB copy(+hash) iterators for DDP offloads Boris Pismenny
2021-01-14 15:10   ` Boris Pismenny
2021-01-14 15:10 ` [PATCH v2 net-next 05/21] net/tls: expose get_netdev_for_sock Boris Pismenny
2021-01-14 15:10   ` Boris Pismenny
2021-01-14 15:10 ` [PATCH v2 net-next 06/21] nvme-tcp: Add DDP offload control path Boris Pismenny
2021-01-14 15:10   ` Boris Pismenny
2021-01-19  3:47   ` David Ahern
2021-01-19  3:47     ` David Ahern
2021-01-31  7:51     ` Boris Pismenny
2021-01-31  7:51       ` Boris Pismenny
2021-01-14 15:10 ` [PATCH v2 net-next 07/21] nvme-tcp: Add DDP data-path Boris Pismenny
2021-01-14 15:10   ` Boris Pismenny
2021-01-19  4:18   ` David Ahern
2021-01-19  4:18     ` David Ahern
2021-01-31  8:44     ` Boris Pismenny [this message]
2021-01-31  8:44       ` Boris Pismenny
2021-01-14 15:10 ` [PATCH v2 net-next 08/21] nvme-tcp : Recalculate crc in the end of the capsule Boris Pismenny
2021-01-14 15:10   ` Boris Pismenny
2021-01-14 15:10 ` [PATCH v2 net-next 09/21] nvme-tcp: Deal with netdevice DOWN events Boris Pismenny
2021-01-14 15:10   ` Boris Pismenny
2021-01-14 15:10 ` [PATCH v2 net-next 10/21] net/mlx5: Header file changes for nvme-tcp offload Boris Pismenny
2021-01-14 15:10   ` Boris Pismenny
2021-01-14 15:10 ` [PATCH v2 net-next 11/21] net/mlx5: Add 128B CQE for NVMEoTCP offload Boris Pismenny
2021-01-14 15:10   ` Boris Pismenny
2021-01-14 15:10 ` [PATCH v2 net-next 12/21] net/mlx5e: TCP flow steering for nvme-tcp Boris Pismenny
2021-01-14 15:10   ` Boris Pismenny
2021-01-14 15:10 ` [PATCH v2 net-next 13/21] net/mlx5e: NVMEoTCP offload initialization Boris Pismenny
2021-01-14 15:10   ` Boris Pismenny
2021-01-14 15:10 ` [PATCH v2 net-next 14/21] net/mlx5e: KLM UMR helper macros Boris Pismenny
2021-01-14 15:10   ` Boris Pismenny
2021-01-14 15:10 ` [PATCH v2 net-next 15/21] net/mlx5e: NVMEoTCP use KLM UMRs Boris Pismenny
2021-01-14 15:10   ` Boris Pismenny
2021-01-14 15:10 ` [PATCH v2 net-next 16/21] net/mlx5e: NVMEoTCP queue init/teardown Boris Pismenny
2021-01-14 15:10   ` Boris Pismenny
2021-01-14 15:10 ` [PATCH v2 net-next 17/21] net/mlx5e: NVMEoTCP async ddp invalidation Boris Pismenny
2021-01-14 15:10   ` Boris Pismenny
2021-01-14 15:10 ` [PATCH v2 net-next 18/21] net/mlx5e: NVMEoTCP ddp setup and resync Boris Pismenny
2021-01-14 15:10   ` Boris Pismenny
2021-01-14 15:10 ` [PATCH v2 net-next 19/21] net/mlx5e: NVMEoTCP, data-path for DDP offload Boris Pismenny
2021-01-14 15:10   ` Boris Pismenny
2021-01-16  4:57   ` David Ahern
2021-01-16  4:57     ` David Ahern
2021-01-17  8:42     ` Boris Pismenny
2021-01-17  8:42       ` Boris Pismenny
2021-01-19  4:36       ` David Ahern
2021-01-19  4:36         ` David Ahern
2021-01-31  9:27         ` Boris Pismenny
2021-01-31  9:27           ` Boris Pismenny
2021-01-14 15:10 ` [PATCH v2 net-next 20/21] net/mlx5e: NVMEoTCP statistics Boris Pismenny
2021-01-14 15:10   ` Boris Pismenny
2021-01-14 15:10 ` [PATCH v2 net-next 21/21] Documentation: add TCP DDP offload documentation Boris Pismenny
2021-01-14 15:10   ` Boris Pismenny

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=419231da-615c-10c5-7c98-7e049ac54ee7@gmail.com \
    --to=borispismenny@gmail.com \
    --cc=axboe@fb.com \
    --cc=benishay@mellanox.com \
    --cc=benishay@nvidia.com \
    --cc=boris.pismenny@gmail.com \
    --cc=borisp@mellanox.com \
    --cc=davem@davemloft.net \
    --cc=dsahern@gmail.com \
    --cc=edumazet@google.com \
    --cc=hch@lst.de \
    --cc=kbusch@kernel.org \
    --cc=kuba@kernel.org \
    --cc=linux-nvme@lists.infradead.org \
    --cc=netdev@vger.kernel.org \
    --cc=ogerlitz@mellanox.com \
    --cc=ogerlitz@nvidia.com \
    --cc=saeedm@nvidia.com \
    --cc=sagi@grimberg.me \
    --cc=smalin@marvell.com \
    --cc=viro@zeniv.linux.org.uk \
    --cc=yorayz@mellanox.com \
    --cc=yorayz@nvidia.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.