All of lore.kernel.org
 help / color / mirror / Atom feed
From: zhenwei pi <pizhenwei@bytedance.com>
To: Parav Pandit <parav@nvidia.com>, Stefan Hajnoczi <stefanha@redhat.com>
Cc: "mst@redhat.com" <mst@redhat.com>,
	"jasowang@redhat.com" <jasowang@redhat.com>,
	"virtio-comment@lists.oasis-open.org"
	<virtio-comment@lists.oasis-open.org>,
	"houp@yusur.tech" <houp@yusur.tech>,
	"helei.sig11@bytedance.com" <helei.sig11@bytedance.com>,
	"xinhao.kong@duke.edu" <xinhao.kong@duke.edu>
Subject: [virtio-comment] Re: RE: Re: Re: Re: [PATCH v2 06/11] transport-fabrics: introduce command set
Date: Fri, 9 Jun 2023 09:39:19 +0800	[thread overview]
Message-ID: <d3455f74-c0a8-5285-0b07-caa99e17ce44@bytedance.com> (raw)
In-Reply-To: <PH0PR12MB5481C447A5FCA25A6D5B28E0DC50A@PH0PR12MB5481.namprd12.prod.outlook.com>



On 6/9/23 01:01, Parav Pandit wrote:
> 
>> From: Stefan Hajnoczi <stefanha@redhat.com>
>> Sent: Thursday, June 8, 2023 12:41 PM
> 
>>>>> For stream protocol, it always work fine.
>>>>> For keyed protocol, for example RDMA, the target side needs to use
>>>>> ibv_post_recv to receive a large size(sizeof
>>>>> virtio_of_command_connect + sizeof virtio_of_connect). If the
>>>>> target uses ibv_post_recv to receive
>>>>> sizeof(CMD) + sizeof(DESC) * 1, the initiator fails in RDMA SEND.
>>>>
>>>> I read that "A RC connection is very similar to a TCP connection" in
>>>> the NVIDIA documentation
>>>> (https://docs.nvidia.com/networking/display/RDMAAwareProgrammingv17/
>>>> Transport+Modes) and expected SOCK_STREAM semantics for RDMA SEND.
>>>>
>>>> Are you saying ibv_post_send() fails when the receiver's work
>>>> request sg_list size is smaller (fewer bytes) than the sender's?
>>>>
>>>
>>> Yes, it will fail.
>>> The receiver get a CQE with status 'IBV_WC_LOC_LEN_ERR', see
>>> https://www.rdmamojo.com/2013/02/15/ibv_poll_cq/
>>
>> Parav: Can you confirm that this is expected?
>>
> Ibv_post_send() will not fail because it is a queuing interface.
> But the send operation itself will fail via send (requester) side completion moving the QP to error.
> Receive q also moves to error.
> 
>> This makes it hard to inline payloads as I was suggesting before :(.
> 
> What I was suggesting in other thread, is if we want to inline the payload, we should do following.
> RDMA write followed by RDMA send. So, a Block write commands actual data can be placed directly in say 4K memory of target.
> 
> This way, sender and receiver works with constant size buffers in send and receive queue.
> RDMA is message based and not byte stream based.
> 
> Inline RDMA write is often called eager buffer, similar to PCIe write combine buffer.
> 
> Both doesn't likely work at scale as the buffer sharing becomes difficult across multiple connections.
> It is memory vs perf trade off.
> But doable.
> 
> We should start with first establishing the data transfer model covering 512B to 1M context and take up the optimizations as extensions.
> 
> 

Hi, Parav

What do you think about another RDMA inline proposal in
'[PATCH v2 11/11] transport-fabrics: support inline data for keyed 
transmission'?

1, use feature command to get the target max recv buffer size, for 
example 16k
2, use feature command to set the initiator max recv buffer size, for 
example 16k
If the size of payload is less than max recv buffer size, using a single 
RDMA SEND is enough. for example, virtio-blk writes 8k: 16 + 8192 < 
16384, this means a single RDMA SEND is fine.

-- 
zhenwei pi

This publicly archived list offers a means to provide input to the
OASIS Virtual I/O Device (VIRTIO) TC.

In order to verify user consent to the Feedback License terms and
to minimize spam in the list archive, subscription is required
before posting.

Subscribe: virtio-comment-subscribe@lists.oasis-open.org
Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
List help: virtio-comment-help@lists.oasis-open.org
List archive: https://lists.oasis-open.org/archives/virtio-comment/
Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
Committee: https://www.oasis-open.org/committees/virtio/
Join OASIS: https://www.oasis-open.org/join/


  reply	other threads:[~2023-06-09  1:41 UTC|newest]

Thread overview: 74+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-05-04  8:18 [virtio-comment] [PATCH v2 00/11] Introduce Virtio Over Fabrics zhenwei pi
2023-05-04  8:19 ` [virtio-comment] [PATCH v2 01/11] transport-fabrics: introduce Virtio Over Fabrics overview zhenwei pi
2023-05-04  8:57   ` David Hildenbrand
2023-05-04  9:46     ` zhenwei pi
2023-05-04 10:05       ` Michael S. Tsirkin
2023-05-04 10:12         ` David Hildenbrand
2023-05-04 10:50         ` Re: " zhenwei pi
2023-05-31 14:00   ` [virtio-comment] " Stefan Hajnoczi
2023-06-02  1:17     ` [virtio-comment] " zhenwei pi
2023-06-05  2:39   ` [virtio-comment] " Parav Pandit
2023-06-05  2:39   ` Parav Pandit
2023-05-04  8:19 ` [virtio-comment] [PATCH v2 02/11] transport-fabrics: introduce Virtio Qualified Name zhenwei pi
2023-05-31 14:06   ` Stefan Hajnoczi
2023-06-02  1:50     ` zhenwei pi
2023-06-05  2:40       ` Parav Pandit
2023-06-05  7:57         ` zhenwei pi
2023-06-05 17:05         ` Stefan Hajnoczi
2023-05-04  8:19 ` [virtio-comment] [PATCH v2 03/11] transport-fabircs: introduce Segment Descriptor Definition zhenwei pi
2023-05-31 14:23   ` Stefan Hajnoczi
2023-06-02  3:08     ` zhenwei pi
2023-06-05  2:40   ` [virtio-comment] " Parav Pandit
2023-05-04  8:19 ` [virtio-comment] [PATCH v2 04/11] transport-fabrics: introduce Stream Transmission zhenwei pi
2023-05-31 15:20   ` Stefan Hajnoczi
2023-06-02  2:26     ` zhenwei pi
2023-06-05 16:11       ` Stefan Hajnoczi
2023-06-06  3:13         ` zhenwei pi
2023-06-06 13:09           ` Stefan Hajnoczi
2023-05-04  8:19 ` [virtio-comment] [PATCH v2 05/11] transport-fabrics: introduce Keyed Transmission zhenwei pi
2023-05-31 16:20   ` [virtio-comment] " Stefan Hajnoczi
2023-06-01  9:02     ` zhenwei pi
2023-06-01 11:33       ` Stefan Hajnoczi
2023-06-01 13:09         ` zhenwei pi
2023-06-01 19:13           ` Stefan Hajnoczi
2023-06-01 21:23             ` Stefan Hajnoczi
2023-06-02  0:55               ` zhenwei pi
2023-06-05 17:21                 ` Stefan Hajnoczi
2023-06-05  2:41   ` Parav Pandit
2023-06-05  8:41     ` zhenwei pi
2023-06-05 11:45       ` Parav Pandit
2023-06-05 12:50         ` zhenwei pi
2023-06-05 13:12           ` Parav Pandit
2023-06-06  7:13             ` zhenwei pi
2023-06-06 21:52               ` Parav Pandit
2023-05-04  8:19 ` [virtio-comment] [PATCH v2 06/11] transport-fabrics: introduce command set zhenwei pi
2023-05-31 17:10   ` [virtio-comment] " Stefan Hajnoczi
2023-06-02  5:15     ` [virtio-comment] " zhenwei pi
2023-06-05 16:30       ` Stefan Hajnoczi
2023-06-06  1:31         ` [virtio-comment] " zhenwei pi
2023-06-06 13:34           ` Stefan Hajnoczi
2023-06-07  2:58             ` [virtio-comment] " zhenwei pi
2023-06-08 16:41               ` Stefan Hajnoczi
2023-06-08 17:01                 ` [virtio-comment] " Parav Pandit
2023-06-09  1:39                   ` zhenwei pi [this message]
2023-06-09  2:06                     ` [virtio-comment] " Parav Pandit
2023-06-09  3:55                       ` zhenwei pi
2023-06-11 20:56                         ` Parav Pandit
2023-06-06  2:02         ` [virtio-comment] " zhenwei pi
2023-06-06 13:44           ` Stefan Hajnoczi
2023-06-07  2:03             ` [virtio-comment] " zhenwei pi
2023-05-04  8:19 ` [virtio-comment] [PATCH v2 07/11] transport-fabrics: introduce opcodes zhenwei pi
2023-05-31 17:11   ` [virtio-comment] " Stefan Hajnoczi
     [not found]   ` <20230531205508.GA1509630@fedora>
2023-06-02  8:39     ` [virtio-comment] " zhenwei pi
2023-06-05 16:46       ` Stefan Hajnoczi
2023-05-04  8:19 ` [virtio-comment] [PATCH v2 08/11] transport-fabrics: introduce status of completion zhenwei pi
2023-05-04  8:19 ` [virtio-comment] [PATCH v2 09/11] transport-fabrics: add TCP&RDMA binding zhenwei pi
     [not found]   ` <20230531210255.GC1509630@fedora>
2023-06-02  9:07     ` [virtio-comment] Re: " zhenwei pi
2023-06-05 16:57       ` Stefan Hajnoczi
2023-06-06  1:41         ` [virtio-comment] " zhenwei pi
2023-06-06 13:51           ` Stefan Hajnoczi
2023-06-07  2:15             ` zhenwei pi
2023-05-04  8:19 ` [virtio-comment] [PATCH v2 10/11] transport-fabrics: add device initialization zhenwei pi
     [not found]   ` <20230531210925.GD1509630@fedora>
2023-06-02  9:11     ` zhenwei pi
2023-05-04  8:19 ` [virtio-comment] [PATCH v2 11/11] transport-fabrics: support inline data for keyed transmission zhenwei pi
2023-05-29  0:56 ` [virtio-comment] PING: [PATCH v2 00/11] Introduce Virtio Over Fabrics zhenwei pi

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=d3455f74-c0a8-5285-0b07-caa99e17ce44@bytedance.com \
    --to=pizhenwei@bytedance.com \
    --cc=helei.sig11@bytedance.com \
    --cc=houp@yusur.tech \
    --cc=jasowang@redhat.com \
    --cc=mst@redhat.com \
    --cc=parav@nvidia.com \
    --cc=stefanha@redhat.com \
    --cc=virtio-comment@lists.oasis-open.org \
    --cc=xinhao.kong@duke.edu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.