All of lore.kernel.org
 help / color / mirror / Atom feed
From: Heng Qi <hengqi@linux.alibaba.com>
To: "Michael S. Tsirkin" <mst@redhat.com>
Cc: virtio-dev@lists.oasis-open.org, jasowang@redhat.com,
	xuanzhuo@linux.alibaba.com, kangjie.xu@linux.alibaba.com
Subject: Re: [virtio-dev] Re: [PATCH v7] virtio_net: support split header
Date: Fri, 9 Sep 2022 18:22:59 +0800	[thread overview]
Message-ID: <20220909102259.GA64822@h68b04307.sqa.eu95> (raw)
In-Reply-To: <20220904162337-mutt-send-email-mst@kernel.org>

On Sun, Sep 04, 2022 at 04:27:38PM -0400, Michael S. Tsirkin wrote:
> On Fri, Sep 02, 2022 at 03:36:25PM +0800, Heng Qi wrote:
> > We need to clarify that the purpose of header splitting is to make all payloads
> > can be independently in a page, which is beneficial for the zerocopy
> > implemented by the upper layer.
> 
> absolutely, pls add motivation.
> 
> > If the driver does not enforce that the buffers submitted to the receiveq MUST
> > be composed of at least two descriptors, then header splitting will become meaningless,
> > or the VIRTIO_NET_F_SPLIT_TRANSPORT_HEADER feature should not be negotiated at this time.
> > 
> > 
> > Thanks.
> > 
> > 
> 
> 
> This seems very narrow and unecessarily wasteful of descriptors.
> What is wrong in this:
> 
> <header>...<padding>... <beginning of page><data>
> 
> seems to achieve the goal of data in a separate page without
> using extra descriptors.
> 
> thus my proposal to replace the requirement of a separate
> descriptor with an offset of data from beginning of
> buffer that driver sets.
>


We have carefully considered your suggestion. 

Let's summarize the schemes we've considered before and now.

1. Scheme A ( refer to spec v7 )

We refer to spec v7 and earlier as scheme A for short. Review scheme A below: 
|                         receive buffer                            | 
|              0th descriptor                      | 1th descriptor | 
| virtnet hdr | mac | ip hdr | tcp hdr|<-- hold -->|      payload   | 

We use a buffer plus a separate page when allocating the receive
buffer. In this way, we can ensure that all payloads can be put
independently in a page, which is very beneficial for the zerocopy 
implemented by the upper layer. 

Scheme A better solves the problem of headroom, tailroom and
memory waste, but as you said, this solution relies on descriptor chain. 

2. Scheme B ( refer to your suggestion )

Our rethinking approach is no longer based on descriptor chain.
 
We refer to your proposed offset-based scheme as scheme B.
As you suggested, scheme B gives the device a buffer, using offset to
indicate where to place the payload. Like this: 

<header>...<padding>... <beginning of page><data> 

But how to apply for this buffer?
Since we want the payload to be placed on a separate page, the method
we consider is to directly alloc two pages from driver of contiguous memory. 

Then the beginning of this contiguous memory is used to store the headroom,
and the contiguous memory after the headroom is directly handed over to the device.
Similar to the following: 

[------------------ receive buffer(2 pages) ------------------------------] 
[<------------first page -------------------><------ second page -------->] 
[<-----><virtnet hdr> <mac,ip,tcp>..<padding><       payload             >] 
   ^    ^
   |    |
   |    pointer to device
   |
   |
   Driver reserved, the later part is filled

3. Scheme C (this sheme we have sent to you on September 7th, maybe you miss it.)

Based on your previous suggestion, we also considered another new scheme C. 
This scheme is implemented based on mergeable buffer, filling a separate page each time. 

If the split header is negotiated and the packet can be successfully split by the device,
the device needs to find at least two buffers, namely two pages, one for the virtio-net header
and transport header, and the other for the payload. Like the following: 

|                       receive buffer1(page)      |     receive buffer2 (page)   | 
| virtnet hdr | mac | ip hdr | tcp hdr|<-- hold -->|         payload              | 

At the same time, if XDP is considered, then the device needs to add headroom at the
beginning of receive buffer1 when receiving packets, so that the driver can process
programs similar to XDP. 

In order to solve this problem, scheme C introduce an offset which requires
the device to write data from the offset to receive buffer1, like the following: 

|                   receive buffer (page)                                 | receive buffer (page) | 
| <-- offset(hold) --> | virtnet hdr | mac | ip hdr | tcp hdr|<-- hold -->|         payload       | 
^
|
pointer to device
					   


4. Summarize

Then we simply compare the advantages and disadvantages of scheme A(spec v7),
scheme B (offset buffer(2 pages)) and scheme C (based on mergeable buffer): 

1). descriptor chain: 
         i)  A depends on desciptor chain;
         ii) B, C do not depend on desciptor chain. 

2). page alloc 
         i) B fills with two consecutive pages, which causes a great waste of memory
                for small packages such as arp;
         ii) C fills with a single page, slightly better than B. 

3). Memory waste: 
         i) The memory waste of scheme A is mainly the 0th descriptor
                that is skipped by the device;
         ii) When scheme B and scheme C successfully split the header,
                there is a huge waste of the first page, but the first page
                can be quickly released by copying in driver. 

4). headroom 
         i) The headrooms of plan A and plan B are reserved;
         ii) Scheme C requires the driver to set offset to let the device skip
                offset when using receive buffer1. 

Which plan do you prefer? 

Thanks. 



> 
> -- 
> MST
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
> For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org

---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


  parent reply	other threads:[~2022-09-09 10:23 UTC|newest]

Thread overview: 32+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-08-16  9:34 [virtio-dev] [PATCH v7] virtio_net: support split header Heng Qi
2022-08-25 14:22 ` Cornelia Huck
2022-08-30 11:23   ` Heng Qi
2022-08-30 11:26 ` Heng Qi
2022-09-02  4:12 ` Heng Qi
2022-09-08 21:18   ` Michael S. Tsirkin
2022-09-02  6:21 ` [virtio-dev] " Jason Wang
2022-09-02  6:41   ` Michael S. Tsirkin
2022-09-02  8:58     ` Heng Qi
2022-09-04 20:31       ` Michael S. Tsirkin
2022-09-05  7:52         ` Xuan Zhuo
2022-09-05  8:37           ` Heng Qi
2022-09-05  9:43             ` Xuan Zhuo
2022-09-06  5:47               ` Jason Wang
2022-09-08 21:18               ` Michael S. Tsirkin
2022-09-02  7:36   ` Heng Qi
2022-09-04 20:27     ` Michael S. Tsirkin
2022-09-06  5:56       ` Jason Wang
2022-09-09  7:41       ` [virtio-dev] " Heng Qi
2022-09-09 11:15         ` Michael S. Tsirkin
2022-09-09 12:38           ` Xuan Zhuo
2022-09-14  3:34             ` Jason Wang
2022-09-27 21:35               ` Michael S. Tsirkin
2022-09-28  2:15                 ` Heng Qi
2022-09-28  8:01                 ` Xuan Zhuo
2022-09-09 12:47           ` Xuan Zhuo
2022-09-13  7:20             ` Heng Qi
2022-09-09 10:22       ` Heng Qi [this message]
2022-09-02  8:26   ` Heng Qi
2022-09-06  5:53     ` Jason Wang
2022-09-02  6:48 ` Michael S. Tsirkin
2022-09-07 11:16 ` [virtio-dev] " Heng Qi

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20220909102259.GA64822@h68b04307.sqa.eu95 \
    --to=hengqi@linux.alibaba.com \
    --cc=jasowang@redhat.com \
    --cc=kangjie.xu@linux.alibaba.com \
    --cc=mst@redhat.com \
    --cc=virtio-dev@lists.oasis-open.org \
    --cc=xuanzhuo@linux.alibaba.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.