All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Michael S. Tsirkin" <mst@redhat.com>
To: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Cc: virtio-dev@lists.oasis-open.org,
	virtualization@lists.linux-foundation.org
Subject: Re: [virtio-dev] packed ring layout proposal v2
Date: Wed, 1 Mar 2017 06:14:21 +0200	[thread overview]
Message-ID: <20170301060913-mutt-send-email-mst@kernel.org> (raw)
In-Reply-To: <20170301035715.GP18844@yliu-dev.sh.intel.com>

On Wed, Mar 01, 2017 at 11:57:15AM +0800, Yuanhan Liu wrote:
> On Wed, Mar 01, 2017 at 03:02:29AM +0200, Michael S. Tsirkin wrote:
> > > > * Descriptor ring:
> > > > 
> > > > Guest adds descriptors with unique index values and DESC_HW set in flags.
> > > > Host overwrites used descriptors with correct len, index, and DESC_HW
> > > > clear.  Flags are always set/cleared last.
> > > 
> > > May I know what's the index intended for? Back referencing a pkt buffer?
> > 
> > Yes and generally identify which descriptor completed. Recall that
> > even though vhost net completes in order at the moment,
> > virtio rings serve devices (e.g. storage) that complete out of order.
> 
> I see, and thanks.
> 
> > > That said, if it's "16: 16" and if we use only 8 bits for batch, we
> > > could still have another 8 bit for anything else, say the number of
> > > desc for a single pkt. With that, the num_buffers of mergeable Rx
> > > header could be replaced.  More importantly, we could reduce a cache
> > > line write if non offload are supported in mergeable Rx path. 
> > 
> > Why do you bother with mergeable Rx without offloads?
> 
> Oh, my bad. I actually meant "without offloads __being used__". Just
> assume the pkt size is 64B and no offloads are used. When mergeable
> Rx is negotiated (which is the default case), num_buffers has to be
> set 1. That means an extra cache line write. For the case of non
> mergeable, the cache line write could be avoid by a trick like what
> the following patch did:
> 
>    http://dpdk.org/browse/dpdk/commit/?id=c9ea670c1dc7e3f111d8139f915082b60c9c1ffe
> 
> It basically tries to avoid writing 0 if the value is already 0:
> the case when no offloads are used.
> So while writing this email, I was thinking maybe we could not set
> num_buffers to 1 when there is only one desc (let it be 0 and let
> num_buffers == 0 imply num_buffers = 1). I'm not quite sure we can
> do that now, thus I checked the DPDK code and found it's Okay.
> 
>     896                 seg_num = header->num_buffers;
>     897
>     898                 if (seg_num == 0)
>     899                         seg_num = 1;
> 
> 
> I then also checked linux kernel code, and found it's not okay as
> the code depends on the value being set correctly:
> 
> ==> 365         u16 num_buf = virtio16_to_cpu(vi->vdev, hdr->num_buffers);
>     366         struct page *page = virt_to_head_page(buf);
>     367         int offset = buf - page_address(page);
>     368         unsigned int truesize = max(len, mergeable_ctx_to_buf_truesize(ctx));
>     369
>     370         struct sk_buff *head_skb = page_to_skb(vi, rq, page, offset, len,
>     371                                                truesize);
>     372         struct sk_buff *curr_skb = head_skb;
>     373
>     374         if (unlikely(!curr_skb))
>     375                 goto err_skb;
> ==> 376         while (--num_buf) {
> 
> That means, if we want to do that, it needs an extra feature flag
> (either a global feature flag or a desc flag), something like
> ALLOW_ZERO_NUM_BUFFERS. Or even, make it allowable in virtio 1.1
> (virtio 0.95/1.0 won't benifit from it though).
> 
> Does it make sense to you?

Right and then we could use a descriptor flag "header is all 0s".
For virtio 1.0 we could put these in the used ring instead.

> 
> > Make each buffer
> > MTU sized and it'll fit without merging.  Linux used not to, it only
> > started doing this to save memory aggressively. I don't think
> > DPDK cares about this.
> > 
> > > 
> > > 	struct desc {
> > > 	        __le64 addr;
> > > 	        __le16 len;
> > > 	        __le8  batch;
> > > 	        __le8  num_buffers;
> > > 	        __le16 index;
> > > 	        __le16 flags;
> > > 	};
> > 
> > Interesting. How about a benchmark for these ideas?
> 
> Sure, I would like to benchmark it. It won't take long to me. But
> currently, I was still focusing on analysising the performance behaviour
> of virtio 0.95/1.0 (when I get some time), to see what's not good for
> performance and what's can be improved.
> 
> Besides that, as said, I often got interrupted. Moreoever, DPDK v17.05
> code freeze deadline is coming. So it's just a remind that I may don't
> have time for it recently. Sorry for that.
> 
> > > > * Device specific descriptor flags
> > > > We have a lot of unused space in the descriptor.  This can be put to
> > > > good use by reserving some flag bits for device use.
> > > > For example, network device can set a bit to request
> > > > that header in the descriptor is suppressed
> > > > (in case it's all 0s anyway). This reduces cache utilization.
> > > 
> > > Good proposal. But I think it could only work with Tx, where the driver
> > > knows whether the headers are all 0s while filling the desc. But for
> > > Rx, whereas the driver has to pre-fill the desc, it doesn't know. Thus
> > > it still requires filling an header desc for storing it.
> > 
> > I don't see why - I don't think drivers pre-fill buffers in header for RX
> > right now. Why would they start?
> 
> Again, my bad, I meant "prepare" but not "pre-fill". Let me try to explain
> it again. I'm thinking:
> 
> - For Tx, when the header is all 0s, the header could be discarded. We
>   could use one desc for transfering a packet (with a flag NO_HEADER
>   or HEADER_ALL_ZERO bit set)
> 
> - For Rx, the header is filled in the device (or vhost) side. And the
>   driver has to prepare the header desc for each pkt, because the Rx
>   driver has no idea whether it will be all 0s.
> 
>   That means, the header could not be discarded.
> 
> If such a global feature is negotiated, we could also discard the header
> desc as well.
> 
> 	--yliu

Right and again, flags could be added to the used ring to pass extra
info.

> > > Maybe we could introduce a global feature? When that's negotiated, no
> > > header desc need filled and processed? I'm thinking this could also
> > > help the vector implementation I mentioned in another email.
> > 
> > It's possible of course - it's a subset of what I said.
> > Though it makes it less useful in the general case.
> > 
> > > > Note: this feature can be supported in virtio 1.0 as well,
> > > > as we have unused bits in both descriptor and used ring there.
> > > 
> > > Agreed.
> > > 
> > > 	--yliu
> > 
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
> > For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org

  parent reply	other threads:[~2017-03-01  4:14 UTC|newest]

Thread overview: 91+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-09-10  5:06 [virtio-dev] packed ring layout proposal v3 Michael S. Tsirkin
2017-02-08 13:37 ` packed ring layout proposal v2 Christian Borntraeger
2017-02-09 17:43   ` Michael S. Tsirkin
     [not found]   ` <20170209181955-mutt-send-email-mst@kernel.org>
2017-02-09 18:27     ` Christian Borntraeger
2017-02-08 17:41 ` [virtio-dev] " Paolo Bonzini
2017-02-08 19:59   ` Michael S. Tsirkin
     [not found]   ` <20170208214435-mutt-send-email-mst@kernel.org>
2017-02-09 15:48     ` Paolo Bonzini
2017-02-09 16:11       ` Cornelia Huck
2017-02-09 18:24       ` Michael S. Tsirkin
     [not found]       ` <20170209202203-mutt-send-email-mst@kernel.org>
2017-02-10 11:32         ` Paolo Bonzini
     [not found]         ` <c229269b-1702-ffec-62e8-002c7c142904@redhat.com>
2017-02-10 15:20           ` Michael S. Tsirkin
2017-02-10 16:17             ` Paolo Bonzini
     [not found]       ` <20170209171105.075a9d9c.cornelia.huck@de.ibm.com>
2017-02-22 16:43         ` Michael S. Tsirkin
     [not found]         ` <20170222181333-mutt-send-email-mst@kernel.org>
2017-03-07 15:53           ` Cornelia Huck
2017-03-07 20:33             ` Michael S. Tsirkin
     [not found]             ` <20170307223057-mutt-send-email-mst@kernel.org>
2017-07-10 16:27               ` Amnon Ilan
2017-07-10 16:27               ` Amnon Ilan
2017-02-22  4:27 ` packed ring layout proposal - todo list Michael S. Tsirkin
     [not found] ` <20170222054336-mutt-send-email-mst@kernel.org>
2017-02-22  9:19   ` [virtio-dev] " Gray, Mark D
     [not found]   ` <738D45BC1F695740A983F43CFE1B7EA94E93CA7E@IRSMSX108.ger.corp.intel.com>
2017-02-22 15:13     ` Michael S. Tsirkin
2017-02-28  4:29   ` Yuanhan Liu
     [not found]   ` <20170228042943.GH18844@yliu-dev.sh.intel.com>
2017-03-01  1:07     ` Michael S. Tsirkin
2017-03-08  7:09       ` Yuanhan Liu
     [not found]       ` <20170308070948.GC18844@yliu-dev.sh.intel.com>
2017-03-08  7:56         ` Yuanhan Liu
     [not found]         ` <20170308075624.GF18844@yliu-dev.sh.intel.com>
2017-03-29 12:39           ` Michael S. Tsirkin
2017-04-01  7:30             ` Yuanhan Liu
2017-02-22 14:46 ` [virtio-dev] packed ring layout proposal v2 Chien, Roger S
2017-02-28  5:02 ` Yuanhan Liu
2017-02-28  5:47 ` [RFC] packed (virtio-net) headers Yuanhan Liu
     [not found] ` <20170228050218.GI18844@yliu-dev.sh.intel.com>
2017-03-01  1:02   ` [virtio-dev] packed ring layout proposal v2 Michael S. Tsirkin
     [not found]   ` <20170301024951-mutt-send-email-mst@kernel.org>
2017-03-01  3:57     ` Yuanhan Liu
     [not found]     ` <20170301035715.GP18844@yliu-dev.sh.intel.com>
2017-03-01  4:14       ` Michael S. Tsirkin [this message]
2017-03-01  4:57         ` Yuanhan Liu
     [not found] ` <20170228054719.GJ18844@yliu-dev.sh.intel.com>
2017-03-01  1:28   ` [RFC] packed (virtio-net) headers Michael S. Tsirkin
2017-07-16  6:00 ` [virtio-dev] packed ring layout proposal v2 Lior Narkis
2017-07-18 16:23   ` Michael S. Tsirkin
2017-07-18 16:23     ` Michael S. Tsirkin
2017-07-19  7:41     ` Lior Narkis
2017-07-20 13:06       ` Michael S. Tsirkin
2017-07-20 13:06         ` Michael S. Tsirkin
2017-07-19  7:41     ` Lior Narkis
2017-07-16  6:00 ` Lior Narkis
2017-09-11  7:47 ` [virtio-dev] Re: packed ring layout proposal v3 Jason Wang
2017-09-12 16:23   ` Willem de Bruijn
2017-09-13  1:26     ` Jason Wang
2017-09-13  1:26     ` Jason Wang
2017-09-12 16:23   ` Willem de Bruijn
2017-09-11  7:47 ` Jason Wang
2017-09-12 16:20 ` [virtio-dev] " Willem de Bruijn
2017-09-12 16:20 ` Willem de Bruijn
2017-09-14  8:23 ` Ilya Lesokhin
2017-09-20  9:11 ` [virtio-dev] " Liang, Cunming
2017-09-20  9:11   ` Liang, Cunming
2017-09-25 22:24   ` Michael S. Tsirkin
2017-09-25 22:24   ` Michael S. Tsirkin
2017-09-26 23:38     ` Steven Luong (sluong)
2017-09-27 23:49       ` Michael S. Tsirkin
2017-09-27 23:49         ` Michael S. Tsirkin
2017-09-28  9:44         ` Liang, Cunming
2017-10-01  4:08           ` Michael S. Tsirkin
2017-10-01  4:08             ` Michael S. Tsirkin
2017-10-04 12:39             ` Jens Freimann
2017-10-04 12:58               ` Michael S. Tsirkin
2017-10-04 12:58               ` Michael S. Tsirkin
2017-10-10  9:56                 ` Liang, Cunming
2017-10-10  9:56                 ` Liang, Cunming
2017-10-11 12:22                   ` Jens Freimann
2017-09-28  9:44         ` Liang, Cunming
2017-09-28 21:13         ` Michael S. Tsirkin
2017-09-28 21:13         ` Michael S. Tsirkin
2017-09-26 23:38     ` Steven Luong (sluong)
2017-09-21 13:36 ` Liang, Cunming
2017-09-21 13:36   ` Liang, Cunming
2017-09-28 21:27   ` Michael S. Tsirkin
2017-09-28 21:27   ` Michael S. Tsirkin
2017-10-08  6:16 ` Ilya Lesokhin
2017-10-08  6:16 ` [virtio-dev] " Ilya Lesokhin
2017-10-25 16:20   ` Michael S. Tsirkin
2017-10-25 16:20     ` [virtio-dev] " Michael S. Tsirkin
2017-10-29  9:05     ` Ilya Lesokhin
2017-10-29  9:05     ` [virtio-dev] " Ilya Lesokhin
2017-10-29 14:21       ` Michael S. Tsirkin
2017-10-29 14:21         ` [virtio-dev] " Michael S. Tsirkin
2017-10-29 14:34         ` Ilya Lesokhin
2017-10-29 14:34         ` [virtio-dev] " Ilya Lesokhin
2017-10-30  2:08           ` Michael S. Tsirkin
2017-10-30  2:08             ` [virtio-dev] " Michael S. Tsirkin
2017-10-30  6:30             ` [virtio-dev] " Ilya Lesokhin
2017-10-30 16:30               ` Michael S. Tsirkin
2017-10-30 16:30                 ` [virtio-dev] " Michael S. Tsirkin
2017-10-30  6:30             ` Ilya Lesokhin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20170301060913-mutt-send-email-mst@kernel.org \
    --to=mst@redhat.com \
    --cc=virtio-dev@lists.oasis-open.org \
    --cc=virtualization@lists.linux-foundation.org \
    --cc=yuanhan.liu@linux.intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.