linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Tiwei Bie <tiwei.bie@intel.com>
To: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Jason Wang <jasowang@redhat.com>,
	wexu@redhat.com, virtualization@lists.linux-foundation.org,
	linux-kernel@vger.kernel.org, netdev@vger.kernel.org,
	jfreimann@redhat.com
Subject: Re: [RFC v2] virtio: support packed ring
Date: Tue, 24 Apr 2018 09:49:21 +0800	[thread overview]
Message-ID: <20180424014921.2zwfo3pucjbmx2o4@debian> (raw)
In-Reply-To: <20180424044250-mutt-send-email-mst@kernel.org>

On Tue, Apr 24, 2018 at 04:43:22AM +0300, Michael S. Tsirkin wrote:
> On Tue, Apr 24, 2018 at 09:37:47AM +0800, Tiwei Bie wrote:
> > On Tue, Apr 24, 2018 at 04:29:51AM +0300, Michael S. Tsirkin wrote:
> > > On Tue, Apr 24, 2018 at 09:16:38AM +0800, Tiwei Bie wrote:
> > > > On Tue, Apr 24, 2018 at 04:05:07AM +0300, Michael S. Tsirkin wrote:
> > > > > On Tue, Apr 24, 2018 at 08:54:52AM +0800, Jason Wang wrote:
> > > > > > 
> > > > > > 
> > > > > > On 2018年04月23日 17:29, Tiwei Bie wrote:
> > > > > > > On Mon, Apr 23, 2018 at 01:42:14PM +0800, Jason Wang wrote:
> > > > > > > > On 2018年04月01日 22:12, Tiwei Bie wrote:
> > > > > > > > > Hello everyone,
> > > > > > > > > 
> > > > > > > > > This RFC implements packed ring support for virtio driver.
> > > > > > > > > 
> > > > > > > > > The code was tested with DPDK vhost (testpmd/vhost-PMD) implemented
> > > > > > > > > by Jens at http://dpdk.org/ml/archives/dev/2018-January/089417.html
> > > > > > > > > Minor changes are needed for the vhost code, e.g. to kick the guest.
> > > > > > > > > 
> > > > > > > > > TODO:
> > > > > > > > > - Refinements and bug fixes;
> > > > > > > > > - Split into small patches;
> > > > > > > > > - Test indirect descriptor support;
> > > > > > > > > - Test/fix event suppression support;
> > > > > > > > > - Test devices other than net;
> > > > > > > > > 
> > > > > > > > > RFC v1 -> RFC v2:
> > > > > > > > > - Add indirect descriptor support - compile test only;
> > > > > > > > > - Add event suppression supprt - compile test only;
> > > > > > > > > - Move vring_packed_init() out of uapi (Jason, MST);
> > > > > > > > > - Merge two loops into one in virtqueue_add_packed() (Jason);
> > > > > > > > > - Split vring_unmap_one() for packed ring and split ring (Jason);
> > > > > > > > > - Avoid using '%' operator (Jason);
> > > > > > > > > - Rename free_head -> next_avail_idx (Jason);
> > > > > > > > > - Add comments for virtio_wmb() in virtqueue_add_packed() (Jason);
> > > > > > > > > - Some other refinements and bug fixes;
> > > > > > > > > 
> > > > > > > > > Thanks!
> > > > > > > > > 
> > > > > > > > > Signed-off-by: Tiwei Bie <tiwei.bie@intel.com>
> > > > > > > > > ---
> > > > > > > > >    drivers/virtio/virtio_ring.c       | 1094 +++++++++++++++++++++++++++++-------
> > > > > > > > >    include/linux/virtio_ring.h        |    8 +-
> > > > > > > > >    include/uapi/linux/virtio_config.h |   12 +-
> > > > > > > > >    include/uapi/linux/virtio_ring.h   |   61 ++
> > > > > > > > >    4 files changed, 980 insertions(+), 195 deletions(-)
> > > > > > > > > 
> > > > > > > > > diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
> > > > > > > > > index 71458f493cf8..0515dca34d77 100644
> > > > > > > > > --- a/drivers/virtio/virtio_ring.c
> > > > > > > > > +++ b/drivers/virtio/virtio_ring.c
> > > > > > > > > @@ -58,14 +58,15 @@
> > > > > > > > [...]
> > > > > > > > 
> > > > > > > > > +
> > > > > > > > > +	if (vq->indirect) {
> > > > > > > > > +		u32 len;
> > > > > > > > > +
> > > > > > > > > +		desc = vq->desc_state[head].indir_desc;
> > > > > > > > > +		/* Free the indirect table, if any, now that it's unmapped. */
> > > > > > > > > +		if (!desc)
> > > > > > > > > +			goto out;
> > > > > > > > > +
> > > > > > > > > +		len = virtio32_to_cpu(vq->vq.vdev,
> > > > > > > > > +				      vq->vring_packed.desc[head].len);
> > > > > > > > > +
> > > > > > > > > +		BUG_ON(!(vq->vring_packed.desc[head].flags &
> > > > > > > > > +			 cpu_to_virtio16(vq->vq.vdev, VRING_DESC_F_INDIRECT)));
> > > > > > > > It looks to me spec does not force to keep VRING_DESC_F_INDIRECT here. So we
> > > > > > > > can safely remove this BUG_ON() here.
> > > > > > > > 
> > > > > > > > > +		BUG_ON(len == 0 || len % sizeof(struct vring_packed_desc));
> > > > > > > > Len could be ignored for used descriptor according to the spec, so we need
> > > > > > > > remove this BUG_ON() too.
> > > > > > > Yeah, you're right! The BUG_ON() isn't right. I'll remove it.
> > > > > > > And I think something related to this in the spec isn't very
> > > > > > > clear currently.
> > > > > > > 
> > > > > > > In the spec, there are below words:
> > > > > > > 
> > > > > > > https://github.com/oasis-tcs/virtio-spec/blob/d4fec517dfcf/packed-ring.tex#L272
> > > > > > > """
> > > > > > > In descriptors with VIRTQ_DESC_F_INDIRECT set VIRTQ_DESC_F_WRITE
> > > > > > > is reserved and is ignored by the device.
> > > > > > > """
> > > > > > > 
> > > > > > > So when device writes back an used descriptor in this case,
> > > > > > > device may not set the VIRTQ_DESC_F_WRITE flag as the flag
> > > > > > > is reserved and should be ignored.
> > > > > > > 
> > > > > > > https://github.com/oasis-tcs/virtio-spec/blob/d4fec517dfcf/packed-ring.tex#L170
> > > > > > > """
> > > > > > > Element Length is reserved for used descriptors without the
> > > > > > > VIRTQ_DESC_F_WRITE flag, and is ignored by drivers.
> > > > > > > """
> > > > > > > 
> > > > > > > And this is the way how driver ignores the `len` in an used
> > > > > > > descriptor.
> > > > > > > 
> > > > > > > https://github.com/oasis-tcs/virtio-spec/blob/d4fec517dfcf/packed-ring.tex#L241
> > > > > > > """
> > > > > > > To increase ring capacity the driver can store a (read-only
> > > > > > > by the device) table of indirect descriptors anywhere in memory,
> > > > > > > and insert a descriptor in the main virtqueue (with \field{Flags}
> > > > > > > bit VIRTQ_DESC_F_INDIRECT on) that refers to a buffer element
> > > > > > > containing this indirect descriptor table;
> > > > > > > """
> > > > > > > 
> > > > > > > So the indirect descriptors in the table are read-only by
> > > > > > > the device. And the only descriptor which is writeable by
> > > > > > > the device is the descriptor in the main virtqueue (with
> > > > > > > Flags bit VIRTQ_DESC_F_INDIRECT on). So if we ignore the
> > > > > > > `len` in this descriptor, we won't be able to get the
> > > > > > > length of the data written by the device.
> > > > > > > 
> > > > > > > So I think the `len` in this descriptor will carry the
> > > > > > > length of the data written by the device (if the buffers
> > > > > > > are writable to the device) even if the VIRTQ_DESC_F_WRITE
> > > > > > > isn't set by the device. How do you think?
> > > > > > 
> > > > > > Yes I think so. But we'd better need clarification from Michael.
> > > > > 
> > > > > I think if you use a descriptor, and you want to supply len
> > > > > to guest, you set VIRTQ_DESC_F_WRITE in the used descriptor.
> > > > > Spec also says you must not set VIRTQ_DESC_F_INDIRECT then.
> > > > > If that's a problem we could look at relaxing that last requirement -
> > > > > does driver want INDIRECT in used descriptor to match
> > > > > the value in the avail descriptor for some reason?
> > > > 
> > > > For indirect, driver needs some way to get the length
> > > > of the data written by the driver. And the descriptors
> > > > in the indirect table is read-only, so the only place
> > > > device could put this value is the descriptor with the
> > > > VIRTQ_DESC_F_INDIRECT flag set.
> > > 
> > > when writing out used descriptor, device should set VIRTQ_DESC_F_WRITE
> > > (and clear VIRTQ_DESC_F_INDIRECT).
> > 
> > So the spec allows device to set VIRTQ_DESC_F_WRITE bit
> > when writing out an used descriptor even if the corresponding
> > descriptors it just parsed don't have the VIRTQ_DESC_F_WRITE
> > bit set?
> > 
> > Best regards,
> > Tiwei Bie
> 
> I think so. In a used descriptor, VIRTQ_DESC_F_WRITE just means length
> is valid.

Got it. It's very neat. Thanks! :)

Best regards,
Tiwei Bie

> 
> -- 
> MST

      reply	other threads:[~2018-04-24  1:49 UTC|newest]

Thread overview: 28+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-04-01 14:12 [RFC v2] virtio: support packed ring Tiwei Bie
2018-04-10  2:55 ` Jason Wang
2018-04-10  3:21   ` Tiwei Bie
2018-04-13  4:30 ` Jason Wang
2018-04-13  7:15   ` Tiwei Bie
2018-04-17  2:11     ` Jason Wang
2018-04-17  2:17       ` Michael S. Tsirkin
2018-04-17  2:24         ` Jason Wang
2018-04-17  2:37           ` Michael S. Tsirkin
2018-04-17  2:51       ` Tiwei Bie
2018-04-17 12:17         ` Michael S. Tsirkin
2018-04-17 12:47           ` Tiwei Bie
2018-04-17 14:04             ` Michael S. Tsirkin
2018-04-17 14:56               ` Tiwei Bie
2018-04-17 15:54                 ` Michael S. Tsirkin
2018-04-18  1:17                   ` Tiwei Bie
2018-04-13 15:22 ` Michael S. Tsirkin
2018-04-14 11:22   ` Tiwei Bie
2018-04-23  5:42 ` Jason Wang
2018-04-23  9:29   ` Tiwei Bie
2018-04-24  0:54     ` Jason Wang
2018-04-24  1:05       ` Michael S. Tsirkin
2018-04-24  1:14         ` Jason Wang
2018-04-24  1:16         ` Tiwei Bie
2018-04-24  1:29           ` Michael S. Tsirkin
2018-04-24  1:37             ` Tiwei Bie
2018-04-24  1:43               ` Michael S. Tsirkin
2018-04-24  1:49                 ` Tiwei Bie [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20180424014921.2zwfo3pucjbmx2o4@debian \
    --to=tiwei.bie@intel.com \
    --cc=jasowang@redhat.com \
    --cc=jfreimann@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mst@redhat.com \
    --cc=netdev@vger.kernel.org \
    --cc=virtualization@lists.linux-foundation.org \
    --cc=wexu@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).