linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Jason Wang <jasowang@redhat.com>
To: Tiwei Bie <tiwei.bie@intel.com>
Cc: mst@redhat.com, wexu@redhat.com,
	virtualization@lists.linux-foundation.org,
	linux-kernel@vger.kernel.org, netdev@vger.kernel.org,
	jfreimann@redhat.com
Subject: Re: [RFC v2] virtio: support packed ring
Date: Tue, 24 Apr 2018 08:54:52 +0800	[thread overview]
Message-ID: <e08c55df-6ea2-cf7f-d7c9-91e55913ab16@redhat.com> (raw)
In-Reply-To: <20180423092908.77rii3gi7dcaf7o6@debian>



On 2018年04月23日 17:29, Tiwei Bie wrote:
> On Mon, Apr 23, 2018 at 01:42:14PM +0800, Jason Wang wrote:
>> On 2018年04月01日 22:12, Tiwei Bie wrote:
>>> Hello everyone,
>>>
>>> This RFC implements packed ring support for virtio driver.
>>>
>>> The code was tested with DPDK vhost (testpmd/vhost-PMD) implemented
>>> by Jens at http://dpdk.org/ml/archives/dev/2018-January/089417.html
>>> Minor changes are needed for the vhost code, e.g. to kick the guest.
>>>
>>> TODO:
>>> - Refinements and bug fixes;
>>> - Split into small patches;
>>> - Test indirect descriptor support;
>>> - Test/fix event suppression support;
>>> - Test devices other than net;
>>>
>>> RFC v1 -> RFC v2:
>>> - Add indirect descriptor support - compile test only;
>>> - Add event suppression supprt - compile test only;
>>> - Move vring_packed_init() out of uapi (Jason, MST);
>>> - Merge two loops into one in virtqueue_add_packed() (Jason);
>>> - Split vring_unmap_one() for packed ring and split ring (Jason);
>>> - Avoid using '%' operator (Jason);
>>> - Rename free_head -> next_avail_idx (Jason);
>>> - Add comments for virtio_wmb() in virtqueue_add_packed() (Jason);
>>> - Some other refinements and bug fixes;
>>>
>>> Thanks!
>>>
>>> Signed-off-by: Tiwei Bie <tiwei.bie@intel.com>
>>> ---
>>>    drivers/virtio/virtio_ring.c       | 1094 +++++++++++++++++++++++++++++-------
>>>    include/linux/virtio_ring.h        |    8 +-
>>>    include/uapi/linux/virtio_config.h |   12 +-
>>>    include/uapi/linux/virtio_ring.h   |   61 ++
>>>    4 files changed, 980 insertions(+), 195 deletions(-)
>>>
>>> diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
>>> index 71458f493cf8..0515dca34d77 100644
>>> --- a/drivers/virtio/virtio_ring.c
>>> +++ b/drivers/virtio/virtio_ring.c
>>> @@ -58,14 +58,15 @@
>> [...]
>>
>>> +
>>> +	if (vq->indirect) {
>>> +		u32 len;
>>> +
>>> +		desc = vq->desc_state[head].indir_desc;
>>> +		/* Free the indirect table, if any, now that it's unmapped. */
>>> +		if (!desc)
>>> +			goto out;
>>> +
>>> +		len = virtio32_to_cpu(vq->vq.vdev,
>>> +				      vq->vring_packed.desc[head].len);
>>> +
>>> +		BUG_ON(!(vq->vring_packed.desc[head].flags &
>>> +			 cpu_to_virtio16(vq->vq.vdev, VRING_DESC_F_INDIRECT)));
>> It looks to me spec does not force to keep VRING_DESC_F_INDIRECT here. So we
>> can safely remove this BUG_ON() here.
>>
>>> +		BUG_ON(len == 0 || len % sizeof(struct vring_packed_desc));
>> Len could be ignored for used descriptor according to the spec, so we need
>> remove this BUG_ON() too.
> Yeah, you're right! The BUG_ON() isn't right. I'll remove it.
> And I think something related to this in the spec isn't very
> clear currently.
>
> In the spec, there are below words:
>
> https://github.com/oasis-tcs/virtio-spec/blob/d4fec517dfcf/packed-ring.tex#L272
> """
> In descriptors with VIRTQ_DESC_F_INDIRECT set VIRTQ_DESC_F_WRITE
> is reserved and is ignored by the device.
> """
>
> So when device writes back an used descriptor in this case,
> device may not set the VIRTQ_DESC_F_WRITE flag as the flag
> is reserved and should be ignored.
>
> https://github.com/oasis-tcs/virtio-spec/blob/d4fec517dfcf/packed-ring.tex#L170
> """
> Element Length is reserved for used descriptors without the
> VIRTQ_DESC_F_WRITE flag, and is ignored by drivers.
> """
>
> And this is the way how driver ignores the `len` in an used
> descriptor.
>
> https://github.com/oasis-tcs/virtio-spec/blob/d4fec517dfcf/packed-ring.tex#L241
> """
> To increase ring capacity the driver can store a (read-only
> by the device) table of indirect descriptors anywhere in memory,
> and insert a descriptor in the main virtqueue (with \field{Flags}
> bit VIRTQ_DESC_F_INDIRECT on) that refers to a buffer element
> containing this indirect descriptor table;
> """
>
> So the indirect descriptors in the table are read-only by
> the device. And the only descriptor which is writeable by
> the device is the descriptor in the main virtqueue (with
> Flags bit VIRTQ_DESC_F_INDIRECT on). So if we ignore the
> `len` in this descriptor, we won't be able to get the
> length of the data written by the device.
>
> So I think the `len` in this descriptor will carry the
> length of the data written by the device (if the buffers
> are writable to the device) even if the VIRTQ_DESC_F_WRITE
> isn't set by the device. How do you think?

Yes I think so. But we'd better need clarification from Michael.

>
>
>> The reason is we don't touch descriptor ring in the case of split, so
>> BUG_ON()s may help there.
>>
>>> +
>>> +		for (j = 0; j < len / sizeof(struct vring_packed_desc); j++)
>>> +			vring_unmap_one_packed(vq, &desc[j]);
>>> +
>>> +		kfree(desc);
>>> +		vq->desc_state[head].indir_desc = NULL;
>>> +	} else if (ctx) {
>>> +		*ctx = vq->desc_state[head].indir_desc;
>>> +	}
>>> +
>>> +out:
>>> +	return vq->desc_state[head].num;
>>> +}
>>> +
>>> +static inline bool more_used_split(const struct vring_virtqueue *vq)
>>>    {
>>>    	return vq->last_used_idx != virtio16_to_cpu(vq->vq.vdev, vq->vring.used->idx);
>>>    }
>>> +static inline bool more_used_packed(const struct vring_virtqueue *vq)
>>> +{
>>> +	u16 last_used, flags;
>>> +	bool avail, used;
>>> +
>>> +	if (vq->vq.num_free == vq->vring_packed.num)
>>> +		return false;
>>> +
>>> +	last_used = vq->last_used_idx;
>>> +	flags = virtio16_to_cpu(vq->vq.vdev,
>>> +				vq->vring_packed.desc[last_used].flags);
>>> +	avail = flags & VRING_DESC_F_AVAIL(1);
>>> +	used = flags & VRING_DESC_F_USED(1);
>>> +
>>> +	return avail == used;
>>> +}
>> This looks interesting, spec said:
>>
>> "
>> Thus VIRTQ_DESC_F_AVAIL and VIRTQ_DESC_F_USED bits are different for an
>> available descriptor and
>> equal for a used descriptor.
>> Note that this observation is mostly useful for sanity-checking as these are
>> necessary but not sufficient
>> conditions - for example, all descriptors are zero-initialized. To detect
>> used and available descriptors it is
>> possible for drivers and devices to keep track of the last observed value of
>> VIRTQ_DESC_F_USED/VIRTQ_-
>> DESC_F_AVAIL. Other techniques to detect
>> VIRTQ_DESC_F_AVAIL/VIRTQ_DESC_F_USED bit changes
>> might also be possible.
>> "
>>
>> So it looks to me it was not sufficient, looking at the example codes in
>> spec, do we need to track last seen used_wrap_counter here?
> I don't think we have to track used_wrap_counter in
> driver. There was a discussion on this:
>
> https://lists.oasis-open.org/archives/virtio-dev/201802/msg00177.html
>
> And after that, below sentence was added (it's also
> in the above words you quoted):
>
> """
> Other techniques to detect
> VIRTQ_DESC_F_AVAIL/VIRTQ_DESC_F_USED bit changes
> might also be possible.
> """
>
> Best regards,
> Tiwei Bie

I see, the extra condition "if (vq->vq.num_free == 
vq->vring_packed.num)" help in this case.

Thanks

>
>> Thanks

  reply	other threads:[~2018-04-24  0:55 UTC|newest]

Thread overview: 28+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-04-01 14:12 [RFC v2] virtio: support packed ring Tiwei Bie
2018-04-10  2:55 ` Jason Wang
2018-04-10  3:21   ` Tiwei Bie
2018-04-13  4:30 ` Jason Wang
2018-04-13  7:15   ` Tiwei Bie
2018-04-17  2:11     ` Jason Wang
2018-04-17  2:17       ` Michael S. Tsirkin
2018-04-17  2:24         ` Jason Wang
2018-04-17  2:37           ` Michael S. Tsirkin
2018-04-17  2:51       ` Tiwei Bie
2018-04-17 12:17         ` Michael S. Tsirkin
2018-04-17 12:47           ` Tiwei Bie
2018-04-17 14:04             ` Michael S. Tsirkin
2018-04-17 14:56               ` Tiwei Bie
2018-04-17 15:54                 ` Michael S. Tsirkin
2018-04-18  1:17                   ` Tiwei Bie
2018-04-13 15:22 ` Michael S. Tsirkin
2018-04-14 11:22   ` Tiwei Bie
2018-04-23  5:42 ` Jason Wang
2018-04-23  9:29   ` Tiwei Bie
2018-04-24  0:54     ` Jason Wang [this message]
2018-04-24  1:05       ` Michael S. Tsirkin
2018-04-24  1:14         ` Jason Wang
2018-04-24  1:16         ` Tiwei Bie
2018-04-24  1:29           ` Michael S. Tsirkin
2018-04-24  1:37             ` Tiwei Bie
2018-04-24  1:43               ` Michael S. Tsirkin
2018-04-24  1:49                 ` Tiwei Bie

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=e08c55df-6ea2-cf7f-d7c9-91e55913ab16@redhat.com \
    --to=jasowang@redhat.com \
    --cc=jfreimann@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mst@redhat.com \
    --cc=netdev@vger.kernel.org \
    --cc=tiwei.bie@intel.com \
    --cc=virtualization@lists.linux-foundation.org \
    --cc=wexu@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).