All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jonah Palmer <jonah.palmer@oracle.com>
To: Eugenio Perez Martin <eperezma@redhat.com>
Cc: qemu-devel@nongnu.org, mst@redhat.com, raphael@enfabrica.net,
	kwolf@redhat.com, hreitz@redhat.com, jasowang@redhat.com,
	pbonzini@redhat.com, fam@euphon.net, stefanha@redhat.com,
	qemu-block@nongnu.org, schalla@marvell.com, leiyang@redhat.com,
	virtio-fs@lists.linux.dev, si-wei.liu@oracle.com,
	boris.ostrovsky@oracle.com
Subject: Re: [RFC v2 1/5] virtio: Initialize sequence variables
Date: Thu, 4 Apr 2024 10:41:26 -0400	[thread overview]
Message-ID: <42b8e8bf-ecfb-4633-b1b6-a33435f81cdc@oracle.com> (raw)
In-Reply-To: <CAJaqyWe3bw6WML9MLEhnpNuWL0_HUQJ4WrSyr1jnGHs6Jwn0-Q@mail.gmail.com>



On 4/4/24 7:35 AM, Eugenio Perez Martin wrote:
> On Wed, Apr 3, 2024 at 6:51 PM Jonah Palmer <jonah.palmer@oracle.com> wrote:
>>
>>
>>
>> On 4/3/24 6:18 AM, Eugenio Perez Martin wrote:
>>> On Thu, Mar 28, 2024 at 5:22 PM Jonah Palmer <jonah.palmer@oracle.com> wrote:
>>>>
>>>> Initialize sequence variables for VirtQueue and VirtQueueElement
>>>> structures. A VirtQueue's sequence variables are initialized when a
>>>> VirtQueue is being created or reset. A VirtQueueElement's sequence
>>>> variable is initialized when a VirtQueueElement is being initialized.
>>>> These variables will be used to support the VIRTIO_F_IN_ORDER feature.
>>>>
>>>> A VirtQueue's used_seq_idx represents the next expected index in a
>>>> sequence of VirtQueueElements to be processed (put on the used ring).
>>>> The next VirtQueueElement added to the used ring must match this
>>>> sequence number before additional elements can be safely added to the
>>>> used ring. It's also particularly useful for helping find the number of
>>>> new elements added to the used ring.
>>>>
>>>> A VirtQueue's current_seq_idx represents the current sequence index.
>>>> This value is essentially a counter where the value is assigned to a new
>>>> VirtQueueElement and then incremented. Given its uint16_t type, this
>>>> sequence number can be between 0 and 65,535.
>>>>
>>>> A VirtQueueElement's seq_idx represents the sequence number assigned to
>>>> the VirtQueueElement when it was created. This value must match with the
>>>> VirtQueue's used_seq_idx before the element can be put on the used ring
>>>> by the device.
>>>>
>>>> Signed-off-by: Jonah Palmer <jonah.palmer@oracle.com>
>>>> ---
>>>>    hw/virtio/virtio.c         | 18 ++++++++++++++++++
>>>>    include/hw/virtio/virtio.h |  1 +
>>>>    2 files changed, 19 insertions(+)
>>>>
>>>> diff --git a/hw/virtio/virtio.c b/hw/virtio/virtio.c
>>>> index fb6b4ccd83..069d96df99 100644
>>>> --- a/hw/virtio/virtio.c
>>>> +++ b/hw/virtio/virtio.c
>>>> @@ -132,6 +132,10 @@ struct VirtQueue
>>>>        uint16_t used_idx;
>>>>        bool used_wrap_counter;
>>>>
>>>> +    /* In-Order sequence indices */
>>>> +    uint16_t used_seq_idx;
>>>> +    uint16_t current_seq_idx;
>>>> +
>>>
>>> I'm having a hard time understanding the difference between these and
>>> last_avail_idx and used_idx. It seems to me if we replace them
>>> everything will work? What am I missing?
>>>
>>
>> For used_seq_idx, it does work like used_idx except the difference is
>> when their values get updated, specifically for the split VQ case.
>>
>> As you know, for the split VQ case, the used_idx is updated during
>> virtqueue_split_flush. However, imagine a batch of elements coming in
>> where virtqueue_split_fill is called multiple times before
>> virtqueue_split_flush. We want to make sure we write these elements to
>> the used ring in-order and we'll know its order based on used_seq_idx.
>>
>> Alternatively, I thought about replicating the logic for the packed VQ
>> case (where this used_seq_idx isn't used) where we start looking at
>> vq->used_elems[vq->used_idx] and iterate through until we find a used
>> element, but I wasn't sure how to handle the case where elements get
>> used (written to the used ring) and new elements get put in used_elems
>> before the used_idx is updated. Since this search would require us to
>> always start at index vq->used_idx.
>>
>> For example, say, of three elements getting filled (elem0 - elem2),
>> elem1 and elem0 come back first (vq->used_idx = 0):
>>
>> elem1 - not in-order
>> elem0 - in-order, vq->used_elems[vq->used_idx + 1] (elem1) also now
>>           in-order, write elem0 and elem1 to used ring, mark elements as
>>           used
>>
>> Then elem2 comes back, but vq->used_idx is still 0, so how do we know to
>> ignore the used elements at vq->used_idx (elem0) and vq->used_idx + 1
>> (elem1) and iterate to vq->used_idx + 2 (elem2)?
>>
>> Hmm... now that I'm thinking about it, maybe for the split VQ case we
>> could continue looking through the vq->used_elems array until we find an
>> unused element... but then again how would we (1) know if the element is
>> in-order and (2) know when to stop searching?
>>
> 
> Ok I think I understand the problem now. It is aggravated if we add
> chained descriptors to the mix.
> 
> We know that the order of used descriptors must be the exact same as
> the order they were made available, leaving out in order batching.
> What if vq->used_elems at virtqueue_pop and then virtqueue_push just
> marks them as used somehow? Two booleans (or flag) would do for a
> first iteration.
> 
> If we go with this approach I think used_elems should be renamed actually.
> 

If I'm understanding correctly, I don't think adding newly created 
elements to vq->used_elems at virtqueue_pop will do much for us. We 
could just keep adding processed elements to vq->used_elems at 
virtqueue_fill but instead of:

vq->used_elems[seq_idx].in_num = elem->in_num;
vq->used_elems[seq_idx].out_num = elem->out_num;

We could do:

vq->used_elems[seq_idx].in_num = 1;
vq->used_elems[seq_idx].out_num = 1;

We'd use in_num and out_num as separate flags. in_num could indicate if 
this element has been written to the used ring while out_num could 
indicate if this element has been flushed (1 for no, 0 for yes). In 
other words, when we go to write to the used ring, start at index 
vq->used_idx and iterate through the used elements.

If a used element's in_num and out_num == 0, then this element is 
invalid (not yet processed) and we stop the search.

If a used element's in_num and out_num == 1, then this element is valid, 
written to the used ring, in_num is set to 0, and the search continues.

Lastly, if a used element's in_num == 0 but out_num == 1, then this 
element has already been written to the used ring but not yet flushed, 
so ignore this element and continue searching.

There should never be a case where in_num == 1 and out_num == 0.

However, this would probably mean that before (or right after) we 
actually perform the flush we'll have to iterate through the used_elems 
array one more time and set their out_num's to 0 to indicate the element 
has been flushed.

Again, this is just for the batched split VQ case where we have to keep 
track of elements that have been written but not flushed and elements 
that have been written and flushed, given that we don't know which 
elements have actually been written to the used ring until the used_idx 
is updated.

This approach appears more costly though if we're really trying to avoid 
having this new used_seq_idx VirtQueue member.

>> In any case, the use of this variable could be seen as an optimization
>> as its value will tell us where to start looking in vq->used_elems
>> instead of always starting at vq->used_idx.
>>
>> If this is like a one-shot scenario where one element gets written and
>> then flushed after, then yes in this case used_seq_idx == used_idx.
>>
>> ------
>>
>> For current_seq_idx, this is pretty much just a counter. Every new
>> VirtQueueElement created from virtqueue_pop is given a number and the
>> counter is incremented. Like grabbing a ticket number and waiting for
>> your number to be called. The next person to grab a ticket number will
>> be your number + 1.
>>
> 
> So it's like last_avail_idx, isn't it?
> 

For the split VQ case, pretty much. Though if we hit this case in 
virtqueue_split_pop, we may get into some trouble:

if (!virtqueue_get_head(vq, vq->last_avail_idx++, &head)) {
     goto done;
}

However for the packed VQ case, last_avail_idx might not work in the way 
we'd need it to for this implementation. In virtqueue_packed_pop, we see 
this:

elem->ndescs = (desc_cache == &indirect_desc_cache) ? 1 : elem_entries;
vq->last_avail_idx += elem->ndescs;

It would appear as though last_avail_idx is incremented by total number 
of descriptors associated with the element, which can be greater than 1. 
This implementation increments by 1 for each element.

Actually... It's good you mentioned this because I think my packed VQ 
implementation is wrong. For packed VQs, vq->used_idx is incremented by 
the total number of descriptors in the flushed elements and not 
necessarily the number of elements being flushed like in the split VQ 
case. I'm adding elements to vq->used_elems in a per-element sequence 
rather than going by the number of descriptors an element holds, which 
should be the case for packed VQs.

>> Let me know if I'm making any sense. Thanks :)
>>
>> Jonah
>>
>>>>        /* Last used index value we have signalled on */
>>>>        uint16_t signalled_used;
>>>>
>>>> @@ -1621,6 +1625,11 @@ static void *virtqueue_split_pop(VirtQueue *vq, size_t sz)
>>>>            elem->in_sg[i] = iov[out_num + i];
>>>>        }
>>>>
>>>> +    /* Assign sequence index for in-order processing */
>>>> +    if (virtio_vdev_has_feature(vdev, VIRTIO_F_IN_ORDER)) {
>>>> +        elem->seq_idx = vq->current_seq_idx++;
>>>> +    }
>>>> +
>>>>        vq->inuse++;
>>>>
>>>>        trace_virtqueue_pop(vq, elem, elem->in_num, elem->out_num);
>>>> @@ -1760,6 +1769,11 @@ static void *virtqueue_packed_pop(VirtQueue *vq, size_t sz)
>>>>        vq->shadow_avail_idx = vq->last_avail_idx;
>>>>        vq->shadow_avail_wrap_counter = vq->last_avail_wrap_counter;
>>>>
>>>> +    /* Assign sequence index for in-order processing */
>>>> +    if (virtio_vdev_has_feature(vdev, VIRTIO_F_IN_ORDER)) {
>>>> +        elem->seq_idx = vq->current_seq_idx++;
>>>> +    }
>>>> +
>>>>        trace_virtqueue_pop(vq, elem, elem->in_num, elem->out_num);
>>>>    done:
>>>>        address_space_cache_destroy(&indirect_desc_cache);
>>>> @@ -2087,6 +2101,8 @@ static void __virtio_queue_reset(VirtIODevice *vdev, uint32_t i)
>>>>        vdev->vq[i].notification = true;
>>>>        vdev->vq[i].vring.num = vdev->vq[i].vring.num_default;
>>>>        vdev->vq[i].inuse = 0;
>>>> +    vdev->vq[i].used_seq_idx = 0;
>>>> +    vdev->vq[i].current_seq_idx = 0;
>>>>        virtio_virtqueue_reset_region_cache(&vdev->vq[i]);
>>>>    }
>>>>
>>>> @@ -2334,6 +2350,8 @@ VirtQueue *virtio_add_queue(VirtIODevice *vdev, int queue_size,
>>>>        vdev->vq[i].vring.align = VIRTIO_PCI_VRING_ALIGN;
>>>>        vdev->vq[i].handle_output = handle_output;
>>>>        vdev->vq[i].used_elems = g_new0(VirtQueueElement, queue_size);
>>>> +    vdev->vq[i].used_seq_idx = 0;
>>>> +    vdev->vq[i].current_seq_idx = 0;
>>>>
>>>>        return &vdev->vq[i];
>>>>    }
>>>> diff --git a/include/hw/virtio/virtio.h b/include/hw/virtio/virtio.h
>>>> index b3c74a1bca..910b2a3427 100644
>>>> --- a/include/hw/virtio/virtio.h
>>>> +++ b/include/hw/virtio/virtio.h
>>>> @@ -75,6 +75,7 @@ typedef struct VirtQueueElement
>>>>        hwaddr *out_addr;
>>>>        struct iovec *in_sg;
>>>>        struct iovec *out_sg;
>>>> +    uint16_t seq_idx;
>>>>    } VirtQueueElement;
>>>>
>>>>    #define VIRTIO_QUEUE_MAX 1024
>>>> --
>>>> 2.39.3
>>>>
>>>
>>
> 

  reply	other threads:[~2024-04-04 14:41 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-03-28 16:21 [RFC v2 0/5] virtio,vhost: Add VIRTIO_F_IN_ORDER support Jonah Palmer
2024-03-28 16:21 ` [RFC v2 1/5] virtio: Initialize sequence variables Jonah Palmer
2024-04-03 10:18   ` Eugenio Perez Martin
2024-04-03 16:51     ` Jonah Palmer
2024-04-04 11:35       ` Eugenio Perez Martin
2024-04-04 14:41         ` Jonah Palmer [this message]
2024-04-04 16:33           ` Eugenio Perez Martin
2024-04-05 13:58             ` Jonah Palmer
2024-04-05 15:04               ` Eugenio Perez Martin
2024-04-05 15:37                 ` Jonah Palmer
2024-03-28 16:22 ` [RFC v2 2/5] virtio: In-order support for split VQs Jonah Palmer
2024-03-28 16:22 ` [RFC v2 3/5] virtio: In-order support for packed VQs Jonah Palmer
2024-03-28 16:22 ` [RFC v2 4/5] vhost,vhost-user: Add VIRTIO_F_IN_ORDER to vhost feature bits Jonah Palmer
2024-03-28 16:22   ` [RFC v2 4/5] vhost, vhost-user: " Jonah Palmer
2024-03-28 16:22 ` [RFC v2 5/5] virtio: Add VIRTIO_F_IN_ORDER property definition Jonah Palmer

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=42b8e8bf-ecfb-4633-b1b6-a33435f81cdc@oracle.com \
    --to=jonah.palmer@oracle.com \
    --cc=boris.ostrovsky@oracle.com \
    --cc=eperezma@redhat.com \
    --cc=fam@euphon.net \
    --cc=hreitz@redhat.com \
    --cc=jasowang@redhat.com \
    --cc=kwolf@redhat.com \
    --cc=leiyang@redhat.com \
    --cc=mst@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=qemu-block@nongnu.org \
    --cc=qemu-devel@nongnu.org \
    --cc=raphael@enfabrica.net \
    --cc=schalla@marvell.com \
    --cc=si-wei.liu@oracle.com \
    --cc=stefanha@redhat.com \
    --cc=virtio-fs@lists.linux.dev \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.