All of lore.kernel.org
 help / color / mirror / Atom feed
From: Eugenio Perez Martin <eperezma@redhat.com>
To: Guo Zhi <qtxuning1999@sjtu.edu.cn>
Cc: Jason Wang <jasowang@redhat.com>,
	Stefano Garzarella <sgarzare@redhat.com>,
	Michael Tsirkin <mst@redhat.com>, netdev <netdev@vger.kernel.org>,
	linux-kernel <linux-kernel@vger.kernel.org>,
	kvm list <kvm@vger.kernel.org>,
	virtualization <virtualization@lists.linux-foundation.org>
Subject: Re: [RFC 1/5] vhost: reorder used descriptors in a batch
Date: Fri, 22 Jul 2022 09:07:17 +0200	[thread overview]
Message-ID: <CAJaqyWcP3CQoqN=oQ2c3d9UbGPgSS+j18CA5NO5JGAW64Z+H-Q@mail.gmail.com> (raw)
In-Reply-To: <20220721084341.24183-2-qtxuning1999@sjtu.edu.cn>

On Thu, Jul 21, 2022 at 10:44 AM Guo Zhi <qtxuning1999@sjtu.edu.cn> wrote:
>
> Device may not use descriptors in order, for example, NIC and SCSI may
> not call __vhost_add_used_n with buffers in order.  It's the task of
> __vhost_add_used_n to order them.  This commit reorder the buffers using
> vq->heads, only the batch is begin from the expected start point and is
> continuous can the batch be exposed to driver.  And only writing out a
> single used ring for a batch of descriptors, according to VIRTIO 1.1
> spec.
>
> Signed-off-by: Guo Zhi <qtxuning1999@sjtu.edu.cn>
> ---
>  drivers/vhost/vhost.c | 44 +++++++++++++++++++++++++++++++++++++++++--
>  drivers/vhost/vhost.h |  3 +++
>  2 files changed, 45 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c
> index 40097826c..e2e77e29f 100644
> --- a/drivers/vhost/vhost.c
> +++ b/drivers/vhost/vhost.c
> @@ -317,6 +317,7 @@ static void vhost_vq_reset(struct vhost_dev *dev,
>         vq->used_flags = 0;
>         vq->log_used = false;
>         vq->log_addr = -1ull;
> +       vq->next_used_head_idx = 0;
>         vq->private_data = NULL;
>         vq->acked_features = 0;
>         vq->acked_backend_features = 0;
> @@ -398,6 +399,8 @@ static long vhost_dev_alloc_iovecs(struct vhost_dev *dev)
>                                           GFP_KERNEL);
>                 if (!vq->indirect || !vq->log || !vq->heads)
>                         goto err_nomem;
> +
> +               memset(vq->heads, 0, sizeof(*vq->heads) * dev->iov_limit);
>         }
>         return 0;
>
> @@ -2374,12 +2377,49 @@ static int __vhost_add_used_n(struct vhost_virtqueue *vq,
>                             unsigned count)
>  {
>         vring_used_elem_t __user *used;
> +       struct vring_desc desc;
>         u16 old, new;
>         int start;
> +       int begin, end, i;
> +       int copy_n = count;
> +
> +       if (vhost_has_feature(vq, VIRTIO_F_IN_ORDER)) {
> +               /* calculate descriptor chain length for each used buffer */
> +               for (i = 0; i < count; i++) {
> +                       begin = heads[i].id;
> +                       end = begin;
> +                       vq->heads[begin].len = 0;
> +                       do {
> +                               vq->heads[begin].len += 1;
> +                               if (unlikely(vhost_get_desc(vq, &desc, end))) {
> +                                       vq_err(vq, "Failed to get descriptor: idx %d addr %p\n",
> +                                              end, vq->desc + end);
> +                                       return -EFAULT;
> +                               }
> +                       } while ((end = next_desc(vq, &desc)) != -1);
> +               }
> +
> +               count = 0;
> +               /* sort and batch continuous used ring entry */
> +               while (vq->heads[vq->next_used_head_idx].len != 0) {
> +                       count++;
> +                       i = vq->next_used_head_idx;
> +                       vq->next_used_head_idx = (vq->next_used_head_idx +
> +                                                 vq->heads[vq->next_used_head_idx].len)
> +                                                 % vq->num;
> +                       vq->heads[i].len = 0;
> +               }

You're iterating vq->heads with two different indexes here.

The first loop is working with indexes [0, count), which is fine if
heads is a "cache" and everything can be overwritten (as it used to be
before this patch).

The other loop trusts in vq->next_used_head_idx, which is saved between calls.

So both uses are going to conflict with each other.

A proposal for checking this is to push the data in the chains
incrementally at the virtio_test driver, and check that they are
returned properly. Like, the first buffer in the chain has the value
of N, the second one N+1, and so on.

Let's split saving chains in its own patch.


> +               /* only write out a single used ring entry with the id corresponding
> +                * to the head entry of the descriptor chain describing the last buffer
> +                * in the batch.
> +                */

Let's delay the batching for now, we can add it as an optimization on
top in the case of devices.

My proposal is to define a new struct vring_used_elem_inorder:

struct vring_used_elem_inorder {
    uint16_t written'
    uint16_t num;
}

And create a per vq array of them, with vq->num size. Let's call it
used_inorder for example.

Everytime the device uses a buffer chain of N buffers, written L and
first descriptor id D, it stores vq->used_inorder[D] = { .written = L,
.num = N }. .num == 0 means the buffer is not available.

After storing that information, you have your next_used_head_idx. You
can check if vq->used_inorder[next_used_head_idx] is used (.num != 0).
In case is not, there is no need to perform any actions for now.

In case it is, you iterate vq->used_inorder. First you write as used
next_used_head_idx. After that, next_used_head_idx increments by .num,
and we need to clean .num. If vq->used_inorder[vq->next_used_head_idx]
is used too, repeat.

I think we could even squash vq->heads and vq->used_inorder with some
tricks, because a chain's length would always be bigger or equal than
used descriptor one, but to store in a different array would be more
clear.

> +               heads[0].id = i;
> +               copy_n = 1;

The device must not write anything to the used ring if the next
descriptor has not been used. I'm failing to trace how this works when
the second half of the batch in vhost/test.c is used here.

Thanks!


> +       }
>
>         start = vq->last_used_idx & (vq->num - 1);
>         used = vq->used->ring + start;
> -       if (vhost_put_used(vq, heads, start, count)) {
> +       if (vhost_put_used(vq, heads, start, copy_n)) {
>                 vq_err(vq, "Failed to write used");
>                 return -EFAULT;
>         }
> @@ -2410,7 +2450,7 @@ int vhost_add_used_n(struct vhost_virtqueue *vq, struct vring_used_elem *heads,
>
>         start = vq->last_used_idx & (vq->num - 1);
>         n = vq->num - start;
> -       if (n < count) {
> +       if (n < count && !vhost_has_feature(vq, VIRTIO_F_IN_ORDER)) {
>                 r = __vhost_add_used_n(vq, heads, n);
>                 if (r < 0)
>                         return r;
> diff --git a/drivers/vhost/vhost.h b/drivers/vhost/vhost.h
> index d9109107a..7b2c0fbb5 100644
> --- a/drivers/vhost/vhost.h
> +++ b/drivers/vhost/vhost.h
> @@ -107,6 +107,9 @@ struct vhost_virtqueue {
>         bool log_used;
>         u64 log_addr;
>
> +       /* Sort heads in order */
> +       u16 next_used_head_idx;
> +
>         struct iovec iov[UIO_MAXIOV];
>         struct iovec iotlb_iov[64];
>         struct iovec *indirect;
> --
> 2.17.1
>


  reply	other threads:[~2022-07-22  7:08 UTC|newest]

Thread overview: 30+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-07-21  8:43 [RFC 0/5] In virtio-spec 1.1, new feature bit VIRTIO_F_IN_ORDER was introduced Guo Zhi
2022-07-21  8:43 ` [RFC 1/5] vhost: reorder used descriptors in a batch Guo Zhi
2022-07-22  7:07   ` Eugenio Perez Martin [this message]
2022-08-02  3:30     ` Guo Zhi
2022-07-26  7:36   ` Jason Wang
2022-07-26  7:36     ` Jason Wang
     [not found]     ` <2a8838c4-2e6f-6de7-dcdc-572699ff3dc9@sjtu.edu.cn>
2022-07-29  7:32       ` Jason Wang
2022-07-29  7:32         ` Jason Wang
2022-08-02  3:09         ` Guo Zhi
2022-08-02 14:12         ` Guo Zhi
2022-08-04  5:04           ` Jason Wang
2022-08-04  5:04             ` Jason Wang
2022-08-11  8:58             ` Guo Zhi
2022-08-02 13:54     ` Guo Zhi
2022-07-21  8:43 ` [RFC 2/5] vhost: announce VIRTIO_F_IN_ORDER support Guo Zhi
2022-07-21  8:43 ` [RFC 3/5] vhost_test: batch used buffer Guo Zhi
2022-07-22  7:12   ` Eugenio Perez Martin
2022-08-02  2:47     ` Guo Zhi
2022-08-02  3:08     ` Guo Zhi
     [not found]     ` <1D1ABF88-B503-4BE0-AC83-3326EAA62510@sjtu.edu.cn>
2022-08-02  7:45       ` Stefano Garzarella
2022-08-02  7:45         ` Stefano Garzarella
2022-07-21  8:43 ` [RFC 4/5] virtio: get desc id in order Guo Zhi
2022-07-26  8:07   ` Jason Wang
2022-07-26  8:07     ` Jason Wang
2022-07-28  8:12     ` Guo Zhi
2022-08-11  8:49     ` Guo Zhi
2022-07-21  8:43 ` [RFC 5/5] virtio: annouce VIRTIO_F_IN_ORDER support Guo Zhi
2022-07-21  9:17 ` [RFC 0/5] In virtio-spec 1.1, new feature bit VIRTIO_F_IN_ORDER was introduced Jason Wang
2022-07-21  9:17   ` Jason Wang
2022-07-21 11:54   ` Guo Zhi

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAJaqyWcP3CQoqN=oQ2c3d9UbGPgSS+j18CA5NO5JGAW64Z+H-Q@mail.gmail.com' \
    --to=eperezma@redhat.com \
    --cc=jasowang@redhat.com \
    --cc=kvm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mst@redhat.com \
    --cc=netdev@vger.kernel.org \
    --cc=qtxuning1999@sjtu.edu.cn \
    --cc=sgarzare@redhat.com \
    --cc=virtualization@lists.linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.