All of lore.kernel.org
 help / color / mirror / Atom feed
From: Eugenio Perez Martin <eperezma@redhat.com>
To: Peter Xu <peterx@redhat.com>
Cc: Laurent Vivier <lvivier@redhat.com>,
	Eduardo Habkost <ehabkost@redhat.com>,
	"Michael S. Tsirkin" <mst@redhat.com>,
	Jason Wang <jasowang@redhat.com>,
	Juan Quintela <quintela@redhat.com>,
	Richard Henderson <richard.henderson@linaro.org>,
	qemu-level <qemu-devel@nongnu.org>,
	Markus Armbruster <armbru@redhat.com>,
	Stefan Hajnoczi <stefanha@redhat.com>,
	Xiao W Wang <xiao.w.wang@intel.com>,
	Harpreet Singh Anand <hanand@xilinx.com>,
	Eli Cohen <eli@mellanox.com>, Paolo Bonzini <pbonzini@redhat.com>,
	Stefano Garzarella <sgarzare@redhat.com>,
	Eric Blake <eblake@redhat.com>,
	virtualization <virtualization@lists.linux-foundation.org>,
	Parav Pandit <parav@mellanox.com>
Subject: Re: [RFC PATCH v5 23/26] util: Add iova_tree_alloc
Date: Thu, 27 Jan 2022 11:09:44 +0100	[thread overview]
Message-ID: <CAJaqyWd52cXWHnLsgo=kP2Z7=VG6YKtxFGhTe7OamfYcZxhz6w@mail.gmail.com> (raw)
In-Reply-To: <YfJeZPn6nsCUxFiL@xz-m1.local>

On Thu, Jan 27, 2022 at 9:57 AM Peter Xu <peterx@redhat.com> wrote:
>
> On Fri, Oct 29, 2021 at 08:35:22PM +0200, Eugenio Pérez wrote:
> > This iova tree function allows it to look for a hole in allocated
> > regions and return a totally new translation for a given translated
> > address.
> >
> > It's usage is mainly to allow devices to access qemu address space,
> > remapping guest's one into a new iova space where qemu can add chunks of
> > addresses.
> >
> > Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
> > ---
> >  include/qemu/iova-tree.h |  17 +++++
> >  util/iova-tree.c         | 139 +++++++++++++++++++++++++++++++++++++++
> >  2 files changed, 156 insertions(+)
> >
> > diff --git a/include/qemu/iova-tree.h b/include/qemu/iova-tree.h
> > index 8249edd764..33f9b2e13f 100644
> > --- a/include/qemu/iova-tree.h
> > +++ b/include/qemu/iova-tree.h
> > @@ -29,6 +29,7 @@
> >  #define  IOVA_OK           (0)
> >  #define  IOVA_ERR_INVALID  (-1) /* Invalid parameters */
> >  #define  IOVA_ERR_OVERLAP  (-2) /* IOVA range overlapped */
> > +#define  IOVA_ERR_NOMEM    (-3) /* Cannot allocate */
> >
> >  typedef struct IOVATree IOVATree;
> >  typedef struct DMAMap {
> > @@ -119,6 +120,22 @@ const DMAMap *iova_tree_find_address(const IOVATree *tree, hwaddr iova);
> >   */
> >  void iova_tree_foreach(IOVATree *tree, iova_tree_iterator iterator);
> >
> > +/**
> > + * iova_tree_alloc:
> > + *
> > + * @tree: the iova tree to allocate from
> > + * @map: the new map (as translated addr & size) to allocate in iova region
> > + * @iova_begin: the minimum address of the allocation
> > + * @iova_end: the maximum addressable direction of the allocation
> > + *
> > + * Allocates a new region of a given size, between iova_min and iova_max.
> > + *
> > + * Return: Same as iova_tree_insert, but cannot overlap and can be out of
> > + * free contiguous range. Caller can get the assigned iova in map->iova.
> > + */
> > +int iova_tree_alloc(IOVATree *tree, DMAMap *map, hwaddr iova_begin,
> > +                    hwaddr iova_end);
> > +
> >  /**
> >   * iova_tree_destroy:
> >   *
> > diff --git a/util/iova-tree.c b/util/iova-tree.c
> > index 23ea35b7a4..27c921c4e2 100644
> > --- a/util/iova-tree.c
> > +++ b/util/iova-tree.c
> > @@ -16,6 +16,36 @@ struct IOVATree {
> >      GTree *tree;
> >  };
> >
> > +/* Args to pass to iova_tree_alloc foreach function. */
> > +struct IOVATreeAllocArgs {
> > +    /* Size of the desired allocation */
> > +    size_t new_size;
> > +
> > +    /* The minimum address allowed in the allocation */
> > +    hwaddr iova_begin;
> > +
> > +    /* The last addressable allowed in the allocation */
> > +    hwaddr iova_last;
> > +
> > +    /* Previously-to-last iterated map, can be NULL in the first node */
> > +    const DMAMap *hole_left;
> > +
> > +    /* Last iterated map */
> > +    const DMAMap *hole_right;
>
> I slightly prefer having two more fields to cache the result:
>
>        /* If found, we fill in the IOVA here */
>        hwaddr iova_result;
>        /* Whether have we found a valid IOVA */
>        bool   iova_found;
>
> IMHO they'll help on readability.  More below.
>

Sure, this avoids an extra call.

> > +};
> > +
> > +/**
> > + * Iterate args to tne next hole
> > + *
> > + * @args  The alloc arguments
> > + * @next  The next mapping in the tree. Can be NULL to signal the last one
> > + */
> > +static void iova_tree_alloc_args_iterate(struct IOVATreeAllocArgs *args,
> > +                                         const DMAMap *next) {
> > +    args->hole_left = args->hole_right;
> > +    args->hole_right = next;
> > +}
> > +
> >  static int iova_tree_compare(gconstpointer a, gconstpointer b, gpointer data)
> >  {
> >      const DMAMap *m1 = a, *m2 = b;
> > @@ -107,6 +137,115 @@ int iova_tree_remove(IOVATree *tree, const DMAMap *map)
> >      return IOVA_OK;
> >  }
> >
> > +/**
> > + * Try to accomodate a map of size ret->size in a hole between
> > + * max(end(hole_left), iova_start).
>
> I think this functions need the most comments, and above sentence is more or
> less not sounding correct... My try...
>
> /*
>  * Try to find an unallocated IOVA range between LEFT and RIGHT elements.
>  *
>  * There're three cases:
>  *
>  * (1) When LEFT==NULL, RIGHT must be non-NULL and it means we're iterating at
>  *     the 1st element.
>  *
>  * (2) When RIGHT==NULL, LEFT must be non-NULL and it means we're iterating at
>  *     the last element.
>  *
>  * (3) When both LEFT and RIGHT are non-NULL, this is the most common case,
>  *     we'll try to find a hole between LEFT and RIGHT mapping.
>  */
>

This is also called with left == NULL and right == NULL in the first
allocation with an empty tree. This allows iova_tree_alloc to have the
same code path both if the three is empty or not.

But I can add the use cases in the doc for sure.

> > + *
> > + * @args Arguments to allocation
> > + */
> > +static bool iova_tree_alloc_map_in_hole(const struct IOVATreeAllocArgs *args)
> > +{
> > +    const DMAMap *left = args->hole_left, *right = args->hole_right;
> > +    uint64_t hole_start, hole_last;
> > +
> > +    if (right && right->iova + right->size < args->iova_begin) {
> > +        return false;
> > +    }
> > +
> > +    if (left && left->iova > args->iova_last) {
> > +        return false;
> > +    }
> > +
> > +    hole_start = MAX(left ? left->iova + left->size + 1 : 0, args->iova_begin);
> > +    hole_last = MIN(right ? right->iova : HWADDR_MAX, args->iova_last);
>
> I assume these values should be always inclusive, hence
>
> s/right->iova/right->iova + 1/
>
> ?
>

Right, it is confusing the way it's written. But I think it should be
right->iova - 1 in any case to make it the hole last element, isn't
it?

Would it work better to rename variable hole_last to hole_end? If not,
we have a special case of the second allocation when iova_begin == 0:
- We successfully allocate a DMAMap of size N. By the way the
algorithm works,  it starts at 0, so [0, N] is allocated.
- We try to allocate a second one of size M. At the first iteration,
"right" is the previously allocated DMAMap.
Using the -1 trick we get hole_end == HWADDR_MAX.

> > +
> > +    if (hole_last - hole_start > args->new_size) {
> > +        /* We found a valid hole. */
>
> IMHO it's cleaner we simply set:
>
>            args->iova_result = hole_start;
>
> Here before stop the iterations.
>

I agree.

> > +        return true;
> > +    }
> > +
> > +    /* Keep iterating */
> > +    return false;
> > +}
> > +
> > +/**
> > + * Foreach dma node in the tree, compare if there is a hole wit its previous
> > + * node (or minimum iova address allowed) and the node.
> > + *
> > + * @key   Node iterating
> > + * @value Node iterating
> > + * @pargs Struct to communicate with the outside world
> > + *
> > + * Return: false to keep iterating, true if needs break.
> > + */
> > +static gboolean iova_tree_alloc_traverse(gpointer key, gpointer value,
> > +                                         gpointer pargs)
> > +{
> > +    struct IOVATreeAllocArgs *args = pargs;
> > +    DMAMap *node = value;
> > +
> > +    assert(key == value);
> > +
> > +    iova_tree_alloc_args_iterate(args, node);
> > +    if (args->hole_left && args->hole_left->iova > args->iova_last) {
>
> IMHO this check is redundant and can be dropped, as it's already done in
> iova_tree_alloc_map_in_hole().
>

Assuming we add "iova_found" to iova_tree_alloc_map_in_hole to
IOVATreeAllocArgs as you propose, it returns true if we are able to
allocate a DMAMap entry, so no more iterations are needed. But if it
returns false, it simply means that DMAMap cannot be allocated between
left (or iova_begin) and right (iova_end). It doesn't tell if you can
keep iterating or not. In other words, false == keep iterating if you
can.

This other check signals the end of the available hole, and to avoid
iterating beyond iova_last in the (unlikely?) case we have more nodes
to iterate beyond that.

I'll try to make it more explicit.

> > +        return true;
> > +    }
> > +
> > +    if (iova_tree_alloc_map_in_hole(args)) {
> > +        return true;
> > +    }
> > +
> > +    return false;
> > +}
> > +
> > +int iova_tree_alloc(IOVATree *tree, DMAMap *map, hwaddr iova_begin,
> > +                    hwaddr iova_last)
> > +{
> > +    struct IOVATreeAllocArgs args = {
> > +        .new_size = map->size,
> > +        .iova_begin = iova_begin,
> > +        .iova_last = iova_last,
> > +    };
> > +
> > +    if (iova_begin == 0) {
> > +        /* Some devices does not like addr 0 */
> > +        iova_begin += qemu_real_host_page_size;
> > +    }
>
> (This should be dropped as the new version goes)
>

Agree.

> > +
> > +    assert(iova_begin < iova_last);
> > +
> > +    /*
> > +     * Find a valid hole for the mapping
> > +     *
> > +     * Assuming low iova_begin, so no need to do a binary search to
> > +     * locate the first node.
>
> We could also mention something like this here:
>
>         *
>         * The traversing will cover all the possible holes but except the last
>         * hole starting from the last element.  We need to handle it separately
>         * below.
>         *
>

Ok I will add the comment.

> > +     *
> > +     * TODO: We can improve the search speed if we save the beginning and the
> > +     * end of holes, so we don't iterate over the previous saved ones.
> > +     *
> > +     * TODO: Replace all this with g_tree_node_first/next/last when available
> > +     * (from glib since 2.68). To do it with g_tree_foreach complicates the
> > +     * code a lot.
> > +     *
> > +     */
> > +    g_tree_foreach(tree->tree, iova_tree_alloc_traverse, &args);
> > +    if (!iova_tree_alloc_map_in_hole(&args)) {
>
> With iova_found, here it could be (hopefully) more readable:
>
>        if (!args->iova_found) {
>            /* If we failed to find a hole in 0..N-1 entries, try the last one */
>            iova_tree_alloc_args_iterate(&args, NULL);
>            iova_tree_alloc_map_in_hole(&args);
>            if (!args->iova_found) {
>                return IOVA_ERR_NOMEM;
>            }
>        }
>
>        map->iova = args->iova_result;
>        ...
>

I also think it's more readable this way.

Thanks!

> Thanks,
>
> > +        /*
> > +         * 2nd try: Last iteration left args->right as the last DMAMap. But
> > +         * (right, end) hole needs to be checked too
> > +         */
> > +        iova_tree_alloc_args_iterate(&args, NULL);
> > +        if (!iova_tree_alloc_map_in_hole(&args)) {
> > +            return IOVA_ERR_NOMEM;
> > +        }
> > +    }
> > +
> > +    map->iova = MAX(iova_begin,
> > +                    args.hole_left ?
> > +                    args.hole_left->iova + args.hole_left->size + 1 : 0);
> > +    return iova_tree_insert(tree, map);
> > +}
> > +
> >  void iova_tree_destroy(IOVATree *tree)
> >  {
> >      g_tree_destroy(tree->tree);
> > --
> > 2.27.0
> >
>
> --
> Peter Xu
>



  reply	other threads:[~2022-01-27 10:26 UTC|newest]

Thread overview: 82+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-10-29 18:34 [RFC PATCH v5 00/26] vDPA shadow virtqueue Eugenio Pérez
2021-10-29 18:35 ` [RFC PATCH v5 01/26] util: Make some iova_tree parameters const Eugenio Pérez
2021-10-31 18:59   ` Juan Quintela
2021-10-31 18:59     ` Juan Quintela
2021-11-01  8:20     ` Eugenio Perez Martin
2021-10-29 18:35 ` [RFC PATCH v5 02/26] vhost: Fix last queue index of devices with no cvq Eugenio Pérez
2021-11-02  7:25   ` Juan Quintela
2021-11-02  7:25     ` Juan Quintela
2021-11-02  7:32     ` Michael S. Tsirkin
2021-11-02  7:32       ` Michael S. Tsirkin
2021-11-02  7:39       ` Juan Quintela
2021-11-02  7:39         ` Juan Quintela
2021-11-02  8:34     ` Eugenio Perez Martin
2021-11-02  7:40   ` Juan Quintela
2021-11-02  7:40     ` Juan Quintela
2021-10-29 18:35 ` [RFC PATCH v5 03/26] virtio: Add VIRTIO_F_QUEUE_STATE Eugenio Pérez
2021-11-02  4:57   ` Jason Wang
2021-11-02  4:57     ` Jason Wang
2021-10-29 18:35 ` [RFC PATCH v5 04/26] virtio-net: Honor VIRTIO_CONFIG_S_DEVICE_STOPPED Eugenio Pérez
2021-10-29 18:35 ` [RFC PATCH v5 05/26] vhost: Add x-vhost-set-shadow-vq qmp Eugenio Pérez
2021-11-02  7:36   ` Juan Quintela
2021-11-02  7:36     ` Juan Quintela
2021-11-02  8:29     ` Eugenio Perez Martin
2021-10-29 18:35 ` [RFC PATCH v5 06/26] vhost: Add VhostShadowVirtqueue Eugenio Pérez
2021-10-29 18:35 ` [RFC PATCH v5 07/26] vdpa: Save kick_fd in vhost-vdpa Eugenio Pérez
2021-10-29 18:35 ` [RFC PATCH v5 08/26] vdpa: Add vhost_svq_get_dev_kick_notifier Eugenio Pérez
2021-10-29 18:35 ` [RFC PATCH v5 09/26] vdpa: Add vhost_svq_set_svq_kick_fd Eugenio Pérez
2021-10-29 18:35 ` [RFC PATCH v5 10/26] vhost: Add Shadow VirtQueue kick forwarding capabilities Eugenio Pérez
2021-10-29 18:35 ` [RFC PATCH v5 11/26] vhost: Handle host notifiers in SVQ Eugenio Pérez
2021-11-02  7:54   ` Jason Wang
2021-11-02  7:54     ` Jason Wang
2021-11-02  8:46     ` Eugenio Perez Martin
     [not found]       ` <CACGkMEvOxUMo1WA4tUfDhw+FOJVW87JJGPw=U3JvUSQTU_ogWQ@mail.gmail.com>
     [not found]         ` <CAJaqyWd4DQwRSL5StCft+3-uq12TW5x1o4DN_YW97D0JzOr2XQ@mail.gmail.com>
2021-11-04  2:31           ` Jason Wang
2021-11-04  2:31             ` Jason Wang
2021-10-29 18:35 ` [RFC PATCH v5 12/26] vhost: Route guest->host notification through shadow virtqueue Eugenio Pérez
2021-11-02  5:36   ` Jason Wang
2021-11-02  5:36     ` Jason Wang
2021-11-02  7:35     ` Eugenio Perez Martin
2021-10-29 18:35 ` [RFC PATCH v5 13/26] Add vhost_svq_get_svq_call_notifier Eugenio Pérez
2021-10-29 18:35 ` [RFC PATCH v5 14/26] Add vhost_svq_set_guest_call_notifier Eugenio Pérez
2021-10-29 18:35 ` [RFC PATCH v5 15/26] vdpa: Save call_fd in vhost-vdpa Eugenio Pérez
2021-10-29 18:35 ` [RFC PATCH v5 16/26] vhost-vdpa: Take into account SVQ in vhost_vdpa_set_vring_call Eugenio Pérez
2021-10-29 18:35 ` [RFC PATCH v5 17/26] vhost: Route host->guest notification through shadow virtqueue Eugenio Pérez
2021-10-29 18:35 ` [RFC PATCH v5 18/26] virtio: Add vhost_shadow_vq_get_vring_addr Eugenio Pérez
2021-10-29 18:35 ` [RFC PATCH v5 19/26] vdpa: ack VIRTIO_F_QUEUE_STATE if device supports it Eugenio Pérez
2021-10-29 18:35 ` [RFC PATCH v5 20/26] vhost: Add vhost_svq_valid_device_features to shadow vq Eugenio Pérez
2021-10-29 18:35 ` [RFC PATCH v5 21/26] vhost: Add vhost_svq_valid_guest_features " Eugenio Pérez
2021-11-02  5:25   ` Jason Wang
2021-11-02  5:25     ` Jason Wang
2021-11-02  8:09     ` Eugenio Perez Martin
2021-11-03  3:18       ` Jason Wang
2021-11-03  3:18         ` Jason Wang
2021-11-03  7:43         ` Eugenio Perez Martin
2021-11-04  2:34           ` Jason Wang
2021-11-04  2:34             ` Jason Wang
2021-10-29 18:35 ` [RFC PATCH v5 22/26] vhost: Shadow virtqueue buffers forwarding Eugenio Pérez
2021-11-02  7:59   ` Jason Wang
2021-11-02  7:59     ` Jason Wang
2021-11-02 10:22     ` Eugenio Perez Martin
2021-10-29 18:35 ` [RFC PATCH v5 23/26] util: Add iova_tree_alloc Eugenio Pérez
2021-11-02  6:35   ` Jason Wang
2021-11-02  6:35     ` Jason Wang
2021-11-02  8:28     ` Eugenio Perez Martin
2021-11-03  3:10       ` Jason Wang
2021-11-03  3:10         ` Jason Wang
2021-11-03  7:41         ` Eugenio Perez Martin
2021-11-23  6:56   ` Peter Xu
2021-11-23  6:56     ` Peter Xu
2021-11-23  7:08     ` Eugenio Perez Martin
2022-01-27  8:57   ` Peter Xu
2022-01-27  8:57     ` Peter Xu
2022-01-27 10:09     ` Eugenio Perez Martin [this message]
2022-01-27 11:25       ` Peter Xu
2022-01-27 11:25         ` Peter Xu
2022-01-27 11:45         ` Eugenio Perez Martin
2021-10-29 18:35 ` [RFC PATCH v5 24/26] vhost: Add VhostIOVATree Eugenio Pérez
2021-10-29 18:35 ` [RFC PATCH v5 25/26] vhost: Use a tree to store memory mappings Eugenio Pérez
2021-10-29 18:35 ` [RFC PATCH v5 26/26] vdpa: Add custom IOTLB translations to SVQ Eugenio Pérez
2021-11-01  9:06 ` [RFC PATCH v5 00/26] vDPA shadow virtqueue Eugenio Perez Martin
2021-11-02  4:25 ` Jason Wang
2021-11-02  4:25   ` Jason Wang
2021-11-02 11:21   ` Eugenio Perez Martin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAJaqyWd52cXWHnLsgo=kP2Z7=VG6YKtxFGhTe7OamfYcZxhz6w@mail.gmail.com' \
    --to=eperezma@redhat.com \
    --cc=armbru@redhat.com \
    --cc=eblake@redhat.com \
    --cc=ehabkost@redhat.com \
    --cc=eli@mellanox.com \
    --cc=hanand@xilinx.com \
    --cc=jasowang@redhat.com \
    --cc=lvivier@redhat.com \
    --cc=mst@redhat.com \
    --cc=parav@mellanox.com \
    --cc=pbonzini@redhat.com \
    --cc=peterx@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=quintela@redhat.com \
    --cc=richard.henderson@linaro.org \
    --cc=sgarzare@redhat.com \
    --cc=stefanha@redhat.com \
    --cc=virtualization@lists.linux-foundation.org \
    --cc=xiao.w.wang@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.