From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 8875CC433EF for ; Tue, 1 Mar 2022 09:57:33 +0000 (UTC) Received: from localhost ([::1]:35230 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1nOzG4-0007oy-Mf for qemu-devel@archiver.kernel.org; Tue, 01 Mar 2022 04:57:28 -0500 Received: from eggs.gnu.org ([209.51.188.92]:57442) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1nOyDZ-00064n-GC for qemu-devel@nongnu.org; Tue, 01 Mar 2022 03:50:49 -0500 Received: from us-smtp-delivery-124.mimecast.com ([170.10.129.124]:36925) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1nOyDX-0001g5-0L for qemu-devel@nongnu.org; Tue, 01 Mar 2022 03:50:49 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1646124646; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=1CKmUt9jhdmFcZU2L1RmZFwOJq0czovA548A3jUEkNw=; b=E0uHn/jPjQkjQTSqO1Q1zCOE1SwwG48732Ilwgqz7qENCUKbjM/UYiDcDnhcSv93MAC3fu u+mRgtvhQUheCv74HgotGOxYoI4sqr096ent20pQkfOG5TYGUjeZf+mowXAz1SvgKoV92n Mgo/+ipW4rdMjRqr/9QHxKU9RpLUj8U= Received: from mail-qt1-f197.google.com (mail-qt1-f197.google.com [209.85.160.197]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-629-PdGqTVSgNRmvpfrTW7nYhw-1; Tue, 01 Mar 2022 03:50:45 -0500 X-MC-Unique: PdGqTVSgNRmvpfrTW7nYhw-1 Received: by mail-qt1-f197.google.com with SMTP id p16-20020ac87410000000b002dde63e978cso6774843qtq.7 for ; Tue, 01 Mar 2022 00:50:45 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc:content-transfer-encoding; bh=1CKmUt9jhdmFcZU2L1RmZFwOJq0czovA548A3jUEkNw=; b=vWSSwn2a3ChzUXp5TYV7bJtQXKznLp+y36cNfFdaGU0ZtL44gQTVjhU4VFw36/xk0q SwAZA+xvWid7LMc+wuRJyd8xZyPQppGlCGOlfQZoo/0a5m9rmPWJ0LaY1C3RFR4EPenW YOGJ5qS3XLJlFoA0WafFc+60myeQ+OdcMvQetBCQSQgPDMBiXC4K4dCWXC+pIhriCtJL +4jc15Pvl9olkMAfbQkgcmZ+yYckR6MyRfavhO/xrV381uDweD1JHQJTsds2poiCU+jz u4bWKsSgkN+NyO5ASlo1T/sQzbp8Fbs6HdEZB3kswe25eTJnlKMcnaz4gfKK1vmNYQoQ FADA== X-Gm-Message-State: AOAM531DbmtKEwB5z8sxZajvrMzE5OesYIx+yyP6dy2NHwUEPnuy9MJT mKQuvmwW/tzmOplzwsQ6KHTaU0fV7B4vxFg9Tzp5TSxLxSOjrft4KSPlFuLXspdGLpdjVyf0src 5lCVTgfbDqpijZe7rd5cITxl65iAfGXs= X-Received: by 2002:a05:622a:1e07:b0:2dd:d74c:b6bd with SMTP id br7-20020a05622a1e0700b002ddd74cb6bdmr19352344qtb.320.1646124644714; Tue, 01 Mar 2022 00:50:44 -0800 (PST) X-Google-Smtp-Source: ABdhPJz9E7sw+l7x/RGSkceMKU6+SV+YoUYCBObFv7e9Re0AcFhoOb6Rn4kDl9T+Ib0B40dwZaKUyjFk32ChNV+uPrY= X-Received: by 2002:a05:622a:1e07:b0:2dd:d74c:b6bd with SMTP id br7-20020a05622a1e0700b002ddd74cb6bdmr19352316qtb.320.1646124644269; Tue, 01 Mar 2022 00:50:44 -0800 (PST) MIME-Version: 1.0 References: <20220227134111.3254066-1-eperezma@redhat.com> <20220227134111.3254066-11-eperezma@redhat.com> In-Reply-To: From: Eugenio Perez Martin Date: Tue, 1 Mar 2022 09:50:08 +0100 Message-ID: Subject: Re: [PATCH v2 10/14] vdpa: Add custom IOTLB translations to SVQ To: Jason Wang Authentication-Results: relay.mimecast.com; auth=pass smtp.auth=CUSA124A263 smtp.mailfrom=eperezma@redhat.com X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Received-SPF: pass client-ip=170.10.129.124; envelope-from=eperezma@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -28 X-Spam_score: -2.9 X-Spam_bar: -- X-Spam_report: (-2.9 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.082, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H5=0.001, RCVD_IN_MSPIKE_WL=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: "Michael S. Tsirkin" , qemu-level , Peter Xu , virtualization , Eli Cohen , Eric Blake , Parav Pandit , Cindy Lu , "Fangyi \(Eric\)" , Markus Armbruster , yebiaoxiang@huawei.com, Liuxiangdong , Stefano Garzarella , Laurent Vivier , Eduardo Habkost , Richard Henderson , Gautam Dawar , Xiao W Wang , Stefan Hajnoczi , Juan Quintela , Harpreet Singh Anand , Paolo Bonzini , Lingshan Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" On Mon, Feb 28, 2022 at 8:37 AM Jason Wang wrote: > > > =E5=9C=A8 2022/2/27 =E4=B8=8B=E5=8D=889:41, Eugenio P=C3=A9rez =E5=86=99= =E9=81=93: > > Use translations added in VhostIOVATree in SVQ. > > > > Only introduce usage here, not allocation and deallocation. As with > > previous patches, we use the dead code paths of shadow_vqs_enabled to > > avoid commiting too many changes at once. These are impossible to take > > at the moment. > > > > Signed-off-by: Eugenio P=C3=A9rez > > --- > > hw/virtio/vhost-shadow-virtqueue.h | 6 +- > > include/hw/virtio/vhost-vdpa.h | 3 + > > hw/virtio/vhost-shadow-virtqueue.c | 76 ++++++++++++++++- > > hw/virtio/vhost-vdpa.c | 128 ++++++++++++++++++++++++----= - > > 4 files changed, 187 insertions(+), 26 deletions(-) > > > > diff --git a/hw/virtio/vhost-shadow-virtqueue.h b/hw/virtio/vhost-shado= w-virtqueue.h > > index 04c67685fd..b2f722d101 100644 > > --- a/hw/virtio/vhost-shadow-virtqueue.h > > +++ b/hw/virtio/vhost-shadow-virtqueue.h > > @@ -13,6 +13,7 @@ > > #include "qemu/event_notifier.h" > > #include "hw/virtio/virtio.h" > > #include "standard-headers/linux/vhost_types.h" > > +#include "hw/virtio/vhost-iova-tree.h" > > > > /* Shadow virtqueue to relay notifications */ > > typedef struct VhostShadowVirtqueue { > > @@ -43,6 +44,9 @@ typedef struct VhostShadowVirtqueue { > > /* Virtio device */ > > VirtIODevice *vdev; > > > > + /* IOVA mapping */ > > + VhostIOVATree *iova_tree; > > + > > /* Map for use the guest's descriptors */ > > VirtQueueElement **ring_id_maps; > > > > @@ -78,7 +82,7 @@ void vhost_svq_start(VhostShadowVirtqueue *svq, VirtI= ODevice *vdev, > > VirtQueue *vq); > > void vhost_svq_stop(VhostShadowVirtqueue *svq); > > > > -VhostShadowVirtqueue *vhost_svq_new(void); > > +VhostShadowVirtqueue *vhost_svq_new(VhostIOVATree *iova_tree); > > > > void vhost_svq_free(gpointer vq); > > G_DEFINE_AUTOPTR_CLEANUP_FUNC(VhostShadowVirtqueue, vhost_svq_free); > > diff --git a/include/hw/virtio/vhost-vdpa.h b/include/hw/virtio/vhost-v= dpa.h > > index 009a9f3b6b..ee8e939ad0 100644 > > --- a/include/hw/virtio/vhost-vdpa.h > > +++ b/include/hw/virtio/vhost-vdpa.h > > @@ -14,6 +14,7 @@ > > > > #include > > > > +#include "hw/virtio/vhost-iova-tree.h" > > #include "hw/virtio/virtio.h" > > #include "standard-headers/linux/vhost_types.h" > > > > @@ -30,6 +31,8 @@ typedef struct vhost_vdpa { > > MemoryListener listener; > > struct vhost_vdpa_iova_range iova_range; > > bool shadow_vqs_enabled; > > + /* IOVA mapping used by the Shadow Virtqueue */ > > + VhostIOVATree *iova_tree; > > GPtrArray *shadow_vqs; > > struct vhost_dev *dev; > > VhostVDPAHostNotifier notifier[VIRTIO_QUEUE_MAX]; > > diff --git a/hw/virtio/vhost-shadow-virtqueue.c b/hw/virtio/vhost-shado= w-virtqueue.c > > index a38d313755..7e073773d1 100644 > > --- a/hw/virtio/vhost-shadow-virtqueue.c > > +++ b/hw/virtio/vhost-shadow-virtqueue.c > > @@ -11,6 +11,7 @@ > > #include "hw/virtio/vhost-shadow-virtqueue.h" > > > > #include "qemu/error-report.h" > > +#include "qemu/log.h" > > #include "qemu/main-loop.h" > > #include "qemu/log.h" > > #include "linux-headers/linux/vhost.h" > > @@ -84,7 +85,58 @@ static void vhost_svq_set_notification(VhostShadowVi= rtqueue *svq, bool enable) > > } > > } > > > > +/** > > + * Translate addresses between the qemu's virtual address and the SVQ = IOVA > > + * > > + * @svq Shadow VirtQueue > > + * @vaddr Translated IOVA addresses > > + * @iovec Source qemu's VA addresses > > + * @num Length of iovec and minimum length of vaddr > > + */ > > +static bool vhost_svq_translate_addr(const VhostShadowVirtqueue *svq, > > + void **addrs, const struct iovec = *iovec, > > + size_t num) > > +{ > > + if (num =3D=3D 0) { > > + return true; > > + } > > + > > + for (size_t i =3D 0; i < num; ++i) { > > + DMAMap needle =3D { > > + .translated_addr =3D (hwaddr)iovec[i].iov_base, > > + .size =3D iovec[i].iov_len, > > + }; > > + size_t off; > > + > > + const DMAMap *map =3D vhost_iova_tree_find_iova(svq->iova_tree= , &needle); > > + /* > > + * Map cannot be NULL since iova map contains all guest space = and > > + * qemu already has a physical address mapped > > + */ > > + if (unlikely(!map)) { > > + qemu_log_mask(LOG_GUEST_ERROR, > > + "Invalid address 0x%"HWADDR_PRIx" given by g= uest", > > + needle.translated_addr); > > + return false; > > + } > > + > > + off =3D needle.translated_addr - map->translated_addr; > > + addrs[i] =3D (void *)(map->iova + off); > > + > > + if (unlikely(int128_gt(int128_add(needle.translated_addr, > > + iovec[i].iov_len), > > + map->translated_addr + map->size))) { > > + qemu_log_mask(LOG_GUEST_ERROR, > > + "Guest buffer expands over iova range"); > > + return false; > > + } > > + } > > + > > + return true; > > +} > > + > > static void vhost_vring_write_descs(VhostShadowVirtqueue *svq, > > + void * const *vaddr_sg, > > > Nit: it looks to me we are not passing vaddr but iova here, so it might > be better to use "sg"? > Sure, this is a leftover of pre-IOVA translations. I see better to write as you say. > > > const struct iovec *iovec, > > size_t num, bool more_descs, bool= write) > > { > > @@ -103,7 +155,7 @@ static void vhost_vring_write_descs(VhostShadowVirt= queue *svq, > > } else { > > descs[i].flags =3D flags; > > } > > - descs[i].addr =3D cpu_to_le64((hwaddr)iovec[n].iov_base); > > + descs[i].addr =3D cpu_to_le64((hwaddr)vaddr_sg[n]); > > descs[i].len =3D cpu_to_le32(iovec[n].iov_len); > > > > last =3D i; > > @@ -119,6 +171,8 @@ static bool vhost_svq_add_split(VhostShadowVirtqueu= e *svq, > > { > > unsigned avail_idx; > > vring_avail_t *avail =3D svq->vring.avail; > > + bool ok; > > + g_autofree void **sgs =3D g_new(void *, MAX(elem->out_num, elem->i= n_num)); > > > > *head =3D svq->free_head; > > > > @@ -129,9 +183,20 @@ static bool vhost_svq_add_split(VhostShadowVirtque= ue *svq, > > return false; > > } > > > > - vhost_vring_write_descs(svq, elem->out_sg, elem->out_num, > > + ok =3D vhost_svq_translate_addr(svq, sgs, elem->out_sg, elem->out_= num); > > + if (unlikely(!ok)) { > > + return false; > > + } > > + vhost_vring_write_descs(svq, sgs, elem->out_sg, elem->out_num, > > elem->in_num > 0, false); > > - vhost_vring_write_descs(svq, elem->in_sg, elem->in_num, false, tru= e); > > + > > + > > + ok =3D vhost_svq_translate_addr(svq, sgs, elem->in_sg, elem->in_nu= m); > > + if (unlikely(!ok)) { > > + return false; > > + } > > + > > + vhost_vring_write_descs(svq, sgs, elem->in_sg, elem->in_num, false= , true); > > > > /* > > * Put the entry in the available array (but don't update avail->= idx until > > @@ -514,11 +579,13 @@ void vhost_svq_stop(VhostShadowVirtqueue *svq) > > * Creates vhost shadow virtqueue, and instructs the vhost device to = use the > > * shadow methods and file descriptors. > > * > > + * @iova_tree Tree to perform descriptors translations > > + * > > * Returns the new virtqueue or NULL. > > * > > * In case of error, reason is reported through error_report. > > */ > > -VhostShadowVirtqueue *vhost_svq_new(void) > > +VhostShadowVirtqueue *vhost_svq_new(VhostIOVATree *iova_tree) > > { > > g_autofree VhostShadowVirtqueue *svq =3D g_new0(VhostShadowVirtqu= eue, 1); > > int r; > > @@ -539,6 +606,7 @@ VhostShadowVirtqueue *vhost_svq_new(void) > > > > event_notifier_init_fd(&svq->svq_kick, VHOST_FILE_UNBIND); > > event_notifier_set_handler(&svq->hdev_call, vhost_svq_handle_call= ); > > + svq->iova_tree =3D iova_tree; > > return g_steal_pointer(&svq); > > > > err_init_hdev_call: > > diff --git a/hw/virtio/vhost-vdpa.c b/hw/virtio/vhost-vdpa.c > > index 435b9c2e9e..56f9f125cd 100644 > > --- a/hw/virtio/vhost-vdpa.c > > +++ b/hw/virtio/vhost-vdpa.c > > @@ -209,6 +209,21 @@ static void vhost_vdpa_listener_region_add(MemoryL= istener *listener, > > vaddr, section->readonly); > > > > llsize =3D int128_sub(llend, int128_make64(iova)); > > + if (v->shadow_vqs_enabled) { > > + DMAMap mem_region =3D { > > + .translated_addr =3D (hwaddr)vaddr, > > + .size =3D int128_get64(llsize) - 1, > > + .perm =3D IOMMU_ACCESS_FLAG(true, section->readonly), > > + }; > > + > > + int r =3D vhost_iova_tree_map_alloc(v->iova_tree, &mem_region)= ; > > + if (unlikely(r !=3D IOVA_OK)) { > > + error_report("Can't allocate a mapping (%d)", r); > > + goto fail; > > + } > > + > > + iova =3D mem_region.iova; > > + } > > > > vhost_vdpa_iotlb_batch_begin_once(v); > > ret =3D vhost_vdpa_dma_map(v, iova, int128_get64(llsize), > > @@ -261,6 +276,20 @@ static void vhost_vdpa_listener_region_del(MemoryL= istener *listener, > > > > llsize =3D int128_sub(llend, int128_make64(iova)); > > > > + if (v->shadow_vqs_enabled) { > > + const DMAMap *result; > > + const void *vaddr =3D memory_region_get_ram_ptr(section->mr) + > > + section->offset_within_region + > > + (iova - section->offset_within_address_space); > > + DMAMap mem_region =3D { > > + .translated_addr =3D (hwaddr)vaddr, > > + .size =3D int128_get64(llsize) - 1, > > + }; > > + > > + result =3D vhost_iova_tree_find_iova(v->iova_tree, &mem_region= ); > > + iova =3D result->iova; > > + vhost_iova_tree_remove(v->iova_tree, &mem_region); > > + } > > vhost_vdpa_iotlb_batch_begin_once(v); > > ret =3D vhost_vdpa_dma_unmap(v, iova, int128_get64(llsize)); > > if (ret) { > > @@ -383,7 +412,7 @@ static int vhost_vdpa_init_svq(struct vhost_dev *hd= ev, struct vhost_vdpa *v, > > > > shadow_vqs =3D g_ptr_array_new_full(hdev->nvqs, vhost_svq_free); > > for (unsigned n =3D 0; n < hdev->nvqs; ++n) { > > - g_autoptr(VhostShadowVirtqueue) svq =3D vhost_svq_new(); > > + g_autoptr(VhostShadowVirtqueue) svq =3D vhost_svq_new(v->iova_= tree); > > > > if (unlikely(!svq)) { > > error_setg(errp, "Cannot create svq %u", n); > > @@ -834,37 +863,78 @@ static int vhost_vdpa_svq_set_fds(struct vhost_de= v *dev, > > /** > > * Unmap a SVQ area in the device > > */ > > -static bool vhost_vdpa_svq_unmap_ring(struct vhost_vdpa *v, hwaddr iov= a, > > - hwaddr size) > > +static bool vhost_vdpa_svq_unmap_ring(struct vhost_vdpa *v, > > + const DMAMap *needle) > > { > > + const DMAMap *result =3D vhost_iova_tree_find_iova(v->iova_tree, n= eedle); > > + hwaddr size; > > int r; > > > > - size =3D ROUND_UP(size, qemu_real_host_page_size); > > - r =3D vhost_vdpa_dma_unmap(v, iova, size); > > + if (unlikely(!result)) { > > + error_report("Unable to find SVQ address to unmap"); > > + return false; > > + } > > + > > + size =3D ROUND_UP(result->size, qemu_real_host_page_size); > > + r =3D vhost_vdpa_dma_unmap(v, result->iova, size); > > return r =3D=3D 0; > > } > > > > static bool vhost_vdpa_svq_unmap_rings(struct vhost_dev *dev, > > const VhostShadowVirtqueue *sv= q) > > { > > + DMAMap needle; > > struct vhost_vdpa *v =3D dev->opaque; > > struct vhost_vring_addr svq_addr; > > - size_t device_size =3D vhost_svq_device_area_size(svq); > > - size_t driver_size =3D vhost_svq_driver_area_size(svq); > > bool ok; > > > > vhost_svq_get_vring_addr(svq, &svq_addr); > > > > - ok =3D vhost_vdpa_svq_unmap_ring(v, svq_addr.desc_user_addr, drive= r_size); > > + needle =3D (DMAMap) { > > + .translated_addr =3D svq_addr.desc_user_addr, > > + }; > > > Let's simply initialize the member to zero during start of this function > then we can use needle->transalted_addr =3D XXX here. > Sure > > > + ok =3D vhost_vdpa_svq_unmap_ring(v, &needle); > > if (unlikely(!ok)) { > > return false; > > } > > > > - return vhost_vdpa_svq_unmap_ring(v, svq_addr.used_user_addr, devic= e_size); > > + needle =3D (DMAMap) { > > + .translated_addr =3D svq_addr.used_user_addr, > > + }; > > + return vhost_vdpa_svq_unmap_ring(v, &needle); > > +} > > + > > +/** > > + * Map the SVQ area in the device > > + * > > + * @v Vhost-vdpa device > > + * @needle The area to search iova > > + * @errorp Error pointer > > + */ > > +static bool vhost_vdpa_svq_map_ring(struct vhost_vdpa *v, DMAMap *need= le, > > + Error **errp) > > +{ > > + int r; > > + > > + r =3D vhost_iova_tree_map_alloc(v->iova_tree, needle); > > + if (unlikely(r !=3D IOVA_OK)) { > > + error_setg(errp, "Cannot allocate iova (%d)", r); > > + return false; > > + } > > + > > + r =3D vhost_vdpa_dma_map(v, needle->iova, needle->size, > > + (void *)needle->translated_addr, > > + !(needle->perm & IOMMU_ACCESS_FLAG(0, 1))); > > > Let's simply use needle->perm =3D=3D IOMMU_RO here? > The motivation to use this way is to be more resilient to the future. For example, if a new flag is added. But I'm totally ok with comparing with IOMMU_RO, I see that scenario unlikely at the moment. > > > + if (unlikely(r !=3D 0)) { > > + error_setg_errno(errp, -r, "Cannot map region to device"); > > + vhost_iova_tree_remove(v->iova_tree, needle); > > + } > > + > > + return r =3D=3D 0; > > } > > > > /** > > - * Map shadow virtqueue rings in device > > + * Map the shadow virtqueue rings in the device > > * > > * @dev The vhost device > > * @svq The shadow virtqueue > > @@ -876,28 +946,44 @@ static bool vhost_vdpa_svq_map_rings(struct vhost= _dev *dev, > > struct vhost_vring_addr *addr, > > Error **errp) > > { > > + DMAMap device_region, driver_region; > > + struct vhost_vring_addr svq_addr; > > struct vhost_vdpa *v =3D dev->opaque; > > size_t device_size =3D vhost_svq_device_area_size(svq); > > size_t driver_size =3D vhost_svq_driver_area_size(svq); > > - int r; > > + size_t avail_offset; > > + bool ok; > > > > ERRP_GUARD(); > > - vhost_svq_get_vring_addr(svq, addr); > > + vhost_svq_get_vring_addr(svq, &svq_addr); > > > > - r =3D vhost_vdpa_dma_map(v, addr->desc_user_addr, driver_size, > > - (void *)addr->desc_user_addr, true); > > - if (unlikely(r !=3D 0)) { > > - error_setg_errno(errp, -r, "Cannot create vq driver region: ")= ; > > + driver_region =3D (DMAMap) { > > + .translated_addr =3D svq_addr.desc_user_addr, > > + .size =3D driver_size - 1, > > > Any reason for the "-1" here? I see several places do things like that, > it's probably hint of wrong API somehwere. > The "problem" is the api mismatch between _end and _last, to include the last member in the size or not. IOVA tree needs to use _end so we can allocate the last page in case of available range ending in (uint64_t)-1 [1]. But If we change vhost_svq_{device,driver}_area_size to make it inclusive, we need to use "+1" in calls like qemu_memalign and memset at vhost_svq_start. Probably in more places too QEMU's emulated Intel iommu code solves it using the address mask as the size, something that does not fit 100% with vhost devices since they can allocate an arbitrary address of arbitrary size when using vIOMMU. It's not a problem for vhost-vdpa at this moment since we make sure we expose unaligned and whole pages with vrings, but I feel it would only be to move the problem somewhere else. Thanks! [1] There are alternatives: to use Int128_t, etc. But I think it's better not to change that in this patch series. > Thanks > > > > + .perm =3D IOMMU_RO, > > + }; > > + ok =3D vhost_vdpa_svq_map_ring(v, &driver_region, errp); > > + if (unlikely(!ok)) { > > + error_prepend(errp, "Cannot create vq driver region: "); > > return false; > > } > > + addr->desc_user_addr =3D driver_region.iova; > > + avail_offset =3D svq_addr.avail_user_addr - svq_addr.desc_user_add= r; > > + addr->avail_user_addr =3D driver_region.iova + avail_offset; > > > > - r =3D vhost_vdpa_dma_map(v, addr->used_user_addr, device_size, > > - (void *)addr->used_user_addr, false); > > - if (unlikely(r !=3D 0)) { > > - error_setg_errno(errp, -r, "Cannot create vq device region: ")= ; > > + device_region =3D (DMAMap) { > > + .translated_addr =3D svq_addr.used_user_addr, > > + .size =3D device_size - 1, > > + .perm =3D IOMMU_RW, > > + }; > > + ok =3D vhost_vdpa_svq_map_ring(v, &device_region, errp); > > + if (unlikely(!ok)) { > > + error_prepend(errp, "Cannot create vq device region: "); > > + vhost_vdpa_svq_unmap_ring(v, &driver_region); > > } > > + addr->used_user_addr =3D device_region.iova; > > > > - return r =3D=3D 0; > > + return ok; > > } > > > > static bool vhost_vdpa_svq_setup(struct vhost_dev *dev, >