QEMU-Devel Archive on lore.kernel.org
 help / color / Atom feed
From: Peter Xu <peterx@redhat.com>
To: Auger Eric <eric.auger@redhat.com>
Cc: "peter.maydell@linaro.org" <peter.maydell@linaro.org>,
	"Tian, Kevin" <kevin.tian@intel.com>,
	"mst@redhat.com" <mst@redhat.com>,
	"tn@semihalf.com" <tn@semihalf.com>,
	"qemu-devel@nongnu.org" <qemu-devel@nongnu.org>,
	"alex.williamson@redhat.com" <alex.williamson@redhat.com>,
	"qemu-arm@nongnu.org" <qemu-arm@nongnu.org>,
	"jean-philippe@linaro.org" <jean-philippe@linaro.org>,
	"bharat.bhushan@nxp.com" <bharat.bhushan@nxp.com>,
	"eric.auger.pro@gmail.com" <eric.auger.pro@gmail.com>
Subject: Re: [Qemu-devel] [PATCH for-4.2 v10 08/15] virtio-iommu: Implement map/unmap
Date: Wed, 4 Sep 2019 16:32:39 +0800
Message-ID: <20190904083239.GI30402@xz-x1> (raw)
In-Reply-To: <c8b8e80d-0b66-1a7c-28b8-0bd3b013a9e8@redhat.com>

On Wed, Sep 04, 2019 at 09:54:12AM +0200, Auger Eric wrote:
> Hi,
> 
> On 9/4/19 7:46 AM, Tian, Kevin wrote:
> >> From: Peter Xu [mailto:peterx@redhat.com]
> >> Sent: Wednesday, September 4, 2019 1:37 PM
> >>
> >> On Wed, Sep 04, 2019 at 04:23:50AM +0000, Tian, Kevin wrote:
> >>>> From: Peter Xu [mailto:peterx@redhat.com]
> >>>> Sent: Wednesday, September 4, 2019 9:44 AM
> >>>>
> >>>> On Tue, Sep 03, 2019 at 01:37:11PM +0200, Auger Eric wrote:
> >>>>> Hi Peter,
> >>>>>
> >>>>> On 8/19/19 10:11 AM, Peter Xu wrote:
> >>>>>> On Tue, Jul 30, 2019 at 07:21:30PM +0200, Eric Auger wrote:
> >>>>>>
> >>>>>> [...]
> >>>>>>
> >>>>>>> +    mapping = g_tree_lookup(domain->mappings,
> >> (gpointer)(&interval));
> >>>>>>> +
> >>>>>>> +    while (mapping) {
> >>>>>>> +        viommu_interval current;
> >>>>>>> +        uint64_t low  = mapping->virt_addr;
> >>>>>>> +        uint64_t high = mapping->virt_addr + mapping->size - 1;
> >>>>>>> +
> >>>>>>> +        current.low = low;
> >>>>>>> +        current.high = high;
> >>>>>>> +
> >>>>>>> +        if (low == interval.low && size >= mapping->size) {
> >>>>>>> +            g_tree_remove(domain->mappings, (gpointer)(&current));
> >>>>>>> +            interval.low = high + 1;
> >>>>>>> +            trace_virtio_iommu_unmap_left_interval(current.low,
> >>>> current.high,
> >>>>>>> +                interval.low, interval.high);
> >>>>>>> +        } else if (high == interval.high && size >= mapping->size) {
> >>>>>>> +            trace_virtio_iommu_unmap_right_interval(current.low,
> >>>> current.high,
> >>>>>>> +                interval.low, interval.high);
> >>>>>>> +            g_tree_remove(domain->mappings, (gpointer)(&current));
> >>>>>>> +            interval.high = low - 1;
> >>>>>>> +        } else if (low > interval.low && high < interval.high) {
> >>>>>>> +            trace_virtio_iommu_unmap_inc_interval(current.low,
> >>>> current.high);
> >>>>>>> +            g_tree_remove(domain->mappings, (gpointer)(&current));
> >>>>>>> +        } else {
> >>>>>>> +            break;
> >>>>>>> +        }
> >>>>>>> +        if (interval.low >= interval.high) {
> >>>>>>> +            return VIRTIO_IOMMU_S_OK;
> >>>>>>> +        } else {
> >>>>>>> +            mapping = g_tree_lookup(domain->mappings,
> >>>> (gpointer)(&interval));
> >>>>>>> +        }
> >>>>>>> +    }
> >>>>>>> +
> >>>>>>> +    if (mapping) {
> >>>>>>> +        qemu_log_mask(LOG_GUEST_ERROR,
> >>>>>>> +                      "****** %s: Unmap 0x%"PRIx64" size=0x%"PRIx64
> >>>>>>> +                     " from 0x%"PRIx64" size=0x%"PRIx64" is not supported\n",
> >>>>>>> +                     __func__, interval.low, size,
> >>>>>>> +                     mapping->virt_addr, mapping->size);
> >>>>>>> +    } else {
> >>>>>>> +        return VIRTIO_IOMMU_S_OK;
> >>>>>>> +    }
> >>>>>>> +
> >>>>>>> +    return VIRTIO_IOMMU_S_INVAL;
> >>>>>>
> >>>>>> Could the above chunk be simplified as something like below?
> >>>>>>
> >>>>>>   while ((mapping = g_tree_lookup(domain->mappings, &interval))) {
> >>>>>>     g_tree_remove(domain->mappings, mapping);
> >>>>>>   }
> >>>>> Indeed the code could be simplified. I only need to make sure I don't
> >>>>> split an existing mapping.
> >>>>
> >>>> Hmm... Do we need to still split an existing mapping if necessary?
> >>>> For example when with this mapping:
> >>>>
> >>>>   iova=0x1000, size=0x2000, phys=ADDR1, flags=FLAGS1
> >>>>
> >>>> And if we want to unmap the range (iova=0, size=0x2000), then we
> >>>> should split the existing mappping and leave this one:
> >>>>
> >>>>   iova=0x2000, size=0x1000, phys=(ADDR1+0x1000), flags=FLAGS1
> >>>>
> >>>> Right?
> >>>>
> >>>
> >>> virtio-iommu spec explicitly disallows partial unmap.
> >>>
> >>> 5.11.6.6.1 Driver Requirements: UNMAP request
> >>>
> >>> The first address of a range MUST either be the first address of a
> >>> mapping or be outside any mapping. The last address of a range
> >>> MUST either be the last address of a mapping or be outside any
> >>> mapping.
> >>>
> >>> 5.11.6.6.2 Device Requirements: UNMAP request
> >>>
> >>> If a mapping affected by the range is not covered in its entirety
> >>> by the range (the UNMAP request would split the mapping),
> >>> then the device SHOULD set the request status to VIRTIO_IOMMU
> >>> _S_RANGE, and SHOULD NOT remove any mapping.
> >>
> >> I see, thanks Kevin.
> >>
> >> Though why so strict?  (Sorry if I missed some discussions
> >> ... pointers welcomed...)
> >>
> >> What I'm thinking is when we want to allocate a bunch of buffers
> >> (e.g., 1M) while we will also need to be able to free them with
> >> smaller chunks (e.g., 4K), then it would be even better that we allow
> >> to allocate a whole 1M buffer within the guest and map it as a whole,
> >> then we can selectively unmap the pages after used.  If with the
> >> strict rule, we'll need to map one by one, that can be a total of
> >> 1M/4K roundtrips.
> >>
> > 
> > Sorry I forgot the original discussion. Need Jean to respond. :-)
> > 
> > A possible reason is that no such usage exists today, thus simplification
> > was made? 
> 
> In
> https://virtualization.linux-foundation.narkive.com/q6XOkO76/rfc-0-3-virtio-iommu-a-paravirtualized-iommu
> 
> I found
> 
> "
> (Note: the semantics of unmap are chosen to be compatible with VFIO's
> type1 v2 IOMMU API. This way a device serving as intermediary between
> guest and VFIO doesn't have to keep an internal tree of mappings. They are
> a bit tighter than VFIO, in that they don't allow unmap spilling outside
> mapped regions. Spilling is 'undefined' at the moment, because it should
> work in most cases but I don't know if it's worth the added complexity in
> devices that are not simply transmitting requests to VFIO. Splitting
> mappings won't ever be allowed, but see the relaxed proposal in 3/3 for
> more lenient semantics)
> "

Yes it makes sense to follow vfio type1v2 here.  Though I'm not sure
whether the maintainance of "an internal tree of mappings" could be
avoided by this, at least if using current QEMU IOMMU notifier
framework.  The problem is currently the IOMMU notifiers cannot fail
(e.g., the VFIO_IOMMU_MAP_DMA ioctl from vfio-pci device will assume
the messages delivered from vIOMMUs are always valid), so AFAICT the
vIOMMU needs to tell which mapping request from the guest driver is
valid before delivering the request to vfio.  It seems impossible to
do that if without the internal tree of mapping.  But that seems to be
another story.

In all cases, I think I'm fine with the approach for this patch.

Thanks!

-- 
Peter Xu


  reply index

Thread overview: 55+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-07-30 17:21 [Qemu-devel] [PATCH for-4.2 v10 00/15] VIRTIO-IOMMU device Eric Auger
2019-07-30 17:21 ` [Qemu-devel] [PATCH for-4.2 v10 01/15] update-linux-headers: Import virtio_iommu.h Eric Auger
2019-07-30 17:21 ` [Qemu-devel] [PATCH for-4.2 v10 02/15] linux-headers: update against 5.3-rc2 Eric Auger
2019-07-30 17:21 ` [Qemu-devel] [PATCH for-4.2 v10 03/15] virtio-iommu: Add skeleton Eric Auger
2019-08-15 13:54   ` Peter Xu
2019-08-29 12:18     ` Auger Eric
2019-08-30  1:26       ` Peter Xu
2019-08-30  8:12         ` Auger Eric
2019-07-30 17:21 ` [Qemu-devel] [PATCH for-4.2 v10 04/15] virtio-iommu: Decode the command payload Eric Auger
2019-07-30 17:21 ` [Qemu-devel] [PATCH for-4.2 v10 05/15] virtio-iommu: Add the iommu regions Eric Auger
2019-08-16  4:00   ` Peter Xu
2019-08-29 12:51     ` Auger Eric
2019-07-30 17:21 ` [Qemu-devel] [PATCH for-4.2 v10 06/15] virtio-iommu: Endpoint and domains structs and helpers Eric Auger
2019-08-16  4:17   ` Peter Xu
2019-11-04 18:31   ` Jean-Philippe Brucker
2019-11-25 13:14     ` Auger Eric
2019-07-30 17:21 ` [Qemu-devel] [PATCH for-4.2 v10 07/15] virtio-iommu: Implement attach/detach command Eric Auger
2019-08-16  4:27   ` Peter Xu
2019-08-29 14:24     ` Auger Eric
2019-07-30 17:21 ` [Qemu-devel] [PATCH for-4.2 v10 08/15] virtio-iommu: Implement map/unmap Eric Auger
2019-08-19  8:11   ` Peter Xu
2019-09-03 11:37     ` Auger Eric
2019-09-04  1:44       ` Peter Xu
2019-09-04  4:23         ` Tian, Kevin
2019-09-04  5:37           ` Peter Xu
2019-09-04  5:46             ` Tian, Kevin
2019-09-04  7:54               ` Auger Eric
2019-09-04  8:32                 ` Peter Xu [this message]
2019-07-30 17:21 ` [Qemu-devel] [PATCH for-4.2 v10 09/15] virtio-iommu: Implement translate Eric Auger
2019-08-19  8:24   ` Peter Xu
2019-09-03 11:45     ` Auger Eric
2019-09-04  1:58       ` Peter Xu
2019-07-30 17:21 ` [Qemu-devel] [PATCH for-4.2 v10 10/15] virtio-iommu: Implement probe request Eric Auger
2019-08-19 12:08   ` Peter Xu
2019-09-03 12:23     ` Auger Eric
2019-07-30 17:21 ` [Qemu-devel] [PATCH for-4.2 v10 11/15] virtio-iommu: Expose the IOAPIC MSI reserved region when relevant Eric Auger
2019-07-30 19:38   ` Michael S. Tsirkin
2019-07-30 23:20     ` Tian, Kevin
2019-07-31  9:05       ` Auger Eric
2019-07-31 19:25       ` Michael S. Tsirkin
2019-07-31 19:44         ` Auger Eric
2019-07-31 23:23           ` Tian, Kevin
2019-07-30 17:21 ` [Qemu-devel] [PATCH for-4.2 v10 12/15] virtio-iommu: Implement fault reporting Eric Auger
2019-07-30 17:21 ` [Qemu-devel] [PATCH for-4.2 v10 13/15] virtio_iommu: Handle reserved regions in translation process Eric Auger
2019-08-19 12:44   ` Peter Xu
2019-09-01  6:38   ` Michael S. Tsirkin
2019-07-30 17:21 ` [Qemu-devel] [PATCH for-4.2 v10 14/15] virtio-iommu-pci: Add virtio iommu pci support Eric Auger
2019-07-30 19:35   ` Michael S. Tsirkin
2019-08-01 12:15     ` Auger Eric
2019-08-01 13:06       ` Michael S. Tsirkin
2019-08-01 13:49         ` Auger Eric
2019-09-01  6:40           ` Michael S. Tsirkin
2019-09-04 14:19             ` Auger Eric
2019-09-04 21:36               ` Michael S. Tsirkin
2019-07-30 17:21 ` [Qemu-devel] [PATCH for-4.2 v10 15/15] hw/arm/virt: Add the virtio-iommu device tree mappings Eric Auger

Reply instructions:

You may reply publically to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190904083239.GI30402@xz-x1 \
    --to=peterx@redhat.com \
    --cc=alex.williamson@redhat.com \
    --cc=bharat.bhushan@nxp.com \
    --cc=eric.auger.pro@gmail.com \
    --cc=eric.auger@redhat.com \
    --cc=jean-philippe@linaro.org \
    --cc=kevin.tian@intel.com \
    --cc=mst@redhat.com \
    --cc=peter.maydell@linaro.org \
    --cc=qemu-arm@nongnu.org \
    --cc=qemu-devel@nongnu.org \
    --cc=tn@semihalf.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

QEMU-Devel Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/qemu-devel/0 qemu-devel/git/0.git
	git clone --mirror https://lore.kernel.org/qemu-devel/1 qemu-devel/git/1.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 qemu-devel qemu-devel/ https://lore.kernel.org/qemu-devel \
		qemu-devel@nongnu.org
	public-inbox-index qemu-devel

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.nongnu.qemu-devel


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git