From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:36048) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1b9yM9-0005tU-1A for qemu-devel@nongnu.org; Mon, 06 Jun 2016 13:30:30 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1b9yM5-0000G9-TS for qemu-devel@nongnu.org; Mon, 06 Jun 2016 13:30:29 -0400 Received: from mx1.redhat.com ([209.132.183.28]:48624) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1b9yM5-0000G5-LJ for qemu-devel@nongnu.org; Mon, 06 Jun 2016 13:30:25 -0400 Date: Mon, 6 Jun 2016 11:30:24 -0600 From: Alex Williamson Message-ID: <20160606113024.350e3d85@ul30vt.home> In-Reply-To: <20160606073825.GH21254@pxdev.xzpeter.org> References: <1463847590-22782-1-git-send-email-bd.aviv@gmail.com> <1463847590-22782-4-git-send-email-bd.aviv@gmail.com> <20160523115342.636a5164@ul30vt.home> <20160606073825.GH21254@pxdev.xzpeter.org> MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Subject: Re: [Qemu-devel] [PATCH v3 3/3] IOMMU: Integrate between VFIO and vIOMMU to support device assignment List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Peter Xu Cc: "Aviv B.D" , qemu-devel@nongnu.org, "Michael S. Tsirkin" , Jan Kiszka On Mon, 6 Jun 2016 15:38:25 +0800 Peter Xu wrote: > Some questions not quite related to this patch content but vfio... > > On Mon, May 23, 2016 at 11:53:42AM -0600, Alex Williamson wrote: > > On Sat, 21 May 2016 19:19:50 +0300 > > "Aviv B.D" wrote: > > [...] > > > > +#if 0 > > > static hwaddr vfio_container_granularity(VFIOContainer *container) > > > { > > > return (hwaddr)1 << ctz64(container->iova_pgsizes); > > > } > > > - > > > +#endif > > Here we are fetching the smallest page size that host IOMMU support, > so even if host IOMMU support large pages, it will not be used as long > as guest enabled vIOMMU, right? Not using this replay mechanism, correct. AFAIK, this replay code has only been tested on POWER where the window is much, much smaller than the 64bit address space and hugepages are not supported. A replay callback into the iommu could could not only walk the address space more efficiently, but also attempt to map with hugepages. It would however need to be cautious not to coalesce separate mappings by the guest into a single mapping through vfio, or else we're going to have inconsistency for mapping vs unmapping that vfio does not expect or support. > > > > > > Clearly this is unacceptable, the code has a purpose. > > > > > static void vfio_listener_region_add(MemoryListener *listener, > > > MemoryRegionSection *section) > > > { > > > @@ -384,11 +387,13 @@ static void vfio_listener_region_add(MemoryListener *listener, > > > giommu->n.notify = vfio_iommu_map_notify; > > > QLIST_INSERT_HEAD(&container->giommu_list, giommu, giommu_next); > > > > > > + vtd_register_giommu(giommu); > > > > vfio will not assume VT-d, this is why we register the notifier below. > > > > > memory_region_register_iommu_notifier(giommu->iommu, &giommu->n); > > > +#if 0 > > > memory_region_iommu_replay(giommu->iommu, &giommu->n, > > > vfio_container_granularity(container), > > > false); > > For memory_region_iommu_replay(), we are using > vfio_container_granularity() as the granularity, which is the host > IOMMU page size. However inside it: > > void memory_region_iommu_replay(MemoryRegion *mr, Notifier *n, > hwaddr granularity, bool is_write) > { > hwaddr addr; > IOMMUTLBEntry iotlb; > > for (addr = 0; addr < memory_region_size(mr); addr += granularity) { > iotlb = mr->iommu_ops->translate(mr, addr, is_write); > if (iotlb.perm != IOMMU_NONE) { > n->notify(n, &iotlb); > } > > /* if (2^64 - MR size) < granularity, it's possible to get an > * infinite loop here. This should catch such a wraparound */ > if ((addr + granularity) < addr) { > break; > } > } > } > > Is it possible that iotlb mapped to a large page (or any page that is > not the same as granularity)? The above code should have assumed that > host/guest IOMMU are having the same page size == granularity? I think this is answered above. This is not remotely efficient code for a real 64bit IOMMU (BTW, VT-d does not support the full 64bit address space either, I believe it's more like 48bits) and is not going to replay hugepages, but it will give us sufficiently correct IOMMU entries... eventually. Thanks, Alex