From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:36728) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1ctoga-0007KM-4A for qemu-devel@nongnu.org; Fri, 31 Mar 2017 01:01:21 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1ctogX-0004r5-3g for qemu-devel@nongnu.org; Fri, 31 Mar 2017 01:01:20 -0400 Received: from mx1.redhat.com ([209.132.183.28]:43898) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1ctogW-0004pQ-S3 for qemu-devel@nongnu.org; Fri, 31 Mar 2017 01:01:17 -0400 Date: Fri, 31 Mar 2017 13:01:01 +0800 From: Peter Xu Message-ID: <20170331050101.GF3981@pxdev.xzpeter.org> References: <1486456099-7345-1-git-send-email-peterx@redhat.com> <1486456099-7345-15-git-send-email-peterx@redhat.com> <20170327091208.GG11497@pxdev.xzpeter.org> <9c3cda64-b4a3-6f9b-f951-bf73f6613faa@redhat.com> <20170331025618.GB3981@pxdev.xzpeter.org> <23619b9a-b671-75ff-ffc5-01a61ea9d8c5@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline In-Reply-To: <23619b9a-b671-75ff-ffc5-01a61ea9d8c5@redhat.com> Content-Transfer-Encoding: quoted-printable Subject: Re: [Qemu-devel] [PATCH v7 14/17] memory: add MemoryRegionIOMMUOps.replay() callback List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Jason Wang Cc: "Liu, Yi L" , "'alex.williamson@redhat.com'" , "Lan, Tianyu" , "Tian, Kevin" , "'mst@redhat.com'" , "'jan.kiszka@siemens.com'" , "'bd.aviv@gmail.com'" , 'David Gibson' , "'qemu-devel@nongnu.org'" On Fri, Mar 31, 2017 at 12:21:23PM +0800, Jason Wang wrote: >=20 >=20 > On 2017=E5=B9=B403=E6=9C=8831=E6=97=A5 10:56, Peter Xu wrote: > >>>Just come to mind that there may be a corner case here. > >>> > >>>Intel VT-d actually has a "pt" mode which allows device use physical= address > >>>even when VT-d is enabled. In kernel, there is a iommu_identity_mapp= ing. > >>>If a device is in this map, then it would use "pt" mode. So that IOM= MU driver > >>>would not build second-level page table for it. > >>Yes, but qemu does not support ECAP_PT now, so guest will still have = a page > >>table in this case. > >> > >>>Back to the virtual IOVA implementation, if an assigned device is in= the > >>>iommu_identity_mapping(e.g. VGA controller), it uses GPA directly to= do DMA. > >>>So it demands a GPA->HPA mapping in host. However, the iommu->ops.re= play > >>>is not able to build it when guest SL page table is empty. > >>> > >>>So I think building an entire guest PA->HPA mapping before guest ker= nel boot > >>>would be recommended. Any thoughts? > >>We plan to add PT in 2.10, a possible rough idea is disabled iommu dm= ar > >>region and use another region without iommu_ops. Then > >>vfio_listener_region_add() will just do the correct mappings. > >Even without any new region. With the patch 16/17 ("intel_iommu: allow > >dynamic switch of IOMMU region"), we can just turn the IOMMU region > >on/off, following the device's PT bit, maybe using the new > >vtd_switch_address_space() interface. That should be enough. >=20 > Right. For vhost it was probably need more works, e.g setting up static > mappings during region_add(). Do we need to? VFIO will need it for building up shadow page table, even without a vIOMMU. But imho that should not be needed by vhost, right? -- peterx