From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:33599) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1cT2GO-0008Jt-6C for qemu-devel@nongnu.org; Mon, 16 Jan 2017 03:03:37 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1cT2GL-0007my-2h for qemu-devel@nongnu.org; Mon, 16 Jan 2017 03:03:36 -0500 Received: from mx1.redhat.com ([209.132.183.28]:37486) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1cT2GK-0007mr-Qk for qemu-devel@nongnu.org; Mon, 16 Jan 2017 03:03:33 -0500 References: <1484276800-26814-1-git-send-email-peterx@redhat.com> <1484276800-26814-12-git-send-email-peterx@redhat.com> <20170116073116.GC30108@pxdev.xzpeter.org> <97339915-c8fb-cfd5-aa06-b795eab21428@redhat.com> <20170116075947.GF30108@pxdev.xzpeter.org> From: Jason Wang Message-ID: Date: Mon, 16 Jan 2017 16:03:22 +0800 MIME-Version: 1.0 In-Reply-To: <20170116075947.GF30108@pxdev.xzpeter.org> Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: quoted-printable Subject: Re: [Qemu-devel] [PATCH RFC v3 11/14] intel_iommu: provide its own replay() callback List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Peter Xu Cc: qemu-devel@nongnu.org, tianyu.lan@intel.com, kevin.tian@intel.com, mst@redhat.com, jan.kiszka@siemens.com, alex.williamson@redhat.com, bd.aviv@gmail.com On 2017=E5=B9=B401=E6=9C=8816=E6=97=A5 15:59, Peter Xu wrote: > On Mon, Jan 16, 2017 at 03:47:08PM +0800, Jason Wang wrote: >> >> On 2017=E5=B9=B401=E6=9C=8816=E6=97=A5 15:31, Peter Xu wrote: >>> On Fri, Jan 13, 2017 at 05:26:06PM +0800, Jason Wang wrote: >>>> On 2017=E5=B9=B401=E6=9C=8813=E6=97=A5 11:06, Peter Xu wrote: >>>>> The default replay() don't work for VT-d since vt-d will have a hug= e >>>>> default memory region which covers address range 0-(2^64-1). This w= ill >>>>> normally bring a dead loop when guest starts. >>>> I think it just takes too much time instead of dead loop? >>> Hmm, I can touch the commit message above to make it more precise. >>> >>>>> The solution is simple - we don't walk over all the regions. Instea= d, we >>>>> jump over the regions when we found that the page directories are e= mpty. >>>>> It'll greatly reduce the time to walk the whole region. >>>> Yes, the problem is memory_region_is_iommu_reply() not smart because= : >>>> >>>> - It doesn't understand large page >>>> - try go over all possible iova >>>> >>>> So I'm thinking to introduce something like iommu_ops->iova_iterate(= ) which >>>> >>>> 1) accept an start iova and return the next exist map >>>> 2) understand large page >>>> 3) skip unmapped iova >>> Though I haven't tested with huge pages yet, but this patch should >>> both solve above issue? I don't know whether you went over the page >>> walk logic - it should both support huge page, and it will skip >>> unmapped iova range (at least that's my goal to have this patch). In >>> that case, looks like this patch is solving the same problem? :) >>> (though without introducing iova_iterate() interface) >>> >>> Please correct me if I misunderstood it. >> Kind of :) I'm fine with this patch, but just want: >> >> - reuse most of the codes in the patch >> - current memory_region_iommu_replay() logic >> >> So what I'm suggesting is a just slight change of API which can let ca= ller >> decide it need to do with each range of iova. So it could be reused fo= r >> other things except for replaying. >> >> But if you like to keep this patch as is, I don't object it. > I see. Then I can understand your mean here. I had the same thought > before, that's why I exposed the vtd_page_walk with a hook. If you > check the page_walk function comment: > > /** > * vtd_page_walk - walk specific IOVA range, and call the hook > * > * @ce: context entry to walk upon > * @start: IOVA address to start the walk > * @end: IOVA range end address (start <=3D addr < end) > * @hook_fn: the hook that to be called for each detected area > * @private: private data for the hook function > */ > > So I didn't implement the notification in page_walk at all - but in > the hook_fn. If any caller that is interested in doing something else > rather than the notification, we can just simply export the page walk > interface and provide his/her own "hook_fn", then it'll be triggered > for each valid page (no matter a huge/small one). > > If we can have a more general interface in the future - no matter > whether we call it iova_iterate() or something else (I'll prefer the > hooker way to do it, so maybe a common page walker with a hook > function), we can do it simply (at least for Intel platform) based on > this vtd_page_walk thing. > > Thanks, > > -- peterx Yes but the problem is hook_fn is only visible inside intel iommu code. Thanks