From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:46862) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1c52Tu-0008MU-Rq for qemu-devel@nongnu.org; Thu, 10 Nov 2016 22:26:23 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1c52Tr-0008Bu-NG for qemu-devel@nongnu.org; Thu, 10 Nov 2016 22:26:22 -0500 Received: from mx1.redhat.com ([209.132.183.28]:39974) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1c52Tr-0008Ap-Fa for qemu-devel@nongnu.org; Thu, 10 Nov 2016 22:26:19 -0500 References: <1478165243-4767-1-git-send-email-jasowang@redhat.com> <1478165243-4767-8-git-send-email-jasowang@redhat.com> <20161103214712-mutt-send-email-mst@kernel.org> <7157aba3-bd5d-c335-78a7-2f79ee576f8d@redhat.com> <20161110192959-mutt-send-email-mst@kernel.org> From: Jason Wang Message-ID: Date: Fri, 11 Nov 2016 11:26:12 +0800 MIME-Version: 1.0 In-Reply-To: <20161110192959-mutt-send-email-mst@kernel.org> Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: quoted-printable Subject: Re: [Qemu-devel] [PATCH V2 07/11] virtio-pci: address space translation service (ATS) support List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: "Michael S. Tsirkin" Cc: peterx@redhat.com, wexu@redhat.com, qemu-devel@nongnu.org, vkaplans@redhat.com, pbonzini@redhat.com, cornelia.huck@de.ibm.com On 2016=E5=B9=B411=E6=9C=8811=E6=97=A5 01:32, Michael S. Tsirkin wrote: > On Fri, Nov 04, 2016 at 02:48:20PM +0800, Jason Wang wrote: >> >> On 2016=E5=B9=B411=E6=9C=8804=E6=97=A5 03:49, Michael S. Tsirkin wrote= : >>> On Thu, Nov 03, 2016 at 05:27:19PM +0800, Jason Wang wrote: >>>>> This patches enable the Address Translation Service support for vir= tio >>>>> pci devices. This is needed for a guest visible Device IOTLB >>>>> implementation and will be required by vhost device IOTLB API >>>>> implementation for intel IOMMU. >>>>> >>>>> Cc: Michael S. Tsirkin >>>>> Signed-off-by: Jason Wang >>> I'd like to understand why do you think this is strictly required. >>> Won't setting CM bit in the IOMMU do the trick. >> ATS was chosen for performance. Since there're many problems for CM: >> >> - CM was slow (10%-20% slower on real hardware for things like netperf= ) >> because of each transition between non-present and present mapping nee= ds an >> explicit invalidation. It may slow down the whole VM. >> - Without ATS/Device IOTLB, IOMMU becomes a bottleneck because of cont= ending >> of IOTLB entries. (What we can do in this case is in fact userspace IO= TLB >> snooping, this could be done even without CM). >> It was natural to think of ATS when designing interface between IOMMU = and >> device/remote IOTLBs. Do you see any drawbacks on ATS here? >> >> Thanks > In fact at this point I'm confused. Any mapping needs to be programmed > in the IOMMU. We need to implement this correctly. > Once we do why do we need ATS? > I think what you need is map/unmap notifiers that Aviv is working on. > No? Let me clarify, device IOTLB API can work without ATS or CM. So there're=20 three ways to do: 1) without ATS or CM support, the function could be implemented through: 1.1: asking for qemu help if there's an IOTLB miss in vhost 1.2: snooping the userspace IOTLB invalidation (present to non-present=20 mapping) and update device IOTLB 2) with CM enabled, the only thing we can add is snooping the=20 non-present to present mapping and update the device IOTLB. This is not=20 a requirement since we still can get this through asking qemu's(1.2) help= . 3) with ATS enabled, guest knows the existence of device IOTLB, and=20 device IOTLB entires needs to be flushed explicitly by guest. In this=20 case there's no need to snoop the ordinary IOTLB invalidation in 1.2. We=20 just need to snoop the device IOTLB specific invalidation request from=20 guest. All the above 3 methods work very well, but let's have a look at=20 performance impact: - Method 1 (without CM or ATS), the performance is not the best since=20 guest does not know about the existence of remote IOTLB, this means the=20 flush of device IOTLB entry could not be done on demand. One example is=20 some IOMMU driver (e.g intel) tends to optimize the IOTLB invalidations=20 by issuing a global invalidation periodically. We need to flush the=20 device IOTLB too in this case. Thus we can notice some jitter (because=20 of IOTLB miss). - Method 2 (with CM but without ATS) seems to be the worst case. It has=20 not only all problems above a but also a new one: each transition needs=20 to notify the device explicitly. Even if dpdk use static mappings, all=20 other devices in the VM use dynamic ones which slows down the whole the=20 system. According to the test, CM is about 10%-20% slower in real hardwar= e. - Method 3 (ATS) can give the best performance, all the problems have=20 gone since guest can flush the device IOTLB entry on demand. It was=20 defined by spec and was designed to solve the issues just like what we=20 meet here, and was supported by modern IOMMUs. And what's even better, implementing ATS turns out less than 100 lines=20 of codes. And it was much more easier to be enabled on other IOMMU (AMD=20 IOMMU only needs 20 lines of codes). All other ways (I started and have=20 codes for method 1 for intel IOMMU) need lots of work specific to each=20 kind of IOMMU. Consider so much advantages by just adding so small lines of codes. I=20 don't see why we don't need ATS (for the IOOMUs that supports it). Thanks > > >>> Also, could you remind me pls - can guests just disable ATS? >>> >>> What happens then? >>> >>>