From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:50359) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1c52qA-0005bY-3Q for qemu-devel@nongnu.org; Thu, 10 Nov 2016 22:49:23 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1c52q5-0007uu-8b for qemu-devel@nongnu.org; Thu, 10 Nov 2016 22:49:22 -0500 Received: from mx1.redhat.com ([209.132.183.28]:45100) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1c52q5-0007uk-0M for qemu-devel@nongnu.org; Thu, 10 Nov 2016 22:49:17 -0500 Date: Fri, 11 Nov 2016 05:49:14 +0200 From: "Michael S. Tsirkin" Message-ID: <20161111054348-mutt-send-email-mst@kernel.org> References: <1478165243-4767-1-git-send-email-jasowang@redhat.com> <1478165243-4767-8-git-send-email-jasowang@redhat.com> <20161103214712-mutt-send-email-mst@kernel.org> <7157aba3-bd5d-c335-78a7-2f79ee576f8d@redhat.com> <20161110192959-mutt-send-email-mst@kernel.org> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline In-Reply-To: Content-Transfer-Encoding: quoted-printable Subject: Re: [Qemu-devel] [PATCH V2 07/11] virtio-pci: address space translation service (ATS) support List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Jason Wang Cc: peterx@redhat.com, wexu@redhat.com, qemu-devel@nongnu.org, vkaplans@redhat.com, pbonzini@redhat.com, cornelia.huck@de.ibm.com On Fri, Nov 11, 2016 at 11:26:12AM +0800, Jason Wang wrote: >=20 >=20 > On 2016=E5=B9=B411=E6=9C=8811=E6=97=A5 01:32, Michael S. Tsirkin wrote: > > On Fri, Nov 04, 2016 at 02:48:20PM +0800, Jason Wang wrote: > > >=20 > > > On 2016=E5=B9=B411=E6=9C=8804=E6=97=A5 03:49, Michael S. Tsirkin wr= ote: > > > > On Thu, Nov 03, 2016 at 05:27:19PM +0800, Jason Wang wrote: > > > > > > This patches enable the Address Translation Service support f= or virtio > > > > > > pci devices. This is needed for a guest visible Device IOTLB > > > > > > implementation and will be required by vhost device IOTLB API > > > > > > implementation for intel IOMMU. > > > > > >=20 > > > > > > Cc: Michael S. Tsirkin > > > > > > Signed-off-by: Jason Wang > > > > I'd like to understand why do you think this is strictly required= . > > > > Won't setting CM bit in the IOMMU do the trick. > > > ATS was chosen for performance. Since there're many problems for CM= : > > >=20 > > > - CM was slow (10%-20% slower on real hardware for things like netp= erf) > > > because of each transition between non-present and present mapping = needs an > > > explicit invalidation. It may slow down the whole VM. > > > - Without ATS/Device IOTLB, IOMMU becomes a bottleneck because of c= ontending > > > of IOTLB entries. (What we can do in this case is in fact userspace= IOTLB > > > snooping, this could be done even without CM). > > > It was natural to think of ATS when designing interface between IOM= MU and > > > device/remote IOTLBs. Do you see any drawbacks on ATS here? > > >=20 > > > Thanks > > In fact at this point I'm confused. Any mapping needs to be programme= d > > in the IOMMU. We need to implement this correctly. > > Once we do why do we need ATS? > > I think what you need is map/unmap notifiers that Aviv is working on. > > No? >=20 > Let me clarify, device IOTLB API can work without ATS or CM. So there'r= e > three ways to do: >=20 > 1) without ATS or CM support, the function could be implemented through= : > 1.1: asking for qemu help if there's an IOTLB miss in vhost > 1.2: snooping the userspace IOTLB invalidation (present to non-present > mapping) and update device IOTLB >=20 > 2) with CM enabled, the only thing we can add is snooping the non-prese= nt to > present mapping and update the device IOTLB. This is not a requirement = since > we still can get this through asking qemu's(1.2) help. >=20 > 3) with ATS enabled, guest knows the existence of device IOTLB, and dev= ice > IOTLB entires needs to be flushed explicitly by guest. In this case the= re's > no need to snoop the ordinary IOTLB invalidation in 1.2. We just need t= o > snoop the device IOTLB specific invalidation request from guest. >=20 > All the above 3 methods work very well, but let's have a look at perfor= mance > impact: >=20 > - Method 1 (without CM or ATS), the performance is not the best since g= uest > does not know about the existence of remote IOTLB, this means the flush= of > device IOTLB entry could not be done on demand. One example is some IOM= MU > driver (e.g intel) tends to optimize the IOTLB invalidations by issuing= a > global invalidation periodically. We need to flush the device IOTLB too= in > this case. Thus we can notice some jitter (because of IOTLB miss). >=20 > - Method 2 (with CM but without ATS) seems to be the worst case. It has= not > only all problems above a but also a new one: each transition needs to > notify the device explicitly. Even if dpdk use static mappings, all oth= er > devices in the VM use dynamic ones which slows down the whole the syste= m. > According to the test, CM is about 10%-20% slower in real hardware. >=20 > - Method 3 (ATS) can give the best performance, all the problems have g= one > since guest can flush the device IOTLB entry on demand. It was defined = by > spec and was designed to solve the issues just like what we meet here, = and > was supported by modern IOMMUs. >=20 > And what's even better, implementing ATS turns out less than 100 lines = of > codes. And it was much more easier to be enabled on other IOMMU (AMD I= OMMU > only needs 20 lines of codes). All other ways (I started and have codes= for > method 1 for intel IOMMU) need lots of work specific to each kind of IO= MMU. method 1 is basically what Aviv implemented except you don't need map notifiers, only unmap. >=20 > Consider so much advantages by just adding so small lines of codes. I d= on't > see why we don't need ATS (for the IOOMUs that supports it). >=20 > Thanks I am concerned that not all IOMMUs and guests support ATS. > >=20 > >=20 > > > > Also, could you remind me pls - can guests just disable ATS? > > > >=20 > > > > What happens then? > > > >=20 > > > >=20