From mboxrd@z Thu Jan 1 00:00:00 1970 From: "Xu, Quan" Subject: Re: [PATCH v3 0/2] VT-d flush issue Date: Mon, 21 Dec 2015 13:35:40 +0000 Message-ID: <945CA011AD5F084CBEA3E851C0AB28894B7FC57A@SHSMSX101.ccr.corp.intel.com> References: <945CA011AD5F084CBEA3E851C0AB28894B7FBC68@SHSMSX101.ccr.corp.intel.com> <5677F4B502000078000C1D51@prv-mh.provo.novell.com> <945CA011AD5F084CBEA3E851C0AB28894B7FC2D8@SHSMSX101.ccr.corp.intel.com> <5678039102000078000C1DEA@prv-mh.provo.novell.com> <945CA011AD5F084CBEA3E851C0AB28894B7FC546@SHSMSX101.ccr.corp.intel.com> <56780B3D02000078000C1E47@prv-mh.provo.novell.com> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Return-path: In-Reply-To: <56780B3D02000078000C1E47@prv-mh.provo.novell.com> Content-Language: en-US List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xen.org Errors-To: xen-devel-bounces@lists.xen.org To: Jan Beulich Cc: "Tian, Kevin" , "Wu, Feng" , "'george.dunlap@eu.citrix.com'" , "'andrew.cooper3@citrix.com'" , "'tim@xen.org'" , "'xen-devel@lists.xen.org'" , "Nakajima, Jun" , "'keir@xen.org'" List-Id: xen-devel@lists.xenproject.org > On 21.12.2015 at 9:23pm, wrote: > >>> On 21.12.15 at 14:08, wrote: > >> On 21.12.2015 at 8:50pm, wrote: > >> >>> On 21.12.15 at 13:28, wrote: > >> > On 21.12.2015 at 7:47pm, wrote: > >> >> >>> On 20.12.15 at 14:57, wrote: > >> >> > 2. If VT-d is bug, does the hardware_domain continue to work > >> >> > with PCIe Devices / DRAM well with DMA remapping error? > >> >> > I think it is no. furthermore, i think VMM can NOT run a > >> >> > normal HVM domain without device-passthrough. > >> >> > >> >> In addition to what Andrew said - VT-d is effectively not in use > >> >> for domains without PT device. > >> > > >> > IMO, When VT-d is enabled, but is not working correct. These PCI-e > >> > devices > >> > (Disks/NICs..) DMA/Interrupt behaviors are not predictable. > >> > Assumed that, VT-d is effectively not in use for domains without PT > >> > device, while at least the virtualization infrastructure is not trusted. > >> > I think it is also not secure to run PV domains. > >> > > >> >> Impacting all such domains by crashing the hypervisor just because > >> >> (in the extreme case) a single domain with PT devices exhibited a > >> >> flush issue is a no-go imo. > >> >> > >> > > >> > IMO, a VT-d (IEC/Context/Iotlb) flush issue is not a single domain > >> > behavior, it is a Hypervisor and infrastructure issue. > >> > ATS device's Device-TLB flush is a single domain issue. > >> > Back to our original goal, my patch set is for ATS flush issue. right? > >> > >> You mean you don't like this entailing clean up of other code? > > > > Jan, for ARM/AMD, I really have no knowledge to fix it. and I have no > > ARM/AMD hardware to verify it. if I need to fix these common part of > > INTEL/ARM/AMD, I think I need to make Xen compile correct and not to > > destroy the logic. > > You indeed aren't expected to fix AMD or ARM code, but it may be necessary to > adjust that code to make error propagation work. > > >> I'm sorry, but I'm > >> afraid you won't get away without - perhaps the VT-d maintainers > >> could help here, but in the end you have to face that it was mainly > >> Intel people who introduced the code which now needs fixing up, so I > >> consider it not exactly unfair for you (as a > >> company) to do this work. > >> > > > > Furthermore, I found out that > > if IEC/Iotlb/Context flush error, then panic. > > Else if device-tlb flush error, we'll hide the target ATS device > > and kill the domain owning this ATS device. If impacted domain is > > hardware domain, just throw out a warning. > > > > Then, it is fine to _not_check all the way up the device-tlb > > flush call trees( maybe it is our next topic of discussion). > > I don't follow - this sounds more or less like the model you've been following in > past versions, yet it was that which prompted the request to properly propagate > errors. > Jan, Maybe we can discuss the big picture first on how to deal with iec/iotlb/context and Device-TLB flush error. Then we can discuss it in detail. We can ignore some point of the way up the device-tlb flush call trees. Such as iommu_hwdom_init() *|--hd->platform_ops->map_page(d, gfn, mfn, mapping); And more, if we are on same page, I am glad to write patch for all of vt-d issue, including IOMMU_WAIT_OP issue .etc.. -Quan