From mboxrd@z Thu Jan  1 00:00:00 1970
From: "Xu, Quan" <quan.xu@intel.com>
Subject: Re: [PATCH v3 0/2] VT-d flush issue
Date: Mon, 21 Dec 2015 13:35:40 +0000
Message-ID: <945CA011AD5F084CBEA3E851C0AB28894B7FC57A@SHSMSX101.ccr.corp.intel.com>
References: <945CA011AD5F084CBEA3E851C0AB28894B7FBC68@SHSMSX101.ccr.corp.intel.com>
	<5677F4B502000078000C1D51@prv-mh.provo.novell.com>
	<945CA011AD5F084CBEA3E851C0AB28894B7FC2D8@SHSMSX101.ccr.corp.intel.com>
	<5678039102000078000C1DEA@prv-mh.provo.novell.com>
	<945CA011AD5F084CBEA3E851C0AB28894B7FC546@SHSMSX101.ccr.corp.intel.com>
	<56780B3D02000078000C1E47@prv-mh.provo.novell.com>
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: 7bit
Return-path: <xen-devel-bounces@lists.xen.org>
In-Reply-To: <56780B3D02000078000C1E47@prv-mh.provo.novell.com>
Content-Language: en-US
List-Unsubscribe: <http://lists.xen.org/cgi-bin/mailman/options/xen-devel>,
	<mailto:xen-devel-request@lists.xen.org?subject=unsubscribe>
List-Post: <mailto:xen-devel@lists.xen.org>
List-Help: <mailto:xen-devel-request@lists.xen.org?subject=help>
List-Subscribe: <http://lists.xen.org/cgi-bin/mailman/listinfo/xen-devel>,
	<mailto:xen-devel-request@lists.xen.org?subject=subscribe>
Sender: xen-devel-bounces@lists.xen.org
Errors-To: xen-devel-bounces@lists.xen.org
To: Jan Beulich <JBeulich@suse.com>
Cc: "Tian, Kevin" <kevin.tian@intel.com>, "Wu, Feng" <feng.wu@intel.com>, "'george.dunlap@eu.citrix.com'" <george.dunlap@eu.citrix.com>, "'andrew.cooper3@citrix.com'" <andrew.cooper3@citrix.com>, "'tim@xen.org'" <tim@xen.org>, "'xen-devel@lists.xen.org'" <xen-devel@lists.xen.org>, "Nakajima,
	Jun" <jun.nakajima@intel.com>, "'keir@xen.org'" <keir@xen.org>
List-Id: xen-devel@lists.xenproject.org

> On 21.12.2015 at 9:23pm, <JBeulich@suse.com> wrote:
> >>> On 21.12.15 at 14:08, <quan.xu@intel.com> wrote:
> >>  On 21.12.2015 at 8:50pm, <JBeulich@suse.com> wrote:
> >> >>> On 21.12.15 at 13:28, <quan.xu@intel.com> wrote:
> >> > On 21.12.2015 at 7:47pm, <JBeulich@suse.com> wrote:
> >> >> >>> On 20.12.15 at 14:57, <quan.xu@intel.com> wrote:
> >> >> > 2. If VT-d is bug, does the hardware_domain continue to work
> >> >> > with PCIe Devices / DRAM well with DMA remapping error?
> >> >> >    I think it is no. furthermore, i think VMM can NOT run a
> >> >> > normal HVM domain without device-passthrough.
> >> >>
> >> >> In addition to what Andrew said - VT-d is effectively not in use
> >> >> for domains without PT device.
> >> >
> >> > IMO, When VT-d is enabled, but is not working correct. These PCI-e
> >> > devices
> >> > (Disks/NICs..) DMA/Interrupt behaviors are not predictable.
> >> > Assumed that, VT-d is effectively not in use for domains without PT
> >> > device, while at least the virtualization infrastructure is not trusted.
> >> > I think it is also not secure to run PV domains.
> >> >
> >> >> Impacting all such domains by crashing the hypervisor just because
> >> >> (in the extreme case) a single domain with PT devices exhibited a
> >> >> flush issue is a no-go imo.
> >> >>
> >> >
> >> > IMO, a VT-d (IEC/Context/Iotlb) flush issue is not a single domain
> >> > behavior, it is a Hypervisor and infrastructure issue.
> >> > ATS device's Device-TLB flush is a single domain issue.
> >> > Back to our original goal, my patch set is for ATS flush issue. right?
> >>
> >> You mean you don't like this entailing clean up of other code?
> >
> >  Jan, for ARM/AMD, I really have no knowledge to fix it. and I have no
> > ARM/AMD hardware to verify it. if I need to fix these common part of
> > INTEL/ARM/AMD, I think I need to make  Xen compile correct and not to
> > destroy the logic.
> 
> You indeed aren't expected to fix AMD or ARM code, but it may be necessary to
> adjust that code to make error propagation work.
> 
> >> I'm sorry, but I'm
> >> afraid you won't get away without - perhaps the VT-d maintainers
> >> could help here, but in the end you have to face that it was mainly
> >> Intel people who introduced the code which now needs fixing up, so I
> >> consider it not exactly unfair for you (as a
> >> company) to do this work.
> >>
> >
> > Furthermore, I found out that
> >      if IEC/Iotlb/Context flush error, then panic.
> >      Else if device-tlb flush error, we'll hide the target ATS device
> > and kill the domain owning this ATS device. If impacted domain is
> > hardware domain, just throw out a warning.
> >
> >      Then, it is fine to _not_check all the way up the device-tlb
> > flush call trees( maybe it is our next topic of discussion).
> 
> I don't follow - this sounds more or less like the model you've been following in
> past versions, yet it was that which prompted the request to properly propagate
> errors.
> 
Jan,
Maybe we can discuss the big picture first on how to deal with iec/iotlb/context and Device-TLB flush error.
Then we can discuss it in detail. We can ignore some point of the way up the device-tlb flush call trees. Such as 

   iommu_hwdom_init()
   *|--hd->platform_ops->map_page(d, gfn, mfn, mapping);


And more, if we are on same page, I am glad to write patch for all of vt-d issue, including IOMMU_WAIT_OP issue .etc..

-Quan