From mboxrd@z Thu Jan 1 00:00:00 1970 From: "Zhang, Xiantao" Subject: Re: iommu=dom0-passthrough behavior Date: Tue, 13 Nov 2012 08:50:13 +0000 Message-ID: References: <5097DB9102000078000A65C7@nat28.tlf.novell.com> <50A20DE302000078000A7F6B@nat28.tlf.novell.com> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Return-path: In-Reply-To: <50A20DE302000078000A7F6B@nat28.tlf.novell.com> Content-Language: en-US List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xen.org Errors-To: xen-devel-bounces@lists.xen.org To: Jan Beulich , "Zhang, Yang Z" Cc: "wei.huang2@amd.com" , "weiwang.dd@gmail.com" , "Zhang, Xiantao" , xen-devel List-Id: xen-devel@lists.xenproject.org > -----Original Message----- > From: Jan Beulich [mailto:JBeulich@suse.com] > Sent: Tuesday, November 13, 2012 4:08 PM > To: Zhang, Xiantao; Zhang, Yang Z > Cc: wei.huang2@amd.com; weiwang.dd@gmail.com; xen-devel > Subject: RE: [Xen-devel] iommu=dom0-passthrough behavior > > >>> On 13.11.12 at 01:11, "Zhang, Yang Z" wrote: > > Jan Beulich wrote on 2012-11-05: > >> so far it was my understanding that this option is intended to get > >> the DMA behavior that Dom0 observes as close as possible to how it > >> would be without IOMMU. > > > > Correct. There is a bit in context entry which controlling the DMA > > request(from this device) to walk or not walk the iommu page table. > > As we known, walking page table introduced extra cost, so we use this > > parameter to make sure the device which owned by dom0 not to walking > > iommu page table when DMA request is arrived. > > Okay, so that would be slightly different from the meaning I give to the > option (as described above). > > >> However, we're now dealing with a customer report where a single > >> function device is observed to initiate DMA operations appearing to > >> originate from function 1, which makes obvious that the option above > >> is not making things as transparent as I would have expected them to > >> be: Without IOMMU, such requests get processed fine, while with > IOMMU > >> (due to there not being a context entry for the bogus device) the > >> device fails to initialize (causing DMA faults, the presence of which > >> I had to convince myself of separately, as for whatever reason at > >> least the VT-d code doesn't issue any log message in that case). > > > > Sorry, I cannot understand your problem. Is there any bug in current > > VT-d code? > > We need to settle on the concept here first: What specifically is said option > intended to do? Basically, this options just allows the transactions from dom0's devices not subject to VT-d engine. Actually, It is not targeted to fix something, but just allows users isolating VT-d issues from dom0. As I know, in early days, VT-d is not that stable, if dom0's devices are controlled by VT-d, some strange issues may trigger in system's boot stage, so use this options to disable VT-d for Dom0. > Only then we can talk about bugs, and if there is one I suspect it's not only in > VT-d code, but equally much in AMD IOMMU's. > > The thing here is that a device functioning properly without IOMMU (with > "properly" not necessarily meaning it being implemented correctly as per > specification, albeit I also didn't check whether the spec would allow for the > observed behavior) doesn't once DMA translation is enabled (even if > suppressed for Dom0 via above option). > > The problem being that while device enumeration only finds a single device > at function zero of the respective (seg,bus,dev) tuple, DMA requests - as > seen by the IOMMU - originate from non-zero functions under the same > tuple. Since a non-discovered device doesn't get a context entry inserted, > this result in an IOMMU fault, rendering the device non-functional. > > The data from the system I have so far doesn't tell me whether the device > incorrectly claims itself as single function (with the functions other than func > 0 simply not being discovered during device enumeration, as single function > devices don't get their non-zero functions scanned) or whether the config > space for functions 1-7 indeed is unpopulated, with the device issuing > requests with non- zero function number for other, unexplained reasons. > Bottom line - I'm seeking advice as to whether working around this problem > in the IOMMU code is desirable/necessary, or whether this is a design flaw > on the device's side that just cannot be tolerated with an IOMMU in the > picture (which would need good reasoning, so that a customer expecting > such a device to work regardless of IOMMU usage can understand that this > cannot reasonably be made work). The issue is why the non-zero functions don't claim themselves during PCI bus scan. From security point of view, VT-d shouldn't allow transactions from the unknown devices. Xiantao