From mboxrd@z Thu Jan 1 00:00:00 1970 From: Konrad Rzeszutek Wilk Subject: Re: swiotlb=force in Konrad's xen-pcifront-0.8.2 pvops domU kernel with PCI passthrough Date: Fri, 12 Nov 2010 11:55:41 -0500 Message-ID: <20101112165541.GA10339@dumpdata.com> References: <20101112155659.GA5529@dumpdata.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Return-path: Content-Disposition: inline In-Reply-To: List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xensource.com Errors-To: xen-devel-bounces@lists.xensource.com To: "Lin, Ray" Cc: Xen-devel , Dante Cinco List-Id: xen-devel@lists.xenproject.org > That does not sound right. You should be able to use the PCI passthrough without the IOMMU. Since it is an interrupt issue it sounds like that you are using x2APIC and that is enabled without the IOMMU. > Had you tried disabling IOMMU and x2apic? (this is all on the hypervisor line?) > > Konrad, > It's unlikely the interrupt issue but DMA issue. Here is the sequence how the tachyon device generates the DMA/interrupts, > - the tachyon device does the DMA to update the memory which indicates the source of interrupt. > - After the DMA is done, the tachyon device trigger an interrupt. > - The interrupt service routine of software driver is invoked due to the interrupt > - The interrupt service routine checks the source of interrupts by examining the memory which is supposed to be updated by previous DMA. > - Even though the interrupt happens, the driver code can't find the source of interrupt since the DMA doesn't work properly. That sounds like the tachyon device is updating the wrong memory location. How are you programming the memory location where thetachyon device is suppose to touch? Are you using the value from pci_map_page or are you using virt_to_phys? The virt_to_phys should be different from the pci_map_page.. unless you allocated a coherent DMA pool using pci_alloc_coherent in which case the virt_to_phys() values for that pool should be the right MFNs. One way you can figure this is doing something like this to make sure you got the right MFN: add these two: #include #include phys_addr_t phys = page_to_phys(mem->pages[i]); + if (xen_pv_domain()) { + phys_addr_t xen_phys = PFN_PHYS(pfn_to_mfn( + page_to_pfn(mem->pages[i]))); + if (phys != xen_phys) { + printk(KERN_ERR "Fixing up: (0x%lx->0x%lx)." \ + " CODE UNTESTED!\n", + (unsigned long)phys, + (unsigned long)xen_phys); + WARN_ON_ONCE(phys != xen_phys); + phys = xen_phys; + } + } and using the 'phys' value from now. If this sounds like black magic, here is a short writeup http://wiki.xensource.com/xenwiki/XenPVOPSDRM look at "Why those patches" section. Lastly, are you using unsigned long for or the phys_addr_t typedefs? The more I think about your problem the more it sounds like a truncating issue. You said that it works just right (albeit slow) if you use 'swiotlb=force'. The slowness could be due to not using the pci_sync_* APIs to sync the DMA buffers.. But irregardless using bounce buffers will slow the DMA operations down. Using the bounce buffers limits the DMA operations to under 32-bit. So could it be that you are using some casting macro that casts a PFN to unsigned long or vice-versa and we end up truncating it to 32-bit? (I've seen this issue actually with InfiniBand drivers back in RHEL5 days..). Lastly, do you set your DMA mask on the device to 32BIT?