From mboxrd@z Thu Jan 1 00:00:00 1970 From: Konrad Rzeszutek Wilk Subject: Re: swiotlb=force in Konrad's xen-pcifront-0.8.2 pvops domU kernel with PCI passthrough Date: Tue, 16 Nov 2010 15:15:43 -0500 Message-ID: <20101116201349.GA18315@dumpdata.com> References: <20101112165541.GA10339@dumpdata.com> <20101112223333.GD26189@dumpdata.com> <20101116185748.GA11549@dumpdata.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Return-path: Content-Disposition: inline In-Reply-To: List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xensource.com Errors-To: xen-devel-bounces@lists.xensource.com To: Dante Cinco Cc: Xen-devel List-Id: xen-devel@lists.xenproject.org > > Or is the issue that when you write to your HBA register the DMA > > address, the HBA register can _only_ deal with 32-bit values (4bytes)? > > The HBA register which is using the address returned by pci_map_single > is limited to a 32-bit value. > > > In which case the PCI device seems to be limited to addressing only up to 4GB, right? > > The HBA has some 32-bit registers and some that are 45-bit. Ugh, so can you set up pci coherent DMA pools at startup for the 32-bit registers. Then set the pci_dma_mask to 45-bit and use pci_map_single for all others. > > > > >> returned 32 bits without explicitly setting the DMA mask. Once we set > >> the mask to 32 bits using pci_set_dma_mask, the NMIs stopped. However > >> with iommu=soft (and no more swiotlb=force), we're still stuck with > >> the abysmal I/O performance (same as when we had swiotlb=force). > > > > Right, that is expected. > > So with iommu=soft, all I/Os have to go through Xen-SWIOTLB which > explains why we're seeing the abysmal I/O performance, right? You are simplifying it. You are seeing abysmal I/O performance b/c you are doing bounce buffering. You can fix this by making the driver have a 32-bit pool allocated at startup and use that just for the HBA registers that can only do 32-bit, and then for the rest use the pci_map_single and use DMA mask 45-bit. > > Is it true then that with an HVM domU kernel and PCI passthrough, it > does not use Xen-SWIOTLB and therefore results in better performance? Yes and no. If you allocate to your HVM guests more than 4GB you are going to hit the same issues with the bounce buffer. If you give your guest less than 4GB, there is no SWIOTLB running in the guest and QEMU along with the hypervisor end up using the hardware one (currently Xen hypervisor supports AMD V-i and Intel VT-d). In your case it is the VT-d - at which point the VT-d will remap your GMFN to MFNs. And the VT-d will be responsible for translating the DMA address that the PCI card will try to access to the real MFN.