From mboxrd@z Thu Jan 1 00:00:00 1970 From: Konrad Rzeszutek Wilk Subject: Re: swiotlb=force in Konrad's xen-pcifront-0.8.2 pvops domU kernel with PCI passthrough Date: Thu, 11 Nov 2010 14:03:51 -0500 Message-ID: <20101111190351.GB15530@dumpdata.com> References: <20101111160459.GB25654@dumpdata.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Return-path: Content-Disposition: inline In-Reply-To: List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xensource.com Errors-To: xen-devel-bounces@lists.xensource.com To: Dante Cinco Cc: Xen-devel List-Id: xen-devel@lists.xenproject.org On Thu, Nov 11, 2010 at 10:31:48AM -0800, Dante Cinco wrote: > Konrad, > > Without swiotlb=force, I don't see "PCI-DMA: Using software bounce > buffering for IO" in /var/log/kern.log. > > With iommu=soft and without swiotlb=force, I see the "software bounce > buffering" in /var/log/kern.log and an NMI (see below) when I load the > kernel module drivers. I made sure the NMI is reproducible and not a What is the kernel module doing to cause this? DMA? > one-time event. So doing 64-bit DMA causes an NMI. Do you have the Hypervisor's IOMMU VT-d enabled or disabled? (iommu=off,verbose) If you turn it off does this work? > > /var/log/kern.log (iommu=soft): > PCI-DMA: Using software bounce buffering for IO (SWIOTLB) > Placing 64MB software IO TLB between ffff880005800000 - ffff880009800000 > software IO TLB at phys 0x5800000 - 0x9800000 > > (XEN) > (XEN) > (XEN) NMI - I/O ERROR > (XEN) ----[ Xen-4.1-unstable x86_64 debug=y Not tainted ]---- > (XEN) CPU: 0 > (XEN) RIP: e008:[] smp_send_event_check_mask+0x1/0x10 > (XEN) RFLAGS: 0000000000000012 CONTEXT: hypervisor > (XEN) rax: 0000000000000080 rbx: ffff82c480287c48 rcx: 0000000000000000 > (XEN) rdx: 0000000000000080 rsi: 0000000000000080 rdi: ffff82c480287c48 > (XEN) rbp: ffff82c480287c78 rsp: ffff82c480287c38 r8: 0000000000000000 > (XEN) r9: 0000000000000037 r10: 0000ffff0000ffff r11: 00ff00ff00ff00ff > (XEN) r12: ffff82c48029f080 r13: 0000000000000001 r14: 0000000000000008 > (XEN) r15: ffff82c4802b0c20 cr0: 000000008005003b cr4: 00000000000026f0 > (XEN) cr3: 00000001250a9000 cr2: 00007f6165ae9428 > (XEN) ds: 0000 es: 0000 fs: 0000 gs: 0000 ss: e010 cs: e008 > (XEN) Xen stack trace from rsp=ffff82c480287c38: > (XEN) ffff82c480287c78 ffff82c48012001f 0000000000000100 0000000000000000 > (XEN) ffff82c480287ca8 ffff83011dadd8b0 ffff83019fffa9d0 ffff82c4802c2300 > (XEN) ffff82c480287cc8 ffff82c480117d0d ffff82c48029f080 0000000000000001 > (XEN) 0000000000000100 0000000000000000 0000000000000002 ffff8300df606000 > (XEN) 000000411de66867 ffff82c4802c2300 ffff82c480287d28 ffff82c48011f299 > (XEN) 0000000000000100 0000000000000086 ffff83019e3fa000 ffff83011dadd8b0 > (XEN) ffff83019fffa9d0 ffff8300df606000 0000000000000000 0000000000000000 > (XEN) 000000000000007f ffff83019fe02200 ffff82c480287d38 ffff82c48011f6ea > (XEN) ffff82c480287d58 ffff82c48014e4c1 ffff83011dae2000 0000000000000066 > (XEN) ffff82c480287d68 ffff82c48014e54d ffff82c480287d98 ffff82c480105d59 > (XEN) ffff82c480287da8 ffff8301616a6990 ffff83011dae2000 0000000000000000 > (XEN) ffff82c480287da8 ffff82c480105f81 ffff82c480287e28 ffff82c48015c043 > (XEN) 0000000000000043 0000000000000043 ffff83019fe02234 0000000000000000 > (XEN) 000000000000010c 0000000000000000 0000000000000000 0000000000000002 > (XEN) ffff82c480287e10 ffff82c480287f18 ffff82c48024f6c0 ffff82c480287f18 > (XEN) ffff82c4802c2300 0000000000000002 00007d3b7fd781a7 ffff82c480154ee6 > (XEN) 0000000000000002 ffff82c4802c2300 ffff82c480287f18 ffff82c48024f6c0 > (XEN) ffff82c480287ee0 ffff82c480287f18 00ff00ff00ff00ff 0000ffff0000ffff > (XEN) 0000000000000000 0000000000000000 ffff82c4802c23a0 0000000000000000 > (XEN) 0000000000000000 ffff82c4802c2e80 0000000000000000 0000007a00000000 > (XEN) Xen call trace: > (XEN) [] smp_send_event_check_mask+0x1/0x10 > (XEN) [] csched_vcpu_wake+0x2e1/0x302 > (XEN) [] vcpu_wake+0x243/0x43e > (XEN) [] vcpu_unblock+0x4a/0x4c > (XEN) [] vcpu_kick+0x21/0x7f > (XEN) [] vcpu_mark_events_pending+0x2e/0x32 > (XEN) [] evtchn_set_pending+0xbf/0x190 > (XEN) [] send_guest_pirq+0x54/0x56 > (XEN) [] do_IRQ+0x3b2/0x59c > (XEN) [] common_interrupt+0x26/0x30 > (XEN) [] default_idle+0x82/0x87 > (XEN) [] idle_loop+0x5a/0x68 > (XEN) > (XEN) > (XEN) **************************************** > (XEN) Panic on CPU 0: > (XEN) FATAL TRAP: vector = 2 (nmi) > (XEN) [error_code=0000] , IN INTERRUPT CONTEXT > (XEN) **************************************** > (XEN) > (XEN) Reboot in five seconds... > > Dante > > > On Thu, Nov 11, 2010 at 8:04 AM, Konrad Rzeszutek Wilk > wrote: > > On Wed, Nov 10, 2010 at 05:16:14PM -0800, Dante Cinco wrote: > >> We have Fibre Channel HBA devices that we PCI passthrough to our pvops > >> domU kernel. Without swiotlb=force in the domU's kernel command line, > >> both domU and dom0 lock up after loading the kernel module drivers for > >> the HBA devices. With swiotlb=force, the domU and dom0 are stable > > > > Whoa. That is not good - what happens if you just pass in iommu=soft? > > Does the PCI-DMA: Using.. show up if you don't pass in any of those parameters? > > (I don't think it does, but just doing 'iommu=soft' should enable it). > > > > > >> after loading the kernel module drivers but the I/O performance is at > >> least an order of magnitude worse than what we were seeing with the > >> HVM kernel. I see the following in /var/log/kern.log in the pvops > >> domU: > >> > >> PCI-DMA: Using software bounce buffering for IO (SWIOTLB) > >> Placing 64MB software IO TLB between ffff880005800000 - ffff880009800000 > >> software IO TLB at phys 0x5800000 - 0x9800000 > >> > >> Is swiotlb=force responsible for the I/O performance degradation? I > >> don't understand what swiotlb=force does so I would appreciate an > >> explanation or a pointer. > > > > So, you should only need to use 'iommu=soft'. It will enable the Linux kernel IOMMU > > to translate the pseudo-PFNs to the real machine frame numbers (bus addresses). > > > > If your card is 64-bit, then that is all it would do. If however your card is 32-bit > > and your are DMA-ing data from above the 32-bit limit, it would copy the user-space page > > to memory below 4GB, DMA that, and when done, copy it back to the where the user-space > > page is. This is called bounce-buffering and this is why you would use a mix of > > pci_map_page, pci_sync_single_for_[cpu|device] calls around your driver. > > > > However, I think your cards are 64-bit, so you don't need this bounce-buffering. But > > if you say 'swiotlb=force' it will force _all_ DMAs to go through the bounce-buffer. > > > > So, try just 'iommu=soft' and see what happens. > >