All of lore.kernel.org
 help / color / mirror / Atom feed
* [Qemu-devel] Multi GPU passthrough via VFIO
@ 2014-02-05 18:59 Maik Broemme
  2014-02-05 20:26 ` Alex Williamson
  0 siblings, 1 reply; 16+ messages in thread
From: Maik Broemme @ 2014-02-05 18:59 UTC (permalink / raw)
  To: qemu-devel

[-- Attachment #1: Type: text/plain, Size: 6300 bytes --]

Hi,

currently VFIO with multi GPU passthrough is working partially and
hopefully somebody has a hint about the problem. I'm doing passthrough
of an AMD Radeon R9 290X and AMD Radeon 7870 GHz Edition to a single VM.

If the VM is running Linux this works quite well with radeon or fglrx
driver. Please see 'dmesg' log attached, when using the radeon driver.
If needed I can also post one with fglrx driver.

If I do the exact same passthrough to a Windows VM and use latest AMD
Catalyst 14.1 (2/1/2014) or AMD Catalyst 13.12 (12/18/2013) I can get
only the first device working (AMD R9 290X) with 'x-vga=on'. I don't
enable 'x-vga=on' on second device as this should never work. :) I see
BIOS boot screen and everything works fine except for the second GPU.
The windows device manager just show me "Code 12" for the second GPU
and its HD Audio device. Code 12 means: "This device cannot find enough
free resources that it can use".

QEMU is called in both cases via the following. I just replace the
'-drive' accordingly.

/usr/bin/taskset -c 0,1,2,3 /usr/bin/qemu-system-x86_64 \
  -machine q35,accel=kvm \
  -enable-kvm \
  -nodefaults \
  -nographic \
  -vga none \
  -boot order=nc \
  -cpu host \
  -smp cores=4,threads=1,sockets=1 \
  -m 8192 \
  -rtc base=localtime \
  -k de \
  -drive file=/srv/kvm/linux-drive0.img,id=drive0,if=none,cache=none,aio=threads \
  -mon chardev=monitor0 \
  -chardev socket,id=monitor0,path=/tmp/linux.monitor,nowait,server \
  -netdev tap,id=net0,vhost=on,helper=/usr/lib/qemu/qemu-bridge-helper \
  -device virtio-net-pci,netdev=net0,mac=00:00:00:02:01:04 \
  -device virtio-blk-pci,drive=drive0,ioeventfd=on \
  -device ioh3420,bus=pcie.0,id=pcie0,port=1,chassis=1,multifunction=on \
  -device ioh3420,bus=pcie.0,id=pcie1,port=2,chassis=2,multifunction=on \
  -device vfio-pci,host=01:00.0,addr=00.0,bus=pcie0,multifunction=on,x-vga=on \
  -device vfio-pci,host=01:00.1,addr=00.1,bus=pcie0 \
  -device vfio-pci,host=02:00.0,addr=00.0,bus=pcie1,multifunction=on \
  -device vfio-pci,host=02:00.1,addr=00.1,bus=pcie1 \
  -no-reboot

My setup is the following:

Kernel: linux-3.13.1
Seabios: seabios-git-rel.1.7.4.r51.g151d034 (5/2/2014)
QEMU: qemu-git-2.0.r30666.g31db5b3 (5/2/2014)

Below is the 'lspci' output and I'm using the AMD Radeon HD 5430 as device
for my local X server:

00:00.0 Host bridge: Advanced Micro Devices, Inc. [AMD/ATI] RD890 PCI to PCI bridge (external gfx0 port B) (rev 02)
00:00.2 IOMMU: Advanced Micro Devices, Inc. [AMD/ATI] RD990 I/O Memory Management Unit (IOMMU)
00:02.0 PCI bridge: Advanced Micro Devices, Inc. [AMD/ATI] RD890 PCI to PCI bridge (PCI express gpp port B)
00:04.0 PCI bridge: Advanced Micro Devices, Inc. [AMD/ATI] RD890 PCI to PCI bridge (PCI express gpp port D)
00:09.0 PCI bridge: Advanced Micro Devices, Inc. [AMD/ATI] RD890 PCI to PCI bridge (PCI express gpp port H)
00:0d.0 PCI bridge: Advanced Micro Devices, Inc. [AMD/ATI] RD890 PCI to PCI bridge (external gfx1 port B)
00:11.0 SATA controller: Advanced Micro Devices, Inc. [AMD/ATI] SB7x0/SB8x0/SB9x0 SATA Controller [AHCI mode] (rev 40)
00:12.0 USB controller: Advanced Micro Devices, Inc. [AMD/ATI] SB7x0/SB8x0/SB9x0 USB OHCI0 Controller
00:12.2 USB controller: Advanced Micro Devices, Inc. [AMD/ATI] SB7x0/SB8x0/SB9x0 USB EHCI Controller
00:13.0 USB controller: Advanced Micro Devices, Inc. [AMD/ATI] SB7x0/SB8x0/SB9x0 USB OHCI0 Controller
00:13.2 USB controller: Advanced Micro Devices, Inc. [AMD/ATI] SB7x0/SB8x0/SB9x0 USB EHCI Controller
00:14.0 SMBus: Advanced Micro Devices, Inc. [AMD/ATI] SBx00 SMBus Controller (rev 42)
00:14.2 Audio device: Advanced Micro Devices, Inc. [AMD/ATI] SBx00 Azalia (Intel HDA) (rev 40)
00:14.3 ISA bridge: Advanced Micro Devices, Inc. [AMD/ATI] SB7x0/SB8x0/SB9x0 LPC host controller (rev 40)
00:14.4 PCI bridge: Advanced Micro Devices, Inc. [AMD/ATI] SBx00 PCI to PCI Bridge (rev 40)
00:14.5 USB controller: Advanced Micro Devices, Inc. [AMD/ATI] SB7x0/SB8x0/SB9x0 USB OHCI2 Controller
00:15.0 PCI bridge: Advanced Micro Devices, Inc. [AMD/ATI] SB700/SB800/SB900 PCI to PCI bridge (PCIE port 0)
00:15.1 PCI bridge: Advanced Micro Devices, Inc. [AMD/ATI] SB700/SB800/SB900 PCI to PCI bridge (PCIE port 1)
00:15.2 PCI bridge: Advanced Micro Devices, Inc. [AMD/ATI] SB900 PCI to PCI bridge (PCIE port 2)
00:15.3 PCI bridge: Advanced Micro Devices, Inc. [AMD/ATI] SB900 PCI to PCI bridge (PCIE port 3)
00:16.0 USB controller: Advanced Micro Devices, Inc. [AMD/ATI] SB7x0/SB8x0/SB9x0 USB OHCI0 Controller
00:16.2 USB controller: Advanced Micro Devices, Inc. [AMD/ATI] SB7x0/SB8x0/SB9x0 USB EHCI Controller
00:18.0 Host bridge: Advanced Micro Devices, Inc. [AMD] Family 15h Processor Function 0
00:18.1 Host bridge: Advanced Micro Devices, Inc. [AMD] Family 15h Processor Function 1
00:18.2 Host bridge: Advanced Micro Devices, Inc. [AMD] Family 15h Processor Function 2
00:18.3 Host bridge: Advanced Micro Devices, Inc. [AMD] Family 15h Processor Function 3
00:18.4 Host bridge: Advanced Micro Devices, Inc. [AMD] Family 15h Processor Function 4
00:18.5 Host bridge: Advanced Micro Devices, Inc. [AMD] Family 15h Processor Function 5
01:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] Hawaii XT [Radeon HD 8970]
01:00.1 Audio device: Advanced Micro Devices, Inc. [AMD/ATI] Device aac8
02:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] Pitcairn XT [Radeon HD 7870 GHz Edition]
02:00.1 Audio device: Advanced Micro Devices, Inc. [AMD/ATI] Cape Verde/Pitcairn HDMI Audio [Radeon HD 7700/7800 Series]
03:00.0 USB controller: Etron Technology, Inc. EJ168 USB 3.0 Host Controller (rev 01)
04:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] Park [Mobility Radeon HD 5430]
04:00.1 Audio device: Advanced Micro Devices, Inc. [AMD/ATI] Cedar HDMI Audio [Radeon HD 5400/6300 Series]
06:00.0 Ethernet controller: Realtek Semiconductor Co., Ltd. RTL8111/8168/8411 PCI Express Gigabit Ethernet Controller (rev 06)
07:00.0 USB controller: Etron Technology, Inc. EJ168 USB 3.0 Host Controller (rev 01)

Another minor issue is that the R9 290X is not reset during shutdown of
VM (neither Linux nor Windows) but it can be tricked with doing
"suspend-to-ram" between two starts. That's why I use '-no-reboot' option
in QEMU. The 7870 is doing the reset properly.

--Maik

[-- Attachment #2: dmesg.log --]
[-- Type: text/plain, Size: 45608 bytes --]

[    0.000000] Initializing cgroup subsys cpuset
[    0.000000] Initializing cgroup subsys cpu
[    0.000000] Initializing cgroup subsys cpuacct
[    0.000000] Linux version 3.13.1-2-ARCH (nobody@mnt-chroots-arch-extra-x86_64-flo-64) (gcc version 4.8.2 20131219 (prerelease) (GCC) ) #1 SMP PREEMPT Fri Jan 31 10:48:18 CET 2014
[    0.000000] Command line: BOOT_IMAGE=/boot/vmlinuz-linux root=UUID=4d04973d-c353-4dbd-9931-1196e9d53b60 rw
[    0.000000] e820: BIOS-provided physical RAM map:
[    0.000000] BIOS-e820: [mem 0x0000000000000000-0x000000000009fbff] usable
[    0.000000] BIOS-e820: [mem 0x000000000009fc00-0x000000000009ffff] reserved
[    0.000000] BIOS-e820: [mem 0x00000000000f0000-0x00000000000fffff] reserved
[    0.000000] BIOS-e820: [mem 0x0000000000100000-0x000000007fffcfff] usable
[    0.000000] BIOS-e820: [mem 0x000000007fffd000-0x000000007fffffff] reserved
[    0.000000] BIOS-e820: [mem 0x00000000b0000000-0x00000000bfffffff] reserved
[    0.000000] BIOS-e820: [mem 0x00000000feffc000-0x00000000feffffff] reserved
[    0.000000] BIOS-e820: [mem 0x00000000fffc0000-0x00000000ffffffff] reserved
[    0.000000] BIOS-e820: [mem 0x0000000100000000-0x000000017fffffff] usable
[    0.000000] NX (Execute Disable) protection: active
[    0.000000] SMBIOS 2.4 present.
[    0.000000] DMI: QEMU Standard PC (Q35 + ICH9, 2009), BIOS Bochs 01/01/2011
[    0.000000] Hypervisor detected: KVM
[    0.000000] e820: update [mem 0x00000000-0x00000fff] usable ==> reserved
[    0.000000] e820: remove [mem 0x000a0000-0x000fffff] usable
[    0.000000] No AGP bridge found
[    0.000000] e820: last_pfn = 0x180000 max_arch_pfn = 0x400000000
[    0.000000] MTRR default type: write-back
[    0.000000] MTRR fixed ranges enabled:
[    0.000000]   00000-9FFFF write-back
[    0.000000]   A0000-BFFFF uncachable
[    0.000000]   C0000-FFFFF write-protect
[    0.000000] MTRR variable ranges enabled:
[    0.000000]   0 base 00C0000000 mask FFC0000000 uncachable
[    0.000000]   1 disabled
[    0.000000]   2 disabled
[    0.000000]   3 disabled
[    0.000000]   4 disabled
[    0.000000]   5 disabled
[    0.000000]   6 disabled
[    0.000000]   7 disabled
[    0.000000] x86 PAT enabled: cpu 0, old 0x70406, new 0x7010600070106
[    0.000000] e820: last_pfn = 0x7fffd max_arch_pfn = 0x400000000
[    0.000000] found SMP MP-table at [mem 0x000f0c60-0x000f0c6f] mapped at [ffff8800000f0c60]
[    0.000000] Scanning 1 areas for low memory corruption
[    0.000000] Base memory trampoline at [ffff880000096000] 96000 size 24576
[    0.000000] Using GB pages for direct mapping
[    0.000000] init_memory_mapping: [mem 0x00000000-0x000fffff]
[    0.000000]  [mem 0x00000000-0x000fffff] page 4k
[    0.000000] BRK [0x01b3e000, 0x01b3efff] PGTABLE
[    0.000000] BRK [0x01b3f000, 0x01b3ffff] PGTABLE
[    0.000000] BRK [0x01b40000, 0x01b40fff] PGTABLE
[    0.000000] init_memory_mapping: [mem 0x17fe00000-0x17fffffff]
[    0.000000]  [mem 0x17fe00000-0x17fffffff] page 1G
[    0.000000] init_memory_mapping: [mem 0x17c000000-0x17fdfffff]
[    0.000000]  [mem 0x17c000000-0x17fdfffff] page 1G
[    0.000000] init_memory_mapping: [mem 0x100000000-0x17bffffff]
[    0.000000]  [mem 0x100000000-0x17bffffff] page 1G
[    0.000000] init_memory_mapping: [mem 0x00100000-0x7fffcfff]
[    0.000000]  [mem 0x00100000-0x001fffff] page 4k
[    0.000000]  [mem 0x00200000-0x7fdfffff] page 2M
[    0.000000]  [mem 0x7fe00000-0x7fffcfff] page 4k
[    0.000000] RAMDISK: [mem 0x37a24000-0x37d09fff]
[    0.000000] ACPI: RSDP 00000000000f0a30 000014 (v00 BOCHS )
[    0.000000] ACPI: RSDT 000000007ffff1c4 000038 (v01 BOCHS  BXPCRSDT 00000001 BXPC 00000001)
[    0.000000] ACPI: FACP 000000007fffed4e 000074 (v01 BOCHS  BXPCFACP 00000001 BXPC 00000001)
[    0.000000] ACPI: DSDT 000000007fffd040 001D0E (v01 BOCHS  BXPCDSDT 00000001 BXPC 00000001)
[    0.000000] ACPI: FACS 000000007fffd000 000040
[    0.000000] ACPI: SSDT 000000007fffedc2 0002FE (v01 BOCHS  BXPCSSDT 00000001 BXPC 00000001)
[    0.000000] ACPI: APIC 000000007ffff0c0 000090 (v01 BOCHS  BXPCAPIC 00000001 BXPC 00000001)
[    0.000000] ACPI: HPET 000000007ffff150 000038 (v01 BOCHS  BXPCHPET 00000001 BXPC 00000001)
[    0.000000] ACPI: MCFG 000000007ffff188 00003C (v01 BOCHS  BXPCMCFG 00000001 BXPC 00000001)
[    0.000000] ACPI: Local APIC address 0xfee00000
[    0.000000] No NUMA configuration found
[    0.000000] Faking a node at [mem 0x0000000000000000-0x000000017fffffff]
[    0.000000] Initmem setup node 0 [mem 0x00000000-0x17fffffff]
[    0.000000]   NODE_DATA [mem 0x17fff9000-0x17fffdfff]
[    0.000000] kvm-clock: Using msrs 4b564d01 and 4b564d00
[    0.000000] kvm-clock: cpu 0, msr 1:7fff7001, boot clock
[    0.000000]  [ffffea0000000000-ffffea0005ffffff] PMD -> [ffff88017b600000-ffff88017f5fffff] on node 0
[    0.000000] Zone ranges:
[    0.000000]   DMA      [mem 0x00001000-0x00ffffff]
[    0.000000]   DMA32    [mem 0x01000000-0xffffffff]
[    0.000000]   Normal   [mem 0x100000000-0x17fffffff]
[    0.000000] Movable zone start for each node
[    0.000000] Early memory node ranges
[    0.000000]   node   0: [mem 0x00001000-0x0009efff]
[    0.000000]   node   0: [mem 0x00100000-0x7fffcfff]
[    0.000000]   node   0: [mem 0x100000000-0x17fffffff]
[    0.000000] On node 0 totalpages: 1048475
[    0.000000]   DMA zone: 64 pages used for memmap
[    0.000000]   DMA zone: 24 pages reserved
[    0.000000]   DMA zone: 3998 pages, LIFO batch:0
[    0.000000]   DMA32 zone: 8128 pages used for memmap
[    0.000000]   DMA32 zone: 520189 pages, LIFO batch:31
[    0.000000]   Normal zone: 8192 pages used for memmap
[    0.000000]   Normal zone: 524288 pages, LIFO batch:31
[    0.000000] ACPI: PM-Timer IO Port: 0xb008
[    0.000000] ACPI: Local APIC address 0xfee00000
[    0.000000] ACPI: LAPIC (acpi_id[0x00] lapic_id[0x00] enabled)
[    0.000000] ACPI: LAPIC (acpi_id[0x01] lapic_id[0x01] enabled)
[    0.000000] ACPI: LAPIC (acpi_id[0x02] lapic_id[0x02] enabled)
[    0.000000] ACPI: LAPIC (acpi_id[0x03] lapic_id[0x03] enabled)
[    0.000000] ACPI: LAPIC_NMI (acpi_id[0xff] dfl dfl lint[0x1])
[    0.000000] ACPI: IOAPIC (id[0x00] address[0xfec00000] gsi_base[0])
[    0.000000] IOAPIC[0]: apic_id 0, version 17, address 0xfec00000, GSI 0-23
[    0.000000] ACPI: INT_SRC_OVR (bus 0 bus_irq 0 global_irq 2 dfl dfl)
[    0.000000] ACPI: INT_SRC_OVR (bus 0 bus_irq 5 global_irq 5 high level)
[    0.000000] ACPI: INT_SRC_OVR (bus 0 bus_irq 9 global_irq 9 high level)
[    0.000000] ACPI: INT_SRC_OVR (bus 0 bus_irq 10 global_irq 10 high level)
[    0.000000] ACPI: INT_SRC_OVR (bus 0 bus_irq 11 global_irq 11 high level)
[    0.000000] ACPI: IRQ0 used by override.
[    0.000000] ACPI: IRQ2 used by override.
[    0.000000] ACPI: IRQ5 used by override.
[    0.000000] ACPI: IRQ9 used by override.
[    0.000000] ACPI: IRQ10 used by override.
[    0.000000] ACPI: IRQ11 used by override.
[    0.000000] Using ACPI (MADT) for SMP configuration information
[    0.000000] ACPI: HPET id: 0x8086a201 base: 0xfed00000
[    0.000000] smpboot: Allowing 4 CPUs, 0 hotplug CPUs
[    0.000000] nr_irqs_gsi: 40
[    0.000000] PM: Registered nosave memory: [mem 0x0009f000-0x0009ffff]
[    0.000000] PM: Registered nosave memory: [mem 0x000a0000-0x000effff]
[    0.000000] PM: Registered nosave memory: [mem 0x000f0000-0x000fffff]
[    0.000000] PM: Registered nosave memory: [mem 0x7fffd000-0x7fffffff]
[    0.000000] PM: Registered nosave memory: [mem 0x80000000-0xafffffff]
[    0.000000] PM: Registered nosave memory: [mem 0xb0000000-0xbfffffff]
[    0.000000] PM: Registered nosave memory: [mem 0xc0000000-0xfeffbfff]
[    0.000000] PM: Registered nosave memory: [mem 0xfeffc000-0xfeffffff]
[    0.000000] PM: Registered nosave memory: [mem 0xff000000-0xfffbffff]
[    0.000000] PM: Registered nosave memory: [mem 0xfffc0000-0xffffffff]
[    0.000000] e820: [mem 0xc0000000-0xfeffbfff] available for PCI devices
[    0.000000] Booting paravirtualized kernel on KVM
[    0.000000] setup_percpu: NR_CPUS:128 nr_cpumask_bits:128 nr_cpu_ids:4 nr_node_ids:1
[    0.000000] PERCPU: Embedded 29 pages/cpu @ffff88017fc00000 s86336 r8192 d24256 u524288
[    0.000000] pcpu-alloc: s86336 r8192 d24256 u524288 alloc=1*2097152
[    0.000000] pcpu-alloc: [0] 0 1 2 3 
[    0.000000] kvm-clock: cpu 0, msr 1:7fff7001, primary cpu clock
[    0.000000] KVM setup async PF for cpu 0
[    0.000000] kvm-stealtime: cpu 0, msr 17fc0df00
[    0.000000] Built 1 zonelists in Zone order, mobility grouping on.  Total pages: 1032067
[    0.000000] Policy zone: Normal
[    0.000000] Kernel command line: BOOT_IMAGE=/boot/vmlinuz-linux root=UUID=4d04973d-c353-4dbd-9931-1196e9d53b60 rw
[    0.000000] PID hash table entries: 4096 (order: 3, 32768 bytes)
[    0.000000] xsave: enabled xstate_bv 0x7, cntxt size 0x340
[    0.000000] Checking aperture...
[    0.000000] No AGP bridge found
[    0.000000] Calgary: detecting Calgary via BIOS EBDA area
[    0.000000] Calgary: Unable to locate Rio Grande table in EBDA - bailing!
[    0.000000] Memory: 4046964K/4193900K available (5269K kernel code, 836K rwdata, 1672K rodata, 1124K init, 1324K bss, 146936K reserved)
[    0.000000] SLUB: HWalign=64, Order=0-3, MinObjects=0, CPUs=4, Nodes=1
[    0.000000] Preemptible hierarchical RCU implementation.
[    0.000000] 	RCU dyntick-idle grace-period acceleration is enabled.
[    0.000000] 	Dump stacks of tasks blocking RCU-preempt GP.
[    0.000000] 	RCU restricting CPUs from NR_CPUS=128 to nr_cpu_ids=4.
[    0.000000] NR_IRQS:8448 nr_irqs:712 16
[    0.000000] Console: colour VGA+ 80x25
[    0.000000] console [tty0] enabled
[    0.000000] allocated 16777216 bytes of page_cgroup
[    0.000000] please try 'cgroup_disable=memory' option if you don't want memory cgroups
[    0.000000] hpet clockevent registered
[    0.000000] tsc: Detected 4026.788 MHz processor
[    0.000000] tsc: Marking TSC unstable due to TSCs unsynchronized
[    0.006666] Calibrating delay loop (skipped) preset value.. 8056.14 BogoMIPS (lpj=13422626)
[    0.006666] pid_max: default: 32768 minimum: 301
[    0.006666] Security Framework initialized
[    0.006666] AppArmor: AppArmor disabled by boot time parameter
[    0.006666] Yama: becoming mindful.
[    0.006666] Dentry cache hash table entries: 524288 (order: 10, 4194304 bytes)
[    0.006666] Inode-cache hash table entries: 262144 (order: 9, 2097152 bytes)
[    0.006666] Mount-cache hash table entries: 256
[    0.006666] Initializing cgroup subsys memory
[    0.006666] Initializing cgroup subsys devices
[    0.006666] Initializing cgroup subsys freezer
[    0.006666] Initializing cgroup subsys net_cls
[    0.006666] Initializing cgroup subsys blkio
[    0.006666] tseg: 0000000000
[    0.006666] CPU: Physical Processor ID: 0
[    0.006666] CPU: Processor Core ID: 0
[    0.006666] mce: CPU supports 10 MCE banks
[    0.006666] Last level iTLB entries: 4KB 512, 2MB 1024, 4MB 512
Last level dTLB entries: 4KB 1024, 2MB 1024, 4MB 512
tlb_flushall_shift: 5
[    0.006737] Freeing SMP alternatives memory: 20K (ffffffff819ec000 - ffffffff819f1000)
[    0.009609] ACPI: Core revision 20131115
[    0.010339] ACPI: All ACPI Tables successfully acquired
[    0.010371] ftrace: allocating 21048 entries in 83 pages
[    0.017035] Enabling x2apic
[    0.017041] Enabled x2apic
[    0.017242] Switched APIC routing to physical x2apic.
[    0.018419] ..TIMER: vector=0x30 apic1=0 pin1=2 apic2=-1 pin2=-1
[    0.020001] smpboot: CPU0: AMD FX(tm)-8350 Eight-Core Processor (fam: 15, model: 02, stepping: 00)
[    0.023333] Performance Events: Broken PMU hardware detected, using software events only.
[    0.023333] Failed to access perfctr msr (MSR c0010004 is 0)
[    0.033411] NMI watchdog: disabled (cpu0): hardware events not enabled
[    0.040136] x86: Booting SMP configuration:
[    0.006666] kvm-clock: cpu 1, msr 1:7fff7041, secondary cpu clock
[    0.052624] KVM setup async PF for cpu 1
[    0.052624] kvm-stealtime: cpu 1, msr 17fc8df00
[    0.006666] kvm-clock: cpu 2, msr 1:7fff7081, secondary cpu clock
[    0.069159] KVM setup async PF for cpu 2
[    0.069159] kvm-stealtime: cpu 2, msr 17fd0df00
[    0.006666] kvm-clock: cpu 3, msr 1:7fff70c1, secondary cpu clock
[    0.040138] .... node  #0, CPUs:      #1 #2 #3
[    0.086008] x86: Booted up 1 node, 4 CPUs
[    0.086012] smpboot: Total of 4 processors activated (32227.56 BogoMIPS)
[    0.085999] KVM setup async PF for cpu 3
[    0.085999] kvm-stealtime: cpu 3, msr 17fd8df00
[    0.090146] devtmpfs: initialized
[    0.094717] RTC time: 19:51:47, date: 02/05/14
[    0.094717] NET: Registered protocol family 16
[    0.094717] cpuidle: using governor ladder
[    0.094717] cpuidle: using governor menu
[    0.095183] ACPI: bus type PCI registered
[    0.095185] acpiphp: ACPI Hot Plug PCI Controller Driver version: 0.5
[    0.096759] PCI: MMCONFIG for domain 0000 [bus 00-ff] at [mem 0xb0000000-0xbfffffff] (base 0xb0000000)
[    0.096761] PCI: MMCONFIG at [mem 0xb0000000-0xbfffffff] reserved in E820
[    0.105643] PCI: Using configuration type 1 for base access
[    0.106174] bio: create slab <bio-0> at 0
[    0.106174] ACPI: Added _OSI(Module Device)
[    0.106174] ACPI: Added _OSI(Processor Device)
[    0.106174] ACPI: Added _OSI(3.0 _SCP Extensions)
[    0.106174] ACPI: Added _OSI(Processor Aggregator Device)
[    0.109579] ACPI: Interpreter enabled
[    0.109583] ACPI Exception: AE_NOT_FOUND, While evaluating Sleep State [\_S1_] (20131115/hwxface-580)
[    0.109586] ACPI Exception: AE_NOT_FOUND, While evaluating Sleep State [\_S2_] (20131115/hwxface-580)
[    0.109599] ACPI: (supports S0 S3 S4 S5)
[    0.109600] ACPI: Using IOAPIC for interrupt routing
[    0.109617] PCI: Using host bridge windows from ACPI; if necessary, use "pci=nocrs" and report a bug
[    0.109661] ACPI: No dock devices found.
[    0.112164] ACPI: PCI Root Bridge [PCI0] (domain 0000 [bus 00-ff])
[    0.112168] acpi PNP0A08:00: _OSC: OS supports [ExtendedConfig ASPM ClockPM Segments MSI]
[    0.112310] acpi PNP0A08:00: _OSC: OS now controls [PCIeHotplug PME AER PCIeCapability]
[    0.112458] PCI host bridge to bus 0000:00
[    0.112461] pci_bus 0000:00: root bus resource [bus 00-ff]
[    0.112463] pci_bus 0000:00: root bus resource [io  0x0000-0x0cd7]
[    0.112464] pci_bus 0000:00: root bus resource [io  0x0d00-0xffff]
[    0.112466] pci_bus 0000:00: root bus resource [mem 0x000a0000-0x000bffff]
[    0.112468] pci_bus 0000:00: root bus resource [mem 0xc0000000-0xfebfffff]
[    0.112504] pci 0000:00:00.0: [8086:29c0] type 00 class 0x060000
[    0.112874] pci 0000:00:01.0: [1af4:1000] type 00 class 0x020000
[    0.114321] pci 0000:00:01.0: reg 0x10: [io  0xe080-0xe09f]
[    0.116669] pci 0000:00:01.0: reg 0x14: [mem 0xfea40000-0xfea40fff]
[    0.126669] pci 0000:00:01.0: reg 0x30: [mem 0xfea00000-0xfea3ffff pref]
[    0.126967] pci 0000:00:02.0: [1af4:1001] type 00 class 0x010000
[    0.128961] pci 0000:00:02.0: reg 0x10: [io  0xe000-0xe03f]
[    0.130976] pci 0000:00:02.0: reg 0x14: [mem 0xfea41000-0xfea41fff]
[    0.140297] pci 0000:00:03.0: [8086:3420] type 01 class 0x060400
[    0.160664] pci 0000:00:04.0: [8086:3420] type 01 class 0x060400
[    0.180822] pci 0000:00:1f.0: [8086:2918] type 00 class 0x060100
[    0.181214] pci 0000:00:1f.0: quirk: [io  0xb000-0xb07f] claimed by ICH6 ACPI/GPIO/TCO
[    0.181411] pci 0000:00:1f.2: [8086:2922] type 00 class 0x010601
[    0.191937] pci 0000:00:1f.2: reg 0x20: [io  0xe0a0-0xe0bf]
[    0.193335] pci 0000:00:1f.2: reg 0x24: [mem 0xfea42000-0xfea42fff]
[    0.195657] pci 0000:00:1f.3: [8086:2930] type 00 class 0x0c0500
[    0.201013] pci 0000:00:1f.3: reg 0x20: [io  0xb100-0xb13f]
[    0.204292] pci 0000:01:00.0: [1002:67b0] type 00 class 0x030000
[    0.210018] pci 0000:01:00.0: reg 0x10: [mem 0xc0000000-0xcfffffff 64bit pref]
[    0.216692] pci 0000:01:00.0: reg 0x18: [mem 0xd0000000-0xd07fffff 64bit pref]
[    0.223352] pci 0000:01:00.0: reg 0x20: [io  0xd000-0xd0ff]
[    0.230018] pci 0000:01:00.0: reg 0x24: [mem 0xfe800000-0xfe83ffff]
[    0.236685] pci 0000:01:00.0: reg 0x30: [mem 0xfe840000-0xfe85ffff pref]
[    0.237096] pci 0000:01:00.0: supports D1 D2
[    0.237098] pci 0000:01:00.0: PME# supported from D1 D2 D3hot
[    0.237443] pci 0000:01:00.1: [1002:aac8] type 00 class 0x040300
[    0.238755] pci 0000:01:00.1: reg 0x10: [mem 0xfe860000-0xfe863fff 64bit]
[    0.246133] pci 0000:01:00.1: supports D1 D2
[    0.246529] pci 0000:00:03.0: PCI bridge to [bus 01]
[    0.246548] pci 0000:00:03.0:   bridge window [io  0xd000-0xdfff]
[    0.246565] pci 0000:00:03.0:   bridge window [mem 0xfe800000-0xfe9fffff]
[    0.246686] pci 0000:00:03.0:   bridge window [mem 0xc0000000-0xdfffffff 64bit pref]
[    0.248016] pci 0000:02:00.0: [1002:6818] type 00 class 0x030000
[    0.253350] pci 0000:02:00.0: reg 0x10: [mem 0xe0000000-0xefffffff 64bit pref]
[    0.260017] pci 0000:02:00.0: reg 0x18: [mem 0xfe600000-0xfe63ffff 64bit]
[    0.266684] pci 0000:02:00.0: reg 0x20: [io  0xc000-0xc0ff]
[    0.280017] pci 0000:02:00.0: reg 0x30: [mem 0xfe640000-0xfe65ffff pref]
[    0.280421] pci 0000:02:00.0: supports D1 D2
[    0.280423] pci 0000:02:00.0: PME# supported from D1 D2 D3hot
[    0.280820] pci 0000:02:00.1: [1002:aab0] type 00 class 0x040300
[    0.283955] pci 0000:02:00.1: reg 0x10: [mem 0xfe660000-0xfe663fff 64bit]
[    0.290409] pci 0000:02:00.1: supports D1 D2
[    0.290854] pci 0000:00:04.0: PCI bridge to [bus 02]
[    0.290877] pci 0000:00:04.0:   bridge window [io  0xc000-0xcfff]
[    0.290895] pci 0000:00:04.0:   bridge window [mem 0xfe600000-0xfe7fffff]
[    0.290928] pci 0000:00:04.0:   bridge window [mem 0xe0000000-0xefffffff 64bit pref]
[    0.293738] ACPI: PCI Interrupt Link [LNKA] (IRQs 5 *10 11)
[    0.293831] ACPI: PCI Interrupt Link [LNKB] (IRQs 5 *10 11)
[    0.293918] ACPI: PCI Interrupt Link [LNKC] (IRQs 5 10 *11)
[    0.294009] ACPI: PCI Interrupt Link [LNKD] (IRQs 5 10 *11)
[    0.294094] ACPI: PCI Interrupt Link [LNKE] (IRQs 5 *10 11)
[    0.296910] ACPI: PCI Interrupt Link [LNKF] (IRQs 5 *10 11)
[    0.300232] ACPI: PCI Interrupt Link [LNKG] (IRQs 5 10 *11)
[    0.300514] ACPI: PCI Interrupt Link [LNKH] (IRQs 5 10 *11)
[    0.303364] ACPI: PCI Interrupt Link [GSIA] (IRQs *16)
[    0.303378] ACPI: PCI Interrupt Link [GSIB] (IRQs *17)
[    0.303387] ACPI: PCI Interrupt Link [GSIC] (IRQs *18)
[    0.303399] ACPI: PCI Interrupt Link [GSID] (IRQs *19)
[    0.303408] ACPI: PCI Interrupt Link [GSIE] (IRQs *20)
[    0.303420] ACPI: PCI Interrupt Link [GSIF] (IRQs *21)
[    0.303429] ACPI: PCI Interrupt Link [GSIG] (IRQs *22)
[    0.303440] ACPI: PCI Interrupt Link [GSIH] (IRQs *23)
[    0.303987] ACPI: Enabled 16 GPEs in block 00 to 3F
[    0.303992] ACPI: \_SB_.PCI0: notify handler is installed
[    0.304007] Found 1 acpi root devices
[    0.306755] vgaarb: device added: PCI:0000:01:00.0,decodes=io+mem,owns=io+mem,locks=none
[    0.306755] vgaarb: device added: PCI:0000:02:00.0,decodes=io+mem,owns=none,locks=none
[    0.306755] vgaarb: loaded
[    0.306755] vgaarb: bridge control possible 0000:02:00.0
[    0.306755] vgaarb: bridge control possible 0000:01:00.0
[    0.306789] PCI: Using ACPI for IRQ routing
[    0.366721] PCI: pci_cache_line_size set to 64 bytes
[    0.366947] e820: reserve RAM buffer [mem 0x0009fc00-0x0009ffff]
[    0.366953] e820: reserve RAM buffer [mem 0x7fffd000-0x7fffffff]
[    0.367080] NetLabel: Initializing
[    0.367081] NetLabel:  domain hash size = 128
[    0.367082] NetLabel:  protocols = UNLABELED CIPSOv4
[    0.367115] NetLabel:  unlabeled traffic allowed by default
[    0.367153] hpet0: at MMIO 0xfed00000, IRQs 2, 8, 0
[    0.367156] hpet0: 3 comparators, 64-bit 100.000000 MHz counter
[    0.371846] Switched to clocksource kvm-clock
[    0.375617] pnp: PnP ACPI init
[    0.375627] ACPI: bus type PNP registered
[    0.375694] pnp 00:00: Plug and Play ACPI device, IDs PNP0b00 (active)
[    0.375735] pnp 00:01: Plug and Play ACPI device, IDs PNP0303 (active)
[    0.375775] pnp 00:02: Plug and Play ACPI device, IDs PNP0f13 (active)
[    0.375825] pnp 00:03: [dma 2]
[    0.375844] pnp 00:03: Plug and Play ACPI device, IDs PNP0700 (active)
[    0.375899] pnp 00:04: Plug and Play ACPI device, IDs PNP0400 (active)
[    0.375960] pnp 00:05: Plug and Play ACPI device, IDs PNP0501 (active)
[    0.376018] pnp 00:06: Plug and Play ACPI device, IDs PNP0501 (active)
[    0.376098] pnp 00:07: Plug and Play ACPI device, IDs PNP0103 (active)
[    0.376298] pnp: PnP ACPI: found 8 devices
[    0.376299] ACPI: bus type PNP unregistered
[    0.599388] pci 0000:00:03.0: PCI bridge to [bus 01]
[    0.599401] pci 0000:00:03.0:   bridge window [io  0xd000-0xdfff]
[    0.600897] pci 0000:00:03.0:   bridge window [mem 0xfe800000-0xfe9fffff]
[    0.601881] pci 0000:00:03.0:   bridge window [mem 0xc0000000-0xdfffffff 64bit pref]
[    0.603850] pci 0000:00:04.0: PCI bridge to [bus 02]
[    0.603860] pci 0000:00:04.0:   bridge window [io  0xc000-0xcfff]
[    0.605324] pci 0000:00:04.0:   bridge window [mem 0xfe600000-0xfe7fffff]
[    0.606306] pci 0000:00:04.0:   bridge window [mem 0xe0000000-0xefffffff 64bit pref]
[    0.608282] pci_bus 0000:00: resource 4 [io  0x0000-0x0cd7]
[    0.608284] pci_bus 0000:00: resource 5 [io  0x0d00-0xffff]
[    0.608286] pci_bus 0000:00: resource 6 [mem 0x000a0000-0x000bffff]
[    0.608288] pci_bus 0000:00: resource 7 [mem 0xc0000000-0xfebfffff]
[    0.608289] pci_bus 0000:01: resource 0 [io  0xd000-0xdfff]
[    0.608291] pci_bus 0000:01: resource 1 [mem 0xfe800000-0xfe9fffff]
[    0.608292] pci_bus 0000:01: resource 2 [mem 0xc0000000-0xdfffffff 64bit pref]
[    0.608294] pci_bus 0000:02: resource 0 [io  0xc000-0xcfff]
[    0.608295] pci_bus 0000:02: resource 1 [mem 0xfe600000-0xfe7fffff]
[    0.608297] pci_bus 0000:02: resource 2 [mem 0xe0000000-0xefffffff 64bit pref]
[    0.608393] NET: Registered protocol family 2
[    0.608634] TCP established hash table entries: 32768 (order: 6, 262144 bytes)
[    0.608810] TCP bind hash table entries: 32768 (order: 7, 524288 bytes)
[    0.609041] TCP: Hash tables configured (established 32768 bind 32768)
[    0.609084] TCP: reno registered
[    0.609100] UDP hash table entries: 2048 (order: 4, 65536 bytes)
[    0.609136] UDP-Lite hash table entries: 2048 (order: 4, 65536 bytes)
[    0.609244] NET: Registered protocol family 1
[    0.609378] pci 0000:01:00.0: Boot video device
[    0.609436] PCI: CLS 64 bytes, default 64
[    0.609515] Unpacking initramfs...
[    0.649884] Freeing initrd memory: 2968K (ffff880037a24000 - ffff880037d0a000)
[    0.649902] PCI-DMA: Using software bounce buffering for IO (SWIOTLB)
[    0.649904] software IO TLB [mem 0x7bffd000-0x7fffd000] (64MB) mapped at [ffff88007bffd000-ffff88007fffcfff]
[    0.650415] Scanning for low memory corruption every 60 seconds
[    0.650828] audit: initializing netlink socket (disabled)
[    0.650839] type=2000 audit(1391626309.036:1): initialized
[    0.660409] HugeTLB registered 2 MB page size, pre-allocated 0 pages
[    0.661790] zbud: loaded
[    0.662009] VFS: Disk quotas dquot_6.5.2
[    0.662041] Dquot-cache hash table entries: 512 (order 0, 4096 bytes)
[    0.662231] msgmni has been set to 7910
[    0.662276] Key type big_key registered
[    0.663107] Block layer SCSI generic (bsg) driver version 0.4 loaded (major 252)
[    0.663274] io scheduler noop registered
[    0.663278] io scheduler deadline registered
[    0.663370] io scheduler cfq registered (default)
[    0.667905] pcieport 0000:00:03.0: irq 40 for MSI/MSI-X
[    0.674652] pcieport 0000:00:04.0: irq 41 for MSI/MSI-X
[    0.675830] aer 0000:00:03.0:pcie02: service driver aer loaded
[    0.675965] aer 0000:00:04.0:pcie02: service driver aer loaded
[    0.676003] pcieport 0000:00:03.0: Signaling PME through PCIe PME interrupt
[    0.676005] pci 0000:01:00.0: Signaling PME through PCIe PME interrupt
[    0.676007] pci 0000:01:00.1: Signaling PME through PCIe PME interrupt
[    0.676026] pcie_pme 0000:00:03.0:pcie01: service driver pcie_pme loaded
[    0.676060] pcieport 0000:00:04.0: Signaling PME through PCIe PME interrupt
[    0.676061] pci 0000:02:00.0: Signaling PME through PCIe PME interrupt
[    0.676063] pci 0000:02:00.1: Signaling PME through PCIe PME interrupt
[    0.676079] pcie_pme 0000:00:04.0:pcie01: service driver pcie_pme loaded
[    0.676089] pci_hotplug: PCI Hot Plug PCI Core version: 0.5
[    0.676186] pciehp 0000:00:03.0:pcie04: HPC vendor_id 8086 device_id 3420 ss_vid 8086 ss_did 0
[    0.676320] pciehp 0000:00:03.0:pcie04: service driver pciehp loaded
[    0.676384] pciehp 0000:00:04.0:pcie04: HPC vendor_id 8086 device_id 3420 ss_vid 8086 ss_did 0
[    0.676495] pciehp 0000:00:04.0:pcie04: service driver pciehp loaded
[    0.676499] pciehp: PCI Express Hot Plug Controller Driver version: 0.4
[    0.676586] GHES: HEST is not enabled!
[    0.676677] Serial: 8250/16550 driver, 4 ports, IRQ sharing disabled
[    0.677154] Linux agpgart interface v0.103
[    0.677235] rtc_cmos 00:00: RTC can wake from S4
[    0.677522] rtc_cmos 00:00: rtc core: registered rtc_cmos as rtc0
[    0.677666] rtc_cmos 00:00: alarms up to one day, 114 bytes nvram, hpet irqs
[    0.677739] drop_monitor: Initializing network drop monitor service
[    0.677778] TCP: cubic registered
[    0.677861] NET: Registered protocol family 10
[    0.677987] NET: Registered protocol family 17
[    0.677994] Key type dns_resolver registered
[    0.678280] registered taskstats version 1
[    0.679588]   Magic number: 2:704:899
[    0.679724] rtc_cmos 00:00: setting system clock to 2014-02-05 19:51:49 UTC (1391629909)
[    0.679744] PM: Hibernation image not present or could not be loaded.
[    0.680831] Freeing unused kernel memory: 1124K (ffffffff818d3000 - ffffffff819ec000)
[    0.680832] Write protecting the kernel read-only data: 8192k
[    0.683395] Freeing unused kernel memory: 864K (ffff880001528000 - ffff880001600000)
[    0.684480] Freeing unused kernel memory: 376K (ffff8800017a2000 - ffff880001800000)
[    0.711248] systemd-udevd[56]: starting version 208
[    0.735972] ACPI: PCI Interrupt Link [GSIF] enabled at IRQ 21
[    0.736515] ACPI: PCI Interrupt Link [GSIG] enabled at IRQ 22
[    0.741769] i8042: PNP: PS/2 Controller [PNP0303:KBD,PNP0f13:MOU] at 0x60,0x64 irq 1,12
[    0.742675] serio: i8042 KBD port at 0x60,0x64 irq 1
[    0.742691] serio: i8042 AUX port at 0x60,0x64 irq 12
[    0.745231] SCSI subsystem initialized
[    0.746859] libata version 3.00 loaded.
[    0.747783] ahci 0000:00:1f.2: version 3.0
[    0.748174] ACPI: PCI Interrupt Link [GSIA] enabled at IRQ 16
[    0.748329] ahci 0000:00:1f.2: irq 42 for MSI/MSI-X
[    0.748833] ahci 0000:00:1f.2: AHCI 0001.0000 32 slots 6 ports 1.5 Gbps 0x3f impl SATA mode
[    0.748839] ahci 0000:00:1f.2: flags: ncq only 
[    0.750548] scsi0 : ahci
[    0.750697] scsi1 : ahci
[    0.751206] scsi2 : ahci
[    0.751306] scsi3 : ahci
[    0.751506] scsi4 : ahci
[    0.751682] scsi5 : ahci
[    0.751760] ata1: SATA max UDMA/133 abar m4096@0xfea42000 port 0xfea42100 irq 42
[    0.751769] ata2: SATA max UDMA/133 abar m4096@0xfea42000 port 0xfea42180 irq 42
[    0.751777] ata3: SATA max UDMA/133 abar m4096@0xfea42000 port 0xfea42200 irq 42
[    0.751787] ata4: SATA max UDMA/133 abar m4096@0xfea42000 port 0xfea42280 irq 42
[    0.751802] ata5: SATA max UDMA/133 abar m4096@0xfea42000 port 0xfea42300 irq 42
[    0.751811] ata6: SATA max UDMA/133 abar m4096@0xfea42000 port 0xfea42380 irq 42
[    0.753921] FDC 0 is a S82078B
[    0.758233] virtio-pci 0000:00:01.0: irq 43 for MSI/MSI-X
[    0.758253] virtio-pci 0000:00:01.0: irq 44 for MSI/MSI-X
[    0.758273] virtio-pci 0000:00:01.0: irq 45 for MSI/MSI-X
[    0.758544] virtio-pci 0000:00:02.0: irq 46 for MSI/MSI-X
[    0.758571] virtio-pci 0000:00:02.0: irq 47 for MSI/MSI-X
[    0.758819] blk-mq: CPU -> queue map
[    0.758821]   CPU 0 -> Queue 0
[    0.758823]   CPU 1 -> Queue 0
[    0.758825]   CPU 2 -> Queue 0
[    0.758826]   CPU 3 -> Queue 0
[    0.948870]  vda: vda1
[    0.949434] input: AT Translated Set 2 keyboard as /devices/platform/i8042/serio0/input/input0
[    1.070335] ata1: SATA link down (SStatus 0 SControl 300)
[    1.070541] ata4: SATA link down (SStatus 0 SControl 300)
[    1.070683] ata3: SATA link down (SStatus 0 SControl 300)
[    1.260331] ata5: SATA link down (SStatus 0 SControl 300)
[    1.260531] ata2: SATA link down (SStatus 0 SControl 300)
[    1.260670] ata6: SATA link down (SStatus 0 SControl 300)
[    1.329378] EXT4-fs (vda1): mounted filesystem with ordered data mode. Opts: (null)
[    1.401220] systemd[1]: systemd 208 running in system mode. (+PAM -LIBWRAP -AUDIT -SELINUX -IMA -SYSVINIT +LIBCRYPTSETUP +GCRYPT +ACL +XZ)
[    1.401264] systemd[1]: Detected virtualization 'kvm'.
[    1.402635] systemd[1]: Set hostname to <linux.theraso.int>.
[    1.405233] random: systemd urandom read with 7 bits of entropy available
[    1.455928] systemd[1]: Cannot add dependency job for unit display-manager.service, ignoring: Unit display-manager.service failed to load: No such file or directory.
[    1.456115] systemd[1]: Starting Forward Password Requests to Wall Directory Watch.
[    1.456161] systemd[1]: Started Forward Password Requests to Wall Directory Watch.
[    1.456173] systemd[1]: Starting Remote File Systems.
[    1.457178] systemd[1]: Reached target Remote File Systems.
[    1.457192] systemd[1]: Expecting device sys-subsystem-net-devices-net0.device...
[    1.458007] systemd[1]: Starting /dev/initctl Compatibility Named Pipe.
[    1.459048] systemd[1]: Listening on /dev/initctl Compatibility Named Pipe.
[    1.459055] systemd[1]: Starting Device-mapper event daemon FIFOs.
[    1.460071] systemd[1]: Listening on Device-mapper event daemon FIFOs.
[    1.460078] systemd[1]: Starting LVM2 metadata daemon socket.
[    1.461061] systemd[1]: Listening on LVM2 metadata daemon socket.
[    1.461067] systemd[1]: Starting Delayed Shutdown Socket.
[    1.462429] systemd[1]: Listening on Delayed Shutdown Socket.
[    1.462440] systemd[1]: Starting Dispatch Password Requests to Console Directory Watch.
[    1.462467] systemd[1]: Started Dispatch Password Requests to Console Directory Watch.
[    1.462474] systemd[1]: Starting Paths.
[    1.463746] systemd[1]: Reached target Paths.
[    1.463754] systemd[1]: Starting Encrypted Volumes.
[    1.465065] systemd[1]: Reached target Encrypted Volumes.
[    1.465072] systemd[1]: Starting Journal Socket.
[    1.466396] systemd[1]: Listening on Journal Socket.
[    1.466413] systemd[1]: Mounting POSIX Message Queue File System...
[    1.468491] systemd[1]: Starting Setup Virtual Console...
[    1.471692] systemd[1]: Started Load Kernel Modules.
[    1.471718] systemd[1]: Mounted FUSE Control File System.
[    1.471741] systemd[1]: Mounting Configuration File System...
[    1.473924] systemd[1]: Mounting Huge Pages File System...
[    1.485114] systemd[1]: Starting Create list of required static device nodes for the current kernel...
[    1.488779] systemd[1]: Starting Apply Kernel Variables...
[    1.491496] systemd[1]: Mounting Debug File System...
[    1.498795] systemd[1]: Starting Journal Service...
[    1.505494] systemd[1]: Started Journal Service.
[    1.525117] systemd-journald[129]: Vacuuming done, freed 0 bytes
[    1.592560] EXT4-fs (vda1): re-mounted. Opts: (null)
[    1.626644] systemd-udevd[152]: starting version 208
[    1.645017] systemd-journald[129]: Received request to flush runtime journal from PID 1
[    1.691695] input: Power Button as /devices/LNXSYSTM:00/LNXPWRBN:00/input/input2
[    1.691700] ACPI: Power Button [PWRF]
[    1.702647] lpc_ich 0000:00:1f.0: RCBA is disabled by hardware/BIOS, device disabled
[    1.702716] input: PC Speaker as /devices/platform/pcspkr/input/input3
[    1.702780] lpc_ich 0000:00:1f.0: I/O space for GPIO uninitialized
[    1.702782] lpc_ich 0000:00:1f.0: No MFD cells added
[    1.704944] shpchp: Standard Hot Plug PCI Controller Driver version: 0.4
[    1.707596] i801_smbus 0000:00:1f.3: SMBus using PCI Interrupt
[    1.710496] parport_pc 00:04: reported by Plug and Play ACPI
[    1.738260] [drm] Initialized drm 1.1.0 20060810
[    1.741797] systemd-udevd[159]: renamed network interface eth0 to net0
[    1.747692] microcode: CPU0: patch_level=0x01000065
[    1.748376] microcode: CPU0: update failed for patch_level=0x06000822
[    1.749489] microcode: CPU1: patch_level=0x01000065
[    1.749500] microcode: CPU1: update failed for patch_level=0x06000822
[    1.750567] microcode: CPU2: patch_level=0x01000065
[    1.750580] microcode: CPU2: update failed for patch_level=0x06000822
[    1.751472] microcode: CPU3: patch_level=0x01000065
[    1.751483] microcode: CPU3: update failed for patch_level=0x06000822
[    1.752596] microcode: Microcode Update Driver: v2.00 <tigran@aivazian.fsnet.co.uk>, Peter Oruba
[    1.768935] ACPI: PCI Interrupt Link [GSIE] enabled at IRQ 20
[    1.768998] hda-intel 0000:01:00.1: Handle VGA-switcheroo audio client
[    1.769195] snd_hda_intel 0000:01:00.1: irq 48 for MSI/MSI-X
[    1.845697] ppdev: user-space parallel port driver
[    1.869335] [drm] radeon kernel modesetting enabled.
[    1.872556] input: HD-Audio Generic HDMI/DP,pcm=11 as /devices/pci0000:00/0000:00:03.0/0000:01:00.1/sound/card0/input10
[    1.872684] input: HD-Audio Generic HDMI/DP,pcm=10 as /devices/pci0000:00/0000:00:03.0/0000:01:00.1/sound/card0/input9
[    1.872773] input: HD-Audio Generic HDMI/DP,pcm=9 as /devices/pci0000:00/0000:00:03.0/0000:01:00.1/sound/card0/input8
[    1.872851] input: HD-Audio Generic HDMI/DP,pcm=8 as /devices/pci0000:00/0000:00:03.0/0000:01:00.1/sound/card0/input7
[    1.872947] input: HD-Audio Generic HDMI/DP,pcm=7 as /devices/pci0000:00/0000:00:03.0/0000:01:00.1/sound/card0/input6
[    1.873023] input: HD-Audio Generic HDMI/DP,pcm=3 as /devices/pci0000:00/0000:00:03.0/0000:01:00.1/sound/card0/input5
[    1.874287] hda-intel 0000:02:00.1: Handle VGA-switcheroo audio client
[    1.874290] hda-intel 0000:02:00.1: Force to non-snoop mode
[    1.874389] snd_hda_intel 0000:02:00.1: irq 49 for MSI/MSI-X
[    1.874482] ACPI: PCI Interrupt Link [GSIH] enabled at IRQ 23
[    1.946147] [drm] initializing kernel modesetting (HAWAII 0x1002:0x67B0 0x1002:0x0B00).
[    1.946193] [drm] register mmio base: 0xFE800000
[    1.946195] [drm] register mmio size: 262144
[    1.946411] [drm] doorbell mmio base: 0xD0000000
[    1.946413] [drm] doorbell mmio size: 8388608
[    1.946564] ATOM BIOS: C67101
[    1.946816] radeon 0000:01:00.0: VRAM: 4096M 0x0000000000000000 - 0x00000000FFFFFFFF (4096M used)
[    1.946819] radeon 0000:01:00.0: GTT: 1024M 0x0000000100000000 - 0x000000013FFFFFFF
[    1.946821] [drm] Detected VRAM RAM=4096M, BAR=256M
[    1.946823] [drm] RAM width 512bits DDR
[    1.946954] [TTM] Zone  kernel: Available graphics memory: 2026158 kiB
[    1.946956] [TTM] Initializing pool allocator
[    1.946961] [TTM] Initializing DMA pool allocator
[    1.946992] [drm] radeon: 4096M of VRAM memory ready
[    1.946993] [drm] radeon: 1024M of GTT memory ready.
[    1.951240] [drm] GART: num cpu pages 262144, num gpu pages 262144
[    1.953720] [drm] probing gen 2 caps for device 8086:3420 = 1000411/0
[    1.956526] input: HDA ATI HDMI HDMI/DP,pcm=11 as /devices/pci0000:00/0000:00:04.0/0000:02:00.1/sound/card1/input16
[    1.956646] input: HDA ATI HDMI HDMI/DP,pcm=10 as /devices/pci0000:00/0000:00:04.0/0000:02:00.1/sound/card1/input15
[    1.956827] input: HDA ATI HDMI HDMI/DP,pcm=9 as /devices/pci0000:00/0000:00:04.0/0000:02:00.1/sound/card1/input14
[    1.956920] input: HDA ATI HDMI HDMI/DP,pcm=8 as /devices/pci0000:00/0000:00:04.0/0000:02:00.1/sound/card1/input13
[    1.957014] input: HDA ATI HDMI HDMI/DP,pcm=7 as /devices/pci0000:00/0000:00:04.0/0000:02:00.1/sound/card1/input12
[    1.957114] input: HDA ATI HDMI HDMI/DP,pcm=3 as /devices/pci0000:00/0000:00:04.0/0000:02:00.1/sound/card1/input11
[    1.960082] [drm] Loading HAWAII Microcode
[    1.976104] [drm] PCIE GART of 1024M enabled (table at 0x0000000000277000).
[    1.976294] radeon 0000:01:00.0: WB enabled
[    1.976347] radeon 0000:01:00.0: fence driver on ring 0 use gpu addr 0x0000000100000c00 and cpu addr 0xffff880176de7c00
[    1.976352] radeon 0000:01:00.0: fence driver on ring 1 use gpu addr 0x0000000100000c04 and cpu addr 0xffff880176de7c04
[    1.976355] radeon 0000:01:00.0: fence driver on ring 2 use gpu addr 0x0000000100000c08 and cpu addr 0xffff880176de7c08
[    1.976359] radeon 0000:01:00.0: fence driver on ring 3 use gpu addr 0x0000000100000c0c and cpu addr 0xffff880176de7c0c
[    1.976361] radeon 0000:01:00.0: fence driver on ring 4 use gpu addr 0x0000000100000c10 and cpu addr 0xffff880176de7c10
[    1.982775] radeon 0000:01:00.0: fence driver on ring 5 use gpu addr 0x0000000000076c98 and cpu addr 0xffffc90012c36c98
[    1.982780] [drm] Supports vblank timestamp caching Rev 2 (21.10.2013).
[    1.982782] [drm] Driver supports precise vblank timestamp query.
[    1.982926] radeon 0000:01:00.0: irq 50 for MSI/MSI-X
[    2.056937] radeon 0000:01:00.0: radeon: using MSI.
[    2.057379] [drm] radeon: irq initialized.
[    2.062339] [drm] ring test on 0 succeeded in 3 usecs
[    2.062448] [drm] ring test on 1 succeeded in 3 usecs
[    2.062469] [drm] ring test on 2 succeeded in 2 usecs
[    2.062621] [drm] ring test on 3 succeeded in 2 usecs
[    2.062634] [drm] ring test on 4 succeeded in 2 usecs
[    2.121989] [drm] ring test on 5 succeeded in 2 usecs
[    2.142078] [drm] UVD initialized successfully.
[    2.143791] [drm] Enabling audio 0 support
[    2.143793] [drm] Enabling audio 1 support
[    2.143794] [drm] Enabling audio 2 support
[    2.143795] [drm] Enabling audio 3 support
[    2.143796] [drm] Enabling audio 4 support
[    2.143797] [drm] Enabling audio 5 support
[    2.143979] [drm] ib test on ring 0 succeeded in 0 usecs
[    2.144138] [drm] ib test on ring 1 succeeded in 0 usecs
[    2.144298] [drm] ib test on ring 2 succeeded in 0 usecs
[    2.144462] [drm] ib test on ring 3 succeeded in 0 usecs
[    2.144624] [drm] ib test on ring 4 succeeded in 1 usecs
[    2.165787] [drm] ib test on ring 5 succeeded
[    2.186676] [drm] Radeon Display Connectors
[    2.186694] [drm] Connector 0:
[    2.186702] [drm]   DP-1
[    2.186706] [drm]   HPD2
[    2.186708] [drm]   DDC: 0x6530 0x6530 0x6534 0x6534 0x6538 0x6538 0x653c 0x653c
[    2.186712] [drm]   Encoders:
[    2.186713] [drm]     DFP1: INTERNAL_UNIPHY2
[    2.186757] [drm] Connector 1:
[    2.186758] [drm]   HDMI-A-1
[    2.186763] [drm]   HPD3
[    2.186767] [drm]   DDC: 0x6550 0x6550 0x6554 0x6554 0x6558 0x6558 0x655c 0x655c
[    2.186768] [drm]   Encoders:
[    2.186769] [drm]     DFP2: INTERNAL_UNIPHY2
[    2.186770] [drm] Connector 2:
[    2.186772] [drm]   DVI-D-1
[    2.186773] [drm]   HPD1
[    2.186775] [drm]   DDC: 0x6560 0x6560 0x6564 0x6564 0x6568 0x6568 0x656c 0x656c
[    2.186776] [drm]   Encoders:
[    2.186776] [drm]     DFP3: INTERNAL_UNIPHY1
[    2.186777] [drm] Connector 3:
[    2.186778] [drm]   DVI-D-2
[    2.186779] [drm]   HPD6
[    2.186780] [drm]   DDC: 0x6580 0x6580 0x6584 0x6584 0x6588 0x6588 0x658c 0x658c
[    2.186781] [drm]   Encoders:
[    2.186784] [drm]     DFP4: INTERNAL_UNIPHY
[    2.186941] [drm] Internal thermal controller with fan control
[    2.186980] [drm] radeon: power management initialized
[    2.276125] [drm] fb mappable at 0xC1488000
[    2.276128] [drm] vram apper at 0xC0000000
[    2.276129] [drm] size 8294400
[    2.276130] [drm] fb depth is 24
[    2.276133] [drm]    pitch is 7680
[    2.276314] fbcon: radeondrmfb (fb0) is primary device
[    2.318864] Console: switching to colour frame buffer device 240x67
[    2.335718] radeon 0000:01:00.0: fb0: radeondrmfb frame buffer device
[    2.335719] radeon 0000:01:00.0: registered panic notifier
[    2.335724] [drm] Initialized radeon 2.36.0 20080528 for 0000:01:00.0 on minor 0
[    2.336550] [drm] initializing kernel modesetting (PITCAIRN 0x1002:0x6818 0x1682:0x3251).
[    2.336575] [drm] register mmio base: 0xFE600000
[    2.336576] [drm] register mmio size: 262144
[    2.488121] ATOM BIOS: C40102
[    2.488166] [drm] GPU not posted. posting now...
[    2.499300] radeon 0000:02:00.0: VRAM: 2048M 0x0000000000000000 - 0x000000007FFFFFFF (2048M used)
[    2.499302] radeon 0000:02:00.0: GTT: 1024M 0x0000000080000000 - 0x00000000BFFFFFFF
[    2.499304] [drm] Detected VRAM RAM=2048M, BAR=256M
[    2.499305] [drm] RAM width 256bits DDR
[    2.499314] [drm] radeon: 2048M of VRAM memory ready
[    2.499316] [drm] radeon: 1024M of GTT memory ready.
[    2.502070] [drm] GART: num cpu pages 262144, num gpu pages 262144
[    2.504543] [drm] probing gen 2 caps for device 8086:3420 = 2000411/0
[    2.505802] [drm] Loading PITCAIRN Microcode
[    2.515475] [drm] PCIE GART of 1024M enabled (table at 0x0000000000276000).
[    2.515640] radeon 0000:02:00.0: WB enabled
[    2.515642] radeon 0000:02:00.0: fence driver on ring 0 use gpu addr 0x0000000080000c00 and cpu addr 0xffff880176f7fc00
[    2.515644] radeon 0000:02:00.0: fence driver on ring 1 use gpu addr 0x0000000080000c04 and cpu addr 0xffff880176f7fc04
[    2.515646] radeon 0000:02:00.0: fence driver on ring 2 use gpu addr 0x0000000080000c08 and cpu addr 0xffff880176f7fc08
[    2.515648] radeon 0000:02:00.0: fence driver on ring 3 use gpu addr 0x0000000080000c0c and cpu addr 0xffff880176f7fc0c
[    2.515649] radeon 0000:02:00.0: fence driver on ring 4 use gpu addr 0x0000000080000c10 and cpu addr 0xffff880176f7fc10
[    2.520952] radeon 0000:02:00.0: fence driver on ring 5 use gpu addr 0x0000000000075a18 and cpu addr 0xffffc90016335a18
[    2.520954] [drm] Supports vblank timestamp caching Rev 2 (21.10.2013).
[    2.520955] [drm] Driver supports precise vblank timestamp query.
[    2.521042] radeon 0000:02:00.0: irq 51 for MSI/MSI-X
[    2.610398] radeon 0000:02:00.0: radeon: using MSI.
[    2.610523] [drm] radeon: irq initialized.
[    2.802338] [drm] ring test on 0 succeeded in 1 usecs
[    2.802348] [drm] ring test on 1 succeeded in 1 usecs
[    2.802353] [drm] ring test on 2 succeeded in 1 usecs
[    2.802423] [drm] ring test on 3 succeeded in 2 usecs
[    2.802439] [drm] ring test on 4 succeeded in 2 usecs
[    2.892747] input: ImExPS/2 Generic Explorer Mouse as /devices/platform/i8042/serio1/input/input4
[    2.899350] mousedev: PS/2 mouse device common for all mice
[    2.989972] [drm] ring test on 5 succeeded in 2 usecs
[    2.989978] [drm] UVD initialized successfully.
[    2.990960] [drm] Enabling audio 0 support
[    2.990962] [drm] Enabling audio 1 support
[    2.990963] [drm] Enabling audio 2 support
[    2.990964] [drm] Enabling audio 3 support
[    2.990964] [drm] Enabling audio 4 support
[    2.990965] [drm] Enabling audio 5 support
[    2.991290] [drm] ib test on ring 0 succeeded in 0 usecs
[    2.991363] [drm] ib test on ring 1 succeeded in 0 usecs
[    2.991443] [drm] ib test on ring 2 succeeded in 0 usecs
[    2.991478] [drm] ib test on ring 3 succeeded in 0 usecs
[    2.991510] [drm] ib test on ring 4 succeeded in 1 usecs
[    3.150373] [drm] ib test on ring 5 succeeded
[    3.151245] [drm] Radeon Display Connectors
[    3.151246] [drm] Connector 0:
[    3.151247] [drm]   DP-2
[    3.151248] [drm]   HPD4
[    3.151250] [drm]   DDC: 0x6530 0x6530 0x6534 0x6534 0x6538 0x6538 0x653c 0x653c
[    3.151251] [drm]   Encoders:
[    3.151252] [drm]     DFP1: INTERNAL_UNIPHY2
[    3.151252] [drm] Connector 1:
[    3.151253] [drm]   DP-3
[    3.151254] [drm]   HPD5
[    3.151255] [drm]   DDC: 0x6540 0x6540 0x6544 0x6544 0x6548 0x6548 0x654c 0x654c
[    3.151256] [drm]   Encoders:
[    3.151257] [drm]     DFP2: INTERNAL_UNIPHY2
[    3.151258] [drm] Connector 2:
[    3.151259] [drm]   HDMI-A-2
[    3.151260] [drm]   HPD1
[    3.151261] [drm]   DDC: 0x6550 0x6550 0x6554 0x6554 0x6558 0x6558 0x655c 0x655c
[    3.151261] [drm]   Encoders:
[    3.151262] [drm]     DFP3: INTERNAL_UNIPHY1
[    3.151263] [drm] Connector 3:
[    3.151264] [drm]   DVI-I-1
[    3.151265] [drm]   HPD6
[    3.151266] [drm]   DDC: 0x6580 0x6580 0x6584 0x6584 0x6588 0x6588 0x658c 0x658c
[    3.151266] [drm]   Encoders:
[    3.151267] [drm]     DFP4: INTERNAL_UNIPHY
[    3.151268] [drm]     CRT1: INTERNAL_KLDSCP_DAC1
[    3.151269] [drm] Connector 4:
[    3.151270] [drm]   DVI-D-3
[    3.151271] [drm]   HPD2
[    3.151272] [drm]   DDC: 0x6570 0x6570 0x6574 0x6574 0x6578 0x6578 0x657c 0x657c
[    3.151272] [drm]   Encoders:
[    3.151273] [drm]     DFP5: INTERNAL_UNIPHY1
[    3.157104] [drm] Internal thermal controller with fan control
[    3.157318] [drm] probing gen 2 caps for device 8086:3420 = 2000411/0
[    4.615397] [drm] radeon: dpm initialized
[    4.638371] radeon 0000:02:00.0: No connectors reported connected with modes
[    4.638375] [drm] Cannot find any crtc or sizes - going 1024x768
[    4.645332] [drm] fb mappable at 0xE1480000
[    4.645334] [drm] vram apper at 0xE0000000
[    4.645335] [drm] size 3145728
[    4.645336] [drm] fb depth is 24
[    4.645337] [drm]    pitch is 4096
[    4.645525] radeon 0000:02:00.0: fb1: radeondrmfb frame buffer device
[    4.645530] [drm] Initialized radeon 2.36.0 20080528 for 0000:02:00.0 on minor 1
[   10.102815] type=1006 audit(1391629918.919:2): pid=330 uid=0 old auid=4294967295 new auid=0 old ses=4294967295 new ses=1 res=1
[   10.116917] type=1006 audit(1391629918.936:3): pid=332 uid=0 old auid=4294967295 new auid=0 old ses=4294967295 new ses=2 res=1
[   10.684350] [drm] Disabling audio 0 support
[   10.684353] [drm] Disabling audio 1 support
[   10.684354] [drm] Disabling audio 2 support
[   10.684355] [drm] Disabling audio 3 support
[   10.684356] [drm] Disabling audio 4 support
[   10.684357] [drm] Disabling audio 5 support

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Qemu-devel] Multi GPU passthrough via VFIO
  2014-02-05 18:59 [Qemu-devel] Multi GPU passthrough via VFIO Maik Broemme
@ 2014-02-05 20:26 ` Alex Williamson
  2014-02-05 21:10   ` Maik Broemme
  0 siblings, 1 reply; 16+ messages in thread
From: Alex Williamson @ 2014-02-05 20:26 UTC (permalink / raw)
  To: Maik Broemme; +Cc: qemu-devel

On Wed, 2014-02-05 at 19:59 +0100, Maik Broemme wrote:
> Hi,
> 
> currently VFIO with multi GPU passthrough is working partially and
> hopefully somebody has a hint about the problem. I'm doing passthrough
> of an AMD Radeon R9 290X and AMD Radeon 7870 GHz Edition to a single VM.
> 
> If the VM is running Linux this works quite well with radeon or fglrx
> driver. Please see 'dmesg' log attached, when using the radeon driver.
> If needed I can also post one with fglrx driver.
> 
> If I do the exact same passthrough to a Windows VM and use latest AMD
> Catalyst 14.1 (2/1/2014) or AMD Catalyst 13.12 (12/18/2013) I can get
> only the first device working (AMD R9 290X) with 'x-vga=on'. I don't
> enable 'x-vga=on' on second device as this should never work. :)

Why not?  The guest is able to change the VGA enable bit in the emulated
bridge registers and access VGA space of each device, just like happens
on bare metal.  You'll only get one device initialized from seabios, but
that's the same as would happen on bare metal as well.

> I see
> BIOS boot screen and everything works fine except for the second GPU.
> The windows device manager just show me "Code 12" for the second GPU
> and its HD Audio device. Code 12 means: "This device cannot find enough
> free resources that it can use".

I've seen the same using Nvidia GRID GPUs (w/o x-vga=on), but only with
the Q35 chipset model, Linux works, Windows reports Code 12.  I have no
idea why as all the PCI resources appear to be properly sized and
mapped.  FWIW, 2 GRID GPUs assigned to a guest do work with the 440FX
chipset model.  Beyond 2 we run out of MMIO resources below 4G and
something bad happens.

> QEMU is called in both cases via the following. I just replace the
> '-drive' accordingly.
> 
> /usr/bin/taskset -c 0,1,2,3 /usr/bin/qemu-system-x86_64 \
>   -machine q35,accel=kvm \
>   -enable-kvm \
>   -nodefaults \
>   -nographic \
>   -vga none \
>   -boot order=nc \
>   -cpu host \
>   -smp cores=4,threads=1,sockets=1 \
>   -m 8192 \
>   -rtc base=localtime \
>   -k de \
>   -drive file=/srv/kvm/linux-drive0.img,id=drive0,if=none,cache=none,aio=threads \
>   -mon chardev=monitor0 \
>   -chardev socket,id=monitor0,path=/tmp/linux.monitor,nowait,server \
>   -netdev tap,id=net0,vhost=on,helper=/usr/lib/qemu/qemu-bridge-helper \
>   -device virtio-net-pci,netdev=net0,mac=00:00:00:02:01:04 \
>   -device virtio-blk-pci,drive=drive0,ioeventfd=on \
>   -device ioh3420,bus=pcie.0,id=pcie0,port=1,chassis=1,multifunction=on \
>   -device ioh3420,bus=pcie.0,id=pcie1,port=2,chassis=2,multifunction=on \
>   -device vfio-pci,host=01:00.0,addr=00.0,bus=pcie0,multifunction=on,x-vga=on \
>   -device vfio-pci,host=01:00.1,addr=00.1,bus=pcie0 \
>   -device vfio-pci,host=02:00.0,addr=00.0,bus=pcie1,multifunction=on \
>   -device vfio-pci,host=02:00.1,addr=00.1,bus=pcie1 \
>   -no-reboot
> 
> My setup is the following:
> 
> Kernel: linux-3.13.1
> Seabios: seabios-git-rel.1.7.4.r51.g151d034 (5/2/2014)
> QEMU: qemu-git-2.0.r30666.g31db5b3 (5/2/2014)
> 
> Below is the 'lspci' output and I'm using the AMD Radeon HD 5430 as device
> for my local X server:
> 
> 00:00.0 Host bridge: Advanced Micro Devices, Inc. [AMD/ATI] RD890 PCI to PCI bridge (external gfx0 port B) (rev 02)
> 00:00.2 IOMMU: Advanced Micro Devices, Inc. [AMD/ATI] RD990 I/O Memory Management Unit (IOMMU)
> 00:02.0 PCI bridge: Advanced Micro Devices, Inc. [AMD/ATI] RD890 PCI to PCI bridge (PCI express gpp port B)
> 00:04.0 PCI bridge: Advanced Micro Devices, Inc. [AMD/ATI] RD890 PCI to PCI bridge (PCI express gpp port D)
> 00:09.0 PCI bridge: Advanced Micro Devices, Inc. [AMD/ATI] RD890 PCI to PCI bridge (PCI express gpp port H)
> 00:0d.0 PCI bridge: Advanced Micro Devices, Inc. [AMD/ATI] RD890 PCI to PCI bridge (external gfx1 port B)
> 00:11.0 SATA controller: Advanced Micro Devices, Inc. [AMD/ATI] SB7x0/SB8x0/SB9x0 SATA Controller [AHCI mode] (rev 40)
> 00:12.0 USB controller: Advanced Micro Devices, Inc. [AMD/ATI] SB7x0/SB8x0/SB9x0 USB OHCI0 Controller
> 00:12.2 USB controller: Advanced Micro Devices, Inc. [AMD/ATI] SB7x0/SB8x0/SB9x0 USB EHCI Controller
> 00:13.0 USB controller: Advanced Micro Devices, Inc. [AMD/ATI] SB7x0/SB8x0/SB9x0 USB OHCI0 Controller
> 00:13.2 USB controller: Advanced Micro Devices, Inc. [AMD/ATI] SB7x0/SB8x0/SB9x0 USB EHCI Controller
> 00:14.0 SMBus: Advanced Micro Devices, Inc. [AMD/ATI] SBx00 SMBus Controller (rev 42)
> 00:14.2 Audio device: Advanced Micro Devices, Inc. [AMD/ATI] SBx00 Azalia (Intel HDA) (rev 40)
> 00:14.3 ISA bridge: Advanced Micro Devices, Inc. [AMD/ATI] SB7x0/SB8x0/SB9x0 LPC host controller (rev 40)
> 00:14.4 PCI bridge: Advanced Micro Devices, Inc. [AMD/ATI] SBx00 PCI to PCI Bridge (rev 40)
> 00:14.5 USB controller: Advanced Micro Devices, Inc. [AMD/ATI] SB7x0/SB8x0/SB9x0 USB OHCI2 Controller
> 00:15.0 PCI bridge: Advanced Micro Devices, Inc. [AMD/ATI] SB700/SB800/SB900 PCI to PCI bridge (PCIE port 0)
> 00:15.1 PCI bridge: Advanced Micro Devices, Inc. [AMD/ATI] SB700/SB800/SB900 PCI to PCI bridge (PCIE port 1)
> 00:15.2 PCI bridge: Advanced Micro Devices, Inc. [AMD/ATI] SB900 PCI to PCI bridge (PCIE port 2)
> 00:15.3 PCI bridge: Advanced Micro Devices, Inc. [AMD/ATI] SB900 PCI to PCI bridge (PCIE port 3)
> 00:16.0 USB controller: Advanced Micro Devices, Inc. [AMD/ATI] SB7x0/SB8x0/SB9x0 USB OHCI0 Controller
> 00:16.2 USB controller: Advanced Micro Devices, Inc. [AMD/ATI] SB7x0/SB8x0/SB9x0 USB EHCI Controller
> 00:18.0 Host bridge: Advanced Micro Devices, Inc. [AMD] Family 15h Processor Function 0
> 00:18.1 Host bridge: Advanced Micro Devices, Inc. [AMD] Family 15h Processor Function 1
> 00:18.2 Host bridge: Advanced Micro Devices, Inc. [AMD] Family 15h Processor Function 2
> 00:18.3 Host bridge: Advanced Micro Devices, Inc. [AMD] Family 15h Processor Function 3
> 00:18.4 Host bridge: Advanced Micro Devices, Inc. [AMD] Family 15h Processor Function 4
> 00:18.5 Host bridge: Advanced Micro Devices, Inc. [AMD] Family 15h Processor Function 5
> 01:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] Hawaii XT [Radeon HD 8970]
> 01:00.1 Audio device: Advanced Micro Devices, Inc. [AMD/ATI] Device aac8
> 02:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] Pitcairn XT [Radeon HD 7870 GHz Edition]
> 02:00.1 Audio device: Advanced Micro Devices, Inc. [AMD/ATI] Cape Verde/Pitcairn HDMI Audio [Radeon HD 7700/7800 Series]
> 03:00.0 USB controller: Etron Technology, Inc. EJ168 USB 3.0 Host Controller (rev 01)
> 04:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] Park [Mobility Radeon HD 5430]
> 04:00.1 Audio device: Advanced Micro Devices, Inc. [AMD/ATI] Cedar HDMI Audio [Radeon HD 5400/6300 Series]
> 06:00.0 Ethernet controller: Realtek Semiconductor Co., Ltd. RTL8111/8168/8411 PCI Express Gigabit Ethernet Controller (rev 06)
> 07:00.0 USB controller: Etron Technology, Inc. EJ168 USB 3.0 Host Controller (rev 01)
> 
> Another minor issue is that the R9 290X is not reset during shutdown of
> VM (neither Linux nor Windows) but it can be tricked with doing
> "suspend-to-ram" between two starts. That's why I use '-no-reboot' option
> in QEMU. The 7870 is doing the reset properly.


Is the NoSoftRst "-" on the 290X vs "+" on the 7870 in lspci -vvv by
chance?  Thanks,

Alex 

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Qemu-devel] Multi GPU passthrough via VFIO
  2014-02-05 20:26 ` Alex Williamson
@ 2014-02-05 21:10   ` Maik Broemme
  2014-02-05 21:27     ` Alex Williamson
  0 siblings, 1 reply; 16+ messages in thread
From: Maik Broemme @ 2014-02-05 21:10 UTC (permalink / raw)
  To: Alex Williamson; +Cc: qemu-devel

Hi Alex,

Alex Williamson <alex.williamson@redhat.com> wrote:
> On Wed, 2014-02-05 at 19:59 +0100, Maik Broemme wrote:
> > Hi,
> > 
> > currently VFIO with multi GPU passthrough is working partially and
> > hopefully somebody has a hint about the problem. I'm doing passthrough
> > of an AMD Radeon R9 290X and AMD Radeon 7870 GHz Edition to a single VM.
> > 
> > If the VM is running Linux this works quite well with radeon or fglrx
> > driver. Please see 'dmesg' log attached, when using the radeon driver.
> > If needed I can also post one with fglrx driver.
> > 
> > If I do the exact same passthrough to a Windows VM and use latest AMD
> > Catalyst 14.1 (2/1/2014) or AMD Catalyst 13.12 (12/18/2013) I can get
> > only the first device working (AMD R9 290X) with 'x-vga=on'. I don't
> > enable 'x-vga=on' on second device as this should never work. :)
> 
> Why not?  The guest is able to change the VGA enable bit in the emulated
> bridge registers and access VGA space of each device, just like happens
> on bare metal.  You'll only get one device initialized from seabios, but
> that's the same as would happen on bare metal as well.
> 

Well it was just my guess as it would behave like most physical boxes
in this case. :)

> > I see
> > BIOS boot screen and everything works fine except for the second GPU.
> > The windows device manager just show me "Code 12" for the second GPU
> > and its HD Audio device. Code 12 means: "This device cannot find enough
> > free resources that it can use".
> 
> I've seen the same using Nvidia GRID GPUs (w/o x-vga=on), but only with
> the Q35 chipset model, Linux works, Windows reports Code 12.  I have no
> idea why as all the PCI resources appear to be properly sized and
> mapped.  FWIW, 2 GRID GPUs assigned to a guest do work with the 440FX
> chipset model.  Beyond 2 we run out of MMIO resources below 4G and
> something bad happens.
> 

Interesting. I will try 440FX a bit later and see if this works. What I
can also do is to post system resource conflicts from Windows, the AMD
Catalyst Center has it integrated. Maybe this will help?

> > QEMU is called in both cases via the following. I just replace the
> > '-drive' accordingly.
> > 
> > /usr/bin/taskset -c 0,1,2,3 /usr/bin/qemu-system-x86_64 \
> >   -machine q35,accel=kvm \
> >   -enable-kvm \
> >   -nodefaults \
> >   -nographic \
> >   -vga none \
> >   -boot order=nc \
> >   -cpu host \
> >   -smp cores=4,threads=1,sockets=1 \
> >   -m 8192 \
> >   -rtc base=localtime \
> >   -k de \
> >   -drive file=/srv/kvm/linux-drive0.img,id=drive0,if=none,cache=none,aio=threads \
> >   -mon chardev=monitor0 \
> >   -chardev socket,id=monitor0,path=/tmp/linux.monitor,nowait,server \
> >   -netdev tap,id=net0,vhost=on,helper=/usr/lib/qemu/qemu-bridge-helper \
> >   -device virtio-net-pci,netdev=net0,mac=00:00:00:02:01:04 \
> >   -device virtio-blk-pci,drive=drive0,ioeventfd=on \
> >   -device ioh3420,bus=pcie.0,id=pcie0,port=1,chassis=1,multifunction=on \
> >   -device ioh3420,bus=pcie.0,id=pcie1,port=2,chassis=2,multifunction=on \
> >   -device vfio-pci,host=01:00.0,addr=00.0,bus=pcie0,multifunction=on,x-vga=on \
> >   -device vfio-pci,host=01:00.1,addr=00.1,bus=pcie0 \
> >   -device vfio-pci,host=02:00.0,addr=00.0,bus=pcie1,multifunction=on \
> >   -device vfio-pci,host=02:00.1,addr=00.1,bus=pcie1 \
> >   -no-reboot
> > 
> > My setup is the following:
> > 
> > Kernel: linux-3.13.1
> > Seabios: seabios-git-rel.1.7.4.r51.g151d034 (5/2/2014)
> > QEMU: qemu-git-2.0.r30666.g31db5b3 (5/2/2014)
> > 
> > Below is the 'lspci' output and I'm using the AMD Radeon HD 5430 as device
> > for my local X server:
> > 
> > 00:00.0 Host bridge: Advanced Micro Devices, Inc. [AMD/ATI] RD890 PCI to PCI bridge (external gfx0 port B) (rev 02)
> > 00:00.2 IOMMU: Advanced Micro Devices, Inc. [AMD/ATI] RD990 I/O Memory Management Unit (IOMMU)
> > 00:02.0 PCI bridge: Advanced Micro Devices, Inc. [AMD/ATI] RD890 PCI to PCI bridge (PCI express gpp port B)
> > 00:04.0 PCI bridge: Advanced Micro Devices, Inc. [AMD/ATI] RD890 PCI to PCI bridge (PCI express gpp port D)
> > 00:09.0 PCI bridge: Advanced Micro Devices, Inc. [AMD/ATI] RD890 PCI to PCI bridge (PCI express gpp port H)
> > 00:0d.0 PCI bridge: Advanced Micro Devices, Inc. [AMD/ATI] RD890 PCI to PCI bridge (external gfx1 port B)
> > 00:11.0 SATA controller: Advanced Micro Devices, Inc. [AMD/ATI] SB7x0/SB8x0/SB9x0 SATA Controller [AHCI mode] (rev 40)
> > 00:12.0 USB controller: Advanced Micro Devices, Inc. [AMD/ATI] SB7x0/SB8x0/SB9x0 USB OHCI0 Controller
> > 00:12.2 USB controller: Advanced Micro Devices, Inc. [AMD/ATI] SB7x0/SB8x0/SB9x0 USB EHCI Controller
> > 00:13.0 USB controller: Advanced Micro Devices, Inc. [AMD/ATI] SB7x0/SB8x0/SB9x0 USB OHCI0 Controller
> > 00:13.2 USB controller: Advanced Micro Devices, Inc. [AMD/ATI] SB7x0/SB8x0/SB9x0 USB EHCI Controller
> > 00:14.0 SMBus: Advanced Micro Devices, Inc. [AMD/ATI] SBx00 SMBus Controller (rev 42)
> > 00:14.2 Audio device: Advanced Micro Devices, Inc. [AMD/ATI] SBx00 Azalia (Intel HDA) (rev 40)
> > 00:14.3 ISA bridge: Advanced Micro Devices, Inc. [AMD/ATI] SB7x0/SB8x0/SB9x0 LPC host controller (rev 40)
> > 00:14.4 PCI bridge: Advanced Micro Devices, Inc. [AMD/ATI] SBx00 PCI to PCI Bridge (rev 40)
> > 00:14.5 USB controller: Advanced Micro Devices, Inc. [AMD/ATI] SB7x0/SB8x0/SB9x0 USB OHCI2 Controller
> > 00:15.0 PCI bridge: Advanced Micro Devices, Inc. [AMD/ATI] SB700/SB800/SB900 PCI to PCI bridge (PCIE port 0)
> > 00:15.1 PCI bridge: Advanced Micro Devices, Inc. [AMD/ATI] SB700/SB800/SB900 PCI to PCI bridge (PCIE port 1)
> > 00:15.2 PCI bridge: Advanced Micro Devices, Inc. [AMD/ATI] SB900 PCI to PCI bridge (PCIE port 2)
> > 00:15.3 PCI bridge: Advanced Micro Devices, Inc. [AMD/ATI] SB900 PCI to PCI bridge (PCIE port 3)
> > 00:16.0 USB controller: Advanced Micro Devices, Inc. [AMD/ATI] SB7x0/SB8x0/SB9x0 USB OHCI0 Controller
> > 00:16.2 USB controller: Advanced Micro Devices, Inc. [AMD/ATI] SB7x0/SB8x0/SB9x0 USB EHCI Controller
> > 00:18.0 Host bridge: Advanced Micro Devices, Inc. [AMD] Family 15h Processor Function 0
> > 00:18.1 Host bridge: Advanced Micro Devices, Inc. [AMD] Family 15h Processor Function 1
> > 00:18.2 Host bridge: Advanced Micro Devices, Inc. [AMD] Family 15h Processor Function 2
> > 00:18.3 Host bridge: Advanced Micro Devices, Inc. [AMD] Family 15h Processor Function 3
> > 00:18.4 Host bridge: Advanced Micro Devices, Inc. [AMD] Family 15h Processor Function 4
> > 00:18.5 Host bridge: Advanced Micro Devices, Inc. [AMD] Family 15h Processor Function 5
> > 01:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] Hawaii XT [Radeon HD 8970]
> > 01:00.1 Audio device: Advanced Micro Devices, Inc. [AMD/ATI] Device aac8
> > 02:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] Pitcairn XT [Radeon HD 7870 GHz Edition]
> > 02:00.1 Audio device: Advanced Micro Devices, Inc. [AMD/ATI] Cape Verde/Pitcairn HDMI Audio [Radeon HD 7700/7800 Series]
> > 03:00.0 USB controller: Etron Technology, Inc. EJ168 USB 3.0 Host Controller (rev 01)
> > 04:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] Park [Mobility Radeon HD 5430]
> > 04:00.1 Audio device: Advanced Micro Devices, Inc. [AMD/ATI] Cedar HDMI Audio [Radeon HD 5400/6300 Series]
> > 06:00.0 Ethernet controller: Realtek Semiconductor Co., Ltd. RTL8111/8168/8411 PCI Express Gigabit Ethernet Controller (rev 06)
> > 07:00.0 USB controller: Etron Technology, Inc. EJ168 USB 3.0 Host Controller (rev 01)
> > 
> > Another minor issue is that the R9 290X is not reset during shutdown of
> > VM (neither Linux nor Windows) but it can be tricked with doing
> > "suspend-to-ram" between two starts. That's why I use '-no-reboot' option
> > in QEMU. The 7870 is doing the reset properly.
> 
> 
> Is the NoSoftRst "-" on the 290X vs "+" on the 7870 in lspci -vvv by
> chance?  Thanks,
> 

Here are both. It is funny it is opposite as you described. :)

root@homer:~# lspci -vvv -s 01:00.0 | grep NoSoftRst
		Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=0 PME-

root@homer:~# lspci -vvv -s 02:00.0 | grep NoSoftRst
		Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=0 PME-

root@homer:~# lspci -vvv -s 01:00.0
01:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] Hawaii XT [Radeon HD 8970] (prog-if 00 [VGA controller])
	Subsystem: Advanced Micro Devices, Inc. [AMD/ATI] Device 0b00
	Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR+ FastB2B- DisINTx+
	Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
	Latency: 0, Cache Line Size: 64 bytes
	Interrupt: pin A routed to IRQ 49
	Region 0: Memory at c0000000 (64-bit, prefetchable) [size=256M]
	Region 2: Memory at df800000 (64-bit, prefetchable) [size=8M]
	Region 4: I/O ports at be00 [size=256]
	Region 5: Memory at fdd80000 (32-bit, non-prefetchable) [size=256K]
	[virtual] Expansion ROM at d0000000 [disabled] [size=128K]
	Capabilities: [48] Vendor Specific Information: Len=08 <?>
	Capabilities: [50] Power Management version 3
		Flags: PMEClk- DSI- D1+ D2+ AuxCurrent=0mA PME(D0-,D1+,D2+,D3hot+,D3cold-)
		Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=0 PME-
	Capabilities: [58] Express (v2) Legacy Endpoint, MSI 00
		DevCap:	MaxPayload 256 bytes, PhantFunc 0, Latency L0s <4us, L1 unlimited
			ExtTag+ AttnBtn- AttnInd- PwrInd- RBE+ FLReset-
		DevCtl:	Report errors: Correctable- Non-Fatal- Fatal- Unsupported-
			RlxdOrd+ ExtTag+ PhantFunc- AuxPwr- NoSnoop+
			MaxPayload 128 bytes, MaxReadReq 512 bytes
		DevSta:	CorrErr- UncorrErr- FatalErr- UnsuppReq- AuxPwr- TransPend-
		LnkCap:	Port #0, Speed 8GT/s, Width x16, ASPM L0s L1, Exit Latency L0s <64ns, L1 <1us
			ClockPM- Surprise- LLActRep- BwNot-
		LnkCtl:	ASPM L0s L1 Enabled; RCB 64 bytes Disabled- CommClk+
			ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
		LnkSta:	Speed 5GT/s, Width x16, TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-
		DevCap2: Completion Timeout: Not Supported, TimeoutDis-, LTR-, OBFF Not Supported
		DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis-, LTR-, OBFF Disabled
		LnkCtl2: Target Link Speed: 8GT/s, EnterCompliance- SpeedDis-
			 Transmit Margin: Normal Operating Range, EnterModifiedCompliance- ComplianceSOS-
			 Compliance De-emphasis: -6dB
		LnkSta2: Current De-emphasis Level: -3.5dB, EqualizationComplete-, EqualizationPhase1-
			 EqualizationPhase2-, EqualizationPhase3-, LinkEqualizationRequest-
	Capabilities: [a0] MSI: Enable+ Count=1/1 Maskable- 64bit+
		Address: 00000000fee00000  Data: 0000
	Capabilities: [100 v1] Vendor Specific Information: ID=0001 Rev=1 Len=010 <?>
	Capabilities: [150 v2] Advanced Error Reporting
		UESta:	DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
		UEMsk:	DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
		UESvrt:	DLP+ SDES+ TLP- FCP+ CmpltTO- CmpltAbrt- UnxCmplt- RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol-
		CESta:	RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr+
		CEMsk:	RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr+
		AERCap:	First Error Pointer: 00, GenCap+ CGenEn- ChkCap+ ChkEn-
	Capabilities: [270 v1] #19
	Capabilities: [2b0 v1] Address Translation Service (ATS)
		ATSCap:	Invalidate Queue Depth: 00
		ATSCtl:	Enable+, Smallest Translation Unit: 00
	Capabilities: [2c0 v1] #13
	Capabilities: [2d0 v1] #1b
	Kernel driver in use: vfio-pci
	Kernel modules: radeon

root@homer:~# lspci -vvv -s 02:00.0
02:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] Pitcairn XT [Radeon HD 7870 GHz Edition] (prog-if 00 [VGA controller])
	Subsystem: XFX Pine Group Inc. Device 3251
	Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR+ FastB2B- DisINTx+
	Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
	Latency: 0, Cache Line Size: 64 bytes
	Interrupt: pin A routed to IRQ 48
	Region 0: Memory at a0000000 (64-bit, prefetchable) [size=256M]
	Region 2: Memory at fda80000 (64-bit, non-prefetchable) [size=256K]
	Region 4: I/O ports at ee00 [size=256]
	[virtual] Expansion ROM at fda00000 [disabled] [size=128K]
	Capabilities: [48] Vendor Specific Information: Len=08 <?>
	Capabilities: [50] Power Management version 3
		Flags: PMEClk- DSI- D1+ D2+ AuxCurrent=0mA PME(D0-,D1+,D2+,D3hot+,D3cold-)
		Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=0 PME-
	Capabilities: [58] Express (v2) Legacy Endpoint, MSI 00
		DevCap:	MaxPayload 256 bytes, PhantFunc 0, Latency L0s <4us, L1 unlimited
			ExtTag+ AttnBtn- AttnInd- PwrInd- RBE+ FLReset-
		DevCtl:	Report errors: Correctable- Non-Fatal- Fatal- Unsupported-
			RlxdOrd+ ExtTag+ PhantFunc- AuxPwr- NoSnoop+
			MaxPayload 128 bytes, MaxReadReq 512 bytes
		DevSta:	CorrErr- UncorrErr- FatalErr- UnsuppReq- AuxPwr- TransPend-
		LnkCap:	Port #0, Speed 8GT/s, Width x16, ASPM L0s L1, Exit Latency L0s <64ns, L1 <1us
			ClockPM- Surprise- LLActRep- BwNot-
		LnkCtl:	ASPM L0s L1 Enabled; RCB 64 bytes Disabled- CommClk+
			ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
		LnkSta:	Speed 5GT/s, Width x4, TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-
		DevCap2: Completion Timeout: Not Supported, TimeoutDis-, LTR-, OBFF Not Supported
		DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis-, LTR-, OBFF Disabled
		LnkCtl2: Target Link Speed: 8GT/s, EnterCompliance- SpeedDis-
			 Transmit Margin: Normal Operating Range, EnterModifiedCompliance- ComplianceSOS-
			 Compliance De-emphasis: -6dB
		LnkSta2: Current De-emphasis Level: -3.5dB, EqualizationComplete-, EqualizationPhase1-
			 EqualizationPhase2-, EqualizationPhase3-, LinkEqualizationRequest-
	Capabilities: [a0] MSI: Enable+ Count=1/1 Maskable- 64bit+
		Address: 00000000fee00000  Data: 0000
	Capabilities: [100 v1] Vendor Specific Information: ID=0001 Rev=1 Len=010 <?>
	Capabilities: [150 v2] Advanced Error Reporting
		UESta:	DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
		UEMsk:	DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
		UESvrt:	DLP+ SDES+ TLP- FCP+ CmpltTO- CmpltAbrt- UnxCmplt- RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol-
		CESta:	RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr-
		CEMsk:	RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr+
		AERCap:	First Error Pointer: 00, GenCap+ CGenEn- ChkCap+ ChkEn-
	Capabilities: [270 v1] #19
	Capabilities: [2b0 v1] Address Translation Service (ATS)
		ATSCap:	Invalidate Queue Depth: 00
		ATSCtl:	Enable+, Smallest Translation Unit: 00
	Capabilities: [2c0 v1] #13
	Capabilities: [2d0 v1] #1b
	Kernel driver in use: vfio-pci
	Kernel modules: radeon

> Alex 
> 

--Maik

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Qemu-devel] Multi GPU passthrough via VFIO
  2014-02-05 21:10   ` Maik Broemme
@ 2014-02-05 21:27     ` Alex Williamson
  2014-02-05 23:47       ` Maik Broemme
  0 siblings, 1 reply; 16+ messages in thread
From: Alex Williamson @ 2014-02-05 21:27 UTC (permalink / raw)
  To: Maik Broemme; +Cc: qemu-devel

On Wed, 2014-02-05 at 22:10 +0100, Maik Broemme wrote:
> Hi Alex,
> 
> Alex Williamson <alex.williamson@redhat.com> wrote:
> > On Wed, 2014-02-05 at 19:59 +0100, Maik Broemme wrote:
> > > Hi,
> > > 
> > > currently VFIO with multi GPU passthrough is working partially and
> > > hopefully somebody has a hint about the problem. I'm doing passthrough
> > > of an AMD Radeon R9 290X and AMD Radeon 7870 GHz Edition to a single VM.
> > > 
> > > If the VM is running Linux this works quite well with radeon or fglrx
> > > driver. Please see 'dmesg' log attached, when using the radeon driver.
> > > If needed I can also post one with fglrx driver.
> > > 
> > > If I do the exact same passthrough to a Windows VM and use latest AMD
> > > Catalyst 14.1 (2/1/2014) or AMD Catalyst 13.12 (12/18/2013) I can get
> > > only the first device working (AMD R9 290X) with 'x-vga=on'. I don't
> > > enable 'x-vga=on' on second device as this should never work. :)
> > 
> > Why not?  The guest is able to change the VGA enable bit in the emulated
> > bridge registers and access VGA space of each device, just like happens
> > on bare metal.  You'll only get one device initialized from seabios, but
> > that's the same as would happen on bare metal as well.
> > 
> 
> Well it was just my guess as it would behave like most physical boxes
> in this case. :)
> 
> > > I see
> > > BIOS boot screen and everything works fine except for the second GPU.
> > > The windows device manager just show me "Code 12" for the second GPU
> > > and its HD Audio device. Code 12 means: "This device cannot find enough
> > > free resources that it can use".
> > 
> > I've seen the same using Nvidia GRID GPUs (w/o x-vga=on), but only with
> > the Q35 chipset model, Linux works, Windows reports Code 12.  I have no
> > idea why as all the PCI resources appear to be properly sized and
> > mapped.  FWIW, 2 GRID GPUs assigned to a guest do work with the 440FX
> > chipset model.  Beyond 2 we run out of MMIO resources below 4G and
> > something bad happens.
> > 
> 
> Interesting. I will try 440FX a bit later and see if this works. What I
> can also do is to post system resource conflicts from Windows, the AMD
> Catalyst Center has it integrated. Maybe this will help?

If you actually see conflicts, then yes.  The Code 12 I've seen I was
never able to identify a conflict.  The trouble with 440FX is that
you'll need to use pci-bridges to isolate VGA space of each GPU.
Otherwise one card would need to be disabled to ensure the VGA accesses
go to the other.

> > > QEMU is called in both cases via the following. I just replace the
> > > '-drive' accordingly.
> > > 
> > > /usr/bin/taskset -c 0,1,2,3 /usr/bin/qemu-system-x86_64 \
> > >   -machine q35,accel=kvm \
> > >   -enable-kvm \
> > >   -nodefaults \
> > >   -nographic \
> > >   -vga none \
> > >   -boot order=nc \
> > >   -cpu host \
> > >   -smp cores=4,threads=1,sockets=1 \
> > >   -m 8192 \
> > >   -rtc base=localtime \
> > >   -k de \
> > >   -drive file=/srv/kvm/linux-drive0.img,id=drive0,if=none,cache=none,aio=threads \
> > >   -mon chardev=monitor0 \
> > >   -chardev socket,id=monitor0,path=/tmp/linux.monitor,nowait,server \
> > >   -netdev tap,id=net0,vhost=on,helper=/usr/lib/qemu/qemu-bridge-helper \
> > >   -device virtio-net-pci,netdev=net0,mac=00:00:00:02:01:04 \
> > >   -device virtio-blk-pci,drive=drive0,ioeventfd=on \
> > >   -device ioh3420,bus=pcie.0,id=pcie0,port=1,chassis=1,multifunction=on \
> > >   -device ioh3420,bus=pcie.0,id=pcie1,port=2,chassis=2,multifunction=on \
> > >   -device vfio-pci,host=01:00.0,addr=00.0,bus=pcie0,multifunction=on,x-vga=on \
> > >   -device vfio-pci,host=01:00.1,addr=00.1,bus=pcie0 \
> > >   -device vfio-pci,host=02:00.0,addr=00.0,bus=pcie1,multifunction=on \
> > >   -device vfio-pci,host=02:00.1,addr=00.1,bus=pcie1 \
> > >   -no-reboot
> > > 
> > > My setup is the following:
> > > 
> > > Kernel: linux-3.13.1
> > > Seabios: seabios-git-rel.1.7.4.r51.g151d034 (5/2/2014)
> > > QEMU: qemu-git-2.0.r30666.g31db5b3 (5/2/2014)
> > > 
> > > Below is the 'lspci' output and I'm using the AMD Radeon HD 5430 as device
> > > for my local X server:
> > > 
> > > 00:00.0 Host bridge: Advanced Micro Devices, Inc. [AMD/ATI] RD890 PCI to PCI bridge (external gfx0 port B) (rev 02)
> > > 00:00.2 IOMMU: Advanced Micro Devices, Inc. [AMD/ATI] RD990 I/O Memory Management Unit (IOMMU)
> > > 00:02.0 PCI bridge: Advanced Micro Devices, Inc. [AMD/ATI] RD890 PCI to PCI bridge (PCI express gpp port B)
> > > 00:04.0 PCI bridge: Advanced Micro Devices, Inc. [AMD/ATI] RD890 PCI to PCI bridge (PCI express gpp port D)
> > > 00:09.0 PCI bridge: Advanced Micro Devices, Inc. [AMD/ATI] RD890 PCI to PCI bridge (PCI express gpp port H)
> > > 00:0d.0 PCI bridge: Advanced Micro Devices, Inc. [AMD/ATI] RD890 PCI to PCI bridge (external gfx1 port B)
> > > 00:11.0 SATA controller: Advanced Micro Devices, Inc. [AMD/ATI] SB7x0/SB8x0/SB9x0 SATA Controller [AHCI mode] (rev 40)
> > > 00:12.0 USB controller: Advanced Micro Devices, Inc. [AMD/ATI] SB7x0/SB8x0/SB9x0 USB OHCI0 Controller
> > > 00:12.2 USB controller: Advanced Micro Devices, Inc. [AMD/ATI] SB7x0/SB8x0/SB9x0 USB EHCI Controller
> > > 00:13.0 USB controller: Advanced Micro Devices, Inc. [AMD/ATI] SB7x0/SB8x0/SB9x0 USB OHCI0 Controller
> > > 00:13.2 USB controller: Advanced Micro Devices, Inc. [AMD/ATI] SB7x0/SB8x0/SB9x0 USB EHCI Controller
> > > 00:14.0 SMBus: Advanced Micro Devices, Inc. [AMD/ATI] SBx00 SMBus Controller (rev 42)
> > > 00:14.2 Audio device: Advanced Micro Devices, Inc. [AMD/ATI] SBx00 Azalia (Intel HDA) (rev 40)
> > > 00:14.3 ISA bridge: Advanced Micro Devices, Inc. [AMD/ATI] SB7x0/SB8x0/SB9x0 LPC host controller (rev 40)
> > > 00:14.4 PCI bridge: Advanced Micro Devices, Inc. [AMD/ATI] SBx00 PCI to PCI Bridge (rev 40)
> > > 00:14.5 USB controller: Advanced Micro Devices, Inc. [AMD/ATI] SB7x0/SB8x0/SB9x0 USB OHCI2 Controller
> > > 00:15.0 PCI bridge: Advanced Micro Devices, Inc. [AMD/ATI] SB700/SB800/SB900 PCI to PCI bridge (PCIE port 0)
> > > 00:15.1 PCI bridge: Advanced Micro Devices, Inc. [AMD/ATI] SB700/SB800/SB900 PCI to PCI bridge (PCIE port 1)
> > > 00:15.2 PCI bridge: Advanced Micro Devices, Inc. [AMD/ATI] SB900 PCI to PCI bridge (PCIE port 2)
> > > 00:15.3 PCI bridge: Advanced Micro Devices, Inc. [AMD/ATI] SB900 PCI to PCI bridge (PCIE port 3)
> > > 00:16.0 USB controller: Advanced Micro Devices, Inc. [AMD/ATI] SB7x0/SB8x0/SB9x0 USB OHCI0 Controller
> > > 00:16.2 USB controller: Advanced Micro Devices, Inc. [AMD/ATI] SB7x0/SB8x0/SB9x0 USB EHCI Controller
> > > 00:18.0 Host bridge: Advanced Micro Devices, Inc. [AMD] Family 15h Processor Function 0
> > > 00:18.1 Host bridge: Advanced Micro Devices, Inc. [AMD] Family 15h Processor Function 1
> > > 00:18.2 Host bridge: Advanced Micro Devices, Inc. [AMD] Family 15h Processor Function 2
> > > 00:18.3 Host bridge: Advanced Micro Devices, Inc. [AMD] Family 15h Processor Function 3
> > > 00:18.4 Host bridge: Advanced Micro Devices, Inc. [AMD] Family 15h Processor Function 4
> > > 00:18.5 Host bridge: Advanced Micro Devices, Inc. [AMD] Family 15h Processor Function 5
> > > 01:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] Hawaii XT [Radeon HD 8970]
> > > 01:00.1 Audio device: Advanced Micro Devices, Inc. [AMD/ATI] Device aac8
> > > 02:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] Pitcairn XT [Radeon HD 7870 GHz Edition]
> > > 02:00.1 Audio device: Advanced Micro Devices, Inc. [AMD/ATI] Cape Verde/Pitcairn HDMI Audio [Radeon HD 7700/7800 Series]
> > > 03:00.0 USB controller: Etron Technology, Inc. EJ168 USB 3.0 Host Controller (rev 01)
> > > 04:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] Park [Mobility Radeon HD 5430]
> > > 04:00.1 Audio device: Advanced Micro Devices, Inc. [AMD/ATI] Cedar HDMI Audio [Radeon HD 5400/6300 Series]
> > > 06:00.0 Ethernet controller: Realtek Semiconductor Co., Ltd. RTL8111/8168/8411 PCI Express Gigabit Ethernet Controller (rev 06)
> > > 07:00.0 USB controller: Etron Technology, Inc. EJ168 USB 3.0 Host Controller (rev 01)
> > > 
> > > Another minor issue is that the R9 290X is not reset during shutdown of
> > > VM (neither Linux nor Windows) but it can be tricked with doing
> > > "suspend-to-ram" between two starts. That's why I use '-no-reboot' option
> > > in QEMU. The 7870 is doing the reset properly.
> > 
> > 
> > Is the NoSoftRst "-" on the 290X vs "+" on the 7870 in lspci -vvv by
> > chance?  Thanks,
> > 
> 
> Here are both. It is funny it is opposite as you described. :)


Oops, yes.  Does this help?

--- a/hw/misc/vfio.c
+++ b/hw/misc/vfio.c
@@ -3136,7 +3136,7 @@ static void vfio_pci_reset_handler(void *opaque)
 
     QLIST_FOREACH(group, &group_list, next) {
         QLIST_FOREACH(vdev, &group->device_list, next) {
-            if (!vdev->reset_works || (!vdev->has_flr && vdev->has_pm_reset)) {
+            if (!vdev->reset_works || !vdev->has_flr) {
                 vdev->needs_reset = true;
             }
         }

I can't figure out why I coded it the way that I did.  Probably overly
targeting a specific device.  Thanks,

Alex

> root@homer:~# lspci -vvv -s 01:00.0 | grep NoSoftRst
> 		Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=0 PME-
> 
> root@homer:~# lspci -vvv -s 02:00.0 | grep NoSoftRst
> 		Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=0 PME-
> 
> root@homer:~# lspci -vvv -s 01:00.0
> 01:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] Hawaii XT [Radeon HD 8970] (prog-if 00 [VGA controller])
> 	Subsystem: Advanced Micro Devices, Inc. [AMD/ATI] Device 0b00
> 	Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR+ FastB2B- DisINTx+
> 	Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
> 	Latency: 0, Cache Line Size: 64 bytes
> 	Interrupt: pin A routed to IRQ 49
> 	Region 0: Memory at c0000000 (64-bit, prefetchable) [size=256M]
> 	Region 2: Memory at df800000 (64-bit, prefetchable) [size=8M]
> 	Region 4: I/O ports at be00 [size=256]
> 	Region 5: Memory at fdd80000 (32-bit, non-prefetchable) [size=256K]
> 	[virtual] Expansion ROM at d0000000 [disabled] [size=128K]
> 	Capabilities: [48] Vendor Specific Information: Len=08 <?>
> 	Capabilities: [50] Power Management version 3
> 		Flags: PMEClk- DSI- D1+ D2+ AuxCurrent=0mA PME(D0-,D1+,D2+,D3hot+,D3cold-)
> 		Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=0 PME-
> 	Capabilities: [58] Express (v2) Legacy Endpoint, MSI 00
> 		DevCap:	MaxPayload 256 bytes, PhantFunc 0, Latency L0s <4us, L1 unlimited
> 			ExtTag+ AttnBtn- AttnInd- PwrInd- RBE+ FLReset-
> 		DevCtl:	Report errors: Correctable- Non-Fatal- Fatal- Unsupported-
> 			RlxdOrd+ ExtTag+ PhantFunc- AuxPwr- NoSnoop+
> 			MaxPayload 128 bytes, MaxReadReq 512 bytes
> 		DevSta:	CorrErr- UncorrErr- FatalErr- UnsuppReq- AuxPwr- TransPend-
> 		LnkCap:	Port #0, Speed 8GT/s, Width x16, ASPM L0s L1, Exit Latency L0s <64ns, L1 <1us
> 			ClockPM- Surprise- LLActRep- BwNot-
> 		LnkCtl:	ASPM L0s L1 Enabled; RCB 64 bytes Disabled- CommClk+
> 			ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
> 		LnkSta:	Speed 5GT/s, Width x16, TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-
> 		DevCap2: Completion Timeout: Not Supported, TimeoutDis-, LTR-, OBFF Not Supported
> 		DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis-, LTR-, OBFF Disabled
> 		LnkCtl2: Target Link Speed: 8GT/s, EnterCompliance- SpeedDis-
> 			 Transmit Margin: Normal Operating Range, EnterModifiedCompliance- ComplianceSOS-
> 			 Compliance De-emphasis: -6dB
> 		LnkSta2: Current De-emphasis Level: -3.5dB, EqualizationComplete-, EqualizationPhase1-
> 			 EqualizationPhase2-, EqualizationPhase3-, LinkEqualizationRequest-
> 	Capabilities: [a0] MSI: Enable+ Count=1/1 Maskable- 64bit+
> 		Address: 00000000fee00000  Data: 0000
> 	Capabilities: [100 v1] Vendor Specific Information: ID=0001 Rev=1 Len=010 <?>
> 	Capabilities: [150 v2] Advanced Error Reporting
> 		UESta:	DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
> 		UEMsk:	DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
> 		UESvrt:	DLP+ SDES+ TLP- FCP+ CmpltTO- CmpltAbrt- UnxCmplt- RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol-
> 		CESta:	RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr+
> 		CEMsk:	RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr+
> 		AERCap:	First Error Pointer: 00, GenCap+ CGenEn- ChkCap+ ChkEn-
> 	Capabilities: [270 v1] #19
> 	Capabilities: [2b0 v1] Address Translation Service (ATS)
> 		ATSCap:	Invalidate Queue Depth: 00
> 		ATSCtl:	Enable+, Smallest Translation Unit: 00
> 	Capabilities: [2c0 v1] #13
> 	Capabilities: [2d0 v1] #1b
> 	Kernel driver in use: vfio-pci
> 	Kernel modules: radeon
> 
> root@homer:~# lspci -vvv -s 02:00.0
> 02:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] Pitcairn XT [Radeon HD 7870 GHz Edition] (prog-if 00 [VGA controller])
> 	Subsystem: XFX Pine Group Inc. Device 3251
> 	Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR+ FastB2B- DisINTx+
> 	Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
> 	Latency: 0, Cache Line Size: 64 bytes
> 	Interrupt: pin A routed to IRQ 48
> 	Region 0: Memory at a0000000 (64-bit, prefetchable) [size=256M]
> 	Region 2: Memory at fda80000 (64-bit, non-prefetchable) [size=256K]
> 	Region 4: I/O ports at ee00 [size=256]
> 	[virtual] Expansion ROM at fda00000 [disabled] [size=128K]
> 	Capabilities: [48] Vendor Specific Information: Len=08 <?>
> 	Capabilities: [50] Power Management version 3
> 		Flags: PMEClk- DSI- D1+ D2+ AuxCurrent=0mA PME(D0-,D1+,D2+,D3hot+,D3cold-)
> 		Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=0 PME-
> 	Capabilities: [58] Express (v2) Legacy Endpoint, MSI 00
> 		DevCap:	MaxPayload 256 bytes, PhantFunc 0, Latency L0s <4us, L1 unlimited
> 			ExtTag+ AttnBtn- AttnInd- PwrInd- RBE+ FLReset-
> 		DevCtl:	Report errors: Correctable- Non-Fatal- Fatal- Unsupported-
> 			RlxdOrd+ ExtTag+ PhantFunc- AuxPwr- NoSnoop+
> 			MaxPayload 128 bytes, MaxReadReq 512 bytes
> 		DevSta:	CorrErr- UncorrErr- FatalErr- UnsuppReq- AuxPwr- TransPend-
> 		LnkCap:	Port #0, Speed 8GT/s, Width x16, ASPM L0s L1, Exit Latency L0s <64ns, L1 <1us
> 			ClockPM- Surprise- LLActRep- BwNot-
> 		LnkCtl:	ASPM L0s L1 Enabled; RCB 64 bytes Disabled- CommClk+
> 			ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
> 		LnkSta:	Speed 5GT/s, Width x4, TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-
> 		DevCap2: Completion Timeout: Not Supported, TimeoutDis-, LTR-, OBFF Not Supported
> 		DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis-, LTR-, OBFF Disabled
> 		LnkCtl2: Target Link Speed: 8GT/s, EnterCompliance- SpeedDis-
> 			 Transmit Margin: Normal Operating Range, EnterModifiedCompliance- ComplianceSOS-
> 			 Compliance De-emphasis: -6dB
> 		LnkSta2: Current De-emphasis Level: -3.5dB, EqualizationComplete-, EqualizationPhase1-
> 			 EqualizationPhase2-, EqualizationPhase3-, LinkEqualizationRequest-
> 	Capabilities: [a0] MSI: Enable+ Count=1/1 Maskable- 64bit+
> 		Address: 00000000fee00000  Data: 0000
> 	Capabilities: [100 v1] Vendor Specific Information: ID=0001 Rev=1 Len=010 <?>
> 	Capabilities: [150 v2] Advanced Error Reporting
> 		UESta:	DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
> 		UEMsk:	DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
> 		UESvrt:	DLP+ SDES+ TLP- FCP+ CmpltTO- CmpltAbrt- UnxCmplt- RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol-
> 		CESta:	RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr-
> 		CEMsk:	RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr+
> 		AERCap:	First Error Pointer: 00, GenCap+ CGenEn- ChkCap+ ChkEn-
> 	Capabilities: [270 v1] #19
> 	Capabilities: [2b0 v1] Address Translation Service (ATS)
> 		ATSCap:	Invalidate Queue Depth: 00
> 		ATSCtl:	Enable+, Smallest Translation Unit: 00
> 	Capabilities: [2c0 v1] #13
> 	Capabilities: [2d0 v1] #1b
> 	Kernel driver in use: vfio-pci
> 	Kernel modules: radeon
> 
> > Alex 
> > 
> 
> --Maik

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Qemu-devel] Multi GPU passthrough via VFIO
  2014-02-05 21:27     ` Alex Williamson
@ 2014-02-05 23:47       ` Maik Broemme
  2014-02-06  0:25         ` Maik Broemme
  0 siblings, 1 reply; 16+ messages in thread
From: Maik Broemme @ 2014-02-05 23:47 UTC (permalink / raw)
  To: Alex Williamson; +Cc: qemu-devel

Hi Alex,

Alex Williamson <alex.williamson@redhat.com> wrote:
> On Wed, 2014-02-05 at 22:10 +0100, Maik Broemme wrote:
> > Hi Alex,
> > 
> > Alex Williamson <alex.williamson@redhat.com> wrote:
> > > On Wed, 2014-02-05 at 19:59 +0100, Maik Broemme wrote:
> > > > Hi,
> > > > 
> > > > currently VFIO with multi GPU passthrough is working partially and
> > > > hopefully somebody has a hint about the problem. I'm doing passthrough
> > > > of an AMD Radeon R9 290X and AMD Radeon 7870 GHz Edition to a single VM.
> > > > 
> > > > If the VM is running Linux this works quite well with radeon or fglrx
> > > > driver. Please see 'dmesg' log attached, when using the radeon driver.
> > > > If needed I can also post one with fglrx driver.
> > > > 
> > > > If I do the exact same passthrough to a Windows VM and use latest AMD
> > > > Catalyst 14.1 (2/1/2014) or AMD Catalyst 13.12 (12/18/2013) I can get
> > > > only the first device working (AMD R9 290X) with 'x-vga=on'. I don't
> > > > enable 'x-vga=on' on second device as this should never work. :)
> > > 
> > > Why not?  The guest is able to change the VGA enable bit in the emulated
> > > bridge registers and access VGA space of each device, just like happens
> > > on bare metal.  You'll only get one device initialized from seabios, but
> > > that's the same as would happen on bare metal as well.
> > > 
> > 
> > Well it was just my guess as it would behave like most physical boxes
> > in this case. :)
> > 
> > > > I see
> > > > BIOS boot screen and everything works fine except for the second GPU.
> > > > The windows device manager just show me "Code 12" for the second GPU
> > > > and its HD Audio device. Code 12 means: "This device cannot find enough
> > > > free resources that it can use".
> > > 
> > > I've seen the same using Nvidia GRID GPUs (w/o x-vga=on), but only with
> > > the Q35 chipset model, Linux works, Windows reports Code 12.  I have no
> > > idea why as all the PCI resources appear to be properly sized and
> > > mapped.  FWIW, 2 GRID GPUs assigned to a guest do work with the 440FX
> > > chipset model.  Beyond 2 we run out of MMIO resources below 4G and
> > > something bad happens.
> > > 
> > 
> > Interesting. I will try 440FX a bit later and see if this works. What I
> > can also do is to post system resource conflicts from Windows, the AMD
> > Catalyst Center has it integrated. Maybe this will help?
> 
> If you actually see conflicts, then yes.  The Code 12 I've seen I was
> never able to identify a conflict.  The trouble with 440FX is that
> you'll need to use pci-bridges to isolate VGA space of each GPU.
> Otherwise one card would need to be disabled to ensure the VGA accesses
> go to the other.
> 

Okay I've collected all necessary information (hopefully). Some are in
German but if needed I can translate it. Please find it below:

- Conflicts:

E/A-Port 0x000003C0-0x000003DF	AMD Radeon R9 200 Series
E/A-Port 0x000003C0-0x000003DF	Intel(R) 5520/5500/X58 I/O Hub PCI Express Root Port 0 - 3420
	
IRQ 10	AMD Radeon HD 7800 Series
IRQ 10	Intel(R) ICH9 Family SMBus Controller - 2930
	
Speicheradresse 0xFE800000-0xFE83FFFF	AMD Radeon R9 200 Series
Speicheradresse 0xFE800000-0xFE83FFFF	Intel(R) 5520/5500/X58 I/O Hub PCI Express Root Port 0 - 3420
	
Speicheradresse 0xE0000000-0xEFFFFFFF	AMD Radeon HD 7800 Series
Speicheradresse 0xE0000000-0xEFFFFFFF	Intel(R) 5520/5500/X58 I/O Hub PCI Express Root Port 0 - 3420
	
Speicheradresse 0xA0000-0xBFFFF	AMD Radeon R9 200 Series
Speicheradresse 0xA0000-0xBFFFF	PCI-Bus
Speicheradresse 0xA0000-0xBFFFF	Intel(R) 5520/5500/X58 I/O Hub PCI Express Root Port 0 - 3420
	
Speicheradresse 0xC0000000-0xCFFFFFFF	AMD Radeon R9 200 Series
Speicheradresse 0xC0000000-0xCFFFFFFF	PCI-Bus
Speicheradresse 0xC0000000-0xCFFFFFFF	Intel(R) 5520/5500/X58 I/O Hub PCI Express Root Port 0 - 3420
	
E/A-Port 0x000003B0-0x000003BB	AMD Radeon R9 200 Series
E/A-Port 0x000003B0-0x000003BB	Intel(R) 5520/5500/X58 I/O Hub PCI Express Root Port 0 - 3420
	
Speicheradresse 0xFE600000-0xFE63FFFF	AMD Radeon HD 7800 Series
Speicheradresse 0xFE600000-0xFE63FFFF	Intel(R) 5520/5500/X58 I/O Hub PCI Express Root Port 0 - 3420
	
E/A-Port 0x0000C000-0x0000C0FF	AMD Radeon HD 7800 Series
E/A-Port 0x0000C000-0x0000C0FF	Intel(R) 5520/5500/X58 I/O Hub PCI Express Root Port 0 - 3420
	
E/A-Port 0x0000D000-0x0000D0FF	AMD Radeon R9 200 Series
E/A-Port 0x0000D000-0x0000D0FF	Intel(R) 5520/5500/X58 I/O Hub PCI Express Root Port 0 - 3420

- Display devices:

Name			AMD Radeon R9 200 Series
PNP-Gerätekennung	PCI\VEN_1002&DEV_67B0&SUBSYS_0B001002&REV_00\4&2122111D&0&0018
Adaptertyp		AMD Radeon Graphics Processor (0x67B0), Advanced Micro Devices, Inc.-kompatibel
Adapterbeschreibung	AMD Radeon R9 200 Series
Adapter-RAM		(1.048.576) Bytes
Installierte Treiber	aticfx64.dll,aticfx64.dll,aticfx64.dll,aticfx32,aticfx32,aticfx32,atiumd64.dll,atidxx64.dll,atidxx64.dll,atiumdag,atidxx32,atidxx32,atiumdva,atiumd6a.cap,atitmm64.dll
Treiberversion		13.350.1005.0
INF-Datei		oem7.inf (Abschnitt ati2mtag_Hawaii)
Farbebenen		Nicht verfügbar
Farbtabelleneinträge	4294967296
Auflösung		1920 x 1080 x 60 Hz
Bits/Pixel		32
Speicheradresse		0xC0000000-0xCFFFFFFF
Speicheradresse		0xD0000000-0xD07FFFFF
E/A-Port		0x0000D000-0x0000D0FF
Speicheradresse		0xFE800000-0xFE83FFFF
IRQ-Kanal		IRQ 4294967287
E/A-Port		0x000003B0-0x000003BB
E/A-Port		0x000003C0-0x000003DF
Speicheradresse		0xA0000-0xBFFFF
Treiber			c:\windows\system32\drivers\atikmpag.sys (8.14.1.6367, 622,00 KB (636.928 Bytes), 31.01.2014 20:28)

Name			AMD Radeon HD 7800 Series
PNP-Gerätekennung	PCI\VEN_1002&DEV_6818&SUBSYS_32511682&REV_00\4&49049C7&0&0820
Adaptertyp		Nicht verfügbar, Advanced Micro Devices, Inc.-kompatibel
Adapterbeschreibung	AMD Radeon HD 7800 Series
Adapter-RAM		Nicht verfügbar
Installierte Treiber	aticfx64.dll,aticfx64.dll,aticfx64.dll,aticfx32,aticfx32,aticfx32,atiumd64.dll,atidxx64.dll,atidxx64.dll,atiumdag,atidxx32,atidxx32,atiumdva,atiumd6a.cap,atitmm64.dll
Treiberversion		13.350.1005.0
INF-Datei		oem7.inf (Abschnitt ati2mtag_R575B)
Farbebenen		Nicht verfügbar
Farbtabelleneinträge	Nicht verfügbar
Auflösung		Nicht verfügbar
Bits/Pixel		Nicht verfügbar
Speicheradresse		0xE0000000-0xEFFFFFFF
Speicheradresse		0xFE600000-0xFE63FFFF
E/A-Port		0x0000C000-0x0000C0FF
IRQ-Kanal		IRQ 10
Treiber			c:\windows\system32\drivers\atikmpag.sys (8.14.1.6367, 622,00 KB (636.928 Bytes), 31.01.2014 20:28)

- I/O:

0x00000000-0x00000CD7	PCI-Bus	OK
0x00000060-0x00000060	Standardtastatur (PS/2)	OK
0x00000064-0x00000064	Standardtastatur (PS/2)	OK
0x00000070-0x00000071	System CMOS/Echtzeituhr	OK
0x00000072-0x00000077	System CMOS/Echtzeituhr	OK
0x000002F8-0x000002FF	Kommunikationsanschluss (COM2)	OK
0x00000378-0x0000037F	Druckeranschluss (LPT1)	OK
0x000003B0-0x000003BB	AMD Radeon R9 200 Series	OK
0x000003B0-0x000003BB	Intel(R) 5520/5500/X58 I/O Hub PCI Express Root Port 0 - 3420	OK
0x000003C0-0x000003DF	AMD Radeon R9 200 Series	OK
0x000003C0-0x000003DF	Intel(R) 5520/5500/X58 I/O Hub PCI Express Root Port 0 - 3420	OK
0x000003F2-0x000003F5	Standard-Diskettenlaufwerkcontroller	OK
0x000003F7-0x000003F7	Standard-Diskettenlaufwerkcontroller	OK
0x000003F8-0x000003FF	Kommunikationsanschluss (COM1)	OK
0x00000CD8-0x00000CF7	ACPI-Modulgerät	OK
0x00000D00-0x0000FFFF	PCI-Bus	OK
0x0000B100-0x0000B13F	Intel(R) ICH9 Family SMBus Controller - 2930	OK
0x0000C000-0x0000C0FF	AMD Radeon HD 7800 Series	OK
0x0000C000-0x0000C0FF	Intel(R) 5520/5500/X58 I/O Hub PCI Express Root Port 0 - 3420	OK
0x0000D000-0x0000D0FF	AMD Radeon R9 200 Series	OK
0x0000D000-0x0000D0FF	Intel(R) 5520/5500/X58 I/O Hub PCI Express Root Port 0 - 3420	OK
0x0000E000-0x0000E03F	Red Hat VirtIO SCSI controller	OK
0x0000E080-0x0000E09F	Red Hat VirtIO Ethernet Adapter	OK
0x0000E0A0-0x0000E0BF	Standard AHCI 1.0 Serieller-ATA-Controller	OK

- Memory:

0xC0000000-0xCFFFFFFF	AMD Radeon R9 200 Series	OK
0xC0000000-0xCFFFFFFF	PCI-Bus	OK
0xC0000000-0xCFFFFFFF	Intel(R) 5520/5500/X58 I/O Hub PCI Express Root Port 0 - 3420	OK
0xD0000000-0xD07FFFFF	AMD Radeon R9 200 Series	OK
0xFE800000-0xFE83FFFF	AMD Radeon R9 200 Series	OK
0xFE800000-0xFE83FFFF	Intel(R) 5520/5500/X58 I/O Hub PCI Express Root Port 0 - 3420	OK
0xFEA40000-0xFEA40FFF	Red Hat VirtIO Ethernet Adapter	OK
0xFED00000-0xFED003FF	Hochpräzisionsereigniszeitgeber	OK
0xFEA41000-0xFEA41FFF	Red Hat VirtIO SCSI controller	OK
0xE0000000-0xEFFFFFFF	AMD Radeon HD 7800 Series	OK
0xE0000000-0xEFFFFFFF	Intel(R) 5520/5500/X58 I/O Hub PCI Express Root Port 0 - 3420	OK
0xFE600000-0xFE63FFFF	AMD Radeon HD 7800 Series	OK
0xFE600000-0xFE63FFFF	Intel(R) 5520/5500/X58 I/O Hub PCI Express Root Port 0 - 3420	OK
0xFEA42000-0xFEA42FFF	Standard AHCI 1.0 Serieller-ATA-Controller	OK
0xFE860000-0xFE863FFF	High Definition Audio-Controller	OK
0xA0000-0xBFFFF	AMD Radeon R9 200 Series	OK
0xA0000-0xBFFFF	PCI-Bus	OK
0xA0000-0xBFFFF	Intel(R) 5520/5500/X58 I/O Hub PCI Express Root Port 0 - 3420	OK

- IRQ:

IRQ 1	Standardtastatur (PS/2)	OK
IRQ 3	Kommunikationsanschluss (COM2)	OK
IRQ 4	Kommunikationsanschluss (COM1)	OK
IRQ 6	Standard-Diskettenlaufwerkcontroller	OK
IRQ 8	System CMOS/Echtzeituhr	OK
IRQ 10	AMD Radeon HD 7800 Series	OK
IRQ 10	Intel(R) ICH9 Family SMBus Controller - 2930	OK
IRQ 12	PS/2-kompatible Maus	OK
IRQ 16	Standard AHCI 1.0 Serieller-ATA-Controller	OK
IRQ 20	High Definition Audio-Controller	OK
IRQ 81	Microsoft ACPI-konformes System	OK
IRQ 82	Microsoft ACPI-konformes System	OK
IRQ 83	Microsoft ACPI-konformes System	OK
IRQ 84	Microsoft ACPI-konformes System	OK
IRQ 85	Microsoft ACPI-konformes System	OK
IRQ 86	Microsoft ACPI-konformes System	OK
IRQ 87	Microsoft ACPI-konformes System	OK
IRQ 88	Microsoft ACPI-konformes System	OK
IRQ 89	Microsoft ACPI-konformes System	OK
IRQ 90	Microsoft ACPI-konformes System	OK
IRQ 91	Microsoft ACPI-konformes System	OK
IRQ 92	Microsoft ACPI-konformes System	OK
IRQ 93	Microsoft ACPI-konformes System	OK
IRQ 94	Microsoft ACPI-konformes System	OK
IRQ 95	Microsoft ACPI-konformes System	OK
IRQ 96	Microsoft ACPI-konformes System	OK
IRQ 97	Microsoft ACPI-konformes System	OK
IRQ 98	Microsoft ACPI-konformes System	OK
IRQ 99	Microsoft ACPI-konformes System	OK
IRQ 100	Microsoft ACPI-konformes System	OK
IRQ 101	Microsoft ACPI-konformes System	OK
IRQ 102	Microsoft ACPI-konformes System	OK
IRQ 103	Microsoft ACPI-konformes System	OK
IRQ 104	Microsoft ACPI-konformes System	OK
IRQ 105	Microsoft ACPI-konformes System	OK
IRQ 106	Microsoft ACPI-konformes System	OK
IRQ 107	Microsoft ACPI-konformes System	OK
IRQ 108	Microsoft ACPI-konformes System	OK
IRQ 109	Microsoft ACPI-konformes System	OK
IRQ 110	Microsoft ACPI-konformes System	OK
IRQ 111	Microsoft ACPI-konformes System	OK
IRQ 112	Microsoft ACPI-konformes System	OK
IRQ 113	Microsoft ACPI-konformes System	OK
IRQ 114	Microsoft ACPI-konformes System	OK
IRQ 115	Microsoft ACPI-konformes System	OK
IRQ 116	Microsoft ACPI-konformes System	OK
IRQ 117	Microsoft ACPI-konformes System	OK
IRQ 118	Microsoft ACPI-konformes System	OK
IRQ 119	Microsoft ACPI-konformes System	OK
IRQ 120	Microsoft ACPI-konformes System	OK
IRQ 121	Microsoft ACPI-konformes System	OK
IRQ 122	Microsoft ACPI-konformes System	OK
IRQ 123	Microsoft ACPI-konformes System	OK
IRQ 124	Microsoft ACPI-konformes System	OK
IRQ 125	Microsoft ACPI-konformes System	OK
IRQ 126	Microsoft ACPI-konformes System	OK
IRQ 127	Microsoft ACPI-konformes System	OK
IRQ 128	Microsoft ACPI-konformes System	OK
IRQ 129	Microsoft ACPI-konformes System	OK
IRQ 130	Microsoft ACPI-konformes System	OK
IRQ 131	Microsoft ACPI-konformes System	OK
IRQ 132	Microsoft ACPI-konformes System	OK
IRQ 133	Microsoft ACPI-konformes System	OK
IRQ 134	Microsoft ACPI-konformes System	OK
IRQ 135	Microsoft ACPI-konformes System	OK
IRQ 136	Microsoft ACPI-konformes System	OK
IRQ 137	Microsoft ACPI-konformes System	OK
IRQ 138	Microsoft ACPI-konformes System	OK
IRQ 139	Microsoft ACPI-konformes System	OK
IRQ 140	Microsoft ACPI-konformes System	OK
IRQ 141	Microsoft ACPI-konformes System	OK
IRQ 142	Microsoft ACPI-konformes System	OK
IRQ 143	Microsoft ACPI-konformes System	OK
IRQ 144	Microsoft ACPI-konformes System	OK
IRQ 145	Microsoft ACPI-konformes System	OK
IRQ 146	Microsoft ACPI-konformes System	OK
IRQ 147	Microsoft ACPI-konformes System	OK
IRQ 148	Microsoft ACPI-konformes System	OK
IRQ 149	Microsoft ACPI-konformes System	OK
IRQ 150	Microsoft ACPI-konformes System	OK
IRQ 151	Microsoft ACPI-konformes System	OK
IRQ 152	Microsoft ACPI-konformes System	OK
IRQ 153	Microsoft ACPI-konformes System	OK
IRQ 154	Microsoft ACPI-konformes System	OK
IRQ 155	Microsoft ACPI-konformes System	OK
IRQ 156	Microsoft ACPI-konformes System	OK
IRQ 157	Microsoft ACPI-konformes System	OK
IRQ 158	Microsoft ACPI-konformes System	OK
IRQ 159	Microsoft ACPI-konformes System	OK
IRQ 160	Microsoft ACPI-konformes System	OK
IRQ 161	Microsoft ACPI-konformes System	OK
IRQ 162	Microsoft ACPI-konformes System	OK
IRQ 163	Microsoft ACPI-konformes System	OK
IRQ 164	Microsoft ACPI-konformes System	OK
IRQ 165	Microsoft ACPI-konformes System	OK
IRQ 166	Microsoft ACPI-konformes System	OK
IRQ 167	Microsoft ACPI-konformes System	OK
IRQ 168	Microsoft ACPI-konformes System	OK
IRQ 169	Microsoft ACPI-konformes System	OK
IRQ 170	Microsoft ACPI-konformes System	OK
IRQ 171	Microsoft ACPI-konformes System	OK
IRQ 172	Microsoft ACPI-konformes System	OK
IRQ 173	Microsoft ACPI-konformes System	OK
IRQ 174	Microsoft ACPI-konformes System	OK
IRQ 175	Microsoft ACPI-konformes System	OK
IRQ 176	Microsoft ACPI-konformes System	OK
IRQ 177	Microsoft ACPI-konformes System	OK
IRQ 178	Microsoft ACPI-konformes System	OK
IRQ 179	Microsoft ACPI-konformes System	OK
IRQ 180	Microsoft ACPI-konformes System	OK
IRQ 181	Microsoft ACPI-konformes System	OK
IRQ 182	Microsoft ACPI-konformes System	OK
IRQ 183	Microsoft ACPI-konformes System	OK
IRQ 184	Microsoft ACPI-konformes System	OK
IRQ 185	Microsoft ACPI-konformes System	OK
IRQ 186	Microsoft ACPI-konformes System	OK
IRQ 187	Microsoft ACPI-konformes System	OK
IRQ 188	Microsoft ACPI-konformes System	OK
IRQ 189	Microsoft ACPI-konformes System	OK
IRQ 190	Microsoft ACPI-konformes System	OK
IRQ 4294967287	AMD Radeon R9 200 Series	OK
IRQ 4294967288	Red Hat VirtIO Ethernet Adapter	OK
IRQ 4294967289	Red Hat VirtIO Ethernet Adapter	OK
IRQ 4294967290	Red Hat VirtIO Ethernet Adapter	OK
IRQ 4294967291	Red Hat VirtIO SCSI controller	OK
IRQ 4294967292	Red Hat VirtIO SCSI controller	OK
IRQ 4294967293	Intel(R) 5520/5500/X58 I/O Hub PCI Express Root Port 0 - 3420	OK
IRQ 4294967294	Intel(R) 5520/5500/X58 I/O Hub PCI Express Root Port 0 - 3420	OK

I'm no expert but it looks like Windows never enabled MSI for the second
card as qemu shell with 'info pci' show me both with IRQ 10. I'll hope it
helps.

> > > > QEMU is called in both cases via the following. I just replace the
> > > > '-drive' accordingly.
> > > > 
> > > > /usr/bin/taskset -c 0,1,2,3 /usr/bin/qemu-system-x86_64 \
> > > >   -machine q35,accel=kvm \
> > > >   -enable-kvm \
> > > >   -nodefaults \
> > > >   -nographic \
> > > >   -vga none \
> > > >   -boot order=nc \
> > > >   -cpu host \
> > > >   -smp cores=4,threads=1,sockets=1 \
> > > >   -m 8192 \
> > > >   -rtc base=localtime \
> > > >   -k de \
> > > >   -drive file=/srv/kvm/linux-drive0.img,id=drive0,if=none,cache=none,aio=threads \
> > > >   -mon chardev=monitor0 \
> > > >   -chardev socket,id=monitor0,path=/tmp/linux.monitor,nowait,server \
> > > >   -netdev tap,id=net0,vhost=on,helper=/usr/lib/qemu/qemu-bridge-helper \
> > > >   -device virtio-net-pci,netdev=net0,mac=00:00:00:02:01:04 \
> > > >   -device virtio-blk-pci,drive=drive0,ioeventfd=on \
> > > >   -device ioh3420,bus=pcie.0,id=pcie0,port=1,chassis=1,multifunction=on \
> > > >   -device ioh3420,bus=pcie.0,id=pcie1,port=2,chassis=2,multifunction=on \
> > > >   -device vfio-pci,host=01:00.0,addr=00.0,bus=pcie0,multifunction=on,x-vga=on \
> > > >   -device vfio-pci,host=01:00.1,addr=00.1,bus=pcie0 \
> > > >   -device vfio-pci,host=02:00.0,addr=00.0,bus=pcie1,multifunction=on \
> > > >   -device vfio-pci,host=02:00.1,addr=00.1,bus=pcie1 \
> > > >   -no-reboot
> > > > 
> > > > My setup is the following:
> > > > 
> > > > Kernel: linux-3.13.1
> > > > Seabios: seabios-git-rel.1.7.4.r51.g151d034 (5/2/2014)
> > > > QEMU: qemu-git-2.0.r30666.g31db5b3 (5/2/2014)
> > > > 
> > > > Below is the 'lspci' output and I'm using the AMD Radeon HD 5430 as device
> > > > for my local X server:
> > > > 
> > > > 00:00.0 Host bridge: Advanced Micro Devices, Inc. [AMD/ATI] RD890 PCI to PCI bridge (external gfx0 port B) (rev 02)
> > > > 00:00.2 IOMMU: Advanced Micro Devices, Inc. [AMD/ATI] RD990 I/O Memory Management Unit (IOMMU)
> > > > 00:02.0 PCI bridge: Advanced Micro Devices, Inc. [AMD/ATI] RD890 PCI to PCI bridge (PCI express gpp port B)
> > > > 00:04.0 PCI bridge: Advanced Micro Devices, Inc. [AMD/ATI] RD890 PCI to PCI bridge (PCI express gpp port D)
> > > > 00:09.0 PCI bridge: Advanced Micro Devices, Inc. [AMD/ATI] RD890 PCI to PCI bridge (PCI express gpp port H)
> > > > 00:0d.0 PCI bridge: Advanced Micro Devices, Inc. [AMD/ATI] RD890 PCI to PCI bridge (external gfx1 port B)
> > > > 00:11.0 SATA controller: Advanced Micro Devices, Inc. [AMD/ATI] SB7x0/SB8x0/SB9x0 SATA Controller [AHCI mode] (rev 40)
> > > > 00:12.0 USB controller: Advanced Micro Devices, Inc. [AMD/ATI] SB7x0/SB8x0/SB9x0 USB OHCI0 Controller
> > > > 00:12.2 USB controller: Advanced Micro Devices, Inc. [AMD/ATI] SB7x0/SB8x0/SB9x0 USB EHCI Controller
> > > > 00:13.0 USB controller: Advanced Micro Devices, Inc. [AMD/ATI] SB7x0/SB8x0/SB9x0 USB OHCI0 Controller
> > > > 00:13.2 USB controller: Advanced Micro Devices, Inc. [AMD/ATI] SB7x0/SB8x0/SB9x0 USB EHCI Controller
> > > > 00:14.0 SMBus: Advanced Micro Devices, Inc. [AMD/ATI] SBx00 SMBus Controller (rev 42)
> > > > 00:14.2 Audio device: Advanced Micro Devices, Inc. [AMD/ATI] SBx00 Azalia (Intel HDA) (rev 40)
> > > > 00:14.3 ISA bridge: Advanced Micro Devices, Inc. [AMD/ATI] SB7x0/SB8x0/SB9x0 LPC host controller (rev 40)
> > > > 00:14.4 PCI bridge: Advanced Micro Devices, Inc. [AMD/ATI] SBx00 PCI to PCI Bridge (rev 40)
> > > > 00:14.5 USB controller: Advanced Micro Devices, Inc. [AMD/ATI] SB7x0/SB8x0/SB9x0 USB OHCI2 Controller
> > > > 00:15.0 PCI bridge: Advanced Micro Devices, Inc. [AMD/ATI] SB700/SB800/SB900 PCI to PCI bridge (PCIE port 0)
> > > > 00:15.1 PCI bridge: Advanced Micro Devices, Inc. [AMD/ATI] SB700/SB800/SB900 PCI to PCI bridge (PCIE port 1)
> > > > 00:15.2 PCI bridge: Advanced Micro Devices, Inc. [AMD/ATI] SB900 PCI to PCI bridge (PCIE port 2)
> > > > 00:15.3 PCI bridge: Advanced Micro Devices, Inc. [AMD/ATI] SB900 PCI to PCI bridge (PCIE port 3)
> > > > 00:16.0 USB controller: Advanced Micro Devices, Inc. [AMD/ATI] SB7x0/SB8x0/SB9x0 USB OHCI0 Controller
> > > > 00:16.2 USB controller: Advanced Micro Devices, Inc. [AMD/ATI] SB7x0/SB8x0/SB9x0 USB EHCI Controller
> > > > 00:18.0 Host bridge: Advanced Micro Devices, Inc. [AMD] Family 15h Processor Function 0
> > > > 00:18.1 Host bridge: Advanced Micro Devices, Inc. [AMD] Family 15h Processor Function 1
> > > > 00:18.2 Host bridge: Advanced Micro Devices, Inc. [AMD] Family 15h Processor Function 2
> > > > 00:18.3 Host bridge: Advanced Micro Devices, Inc. [AMD] Family 15h Processor Function 3
> > > > 00:18.4 Host bridge: Advanced Micro Devices, Inc. [AMD] Family 15h Processor Function 4
> > > > 00:18.5 Host bridge: Advanced Micro Devices, Inc. [AMD] Family 15h Processor Function 5
> > > > 01:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] Hawaii XT [Radeon HD 8970]
> > > > 01:00.1 Audio device: Advanced Micro Devices, Inc. [AMD/ATI] Device aac8
> > > > 02:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] Pitcairn XT [Radeon HD 7870 GHz Edition]
> > > > 02:00.1 Audio device: Advanced Micro Devices, Inc. [AMD/ATI] Cape Verde/Pitcairn HDMI Audio [Radeon HD 7700/7800 Series]
> > > > 03:00.0 USB controller: Etron Technology, Inc. EJ168 USB 3.0 Host Controller (rev 01)
> > > > 04:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] Park [Mobility Radeon HD 5430]
> > > > 04:00.1 Audio device: Advanced Micro Devices, Inc. [AMD/ATI] Cedar HDMI Audio [Radeon HD 5400/6300 Series]
> > > > 06:00.0 Ethernet controller: Realtek Semiconductor Co., Ltd. RTL8111/8168/8411 PCI Express Gigabit Ethernet Controller (rev 06)
> > > > 07:00.0 USB controller: Etron Technology, Inc. EJ168 USB 3.0 Host Controller (rev 01)
> > > > 
> > > > Another minor issue is that the R9 290X is not reset during shutdown of
> > > > VM (neither Linux nor Windows) but it can be tricked with doing
> > > > "suspend-to-ram" between two starts. That's why I use '-no-reboot' option
> > > > in QEMU. The 7870 is doing the reset properly.
> > > 
> > > 
> > > Is the NoSoftRst "-" on the 290X vs "+" on the 7870 in lspci -vvv by
> > > chance?  Thanks,
> > > 
> > 
> > Here are both. It is funny it is opposite as you described. :)
> 
> 
> Oops, yes.  Does this help?
> 
> --- a/hw/misc/vfio.c
> +++ b/hw/misc/vfio.c
> @@ -3136,7 +3136,7 @@ static void vfio_pci_reset_handler(void *opaque)
>  
>      QLIST_FOREACH(group, &group_list, next) {
>          QLIST_FOREACH(vdev, &group->device_list, next) {
> -            if (!vdev->reset_works || (!vdev->has_flr && vdev->has_pm_reset)) {
> +            if (!vdev->reset_works || !vdev->has_flr) {
>                  vdev->needs_reset = true;
>              }
>          }
> 
> I can't figure out why I coded it the way that I did.  Probably overly
> targeting a specific device.  Thanks,
> 

This patch works absolutely fine. After applying it to my 'qemu-git', the
device resets works flawlessly. So it would be great to push it upstream
as it looks good.

> Alex
> 
> > root@homer:~# lspci -vvv -s 01:00.0 | grep NoSoftRst
> > 		Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=0 PME-
> > 
> > root@homer:~# lspci -vvv -s 02:00.0 | grep NoSoftRst
> > 		Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=0 PME-
> > 
> > root@homer:~# lspci -vvv -s 01:00.0
> > 01:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] Hawaii XT [Radeon HD 8970] (prog-if 00 [VGA controller])
> > 	Subsystem: Advanced Micro Devices, Inc. [AMD/ATI] Device 0b00
> > 	Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR+ FastB2B- DisINTx+
> > 	Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
> > 	Latency: 0, Cache Line Size: 64 bytes
> > 	Interrupt: pin A routed to IRQ 49
> > 	Region 0: Memory at c0000000 (64-bit, prefetchable) [size=256M]
> > 	Region 2: Memory at df800000 (64-bit, prefetchable) [size=8M]
> > 	Region 4: I/O ports at be00 [size=256]
> > 	Region 5: Memory at fdd80000 (32-bit, non-prefetchable) [size=256K]
> > 	[virtual] Expansion ROM at d0000000 [disabled] [size=128K]
> > 	Capabilities: [48] Vendor Specific Information: Len=08 <?>
> > 	Capabilities: [50] Power Management version 3
> > 		Flags: PMEClk- DSI- D1+ D2+ AuxCurrent=0mA PME(D0-,D1+,D2+,D3hot+,D3cold-)
> > 		Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=0 PME-
> > 	Capabilities: [58] Express (v2) Legacy Endpoint, MSI 00
> > 		DevCap:	MaxPayload 256 bytes, PhantFunc 0, Latency L0s <4us, L1 unlimited
> > 			ExtTag+ AttnBtn- AttnInd- PwrInd- RBE+ FLReset-
> > 		DevCtl:	Report errors: Correctable- Non-Fatal- Fatal- Unsupported-
> > 			RlxdOrd+ ExtTag+ PhantFunc- AuxPwr- NoSnoop+
> > 			MaxPayload 128 bytes, MaxReadReq 512 bytes
> > 		DevSta:	CorrErr- UncorrErr- FatalErr- UnsuppReq- AuxPwr- TransPend-
> > 		LnkCap:	Port #0, Speed 8GT/s, Width x16, ASPM L0s L1, Exit Latency L0s <64ns, L1 <1us
> > 			ClockPM- Surprise- LLActRep- BwNot-
> > 		LnkCtl:	ASPM L0s L1 Enabled; RCB 64 bytes Disabled- CommClk+
> > 			ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
> > 		LnkSta:	Speed 5GT/s, Width x16, TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-
> > 		DevCap2: Completion Timeout: Not Supported, TimeoutDis-, LTR-, OBFF Not Supported
> > 		DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis-, LTR-, OBFF Disabled
> > 		LnkCtl2: Target Link Speed: 8GT/s, EnterCompliance- SpeedDis-
> > 			 Transmit Margin: Normal Operating Range, EnterModifiedCompliance- ComplianceSOS-
> > 			 Compliance De-emphasis: -6dB
> > 		LnkSta2: Current De-emphasis Level: -3.5dB, EqualizationComplete-, EqualizationPhase1-
> > 			 EqualizationPhase2-, EqualizationPhase3-, LinkEqualizationRequest-
> > 	Capabilities: [a0] MSI: Enable+ Count=1/1 Maskable- 64bit+
> > 		Address: 00000000fee00000  Data: 0000
> > 	Capabilities: [100 v1] Vendor Specific Information: ID=0001 Rev=1 Len=010 <?>
> > 	Capabilities: [150 v2] Advanced Error Reporting
> > 		UESta:	DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
> > 		UEMsk:	DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
> > 		UESvrt:	DLP+ SDES+ TLP- FCP+ CmpltTO- CmpltAbrt- UnxCmplt- RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol-
> > 		CESta:	RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr+
> > 		CEMsk:	RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr+
> > 		AERCap:	First Error Pointer: 00, GenCap+ CGenEn- ChkCap+ ChkEn-
> > 	Capabilities: [270 v1] #19
> > 	Capabilities: [2b0 v1] Address Translation Service (ATS)
> > 		ATSCap:	Invalidate Queue Depth: 00
> > 		ATSCtl:	Enable+, Smallest Translation Unit: 00
> > 	Capabilities: [2c0 v1] #13
> > 	Capabilities: [2d0 v1] #1b
> > 	Kernel driver in use: vfio-pci
> > 	Kernel modules: radeon
> > 
> > root@homer:~# lspci -vvv -s 02:00.0
> > 02:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] Pitcairn XT [Radeon HD 7870 GHz Edition] (prog-if 00 [VGA controller])
> > 	Subsystem: XFX Pine Group Inc. Device 3251
> > 	Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR+ FastB2B- DisINTx+
> > 	Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
> > 	Latency: 0, Cache Line Size: 64 bytes
> > 	Interrupt: pin A routed to IRQ 48
> > 	Region 0: Memory at a0000000 (64-bit, prefetchable) [size=256M]
> > 	Region 2: Memory at fda80000 (64-bit, non-prefetchable) [size=256K]
> > 	Region 4: I/O ports at ee00 [size=256]
> > 	[virtual] Expansion ROM at fda00000 [disabled] [size=128K]
> > 	Capabilities: [48] Vendor Specific Information: Len=08 <?>
> > 	Capabilities: [50] Power Management version 3
> > 		Flags: PMEClk- DSI- D1+ D2+ AuxCurrent=0mA PME(D0-,D1+,D2+,D3hot+,D3cold-)
> > 		Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=0 PME-
> > 	Capabilities: [58] Express (v2) Legacy Endpoint, MSI 00
> > 		DevCap:	MaxPayload 256 bytes, PhantFunc 0, Latency L0s <4us, L1 unlimited
> > 			ExtTag+ AttnBtn- AttnInd- PwrInd- RBE+ FLReset-
> > 		DevCtl:	Report errors: Correctable- Non-Fatal- Fatal- Unsupported-
> > 			RlxdOrd+ ExtTag+ PhantFunc- AuxPwr- NoSnoop+
> > 			MaxPayload 128 bytes, MaxReadReq 512 bytes
> > 		DevSta:	CorrErr- UncorrErr- FatalErr- UnsuppReq- AuxPwr- TransPend-
> > 		LnkCap:	Port #0, Speed 8GT/s, Width x16, ASPM L0s L1, Exit Latency L0s <64ns, L1 <1us
> > 			ClockPM- Surprise- LLActRep- BwNot-
> > 		LnkCtl:	ASPM L0s L1 Enabled; RCB 64 bytes Disabled- CommClk+
> > 			ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
> > 		LnkSta:	Speed 5GT/s, Width x4, TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-
> > 		DevCap2: Completion Timeout: Not Supported, TimeoutDis-, LTR-, OBFF Not Supported
> > 		DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis-, LTR-, OBFF Disabled
> > 		LnkCtl2: Target Link Speed: 8GT/s, EnterCompliance- SpeedDis-
> > 			 Transmit Margin: Normal Operating Range, EnterModifiedCompliance- ComplianceSOS-
> > 			 Compliance De-emphasis: -6dB
> > 		LnkSta2: Current De-emphasis Level: -3.5dB, EqualizationComplete-, EqualizationPhase1-
> > 			 EqualizationPhase2-, EqualizationPhase3-, LinkEqualizationRequest-
> > 	Capabilities: [a0] MSI: Enable+ Count=1/1 Maskable- 64bit+
> > 		Address: 00000000fee00000  Data: 0000
> > 	Capabilities: [100 v1] Vendor Specific Information: ID=0001 Rev=1 Len=010 <?>
> > 	Capabilities: [150 v2] Advanced Error Reporting
> > 		UESta:	DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
> > 		UEMsk:	DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
> > 		UESvrt:	DLP+ SDES+ TLP- FCP+ CmpltTO- CmpltAbrt- UnxCmplt- RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol-
> > 		CESta:	RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr-
> > 		CEMsk:	RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr+
> > 		AERCap:	First Error Pointer: 00, GenCap+ CGenEn- ChkCap+ ChkEn-
> > 	Capabilities: [270 v1] #19
> > 	Capabilities: [2b0 v1] Address Translation Service (ATS)
> > 		ATSCap:	Invalidate Queue Depth: 00
> > 		ATSCtl:	Enable+, Smallest Translation Unit: 00
> > 	Capabilities: [2c0 v1] #13
> > 	Capabilities: [2d0 v1] #1b
> > 	Kernel driver in use: vfio-pci
> > 	Kernel modules: radeon
> > 
> > > Alex 
> > > 
> > 
> > --Maik
> 
> 
> 

--Maik

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Qemu-devel] Multi GPU passthrough via VFIO
  2014-02-05 23:47       ` Maik Broemme
@ 2014-02-06  0:25         ` Maik Broemme
  2014-02-06  3:36           ` Alex Williamson
  0 siblings, 1 reply; 16+ messages in thread
From: Maik Broemme @ 2014-02-06  0:25 UTC (permalink / raw)
  To: Alex Williamson; +Cc: qemu-devel

Hi Alex,

Maik Broemme <mbroemme@parallels.com> wrote:
> > > > > Another minor issue is that the R9 290X is not reset during shutdown of
> > > > > VM (neither Linux nor Windows) but it can be tricked with doing
> > > > > "suspend-to-ram" between two starts. That's why I use '-no-reboot' option
> > > > > in QEMU. The 7870 is doing the reset properly.
> > > > 
> > > > 
> > > > Is the NoSoftRst "-" on the 290X vs "+" on the 7870 in lspci -vvv by
> > > > chance?  Thanks,
> > > > 
> > > 
> > > Here are both. It is funny it is opposite as you described. :)
> > 
> > 
> > Oops, yes.  Does this help?
> > 
> > --- a/hw/misc/vfio.c
> > +++ b/hw/misc/vfio.c
> > @@ -3136,7 +3136,7 @@ static void vfio_pci_reset_handler(void *opaque)
> >  
> >      QLIST_FOREACH(group, &group_list, next) {
> >          QLIST_FOREACH(vdev, &group->device_list, next) {
> > -            if (!vdev->reset_works || (!vdev->has_flr && vdev->has_pm_reset)) {
> > +            if (!vdev->reset_works || !vdev->has_flr) {
> >                  vdev->needs_reset = true;
> >              }
> >          }
> > 
> > I can't figure out why I coded it the way that I did.  Probably overly
> > targeting a specific device.  Thanks,
> > 
> 
> This patch works absolutely fine. After applying it to my 'qemu-git', the
> device resets works flawlessly. So it would be great to push it upstream
> as it looks good.
> 

Okay sorry. I was too fast here. It was just working first time but now
even after clean reboot it no longer works as expected but behavior
is very strange.

Windows:

  1st boot works fine - boot VGA and Windows ATI driver loaded, issue
      reboot and qemu stopped due to '-no-reboot'.

  2nd boot works partially - boot VGA and Windows ATI driver loaded but
      black screen and my system becames terrible slow and mostly
      unresponsive. My dmesg shows immediately after ATI driver will
      enable the device the following:

[  159.984324] vfio_ecap_init: 0000:01:00.0 hiding ecap 0x19@0x270
[  159.984340] vfio_ecap_init: 0000:01:00.0 hiding ecap 0x1b@0x2d0
[  160.129036] vfio_ecap_init: 0000:02:00.0 hiding ecap 0x19@0x270
[  160.129049] vfio_ecap_init: 0000:02:00.0 hiding ecap 0x1b@0x2d0
[  172.977677] kvm: zapping shadow pages for mmio generation wraparound
[  173.160174] br0: port 2(tap0) entered forwarding state
[  175.902967] vfio-pci 0000:01:00.0: irq 46 for MSI/MSI-X
[  188.340430] Clocksource tsc unstable (delta = -119654611 ns)
[  188.340511] Switched to clocksource hpet
[  191.088693] hpet1: lost 12 rtc interrupts
[  191.926555] hpet1: lost 25 rtc interrupts

  So your patch fixed indeed reset issue of boot VGA but something else
  is wrong now. :)

Linux (fglrx):

  1st boot works fine - boot VGA, fglrx loads fine and X could be
      started, issue reboot via SSH and qemu stopped due to
      '-no-reboot'.

  2nd boot works partially - boot VGA, fglrx loads fine but X couldn't
      be started and fails with:

[   34.265111] fglrx_pci 0000:02:00.0: irq 50 for MSI/MSI-X
[   34.344313] <6>[fglrx] Firegl kernel thread PID: 318
[   34.344400] <6>[fglrx] Firegl kernel thread PID: 319
[   34.344478] <6>[fglrx] Firegl kernel thread PID: 320
[   34.344589] <6>[fglrx] IRQ 50 Enabled
[   34.356105] <6>[fglrx] Reserved FB block: Shared offset:0, size:1000000 
[   34.356107] <6>[fglrx] Reserved FB block: Unshared offset:fac3000, size:3000 
[   34.356109] <6>[fglrx] Reserved FB block: Unshared offset:fac6000, size:23a000 
[   34.356110] <6>[fglrx] Reserved FB block: Unshared offset:7fff4000, size:c000 
[   34.386436] fglrx_pci 0000:01:00.0: irq 51 for MSI/MSI-X
[   34.490902] <6>[fglrx] Firegl kernel thread PID: 321
[   34.490994] <6>[fglrx] Firegl kernel thread PID: 322
[   34.491069] <6>[fglrx] Firegl kernel thread PID: 323
[   34.491166] <6>[fglrx] IRQ 51 Enabled
[   34.505271] <6>[fglrx] Reserved FB block: Shared offset:0, size:1000000 
[   34.505273] <6>[fglrx] Reserved FB block: Unshared offset:f9c3000, size:3000 
[   34.505274] <6>[fglrx] Reserved FB block: Unshared offset:f9c6000, size:23a000 
[   34.505276] <6>[fglrx] Reserved FB block: Unshared offset:fc00000, size:100000 
[   34.505277] <6>[fglrx] Reserved FB block: Unshared offset:fff8000, size:8000 
[   34.505278] <6>[fglrx] Reserved FB block: Unshared offset:ffff4000, size:c000 
[   34.526198] BUG: unable to handle kernel paging request at ffff880c724e8008
[   34.526203] IP: [<ffffffffa0399af6>] TF_PhwCIslands_PopulateAndUploadSclkMclkDPMLevels+0x96/0x3d0 [fglrx]
[   34.526277] PGD 1b3e067 PUD 0 
[   34.526279] Oops: 0002 [#1] PREEMPT SMP 
[   34.526282] Modules linked in: mousedev crct10dif_pclmul crct10dif_common crc32_pclmul crc32c_intel ghash_clmulni_intel ppdev aesni_intel snd_hda_codec_hdmi aes_x86_64 lrw gf128mul glue_helper ablk_helper cryptd snd_hda_intel microcode snd_hda_codec serio_raw psmouse parport_pc snd_hwdep snd_pcm parport snd_page_alloc processor snd_timer snd soundcore i2c_i801 intel_agp lpc_ich pcspkr intel_gtt i2c_core shpchp evdev fglrx(PO) amd_iommu_v2 button ext4 crc16 mbcache jbd2 atkbd libps2 virtio_blk virtio_net ahci libahci libata scsi_mod i8042 floppy serio virtio_pci virtio_ring virtio
[   34.526307] CPU: 1 PID: 316 Comm: X Tainted: P           O 3.13.1-2-ARCH #1
[   34.526309] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS Bochs 01/01/2011
[   34.526311] task: ffff8800776e2d00 ti: ffff880037a28000 task.ti: ffff880037a28000
[   34.526312] RIP: 0010:[<ffffffffa0399af6>]  [<ffffffffa0399af6>] TF_PhwCIslands_PopulateAndUploadSclkMclkDPMLevels+0x96/0x3d0 [fglrx]
[   34.526353] RSP: 0018:ffff880037a29810  EFLAGS: 00010296
[   34.526354] RAX: 0000000000000001 RBX: ffff8800724e800c RCX: 0000000000000006
[   34.526356] RDX: 0000000000000003 RSI: 0000000000000002 RDI: ffff8800724e8264
[   34.526357] RBP: ffff88007b19a00c R08: 00000000000186a0 R09: 000000000001e848
[   34.526358] R10: 00000002fffffffd R11: 00000000ffffffff R12: 0000000000000001
[   34.526359] R13: ffff88007b19a00c R14: 0000000000000000 R15: ffff880037a298b0
[   34.526363] FS:  00007f0ba649b880(0000) GS:ffff88007fd00000(0000) knlGS:0000000000000000
[   34.526365] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[   34.526366] CR2: ffff880c724e8008 CR3: 0000000037998000 CR4: 00000000000406e0
[   34.526372] Stack:
[   34.526373]  ffff88007b19a2f4 ffff88007bffcd1c 0000000000000001 ffffffffa0322cf0
[   34.526375]  0000000000000000 0000000000000000 0000000000000000 ffff880077ed2c08
[   34.526378]  0000000000000000 ffff880077ed2c08 ffff880037a298a0 ffffffffa0327f14
[   34.526380] Call Trace:
[   34.526435]  [<ffffffffa0322cf0>] ? PHM_DispatchTable+0xf0/0x220 [fglrx]
[   34.526490]  [<ffffffffa0327f14>] ? PECI_NotifyDALPreAdapterClockChange+0x144/0x160 [fglrx]
[   34.526546]  [<ffffffffa031e321>] ? PHM_SetPowerState+0x31/0xc0 [fglrx]
[   34.526597]  [<ffffffffa0340a5b>] ? PSM_ApplyHardwareAttributes_Dynamic+0x9b/0xf0 [fglrx]
[   34.526651]  [<ffffffffa033fde9>] ? PSM_AdjustPowerState_Dynamic+0x169/0x540 [fglrx]
[   34.526668]  [<ffffffffa0322cf0>] ? PHM_DispatchTable+0xf0/0x220 [fglrx]
[   34.526668]  [<ffffffffa0342ee4>] ? PEM_ExcuteEventChain+0x64/0xe0 [fglrx]
[   34.526668]  [<ffffffffa0341302>] ? PEM_HandleEvent+0x92/0xd0 [fglrx]
[   34.526668]  [<ffffffffa03357c0>] ? PEM_CWDDEPM_NotifyEvent+0xe0/0x4d0 [fglrx]
[   34.526668]  [<ffffffffa0333869>] ? PP_Cwdde+0x109/0x180 [fglrx]
[   34.526668]  [<ffffffffa02091dc>] ? firegl_pplib_cwddepm+0xbc/0x130 [fglrx]
[   34.526668]  [<ffffffffa02092d9>] ? firegl_pplib_notify_event+0x89/0xd0 [fglrx]
[   34.526668]  [<ffffffffa020292f>] ? hal_init_gpu+0x2bf/0x480 [fglrx]
[   34.526668]  [<ffffffffa01dcc7b>] ? firegl_open+0x2db/0x310 [fglrx]
[   34.526668]  [<ffffffffa01cb287>] ? ip_firegl_open+0x17/0x20 [fglrx]
[   34.526668]  [<ffffffffa01ccac8>] ? firegl_stub_open+0x98/0x100 [fglrx]
[   34.526668]  [<ffffffff811a82bf>] ? chrdev_open+0x9f/0x1d0
[   34.526668]  [<ffffffff811a1967>] ? do_dentry_open+0x1b7/0x2c0
[   34.526668]  [<ffffffff811aed41>] ? __inode_permission+0x41/0xb0
[   34.526668]  [<ffffffff811a8220>] ? cdev_put+0x30/0x30
[   34.526668]  [<ffffffff811a1d91>] ? finish_open+0x31/0x40
[   34.526668]  [<ffffffff811b1b72>] ? do_last+0x572/0xe90
[   34.526668]  [<ffffffff811af036>] ? link_path_walk+0x236/0x8d0
[   34.526668]  [<ffffffff811b254b>] ? path_openat+0xbb/0x6b0
[   34.526668]  [<ffffffff811b3c6a>] ? do_filp_open+0x3a/0x90
[   34.526668]  [<ffffffff811c0567>] ? __alloc_fd+0xa7/0x130
[   34.526668]  [<ffffffff811a2f49>] ? do_sys_open+0x129/0x220
[   34.526668]  [<ffffffff811a305e>] ? SyS_open+0x1e/0x20
[   34.526668]  [<ffffffff8152136d>] ? system_call_fastpath+0x1a/0x1f
[   34.526668] Code: 8b 4a 1c 8b 93 e0 18 00 00 48 8d bb 58 02 00 00 85 d2 0f 84 63 02 00 00 f6 c2 01 0f 84 20 01 00 00 44 8b 1b 41 ff cb 4f 8d 14 5b <46> 89 44 93 08 8b 95 3c 02 00 00 48 89 d0 48 c1 e8 07 a8 01 75 
[   34.526668] RIP  [<ffffffffa0399af6>] TF_PhwCIslands_PopulateAndUploadSclkMclkDPMLevels+0x96/0x3d0 [fglrx]
[   34.526668]  RSP <ffff880037a29810>
[   34.526668] CR2: ffff880c724e8008
[   34.526668] ---[ end trace 5431e6dcf1c31dea ]---
[   69.317528] type=1006 audit(1391649552.046:4): pid=324 uid=0 old auid=4294967295 new auid=0 old ses=4294967295 new ses=3 res=1

I know it is the binary driver but I would also retry with radeon one but
I believe there will be a similar crash. In my first try I just rebooted
the Linux VM several times without starting X.

I got it one time working without getting 'Clocksource tsc unstable' but
now I'm unable to repeat it. So I believe something more is needed.

> > Alex
> > 
> 
> --Maik
> 

--Maik

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Qemu-devel] Multi GPU passthrough via VFIO
  2014-02-06  0:25         ` Maik Broemme
@ 2014-02-06  3:36           ` Alex Williamson
  2014-02-07  0:22             ` Maik Broemme
  0 siblings, 1 reply; 16+ messages in thread
From: Alex Williamson @ 2014-02-06  3:36 UTC (permalink / raw)
  To: Maik Broemme; +Cc: qemu-devel

On Thu, 2014-02-06 at 01:25 +0100, Maik Broemme wrote:
> Hi Alex,
> 
> Maik Broemme <mbroemme@parallels.com> wrote:
> > > > > > Another minor issue is that the R9 290X is not reset during shutdown of
> > > > > > VM (neither Linux nor Windows) but it can be tricked with doing
> > > > > > "suspend-to-ram" between two starts. That's why I use '-no-reboot' option
> > > > > > in QEMU. The 7870 is doing the reset properly.
> > > > > 
> > > > > 
> > > > > Is the NoSoftRst "-" on the 290X vs "+" on the 7870 in lspci -vvv by
> > > > > chance?  Thanks,
> > > > > 
> > > > 
> > > > Here are both. It is funny it is opposite as you described. :)
> > > 
> > > 
> > > Oops, yes.  Does this help?
> > > 
> > > --- a/hw/misc/vfio.c
> > > +++ b/hw/misc/vfio.c
> > > @@ -3136,7 +3136,7 @@ static void vfio_pci_reset_handler(void *opaque)
> > >  
> > >      QLIST_FOREACH(group, &group_list, next) {
> > >          QLIST_FOREACH(vdev, &group->device_list, next) {
> > > -            if (!vdev->reset_works || (!vdev->has_flr && vdev->has_pm_reset)) {
> > > +            if (!vdev->reset_works || !vdev->has_flr) {
> > >                  vdev->needs_reset = true;
> > >              }
> > >          }
> > > 
> > > I can't figure out why I coded it the way that I did.  Probably overly
> > > targeting a specific device.  Thanks,
> > > 
> > 
> > This patch works absolutely fine. After applying it to my 'qemu-git', the
> > device resets works flawlessly. So it would be great to push it upstream
> > as it looks good.
> > 
> 
> Okay sorry. I was too fast here. It was just working first time but now
> even after clean reboot it no longer works as expected but behavior
> is very strange.
> 
> Windows:
> 
>   1st boot works fine - boot VGA and Windows ATI driver loaded, issue
>       reboot and qemu stopped due to '-no-reboot'.
> 
>   2nd boot works partially - boot VGA and Windows ATI driver loaded but
>       black screen and my system becames terrible slow and mostly
>       unresponsive. My dmesg shows immediately after ATI driver will
>       enable the device the following:
> 
> [  159.984324] vfio_ecap_init: 0000:01:00.0 hiding ecap 0x19@0x270
> [  159.984340] vfio_ecap_init: 0000:01:00.0 hiding ecap 0x1b@0x2d0
> [  160.129036] vfio_ecap_init: 0000:02:00.0 hiding ecap 0x19@0x270
> [  160.129049] vfio_ecap_init: 0000:02:00.0 hiding ecap 0x1b@0x2d0
> [  172.977677] kvm: zapping shadow pages for mmio generation wraparound
> [  173.160174] br0: port 2(tap0) entered forwarding state
> [  175.902967] vfio-pci 0000:01:00.0: irq 46 for MSI/MSI-X
> [  188.340430] Clocksource tsc unstable (delta = -119654611 ns)
> [  188.340511] Switched to clocksource hpet
> [  191.088693] hpet1: lost 12 rtc interrupts
> [  191.926555] hpet1: lost 25 rtc interrupts
> 
>   So your patch fixed indeed reset issue of boot VGA but something else
>   is wrong now. :)

Can you try the cards separately?  If you run lspci on the device in the
host, does it report as normal?  Often when the host gets slow and we
get these sorts of clock issues it means the bus is fatal and we get
timeouts trying to read from it.

> Linux (fglrx):
> 
>   1st boot works fine - boot VGA, fglrx loads fine and X could be
>       started, issue reboot via SSH and qemu stopped due to
>       '-no-reboot'.
> 
>   2nd boot works partially - boot VGA, fglrx loads fine but X couldn't
>       be started and fails with:
> 
> [   34.265111] fglrx_pci 0000:02:00.0: irq 50 for MSI/MSI-X
> [   34.344313] <6>[fglrx] Firegl kernel thread PID: 318
> [   34.344400] <6>[fglrx] Firegl kernel thread PID: 319
> [   34.344478] <6>[fglrx] Firegl kernel thread PID: 320
> [   34.344589] <6>[fglrx] IRQ 50 Enabled
> [   34.356105] <6>[fglrx] Reserved FB block: Shared offset:0, size:1000000 
> [   34.356107] <6>[fglrx] Reserved FB block: Unshared offset:fac3000, size:3000 
> [   34.356109] <6>[fglrx] Reserved FB block: Unshared offset:fac6000, size:23a000 
> [   34.356110] <6>[fglrx] Reserved FB block: Unshared offset:7fff4000, size:c000 
> [   34.386436] fglrx_pci 0000:01:00.0: irq 51 for MSI/MSI-X
> [   34.490902] <6>[fglrx] Firegl kernel thread PID: 321
> [   34.490994] <6>[fglrx] Firegl kernel thread PID: 322
> [   34.491069] <6>[fglrx] Firegl kernel thread PID: 323
> [   34.491166] <6>[fglrx] IRQ 51 Enabled
> [   34.505271] <6>[fglrx] Reserved FB block: Shared offset:0, size:1000000 
> [   34.505273] <6>[fglrx] Reserved FB block: Unshared offset:f9c3000, size:3000 
> [   34.505274] <6>[fglrx] Reserved FB block: Unshared offset:f9c6000, size:23a000 
> [   34.505276] <6>[fglrx] Reserved FB block: Unshared offset:fc00000, size:100000 
> [   34.505277] <6>[fglrx] Reserved FB block: Unshared offset:fff8000, size:8000 
> [   34.505278] <6>[fglrx] Reserved FB block: Unshared offset:ffff4000, size:c000 
> [   34.526198] BUG: unable to handle kernel paging request at ffff880c724e8008
> [   34.526203] IP: [<ffffffffa0399af6>] TF_PhwCIslands_PopulateAndUploadSclkMclkDPMLevels+0x96/0x3d0 [fglrx]
> [   34.526277] PGD 1b3e067 PUD 0 
> [   34.526279] Oops: 0002 [#1] PREEMPT SMP 
> [   34.526282] Modules linked in: mousedev crct10dif_pclmul crct10dif_common crc32_pclmul crc32c_intel ghash_clmulni_intel ppdev aesni_intel snd_hda_codec_hdmi aes_x86_64 lrw gf128mul glue_helper ablk_helper cryptd snd_hda_intel microcode snd_hda_codec serio_raw psmouse parport_pc snd_hwdep snd_pcm parport snd_page_alloc processor snd_timer snd soundcore i2c_i801 intel_agp lpc_ich pcspkr intel_gtt i2c_core shpchp evdev fglrx(PO) amd_iommu_v2 button ext4 crc16 mbcache jbd2 atkbd libps2 virtio_blk virtio_net ahci libahci libata scsi_mod i8042 floppy serio virtio_pci virtio_ring virtio
> [   34.526307] CPU: 1 PID: 316 Comm: X Tainted: P           O 3.13.1-2-ARCH #1
> [   34.526309] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS Bochs 01/01/2011
> [   34.526311] task: ffff8800776e2d00 ti: ffff880037a28000 task.ti: ffff880037a28000
> [   34.526312] RIP: 0010:[<ffffffffa0399af6>]  [<ffffffffa0399af6>] TF_PhwCIslands_PopulateAndUploadSclkMclkDPMLevels+0x96/0x3d0 [fglrx]
> [   34.526353] RSP: 0018:ffff880037a29810  EFLAGS: 00010296
> [   34.526354] RAX: 0000000000000001 RBX: ffff8800724e800c RCX: 0000000000000006
> [   34.526356] RDX: 0000000000000003 RSI: 0000000000000002 RDI: ffff8800724e8264
> [   34.526357] RBP: ffff88007b19a00c R08: 00000000000186a0 R09: 000000000001e848
> [   34.526358] R10: 00000002fffffffd R11: 00000000ffffffff R12: 0000000000000001
> [   34.526359] R13: ffff88007b19a00c R14: 0000000000000000 R15: ffff880037a298b0
> [   34.526363] FS:  00007f0ba649b880(0000) GS:ffff88007fd00000(0000) knlGS:0000000000000000
> [   34.526365] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
> [   34.526366] CR2: ffff880c724e8008 CR3: 0000000037998000 CR4: 00000000000406e0
> [   34.526372] Stack:
> [   34.526373]  ffff88007b19a2f4 ffff88007bffcd1c 0000000000000001 ffffffffa0322cf0
> [   34.526375]  0000000000000000 0000000000000000 0000000000000000 ffff880077ed2c08
> [   34.526378]  0000000000000000 ffff880077ed2c08 ffff880037a298a0 ffffffffa0327f14
> [   34.526380] Call Trace:
> [   34.526435]  [<ffffffffa0322cf0>] ? PHM_DispatchTable+0xf0/0x220 [fglrx]
> [   34.526490]  [<ffffffffa0327f14>] ? PECI_NotifyDALPreAdapterClockChange+0x144/0x160 [fglrx]
> [   34.526546]  [<ffffffffa031e321>] ? PHM_SetPowerState+0x31/0xc0 [fglrx]
> [   34.526597]  [<ffffffffa0340a5b>] ? PSM_ApplyHardwareAttributes_Dynamic+0x9b/0xf0 [fglrx]
> [   34.526651]  [<ffffffffa033fde9>] ? PSM_AdjustPowerState_Dynamic+0x169/0x540 [fglrx]
> [   34.526668]  [<ffffffffa0322cf0>] ? PHM_DispatchTable+0xf0/0x220 [fglrx]
> [   34.526668]  [<ffffffffa0342ee4>] ? PEM_ExcuteEventChain+0x64/0xe0 [fglrx]
> [   34.526668]  [<ffffffffa0341302>] ? PEM_HandleEvent+0x92/0xd0 [fglrx]
> [   34.526668]  [<ffffffffa03357c0>] ? PEM_CWDDEPM_NotifyEvent+0xe0/0x4d0 [fglrx]
> [   34.526668]  [<ffffffffa0333869>] ? PP_Cwdde+0x109/0x180 [fglrx]
> [   34.526668]  [<ffffffffa02091dc>] ? firegl_pplib_cwddepm+0xbc/0x130 [fglrx]
> [   34.526668]  [<ffffffffa02092d9>] ? firegl_pplib_notify_event+0x89/0xd0 [fglrx]
> [   34.526668]  [<ffffffffa020292f>] ? hal_init_gpu+0x2bf/0x480 [fglrx]
> [   34.526668]  [<ffffffffa01dcc7b>] ? firegl_open+0x2db/0x310 [fglrx]
> [   34.526668]  [<ffffffffa01cb287>] ? ip_firegl_open+0x17/0x20 [fglrx]
> [   34.526668]  [<ffffffffa01ccac8>] ? firegl_stub_open+0x98/0x100 [fglrx]
> [   34.526668]  [<ffffffff811a82bf>] ? chrdev_open+0x9f/0x1d0
> [   34.526668]  [<ffffffff811a1967>] ? do_dentry_open+0x1b7/0x2c0
> [   34.526668]  [<ffffffff811aed41>] ? __inode_permission+0x41/0xb0
> [   34.526668]  [<ffffffff811a8220>] ? cdev_put+0x30/0x30
> [   34.526668]  [<ffffffff811a1d91>] ? finish_open+0x31/0x40
> [   34.526668]  [<ffffffff811b1b72>] ? do_last+0x572/0xe90
> [   34.526668]  [<ffffffff811af036>] ? link_path_walk+0x236/0x8d0
> [   34.526668]  [<ffffffff811b254b>] ? path_openat+0xbb/0x6b0
> [   34.526668]  [<ffffffff811b3c6a>] ? do_filp_open+0x3a/0x90
> [   34.526668]  [<ffffffff811c0567>] ? __alloc_fd+0xa7/0x130
> [   34.526668]  [<ffffffff811a2f49>] ? do_sys_open+0x129/0x220
> [   34.526668]  [<ffffffff811a305e>] ? SyS_open+0x1e/0x20
> [   34.526668]  [<ffffffff8152136d>] ? system_call_fastpath+0x1a/0x1f
> [   34.526668] Code: 8b 4a 1c 8b 93 e0 18 00 00 48 8d bb 58 02 00 00 85 d2 0f 84 63 02 00 00 f6 c2 01 0f 84 20 01 00 00 44 8b 1b 41 ff cb 4f 8d 14 5b <46> 89 44 93 08 8b 95 3c 02 00 00 48 89 d0 48 c1 e8 07 a8 01 75 
> [   34.526668] RIP  [<ffffffffa0399af6>] TF_PhwCIslands_PopulateAndUploadSclkMclkDPMLevels+0x96/0x3d0 [fglrx]
> [   34.526668]  RSP <ffff880037a29810>
> [   34.526668] CR2: ffff880c724e8008
> [   34.526668] ---[ end trace 5431e6dcf1c31dea ]---
> [   69.317528] type=1006 audit(1391649552.046:4): pid=324 uid=0 old auid=4294967295 new auid=0 old ses=4294967295 new ses=3 res=1
> 
> I know it is the binary driver but I would also retry with radeon one but
> I believe there will be a similar crash. In my first try I just rebooted
> the Linux VM several times without starting X.
> 
> I got it one time working without getting 'Clocksource tsc unstable' but
> now I'm unable to repeat it. So I believe something more is needed.

Bus resets are a mixed blessing, it returns the card to a relatively
known state, but it's a fairly unusual event from a platform perspective
and we have no idea what kind of quirks the host system bios might have
in place to workaround hardware.  If the bus is not fatal you might try
running lspci -vvv in the host at various points to see what changed.
For instance, boot a Linux guest to text mode and see if the card is in
the same state between first boot and second boot before starting X.
Thanks,

Alex

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Qemu-devel] Multi GPU passthrough via VFIO
  2014-02-06  3:36           ` Alex Williamson
@ 2014-02-07  0:22             ` Maik Broemme
  2014-02-07 18:07               ` Maik Broemme
  2014-02-07 19:10               ` Alex Williamson
  0 siblings, 2 replies; 16+ messages in thread
From: Maik Broemme @ 2014-02-07  0:22 UTC (permalink / raw)
  To: Alex Williamson; +Cc: qemu-devel

Hi Alex,

Alex Williamson <alex.williamson@redhat.com> wrote:
> On Thu, 2014-02-06 at 01:25 +0100, Maik Broemme wrote:
> > Hi Alex,
> > 
> > Maik Broemme <mbroemme@parallels.com> wrote:
> > > > > > > Another minor issue is that the R9 290X is not reset during shutdown of
> > > > > > > VM (neither Linux nor Windows) but it can be tricked with doing
> > > > > > > "suspend-to-ram" between two starts. That's why I use '-no-reboot' option
> > > > > > > in QEMU. The 7870 is doing the reset properly.
> > > > > > 
> > > > > > 
> > > > > > Is the NoSoftRst "-" on the 290X vs "+" on the 7870 in lspci -vvv by
> > > > > > chance?  Thanks,
> > > > > > 
> > > > > 
> > > > > Here are both. It is funny it is opposite as you described. :)
> > > > 
> > > > 
> > > > Oops, yes.  Does this help?
> > > > 
> > > > --- a/hw/misc/vfio.c
> > > > +++ b/hw/misc/vfio.c
> > > > @@ -3136,7 +3136,7 @@ static void vfio_pci_reset_handler(void *opaque)
> > > >  
> > > >      QLIST_FOREACH(group, &group_list, next) {
> > > >          QLIST_FOREACH(vdev, &group->device_list, next) {
> > > > -            if (!vdev->reset_works || (!vdev->has_flr && vdev->has_pm_reset)) {
> > > > +            if (!vdev->reset_works || !vdev->has_flr) {
> > > >                  vdev->needs_reset = true;
> > > >              }
> > > >          }
> > > > 
> > > > I can't figure out why I coded it the way that I did.  Probably overly
> > > > targeting a specific device.  Thanks,
> > > > 
> > > 
> > > This patch works absolutely fine. After applying it to my 'qemu-git', the
> > > device resets works flawlessly. So it would be great to push it upstream
> > > as it looks good.
> > > 
> > 
> > Okay sorry. I was too fast here. It was just working first time but now
> > even after clean reboot it no longer works as expected but behavior
> > is very strange.
> > 
> > Windows:
> > 
> >   1st boot works fine - boot VGA and Windows ATI driver loaded, issue
> >       reboot and qemu stopped due to '-no-reboot'.
> > 
> >   2nd boot works partially - boot VGA and Windows ATI driver loaded but
> >       black screen and my system becames terrible slow and mostly
> >       unresponsive. My dmesg shows immediately after ATI driver will
> >       enable the device the following:
> > 
> > [  159.984324] vfio_ecap_init: 0000:01:00.0 hiding ecap 0x19@0x270
> > [  159.984340] vfio_ecap_init: 0000:01:00.0 hiding ecap 0x1b@0x2d0
> > [  160.129036] vfio_ecap_init: 0000:02:00.0 hiding ecap 0x19@0x270
> > [  160.129049] vfio_ecap_init: 0000:02:00.0 hiding ecap 0x1b@0x2d0
> > [  172.977677] kvm: zapping shadow pages for mmio generation wraparound
> > [  173.160174] br0: port 2(tap0) entered forwarding state
> > [  175.902967] vfio-pci 0000:01:00.0: irq 46 for MSI/MSI-X
> > [  188.340430] Clocksource tsc unstable (delta = -119654611 ns)
> > [  188.340511] Switched to clocksource hpet
> > [  191.088693] hpet1: lost 12 rtc interrupts
> > [  191.926555] hpet1: lost 25 rtc interrupts
> > 
> >   So your patch fixed indeed reset issue of boot VGA but something else
> >   is wrong now. :)
> 
> Can you try the cards separately?  If you run lspci on the device in the
> host, does it report as normal?  Often when the host gets slow and we
> get these sorts of clock issues it means the bus is fatal and we get
> timeouts trying to read from it.
> 

Okay with only one card I don't have the clock issues anymore, so we
should look into this a bit later as working reset seems more important
for now.

> > Linux (fglrx):
> > 
> >   1st boot works fine - boot VGA, fglrx loads fine and X could be
> >       started, issue reboot via SSH and qemu stopped due to
> >       '-no-reboot'.
> > 
> >   2nd boot works partially - boot VGA, fglrx loads fine but X couldn't
> >       be started and fails with:
> > 
> > [   34.265111] fglrx_pci 0000:02:00.0: irq 50 for MSI/MSI-X
> > [   34.344313] <6>[fglrx] Firegl kernel thread PID: 318
> > [   34.344400] <6>[fglrx] Firegl kernel thread PID: 319
> > [   34.344478] <6>[fglrx] Firegl kernel thread PID: 320
> > [   34.344589] <6>[fglrx] IRQ 50 Enabled
> > [   34.356105] <6>[fglrx] Reserved FB block: Shared offset:0, size:1000000 
> > [   34.356107] <6>[fglrx] Reserved FB block: Unshared offset:fac3000, size:3000 
> > [   34.356109] <6>[fglrx] Reserved FB block: Unshared offset:fac6000, size:23a000 
> > [   34.356110] <6>[fglrx] Reserved FB block: Unshared offset:7fff4000, size:c000 
> > [   34.386436] fglrx_pci 0000:01:00.0: irq 51 for MSI/MSI-X
> > [   34.490902] <6>[fglrx] Firegl kernel thread PID: 321
> > [   34.490994] <6>[fglrx] Firegl kernel thread PID: 322
> > [   34.491069] <6>[fglrx] Firegl kernel thread PID: 323
> > [   34.491166] <6>[fglrx] IRQ 51 Enabled
> > [   34.505271] <6>[fglrx] Reserved FB block: Shared offset:0, size:1000000 
> > [   34.505273] <6>[fglrx] Reserved FB block: Unshared offset:f9c3000, size:3000 
> > [   34.505274] <6>[fglrx] Reserved FB block: Unshared offset:f9c6000, size:23a000 
> > [   34.505276] <6>[fglrx] Reserved FB block: Unshared offset:fc00000, size:100000 
> > [   34.505277] <6>[fglrx] Reserved FB block: Unshared offset:fff8000, size:8000 
> > [   34.505278] <6>[fglrx] Reserved FB block: Unshared offset:ffff4000, size:c000 
> > [   34.526198] BUG: unable to handle kernel paging request at ffff880c724e8008
> > [   34.526203] IP: [<ffffffffa0399af6>] TF_PhwCIslands_PopulateAndUploadSclkMclkDPMLevels+0x96/0x3d0 [fglrx]
> > [   34.526277] PGD 1b3e067 PUD 0 
> > [   34.526279] Oops: 0002 [#1] PREEMPT SMP 
> > [   34.526282] Modules linked in: mousedev crct10dif_pclmul crct10dif_common crc32_pclmul crc32c_intel ghash_clmulni_intel ppdev aesni_intel snd_hda_codec_hdmi aes_x86_64 lrw gf128mul glue_helper ablk_helper cryptd snd_hda_intel microcode snd_hda_codec serio_raw psmouse parport_pc snd_hwdep snd_pcm parport snd_page_alloc processor snd_timer snd soundcore i2c_i801 intel_agp lpc_ich pcspkr intel_gtt i2c_core shpchp evdev fglrx(PO) amd_iommu_v2 button ext4 crc16 mbcache jbd2 atkbd libps2 virtio_blk virtio_net ahci libahci libata scsi_mod i8042 floppy serio virtio_pci virtio_ring virtio
> > [   34.526307] CPU: 1 PID: 316 Comm: X Tainted: P           O 3.13.1-2-ARCH #1
> > [   34.526309] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS Bochs 01/01/2011
> > [   34.526311] task: ffff8800776e2d00 ti: ffff880037a28000 task.ti: ffff880037a28000
> > [   34.526312] RIP: 0010:[<ffffffffa0399af6>]  [<ffffffffa0399af6>] TF_PhwCIslands_PopulateAndUploadSclkMclkDPMLevels+0x96/0x3d0 [fglrx]
> > [   34.526353] RSP: 0018:ffff880037a29810  EFLAGS: 00010296
> > [   34.526354] RAX: 0000000000000001 RBX: ffff8800724e800c RCX: 0000000000000006
> > [   34.526356] RDX: 0000000000000003 RSI: 0000000000000002 RDI: ffff8800724e8264
> > [   34.526357] RBP: ffff88007b19a00c R08: 00000000000186a0 R09: 000000000001e848
> > [   34.526358] R10: 00000002fffffffd R11: 00000000ffffffff R12: 0000000000000001
> > [   34.526359] R13: ffff88007b19a00c R14: 0000000000000000 R15: ffff880037a298b0
> > [   34.526363] FS:  00007f0ba649b880(0000) GS:ffff88007fd00000(0000) knlGS:0000000000000000
> > [   34.526365] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
> > [   34.526366] CR2: ffff880c724e8008 CR3: 0000000037998000 CR4: 00000000000406e0
> > [   34.526372] Stack:
> > [   34.526373]  ffff88007b19a2f4 ffff88007bffcd1c 0000000000000001 ffffffffa0322cf0
> > [   34.526375]  0000000000000000 0000000000000000 0000000000000000 ffff880077ed2c08
> > [   34.526378]  0000000000000000 ffff880077ed2c08 ffff880037a298a0 ffffffffa0327f14
> > [   34.526380] Call Trace:
> > [   34.526435]  [<ffffffffa0322cf0>] ? PHM_DispatchTable+0xf0/0x220 [fglrx]
> > [   34.526490]  [<ffffffffa0327f14>] ? PECI_NotifyDALPreAdapterClockChange+0x144/0x160 [fglrx]
> > [   34.526546]  [<ffffffffa031e321>] ? PHM_SetPowerState+0x31/0xc0 [fglrx]
> > [   34.526597]  [<ffffffffa0340a5b>] ? PSM_ApplyHardwareAttributes_Dynamic+0x9b/0xf0 [fglrx]
> > [   34.526651]  [<ffffffffa033fde9>] ? PSM_AdjustPowerState_Dynamic+0x169/0x540 [fglrx]
> > [   34.526668]  [<ffffffffa0322cf0>] ? PHM_DispatchTable+0xf0/0x220 [fglrx]
> > [   34.526668]  [<ffffffffa0342ee4>] ? PEM_ExcuteEventChain+0x64/0xe0 [fglrx]
> > [   34.526668]  [<ffffffffa0341302>] ? PEM_HandleEvent+0x92/0xd0 [fglrx]
> > [   34.526668]  [<ffffffffa03357c0>] ? PEM_CWDDEPM_NotifyEvent+0xe0/0x4d0 [fglrx]
> > [   34.526668]  [<ffffffffa0333869>] ? PP_Cwdde+0x109/0x180 [fglrx]
> > [   34.526668]  [<ffffffffa02091dc>] ? firegl_pplib_cwddepm+0xbc/0x130 [fglrx]
> > [   34.526668]  [<ffffffffa02092d9>] ? firegl_pplib_notify_event+0x89/0xd0 [fglrx]
> > [   34.526668]  [<ffffffffa020292f>] ? hal_init_gpu+0x2bf/0x480 [fglrx]
> > [   34.526668]  [<ffffffffa01dcc7b>] ? firegl_open+0x2db/0x310 [fglrx]
> > [   34.526668]  [<ffffffffa01cb287>] ? ip_firegl_open+0x17/0x20 [fglrx]
> > [   34.526668]  [<ffffffffa01ccac8>] ? firegl_stub_open+0x98/0x100 [fglrx]
> > [   34.526668]  [<ffffffff811a82bf>] ? chrdev_open+0x9f/0x1d0
> > [   34.526668]  [<ffffffff811a1967>] ? do_dentry_open+0x1b7/0x2c0
> > [   34.526668]  [<ffffffff811aed41>] ? __inode_permission+0x41/0xb0
> > [   34.526668]  [<ffffffff811a8220>] ? cdev_put+0x30/0x30
> > [   34.526668]  [<ffffffff811a1d91>] ? finish_open+0x31/0x40
> > [   34.526668]  [<ffffffff811b1b72>] ? do_last+0x572/0xe90
> > [   34.526668]  [<ffffffff811af036>] ? link_path_walk+0x236/0x8d0
> > [   34.526668]  [<ffffffff811b254b>] ? path_openat+0xbb/0x6b0
> > [   34.526668]  [<ffffffff811b3c6a>] ? do_filp_open+0x3a/0x90
> > [   34.526668]  [<ffffffff811c0567>] ? __alloc_fd+0xa7/0x130
> > [   34.526668]  [<ffffffff811a2f49>] ? do_sys_open+0x129/0x220
> > [   34.526668]  [<ffffffff811a305e>] ? SyS_open+0x1e/0x20
> > [   34.526668]  [<ffffffff8152136d>] ? system_call_fastpath+0x1a/0x1f
> > [   34.526668] Code: 8b 4a 1c 8b 93 e0 18 00 00 48 8d bb 58 02 00 00 85 d2 0f 84 63 02 00 00 f6 c2 01 0f 84 20 01 00 00 44 8b 1b 41 ff cb 4f 8d 14 5b <46> 89 44 93 08 8b 95 3c 02 00 00 48 89 d0 48 c1 e8 07 a8 01 75 
> > [   34.526668] RIP  [<ffffffffa0399af6>] TF_PhwCIslands_PopulateAndUploadSclkMclkDPMLevels+0x96/0x3d0 [fglrx]
> > [   34.526668]  RSP <ffff880037a29810>
> > [   34.526668] CR2: ffff880c724e8008
> > [   34.526668] ---[ end trace 5431e6dcf1c31dea ]---
> > [   69.317528] type=1006 audit(1391649552.046:4): pid=324 uid=0 old auid=4294967295 new auid=0 old ses=4294967295 new ses=3 res=1
> > 
> > I know it is the binary driver but I would also retry with radeon one but
> > I believe there will be a similar crash. In my first try I just rebooted
> > the Linux VM several times without starting X.
> > 
> > I got it one time working without getting 'Clocksource tsc unstable' but
> > now I'm unable to repeat it. So I believe something more is needed.
> 
> Bus resets are a mixed blessing, it returns the card to a relatively
> known state, but it's a fairly unusual event from a platform perspective
> and we have no idea what kind of quirks the host system bios might have
> in place to workaround hardware.  If the bus is not fatal you might try
> running lspci -vvv in the host at various points to see what changed.
> For instance, boot a Linux guest to text mode and see if the card is in
> the same state between first boot and second boot before starting X.
> Thanks,
> 

I tried the R9 290X separately now. You're right there are some changes
between lspci -vvv output between 1st and 2nd boot and they are reset
if I do "suspend-to-ram" and resume before 3rd boot of VM. Below is the
lspci from 1st boot and the diffs of the lspci outputs:

--- 001-lspci.290x.before.1st.log	2014-02-07 01:13:41.498827928 +0100
+++ 002-lspci.290x.during.1st.before.X.log	2014-02-07 01:14:47.984612423 +0100
@@ -1,6 +1,6 @@
 01:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] Hawaii XT [Radeon HD 8970] (prog-if 00 [VGA controller])
 	Subsystem: Advanced Micro Devices, Inc. [AMD/ATI] Device 0b00
-	Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx-
+	Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR+ FastB2B- DisINTx-
 	Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
 	Latency: 0, Cache Line Size: 64 bytes
 	Interrupt: pin A routed to IRQ 18
@@ -19,7 +19,7 @@
 		DevCtl:	Report errors: Correctable- Non-Fatal- Fatal- Unsupported-
 			RlxdOrd+ ExtTag- PhantFunc- AuxPwr- NoSnoop+
 			MaxPayload 128 bytes, MaxReadReq 512 bytes
-		DevSta:	CorrErr- UncorrErr- FatalErr- UnsuppReq- AuxPwr- TransPend-
+		DevSta:	CorrErr+ UncorrErr- FatalErr- UnsuppReq+ AuxPwr- TransPend-
 		LnkCap:	Port #0, Speed 8GT/s, Width x16, ASPM L0s L1, Exit Latency L0s <64ns, L1 <1us
 			ClockPM- Surprise- LLActRep- BwNot-
 		LnkCtl:	ASPM L0s L1 Enabled; RCB 64 bytes Disabled- CommClk+
@@ -39,13 +39,13 @@
 		UESta:	DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
 		UEMsk:	DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
 		UESvrt:	DLP+ SDES+ TLP- FCP+ CmpltTO- CmpltAbrt- UnxCmplt- RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol-
-		CESta:	RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr-
+		CESta:	RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr+
 		CEMsk:	RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr+
 		AERCap:	First Error Pointer: 00, GenCap+ CGenEn- ChkCap+ ChkEn-
 	Capabilities: [270 v1] #19
 	Capabilities: [2b0 v1] Address Translation Service (ATS)
 		ATSCap:	Invalidate Queue Depth: 00
-		ATSCtl:	Enable-, Smallest Translation Unit: 00
+		ATSCtl:	Enable+, Smallest Translation Unit: 00
 	Capabilities: [2c0 v1] #13
 	Capabilities: [2d0 v1] #1b
 	Kernel driver in use: vfio-pci

--- 002-lspci.290x.during.1st.before.X.log	2014-02-07 01:14:47.984612423 +0100
+++ 003-lspci.290x.during.1st.after.X.log	2014-02-07 01:16:29.644846503 +0100
@@ -1,9 +1,9 @@
 01:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] Hawaii XT [Radeon HD 8970] (prog-if 00 [VGA controller])
 	Subsystem: Advanced Micro Devices, Inc. [AMD/ATI] Device 0b00
-	Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR+ FastB2B- DisINTx-
+	Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR+ FastB2B- DisINTx+
 	Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
 	Latency: 0, Cache Line Size: 64 bytes
-	Interrupt: pin A routed to IRQ 18
+	Interrupt: pin A routed to IRQ 47
 	Region 0: Memory at c0000000 (64-bit, prefetchable) [size=256M]
 	Region 2: Memory at df800000 (64-bit, prefetchable) [size=8M]
 	Region 4: I/O ports at be00 [size=256]
@@ -17,14 +17,14 @@
 		DevCap:	MaxPayload 256 bytes, PhantFunc 0, Latency L0s <4us, L1 unlimited
 			ExtTag+ AttnBtn- AttnInd- PwrInd- RBE+ FLReset-
 		DevCtl:	Report errors: Correctable- Non-Fatal- Fatal- Unsupported-
-			RlxdOrd+ ExtTag- PhantFunc- AuxPwr- NoSnoop+
+			RlxdOrd+ ExtTag+ PhantFunc- AuxPwr- NoSnoop+
 			MaxPayload 128 bytes, MaxReadReq 512 bytes
-		DevSta:	CorrErr+ UncorrErr- FatalErr- UnsuppReq+ AuxPwr- TransPend-
+		DevSta:	CorrErr- UncorrErr- FatalErr- UnsuppReq- AuxPwr- TransPend-
 		LnkCap:	Port #0, Speed 8GT/s, Width x16, ASPM L0s L1, Exit Latency L0s <64ns, L1 <1us
 			ClockPM- Surprise- LLActRep- BwNot-
 		LnkCtl:	ASPM L0s L1 Enabled; RCB 64 bytes Disabled- CommClk+
 			ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
-		LnkSta:	Speed 5GT/s, Width x16, TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-
+		LnkSta:	Speed 2.5GT/s, Width x1, TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-
 		DevCap2: Completion Timeout: Not Supported, TimeoutDis-, LTR-, OBFF Not Supported
 		DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis-, LTR-, OBFF Disabled
 		LnkCtl2: Target Link Speed: 8GT/s, EnterCompliance- SpeedDis-
@@ -32,8 +32,8 @@
 			 Compliance De-emphasis: -6dB
 		LnkSta2: Current De-emphasis Level: -3.5dB, EqualizationComplete-, EqualizationPhase1-
 			 EqualizationPhase2-, EqualizationPhase3-, LinkEqualizationRequest-
-	Capabilities: [a0] MSI: Enable- Count=1/1 Maskable- 64bit+
-		Address: 0000000000000000  Data: 0000
+	Capabilities: [a0] MSI: Enable+ Count=1/1 Maskable- 64bit+
+		Address: 00000000fee00000  Data: 0000
 	Capabilities: [100 v1] Vendor Specific Information: ID=0001 Rev=1 Len=010 <?>
 	Capabilities: [150 v2] Advanced Error Reporting
 		UESta:	DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-

Now I stopped X and powered down the VM and started 2nd cycle:

--- 003-lspci.290x.during.1st.after.X.log	2014-02-07 01:16:29.644846503 +0100
+++ 004-lspci.290x.before.2nd.log	2014-02-07 01:16:50.966611282 +0100
@@ -1,9 +1,9 @@
 01:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] Hawaii XT [Radeon HD 8970] (prog-if 00 [VGA controller])
 	Subsystem: Advanced Micro Devices, Inc. [AMD/ATI] Device 0b00
-	Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR+ FastB2B- DisINTx+
+	Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx-
 	Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
 	Latency: 0, Cache Line Size: 64 bytes
-	Interrupt: pin A routed to IRQ 47
+	Interrupt: pin A routed to IRQ 18
 	Region 0: Memory at c0000000 (64-bit, prefetchable) [size=256M]
 	Region 2: Memory at df800000 (64-bit, prefetchable) [size=8M]
 	Region 4: I/O ports at be00 [size=256]
@@ -17,7 +17,7 @@
 		DevCap:	MaxPayload 256 bytes, PhantFunc 0, Latency L0s <4us, L1 unlimited
 			ExtTag+ AttnBtn- AttnInd- PwrInd- RBE+ FLReset-
 		DevCtl:	Report errors: Correctable- Non-Fatal- Fatal- Unsupported-
-			RlxdOrd+ ExtTag+ PhantFunc- AuxPwr- NoSnoop+
+			RlxdOrd+ ExtTag- PhantFunc- AuxPwr- NoSnoop+
 			MaxPayload 128 bytes, MaxReadReq 512 bytes
 		DevSta:	CorrErr- UncorrErr- FatalErr- UnsuppReq- AuxPwr- TransPend-
 		LnkCap:	Port #0, Speed 8GT/s, Width x16, ASPM L0s L1, Exit Latency L0s <64ns, L1 <1us
@@ -32,7 +32,7 @@
 			 Compliance De-emphasis: -6dB
 		LnkSta2: Current De-emphasis Level: -3.5dB, EqualizationComplete-, EqualizationPhase1-
 			 EqualizationPhase2-, EqualizationPhase3-, LinkEqualizationRequest-
-	Capabilities: [a0] MSI: Enable+ Count=1/1 Maskable- 64bit+
+	Capabilities: [a0] MSI: Enable- Count=1/1 Maskable- 64bit+
 		Address: 00000000fee00000  Data: 0000
 	Capabilities: [100 v1] Vendor Specific Information: ID=0001 Rev=1 Len=010 <?>
 	Capabilities: [150 v2] Advanced Error Reporting
@@ -45,7 +45,7 @@
 	Capabilities: [270 v1] #19
 	Capabilities: [2b0 v1] Address Translation Service (ATS)
 		ATSCap:	Invalidate Queue Depth: 00
-		ATSCtl:	Enable+, Smallest Translation Unit: 00
+		ATSCtl:	Enable-, Smallest Translation Unit: 00
 	Capabilities: [2c0 v1] #13
 	Capabilities: [2d0 v1] #1b
 	Kernel driver in use: vfio-pci

--- 003-lspci.290x.during.1st.after.X.log	2014-02-07 01:16:29.644846503 +0100
+++ 004-lspci.290x.before.2nd.log	2014-02-07 01:16:50.966611282 +0100
@@ -1,9 +1,9 @@
 01:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] Hawaii XT [Radeon HD 8970] (prog-if 00 [VGA controller])
 	Subsystem: Advanced Micro Devices, Inc. [AMD/ATI] Device 0b00
-	Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR+ FastB2B- DisINTx+
+	Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx-
 	Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
 	Latency: 0, Cache Line Size: 64 bytes
-	Interrupt: pin A routed to IRQ 47
+	Interrupt: pin A routed to IRQ 18
 	Region 0: Memory at c0000000 (64-bit, prefetchable) [size=256M]
 	Region 2: Memory at df800000 (64-bit, prefetchable) [size=8M]
 	Region 4: I/O ports at be00 [size=256]
@@ -17,7 +17,7 @@
 		DevCap:	MaxPayload 256 bytes, PhantFunc 0, Latency L0s <4us, L1 unlimited
 			ExtTag+ AttnBtn- AttnInd- PwrInd- RBE+ FLReset-
 		DevCtl:	Report errors: Correctable- Non-Fatal- Fatal- Unsupported-
-			RlxdOrd+ ExtTag+ PhantFunc- AuxPwr- NoSnoop+
+			RlxdOrd+ ExtTag- PhantFunc- AuxPwr- NoSnoop+
 			MaxPayload 128 bytes, MaxReadReq 512 bytes
 		DevSta:	CorrErr- UncorrErr- FatalErr- UnsuppReq- AuxPwr- TransPend-
 		LnkCap:	Port #0, Speed 8GT/s, Width x16, ASPM L0s L1, Exit Latency L0s <64ns, L1 <1us
@@ -32,7 +32,7 @@
 			 Compliance De-emphasis: -6dB
 		LnkSta2: Current De-emphasis Level: -3.5dB, EqualizationComplete-, EqualizationPhase1-
 			 EqualizationPhase2-, EqualizationPhase3-, LinkEqualizationRequest-
-	Capabilities: [a0] MSI: Enable+ Count=1/1 Maskable- 64bit+
+	Capabilities: [a0] MSI: Enable- Count=1/1 Maskable- 64bit+
 		Address: 00000000fee00000  Data: 0000
 	Capabilities: [100 v1] Vendor Specific Information: ID=0001 Rev=1 Len=010 <?>
 	Capabilities: [150 v2] Advanced Error Reporting
@@ -45,7 +45,7 @@
 	Capabilities: [270 v1] #19
 	Capabilities: [2b0 v1] Address Translation Service (ATS)
 		ATSCap:	Invalidate Queue Depth: 00
-		ATSCtl:	Enable+, Smallest Translation Unit: 00
+		ATSCtl:	Enable-, Smallest Translation Unit: 00
 	Capabilities: [2c0 v1] #13
 	Capabilities: [2d0 v1] #1b
 	Kernel driver in use: vfio-pci

--- 004-lspci.290x.before.2nd.log	2014-02-07 01:16:50.966611282 +0100
+++ 005-lspci.290x.during.2nd.before.X.log	2014-02-07 01:17:55.571676376 +0100
@@ -1,6 +1,6 @@
 01:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] Hawaii XT [Radeon HD 8970] (prog-if 00 [VGA controller])
 	Subsystem: Advanced Micro Devices, Inc. [AMD/ATI] Device 0b00
-	Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx-
+	Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR+ FastB2B- DisINTx-
 	Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
 	Latency: 0, Cache Line Size: 64 bytes
 	Interrupt: pin A routed to IRQ 18
@@ -19,12 +19,12 @@
 		DevCtl:	Report errors: Correctable- Non-Fatal- Fatal- Unsupported-
 			RlxdOrd+ ExtTag- PhantFunc- AuxPwr- NoSnoop+
 			MaxPayload 128 bytes, MaxReadReq 512 bytes
-		DevSta:	CorrErr- UncorrErr- FatalErr- UnsuppReq- AuxPwr- TransPend-
+		DevSta:	CorrErr+ UncorrErr- FatalErr- UnsuppReq+ AuxPwr- TransPend-
 		LnkCap:	Port #0, Speed 8GT/s, Width x16, ASPM L0s L1, Exit Latency L0s <64ns, L1 <1us
 			ClockPM- Surprise- LLActRep- BwNot-
 		LnkCtl:	ASPM L0s L1 Enabled; RCB 64 bytes Disabled- CommClk+
 			ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
-		LnkSta:	Speed 2.5GT/s, Width x1, TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-
+		LnkSta:	Speed 5GT/s, Width x16, TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-
 		DevCap2: Completion Timeout: Not Supported, TimeoutDis-, LTR-, OBFF Not Supported
 		DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis-, LTR-, OBFF Disabled
 		LnkCtl2: Target Link Speed: 8GT/s, EnterCompliance- SpeedDis-
@@ -33,7 +33,7 @@
 		LnkSta2: Current De-emphasis Level: -3.5dB, EqualizationComplete-, EqualizationPhase1-
 			 EqualizationPhase2-, EqualizationPhase3-, LinkEqualizationRequest-
 	Capabilities: [a0] MSI: Enable- Count=1/1 Maskable- 64bit+
-		Address: 00000000fee00000  Data: 0000
+		Address: 0000000000000000  Data: 0000
 	Capabilities: [100 v1] Vendor Specific Information: ID=0001 Rev=1 Len=010 <?>
 	Capabilities: [150 v2] Advanced Error Reporting
 		UESta:	DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
@@ -45,7 +45,7 @@
 	Capabilities: [270 v1] #19
 	Capabilities: [2b0 v1] Address Translation Service (ATS)
 		ATSCap:	Invalidate Queue Depth: 00
-		ATSCtl:	Enable-, Smallest Translation Unit: 00
+		ATSCtl:	Enable+, Smallest Translation Unit: 00
 	Capabilities: [2c0 v1] #13
 	Capabilities: [2d0 v1] #1b
 	Kernel driver in use: vfio-pci

--- 005-lspci.290x.during.2nd.before.X.log	2014-02-07 01:17:55.571676376 +0100
+++ 006-lspci.290x.during.2nd.after.X.crash.log	2014-02-07 01:18:16.996855362 +0100
@@ -1,9 +1,9 @@
 01:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] Hawaii XT [Radeon HD 8970] (prog-if 00 [VGA controller])
 	Subsystem: Advanced Micro Devices, Inc. [AMD/ATI] Device 0b00
-	Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR+ FastB2B- DisINTx-
+	Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR+ FastB2B- DisINTx+
 	Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
 	Latency: 0, Cache Line Size: 64 bytes
-	Interrupt: pin A routed to IRQ 18
+	Interrupt: pin A routed to IRQ 47
 	Region 0: Memory at c0000000 (64-bit, prefetchable) [size=256M]
 	Region 2: Memory at df800000 (64-bit, prefetchable) [size=8M]
 	Region 4: I/O ports at be00 [size=256]
@@ -17,9 +17,9 @@
 		DevCap:	MaxPayload 256 bytes, PhantFunc 0, Latency L0s <4us, L1 unlimited
 			ExtTag+ AttnBtn- AttnInd- PwrInd- RBE+ FLReset-
 		DevCtl:	Report errors: Correctable- Non-Fatal- Fatal- Unsupported-
-			RlxdOrd+ ExtTag- PhantFunc- AuxPwr- NoSnoop+
+			RlxdOrd+ ExtTag+ PhantFunc- AuxPwr- NoSnoop+
 			MaxPayload 128 bytes, MaxReadReq 512 bytes
-		DevSta:	CorrErr+ UncorrErr- FatalErr- UnsuppReq+ AuxPwr- TransPend-
+		DevSta:	CorrErr- UncorrErr- FatalErr- UnsuppReq- AuxPwr- TransPend-
 		LnkCap:	Port #0, Speed 8GT/s, Width x16, ASPM L0s L1, Exit Latency L0s <64ns, L1 <1us
 			ClockPM- Surprise- LLActRep- BwNot-
 		LnkCtl:	ASPM L0s L1 Enabled; RCB 64 bytes Disabled- CommClk+
@@ -32,8 +32,8 @@
 			 Compliance De-emphasis: -6dB
 		LnkSta2: Current De-emphasis Level: -3.5dB, EqualizationComplete-, EqualizationPhase1-
 			 EqualizationPhase2-, EqualizationPhase3-, LinkEqualizationRequest-
-	Capabilities: [a0] MSI: Enable- Count=1/1 Maskable- 64bit+
-		Address: 0000000000000000  Data: 0000
+	Capabilities: [a0] MSI: Enable+ Count=1/1 Maskable- 64bit+
+		Address: 00000000fee00000  Data: 0000
 	Capabilities: [100 v1] Vendor Specific Information: ID=0001 Rev=1 Len=010 <?>
 	Capabilities: [150 v2] Advanced Error Reporting
 		UESta:	DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-

Interesting is the diff between 1st and 2nd boot, so if I do the lspci
prior to the booting. The only difference between 1st start and 2nd
start are:

--- 001-lspci.290x.before.1st.log	2014-02-07 01:13:41.498827928 +0100
+++ 004-lspci.290x.before.2nd.log	2014-02-07 01:16:50.966611282 +0100
@@ -24,7 +24,7 @@
 			ClockPM- Surprise- LLActRep- BwNot-
 		LnkCtl:	ASPM L0s L1 Enabled; RCB 64 bytes Disabled- CommClk+
 			ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
-		LnkSta:	Speed 5GT/s, Width x16, TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-
+		LnkSta:	Speed 2.5GT/s, Width x1, TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-
 		DevCap2: Completion Timeout: Not Supported, TimeoutDis-, LTR-, OBFF Not Supported
 		DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis-, LTR-, OBFF Disabled
 		LnkCtl2: Target Link Speed: 8GT/s, EnterCompliance- SpeedDis-
@@ -33,13 +33,13 @@
 		LnkSta2: Current De-emphasis Level: -3.5dB, EqualizationComplete-, EqualizationPhase1-
 			 EqualizationPhase2-, EqualizationPhase3-, LinkEqualizationRequest-
 	Capabilities: [a0] MSI: Enable- Count=1/1 Maskable- 64bit+
-		Address: 0000000000000000  Data: 0000
+		Address: 00000000fee00000  Data: 0000
 	Capabilities: [100 v1] Vendor Specific Information: ID=0001 Rev=1 Len=010 <?>
 	Capabilities: [150 v2] Advanced Error Reporting
 		UESta:	DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
 		UEMsk:	DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
 		UESvrt:	DLP+ SDES+ TLP- FCP+ CmpltTO- CmpltAbrt- UnxCmplt- RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol-
-		CESta:	RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr-
+		CESta:	RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr+
 		CEMsk:	RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr+
 		AERCap:	First Error Pointer: 00, GenCap+ CGenEn- ChkCap+ ChkEn-
 	Capabilities: [270 v1] #19

After that if I do suspend-to-ram / resume trick I have again lspci
output from before 1st boot.

> Alex
> 

--Maik

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Qemu-devel] Multi GPU passthrough via VFIO
  2014-02-07  0:22             ` Maik Broemme
@ 2014-02-07 18:07               ` Maik Broemme
  2014-02-07 19:10               ` Alex Williamson
  1 sibling, 0 replies; 16+ messages in thread
From: Maik Broemme @ 2014-02-07 18:07 UTC (permalink / raw)
  To: Alex Williamson; +Cc: qemu-devel

Hi Alex,

Maik Broemme <mbroemme@parallels.com> wrote:
> Hi Alex,
> 
> Alex Williamson <alex.williamson@redhat.com> wrote:
> > On Thu, 2014-02-06 at 01:25 +0100, Maik Broemme wrote:
> > > Hi Alex,
> > > 
> > > Maik Broemme <mbroemme@parallels.com> wrote:
> > > > > > > > Another minor issue is that the R9 290X is not reset during shutdown of
> > > > > > > > VM (neither Linux nor Windows) but it can be tricked with doing
> > > > > > > > "suspend-to-ram" between two starts. That's why I use '-no-reboot' option
> > > > > > > > in QEMU. The 7870 is doing the reset properly.
> > > > > > > 
> > > > > > > 
> > > > > > > Is the NoSoftRst "-" on the 290X vs "+" on the 7870 in lspci -vvv by
> > > > > > > chance?  Thanks,
> > > > > > > 
> > > > > > 
> > > > > > Here are both. It is funny it is opposite as you described. :)
> > > > > 
> > > > > 
> > > > > Oops, yes.  Does this help?
> > > > > 
> > > > > --- a/hw/misc/vfio.c
> > > > > +++ b/hw/misc/vfio.c
> > > > > @@ -3136,7 +3136,7 @@ static void vfio_pci_reset_handler(void *opaque)
> > > > >  
> > > > >      QLIST_FOREACH(group, &group_list, next) {
> > > > >          QLIST_FOREACH(vdev, &group->device_list, next) {
> > > > > -            if (!vdev->reset_works || (!vdev->has_flr && vdev->has_pm_reset)) {
> > > > > +            if (!vdev->reset_works || !vdev->has_flr) {
> > > > >                  vdev->needs_reset = true;
> > > > >              }
> > > > >          }
> > > > > 
> > > > > I can't figure out why I coded it the way that I did.  Probably overly
> > > > > targeting a specific device.  Thanks,
> > > > > 
> > > > 
> > > > This patch works absolutely fine. After applying it to my 'qemu-git', the
> > > > device resets works flawlessly. So it would be great to push it upstream
> > > > as it looks good.
> > > > 
> > > 
> > > Okay sorry. I was too fast here. It was just working first time but now
> > > even after clean reboot it no longer works as expected but behavior
> > > is very strange.
> > > 
> > > Windows:
> > > 
> > >   1st boot works fine - boot VGA and Windows ATI driver loaded, issue
> > >       reboot and qemu stopped due to '-no-reboot'.
> > > 
> > >   2nd boot works partially - boot VGA and Windows ATI driver loaded but
> > >       black screen and my system becames terrible slow and mostly
> > >       unresponsive. My dmesg shows immediately after ATI driver will
> > >       enable the device the following:
> > > 
> > > [  159.984324] vfio_ecap_init: 0000:01:00.0 hiding ecap 0x19@0x270
> > > [  159.984340] vfio_ecap_init: 0000:01:00.0 hiding ecap 0x1b@0x2d0
> > > [  160.129036] vfio_ecap_init: 0000:02:00.0 hiding ecap 0x19@0x270
> > > [  160.129049] vfio_ecap_init: 0000:02:00.0 hiding ecap 0x1b@0x2d0
> > > [  172.977677] kvm: zapping shadow pages for mmio generation wraparound
> > > [  173.160174] br0: port 2(tap0) entered forwarding state
> > > [  175.902967] vfio-pci 0000:01:00.0: irq 46 for MSI/MSI-X
> > > [  188.340430] Clocksource tsc unstable (delta = -119654611 ns)
> > > [  188.340511] Switched to clocksource hpet
> > > [  191.088693] hpet1: lost 12 rtc interrupts
> > > [  191.926555] hpet1: lost 25 rtc interrupts
> > > 
> > >   So your patch fixed indeed reset issue of boot VGA but something else
> > >   is wrong now. :)
> > 
> > Can you try the cards separately?  If you run lspci on the device in the
> > host, does it report as normal?  Often when the host gets slow and we
> > get these sorts of clock issues it means the bus is fatal and we get
> > timeouts trying to read from it.
> > 
> 
> Okay with only one card I don't have the clock issues anymore, so we
> should look into this a bit later as working reset seems more important
> for now.
> 
> > > Linux (fglrx):
> > > 
> > >   1st boot works fine - boot VGA, fglrx loads fine and X could be
> > >       started, issue reboot via SSH and qemu stopped due to
> > >       '-no-reboot'.
> > > 
> > >   2nd boot works partially - boot VGA, fglrx loads fine but X couldn't
> > >       be started and fails with:
> > > 
> > > [   34.265111] fglrx_pci 0000:02:00.0: irq 50 for MSI/MSI-X
> > > [   34.344313] <6>[fglrx] Firegl kernel thread PID: 318
> > > [   34.344400] <6>[fglrx] Firegl kernel thread PID: 319
> > > [   34.344478] <6>[fglrx] Firegl kernel thread PID: 320
> > > [   34.344589] <6>[fglrx] IRQ 50 Enabled
> > > [   34.356105] <6>[fglrx] Reserved FB block: Shared offset:0, size:1000000 
> > > [   34.356107] <6>[fglrx] Reserved FB block: Unshared offset:fac3000, size:3000 
> > > [   34.356109] <6>[fglrx] Reserved FB block: Unshared offset:fac6000, size:23a000 
> > > [   34.356110] <6>[fglrx] Reserved FB block: Unshared offset:7fff4000, size:c000 
> > > [   34.386436] fglrx_pci 0000:01:00.0: irq 51 for MSI/MSI-X
> > > [   34.490902] <6>[fglrx] Firegl kernel thread PID: 321
> > > [   34.490994] <6>[fglrx] Firegl kernel thread PID: 322
> > > [   34.491069] <6>[fglrx] Firegl kernel thread PID: 323
> > > [   34.491166] <6>[fglrx] IRQ 51 Enabled
> > > [   34.505271] <6>[fglrx] Reserved FB block: Shared offset:0, size:1000000 
> > > [   34.505273] <6>[fglrx] Reserved FB block: Unshared offset:f9c3000, size:3000 
> > > [   34.505274] <6>[fglrx] Reserved FB block: Unshared offset:f9c6000, size:23a000 
> > > [   34.505276] <6>[fglrx] Reserved FB block: Unshared offset:fc00000, size:100000 
> > > [   34.505277] <6>[fglrx] Reserved FB block: Unshared offset:fff8000, size:8000 
> > > [   34.505278] <6>[fglrx] Reserved FB block: Unshared offset:ffff4000, size:c000 
> > > [   34.526198] BUG: unable to handle kernel paging request at ffff880c724e8008
> > > [   34.526203] IP: [<ffffffffa0399af6>] TF_PhwCIslands_PopulateAndUploadSclkMclkDPMLevels+0x96/0x3d0 [fglrx]
> > > [   34.526277] PGD 1b3e067 PUD 0 
> > > [   34.526279] Oops: 0002 [#1] PREEMPT SMP 
> > > [   34.526282] Modules linked in: mousedev crct10dif_pclmul crct10dif_common crc32_pclmul crc32c_intel ghash_clmulni_intel ppdev aesni_intel snd_hda_codec_hdmi aes_x86_64 lrw gf128mul glue_helper ablk_helper cryptd snd_hda_intel microcode snd_hda_codec serio_raw psmouse parport_pc snd_hwdep snd_pcm parport snd_page_alloc processor snd_timer snd soundcore i2c_i801 intel_agp lpc_ich pcspkr intel_gtt i2c_core shpchp evdev fglrx(PO) amd_iommu_v2 button ext4 crc16 mbcache jbd2 atkbd libps2 virtio_blk virtio_net ahci libahci libata scsi_mod i8042 floppy serio virtio_pci virtio_ring virtio
> > > [   34.526307] CPU: 1 PID: 316 Comm: X Tainted: P           O 3.13.1-2-ARCH #1
> > > [   34.526309] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS Bochs 01/01/2011
> > > [   34.526311] task: ffff8800776e2d00 ti: ffff880037a28000 task.ti: ffff880037a28000
> > > [   34.526312] RIP: 0010:[<ffffffffa0399af6>]  [<ffffffffa0399af6>] TF_PhwCIslands_PopulateAndUploadSclkMclkDPMLevels+0x96/0x3d0 [fglrx]
> > > [   34.526353] RSP: 0018:ffff880037a29810  EFLAGS: 00010296
> > > [   34.526354] RAX: 0000000000000001 RBX: ffff8800724e800c RCX: 0000000000000006
> > > [   34.526356] RDX: 0000000000000003 RSI: 0000000000000002 RDI: ffff8800724e8264
> > > [   34.526357] RBP: ffff88007b19a00c R08: 00000000000186a0 R09: 000000000001e848
> > > [   34.526358] R10: 00000002fffffffd R11: 00000000ffffffff R12: 0000000000000001
> > > [   34.526359] R13: ffff88007b19a00c R14: 0000000000000000 R15: ffff880037a298b0
> > > [   34.526363] FS:  00007f0ba649b880(0000) GS:ffff88007fd00000(0000) knlGS:0000000000000000
> > > [   34.526365] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
> > > [   34.526366] CR2: ffff880c724e8008 CR3: 0000000037998000 CR4: 00000000000406e0
> > > [   34.526372] Stack:
> > > [   34.526373]  ffff88007b19a2f4 ffff88007bffcd1c 0000000000000001 ffffffffa0322cf0
> > > [   34.526375]  0000000000000000 0000000000000000 0000000000000000 ffff880077ed2c08
> > > [   34.526378]  0000000000000000 ffff880077ed2c08 ffff880037a298a0 ffffffffa0327f14
> > > [   34.526380] Call Trace:
> > > [   34.526435]  [<ffffffffa0322cf0>] ? PHM_DispatchTable+0xf0/0x220 [fglrx]
> > > [   34.526490]  [<ffffffffa0327f14>] ? PECI_NotifyDALPreAdapterClockChange+0x144/0x160 [fglrx]
> > > [   34.526546]  [<ffffffffa031e321>] ? PHM_SetPowerState+0x31/0xc0 [fglrx]
> > > [   34.526597]  [<ffffffffa0340a5b>] ? PSM_ApplyHardwareAttributes_Dynamic+0x9b/0xf0 [fglrx]
> > > [   34.526651]  [<ffffffffa033fde9>] ? PSM_AdjustPowerState_Dynamic+0x169/0x540 [fglrx]
> > > [   34.526668]  [<ffffffffa0322cf0>] ? PHM_DispatchTable+0xf0/0x220 [fglrx]
> > > [   34.526668]  [<ffffffffa0342ee4>] ? PEM_ExcuteEventChain+0x64/0xe0 [fglrx]
> > > [   34.526668]  [<ffffffffa0341302>] ? PEM_HandleEvent+0x92/0xd0 [fglrx]
> > > [   34.526668]  [<ffffffffa03357c0>] ? PEM_CWDDEPM_NotifyEvent+0xe0/0x4d0 [fglrx]
> > > [   34.526668]  [<ffffffffa0333869>] ? PP_Cwdde+0x109/0x180 [fglrx]
> > > [   34.526668]  [<ffffffffa02091dc>] ? firegl_pplib_cwddepm+0xbc/0x130 [fglrx]
> > > [   34.526668]  [<ffffffffa02092d9>] ? firegl_pplib_notify_event+0x89/0xd0 [fglrx]
> > > [   34.526668]  [<ffffffffa020292f>] ? hal_init_gpu+0x2bf/0x480 [fglrx]
> > > [   34.526668]  [<ffffffffa01dcc7b>] ? firegl_open+0x2db/0x310 [fglrx]
> > > [   34.526668]  [<ffffffffa01cb287>] ? ip_firegl_open+0x17/0x20 [fglrx]
> > > [   34.526668]  [<ffffffffa01ccac8>] ? firegl_stub_open+0x98/0x100 [fglrx]
> > > [   34.526668]  [<ffffffff811a82bf>] ? chrdev_open+0x9f/0x1d0
> > > [   34.526668]  [<ffffffff811a1967>] ? do_dentry_open+0x1b7/0x2c0
> > > [   34.526668]  [<ffffffff811aed41>] ? __inode_permission+0x41/0xb0
> > > [   34.526668]  [<ffffffff811a8220>] ? cdev_put+0x30/0x30
> > > [   34.526668]  [<ffffffff811a1d91>] ? finish_open+0x31/0x40
> > > [   34.526668]  [<ffffffff811b1b72>] ? do_last+0x572/0xe90
> > > [   34.526668]  [<ffffffff811af036>] ? link_path_walk+0x236/0x8d0
> > > [   34.526668]  [<ffffffff811b254b>] ? path_openat+0xbb/0x6b0
> > > [   34.526668]  [<ffffffff811b3c6a>] ? do_filp_open+0x3a/0x90
> > > [   34.526668]  [<ffffffff811c0567>] ? __alloc_fd+0xa7/0x130
> > > [   34.526668]  [<ffffffff811a2f49>] ? do_sys_open+0x129/0x220
> > > [   34.526668]  [<ffffffff811a305e>] ? SyS_open+0x1e/0x20
> > > [   34.526668]  [<ffffffff8152136d>] ? system_call_fastpath+0x1a/0x1f
> > > [   34.526668] Code: 8b 4a 1c 8b 93 e0 18 00 00 48 8d bb 58 02 00 00 85 d2 0f 84 63 02 00 00 f6 c2 01 0f 84 20 01 00 00 44 8b 1b 41 ff cb 4f 8d 14 5b <46> 89 44 93 08 8b 95 3c 02 00 00 48 89 d0 48 c1 e8 07 a8 01 75 
> > > [   34.526668] RIP  [<ffffffffa0399af6>] TF_PhwCIslands_PopulateAndUploadSclkMclkDPMLevels+0x96/0x3d0 [fglrx]
> > > [   34.526668]  RSP <ffff880037a29810>
> > > [   34.526668] CR2: ffff880c724e8008
> > > [   34.526668] ---[ end trace 5431e6dcf1c31dea ]---
> > > [   69.317528] type=1006 audit(1391649552.046:4): pid=324 uid=0 old auid=4294967295 new auid=0 old ses=4294967295 new ses=3 res=1
> > > 
> > > I know it is the binary driver but I would also retry with radeon one but
> > > I believe there will be a similar crash. In my first try I just rebooted
> > > the Linux VM several times without starting X.
> > > 
> > > I got it one time working without getting 'Clocksource tsc unstable' but
> > > now I'm unable to repeat it. So I believe something more is needed.
> > 
> > Bus resets are a mixed blessing, it returns the card to a relatively
> > known state, but it's a fairly unusual event from a platform perspective
> > and we have no idea what kind of quirks the host system bios might have
> > in place to workaround hardware.  If the bus is not fatal you might try
> > running lspci -vvv in the host at various points to see what changed.
> > For instance, boot a Linux guest to text mode and see if the card is in
> > the same state between first boot and second boot before starting X.
> > Thanks,
> > 
> 
> I tried the R9 290X separately now. You're right there are some changes
> between lspci -vvv output between 1st and 2nd boot and they are reset
> if I do "suspend-to-ram" and resume before 3rd boot of VM. Below is the
> lspci from 1st boot and the diffs of the lspci outputs:
> 
> --- 001-lspci.290x.before.1st.log	2014-02-07 01:13:41.498827928 +0100
> +++ 002-lspci.290x.during.1st.before.X.log	2014-02-07 01:14:47.984612423 +0100
> @@ -1,6 +1,6 @@
>  01:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] Hawaii XT [Radeon HD 8970] (prog-if 00 [VGA controller])
>  	Subsystem: Advanced Micro Devices, Inc. [AMD/ATI] Device 0b00
> -	Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx-
> +	Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR+ FastB2B- DisINTx-
>  	Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
>  	Latency: 0, Cache Line Size: 64 bytes
>  	Interrupt: pin A routed to IRQ 18
> @@ -19,7 +19,7 @@
>  		DevCtl:	Report errors: Correctable- Non-Fatal- Fatal- Unsupported-
>  			RlxdOrd+ ExtTag- PhantFunc- AuxPwr- NoSnoop+
>  			MaxPayload 128 bytes, MaxReadReq 512 bytes
> -		DevSta:	CorrErr- UncorrErr- FatalErr- UnsuppReq- AuxPwr- TransPend-
> +		DevSta:	CorrErr+ UncorrErr- FatalErr- UnsuppReq+ AuxPwr- TransPend-
>  		LnkCap:	Port #0, Speed 8GT/s, Width x16, ASPM L0s L1, Exit Latency L0s <64ns, L1 <1us
>  			ClockPM- Surprise- LLActRep- BwNot-
>  		LnkCtl:	ASPM L0s L1 Enabled; RCB 64 bytes Disabled- CommClk+
> @@ -39,13 +39,13 @@
>  		UESta:	DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
>  		UEMsk:	DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
>  		UESvrt:	DLP+ SDES+ TLP- FCP+ CmpltTO- CmpltAbrt- UnxCmplt- RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol-
> -		CESta:	RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr-
> +		CESta:	RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr+
>  		CEMsk:	RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr+
>  		AERCap:	First Error Pointer: 00, GenCap+ CGenEn- ChkCap+ ChkEn-
>  	Capabilities: [270 v1] #19
>  	Capabilities: [2b0 v1] Address Translation Service (ATS)
>  		ATSCap:	Invalidate Queue Depth: 00
> -		ATSCtl:	Enable-, Smallest Translation Unit: 00
> +		ATSCtl:	Enable+, Smallest Translation Unit: 00
>  	Capabilities: [2c0 v1] #13
>  	Capabilities: [2d0 v1] #1b
>  	Kernel driver in use: vfio-pci
> 
> --- 002-lspci.290x.during.1st.before.X.log	2014-02-07 01:14:47.984612423 +0100
> +++ 003-lspci.290x.during.1st.after.X.log	2014-02-07 01:16:29.644846503 +0100
> @@ -1,9 +1,9 @@
>  01:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] Hawaii XT [Radeon HD 8970] (prog-if 00 [VGA controller])
>  	Subsystem: Advanced Micro Devices, Inc. [AMD/ATI] Device 0b00
> -	Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR+ FastB2B- DisINTx-
> +	Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR+ FastB2B- DisINTx+
>  	Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
>  	Latency: 0, Cache Line Size: 64 bytes
> -	Interrupt: pin A routed to IRQ 18
> +	Interrupt: pin A routed to IRQ 47
>  	Region 0: Memory at c0000000 (64-bit, prefetchable) [size=256M]
>  	Region 2: Memory at df800000 (64-bit, prefetchable) [size=8M]
>  	Region 4: I/O ports at be00 [size=256]
> @@ -17,14 +17,14 @@
>  		DevCap:	MaxPayload 256 bytes, PhantFunc 0, Latency L0s <4us, L1 unlimited
>  			ExtTag+ AttnBtn- AttnInd- PwrInd- RBE+ FLReset-
>  		DevCtl:	Report errors: Correctable- Non-Fatal- Fatal- Unsupported-
> -			RlxdOrd+ ExtTag- PhantFunc- AuxPwr- NoSnoop+
> +			RlxdOrd+ ExtTag+ PhantFunc- AuxPwr- NoSnoop+
>  			MaxPayload 128 bytes, MaxReadReq 512 bytes
> -		DevSta:	CorrErr+ UncorrErr- FatalErr- UnsuppReq+ AuxPwr- TransPend-
> +		DevSta:	CorrErr- UncorrErr- FatalErr- UnsuppReq- AuxPwr- TransPend-
>  		LnkCap:	Port #0, Speed 8GT/s, Width x16, ASPM L0s L1, Exit Latency L0s <64ns, L1 <1us
>  			ClockPM- Surprise- LLActRep- BwNot-
>  		LnkCtl:	ASPM L0s L1 Enabled; RCB 64 bytes Disabled- CommClk+
>  			ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
> -		LnkSta:	Speed 5GT/s, Width x16, TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-
> +		LnkSta:	Speed 2.5GT/s, Width x1, TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-
>  		DevCap2: Completion Timeout: Not Supported, TimeoutDis-, LTR-, OBFF Not Supported
>  		DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis-, LTR-, OBFF Disabled
>  		LnkCtl2: Target Link Speed: 8GT/s, EnterCompliance- SpeedDis-
> @@ -32,8 +32,8 @@
>  			 Compliance De-emphasis: -6dB
>  		LnkSta2: Current De-emphasis Level: -3.5dB, EqualizationComplete-, EqualizationPhase1-
>  			 EqualizationPhase2-, EqualizationPhase3-, LinkEqualizationRequest-
> -	Capabilities: [a0] MSI: Enable- Count=1/1 Maskable- 64bit+
> -		Address: 0000000000000000  Data: 0000
> +	Capabilities: [a0] MSI: Enable+ Count=1/1 Maskable- 64bit+
> +		Address: 00000000fee00000  Data: 0000
>  	Capabilities: [100 v1] Vendor Specific Information: ID=0001 Rev=1 Len=010 <?>
>  	Capabilities: [150 v2] Advanced Error Reporting
>  		UESta:	DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
> 
> Now I stopped X and powered down the VM and started 2nd cycle:
> 
> --- 003-lspci.290x.during.1st.after.X.log	2014-02-07 01:16:29.644846503 +0100
> +++ 004-lspci.290x.before.2nd.log	2014-02-07 01:16:50.966611282 +0100
> @@ -1,9 +1,9 @@
>  01:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] Hawaii XT [Radeon HD 8970] (prog-if 00 [VGA controller])
>  	Subsystem: Advanced Micro Devices, Inc. [AMD/ATI] Device 0b00
> -	Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR+ FastB2B- DisINTx+
> +	Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx-
>  	Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
>  	Latency: 0, Cache Line Size: 64 bytes
> -	Interrupt: pin A routed to IRQ 47
> +	Interrupt: pin A routed to IRQ 18
>  	Region 0: Memory at c0000000 (64-bit, prefetchable) [size=256M]
>  	Region 2: Memory at df800000 (64-bit, prefetchable) [size=8M]
>  	Region 4: I/O ports at be00 [size=256]
> @@ -17,7 +17,7 @@
>  		DevCap:	MaxPayload 256 bytes, PhantFunc 0, Latency L0s <4us, L1 unlimited
>  			ExtTag+ AttnBtn- AttnInd- PwrInd- RBE+ FLReset-
>  		DevCtl:	Report errors: Correctable- Non-Fatal- Fatal- Unsupported-
> -			RlxdOrd+ ExtTag+ PhantFunc- AuxPwr- NoSnoop+
> +			RlxdOrd+ ExtTag- PhantFunc- AuxPwr- NoSnoop+
>  			MaxPayload 128 bytes, MaxReadReq 512 bytes
>  		DevSta:	CorrErr- UncorrErr- FatalErr- UnsuppReq- AuxPwr- TransPend-
>  		LnkCap:	Port #0, Speed 8GT/s, Width x16, ASPM L0s L1, Exit Latency L0s <64ns, L1 <1us
> @@ -32,7 +32,7 @@
>  			 Compliance De-emphasis: -6dB
>  		LnkSta2: Current De-emphasis Level: -3.5dB, EqualizationComplete-, EqualizationPhase1-
>  			 EqualizationPhase2-, EqualizationPhase3-, LinkEqualizationRequest-
> -	Capabilities: [a0] MSI: Enable+ Count=1/1 Maskable- 64bit+
> +	Capabilities: [a0] MSI: Enable- Count=1/1 Maskable- 64bit+
>  		Address: 00000000fee00000  Data: 0000
>  	Capabilities: [100 v1] Vendor Specific Information: ID=0001 Rev=1 Len=010 <?>
>  	Capabilities: [150 v2] Advanced Error Reporting
> @@ -45,7 +45,7 @@
>  	Capabilities: [270 v1] #19
>  	Capabilities: [2b0 v1] Address Translation Service (ATS)
>  		ATSCap:	Invalidate Queue Depth: 00
> -		ATSCtl:	Enable+, Smallest Translation Unit: 00
> +		ATSCtl:	Enable-, Smallest Translation Unit: 00
>  	Capabilities: [2c0 v1] #13
>  	Capabilities: [2d0 v1] #1b
>  	Kernel driver in use: vfio-pci
> 
> --- 003-lspci.290x.during.1st.after.X.log	2014-02-07 01:16:29.644846503 +0100
> +++ 004-lspci.290x.before.2nd.log	2014-02-07 01:16:50.966611282 +0100
> @@ -1,9 +1,9 @@
>  01:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] Hawaii XT [Radeon HD 8970] (prog-if 00 [VGA controller])
>  	Subsystem: Advanced Micro Devices, Inc. [AMD/ATI] Device 0b00
> -	Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR+ FastB2B- DisINTx+
> +	Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx-
>  	Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
>  	Latency: 0, Cache Line Size: 64 bytes
> -	Interrupt: pin A routed to IRQ 47
> +	Interrupt: pin A routed to IRQ 18
>  	Region 0: Memory at c0000000 (64-bit, prefetchable) [size=256M]
>  	Region 2: Memory at df800000 (64-bit, prefetchable) [size=8M]
>  	Region 4: I/O ports at be00 [size=256]
> @@ -17,7 +17,7 @@
>  		DevCap:	MaxPayload 256 bytes, PhantFunc 0, Latency L0s <4us, L1 unlimited
>  			ExtTag+ AttnBtn- AttnInd- PwrInd- RBE+ FLReset-
>  		DevCtl:	Report errors: Correctable- Non-Fatal- Fatal- Unsupported-
> -			RlxdOrd+ ExtTag+ PhantFunc- AuxPwr- NoSnoop+
> +			RlxdOrd+ ExtTag- PhantFunc- AuxPwr- NoSnoop+
>  			MaxPayload 128 bytes, MaxReadReq 512 bytes
>  		DevSta:	CorrErr- UncorrErr- FatalErr- UnsuppReq- AuxPwr- TransPend-
>  		LnkCap:	Port #0, Speed 8GT/s, Width x16, ASPM L0s L1, Exit Latency L0s <64ns, L1 <1us
> @@ -32,7 +32,7 @@
>  			 Compliance De-emphasis: -6dB
>  		LnkSta2: Current De-emphasis Level: -3.5dB, EqualizationComplete-, EqualizationPhase1-
>  			 EqualizationPhase2-, EqualizationPhase3-, LinkEqualizationRequest-
> -	Capabilities: [a0] MSI: Enable+ Count=1/1 Maskable- 64bit+
> +	Capabilities: [a0] MSI: Enable- Count=1/1 Maskable- 64bit+
>  		Address: 00000000fee00000  Data: 0000
>  	Capabilities: [100 v1] Vendor Specific Information: ID=0001 Rev=1 Len=010 <?>
>  	Capabilities: [150 v2] Advanced Error Reporting
> @@ -45,7 +45,7 @@
>  	Capabilities: [270 v1] #19
>  	Capabilities: [2b0 v1] Address Translation Service (ATS)
>  		ATSCap:	Invalidate Queue Depth: 00
> -		ATSCtl:	Enable+, Smallest Translation Unit: 00
> +		ATSCtl:	Enable-, Smallest Translation Unit: 00
>  	Capabilities: [2c0 v1] #13
>  	Capabilities: [2d0 v1] #1b
>  	Kernel driver in use: vfio-pci
> 
> --- 004-lspci.290x.before.2nd.log	2014-02-07 01:16:50.966611282 +0100
> +++ 005-lspci.290x.during.2nd.before.X.log	2014-02-07 01:17:55.571676376 +0100
> @@ -1,6 +1,6 @@
>  01:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] Hawaii XT [Radeon HD 8970] (prog-if 00 [VGA controller])
>  	Subsystem: Advanced Micro Devices, Inc. [AMD/ATI] Device 0b00
> -	Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx-
> +	Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR+ FastB2B- DisINTx-
>  	Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
>  	Latency: 0, Cache Line Size: 64 bytes
>  	Interrupt: pin A routed to IRQ 18
> @@ -19,12 +19,12 @@
>  		DevCtl:	Report errors: Correctable- Non-Fatal- Fatal- Unsupported-
>  			RlxdOrd+ ExtTag- PhantFunc- AuxPwr- NoSnoop+
>  			MaxPayload 128 bytes, MaxReadReq 512 bytes
> -		DevSta:	CorrErr- UncorrErr- FatalErr- UnsuppReq- AuxPwr- TransPend-
> +		DevSta:	CorrErr+ UncorrErr- FatalErr- UnsuppReq+ AuxPwr- TransPend-
>  		LnkCap:	Port #0, Speed 8GT/s, Width x16, ASPM L0s L1, Exit Latency L0s <64ns, L1 <1us
>  			ClockPM- Surprise- LLActRep- BwNot-
>  		LnkCtl:	ASPM L0s L1 Enabled; RCB 64 bytes Disabled- CommClk+
>  			ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
> -		LnkSta:	Speed 2.5GT/s, Width x1, TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-
> +		LnkSta:	Speed 5GT/s, Width x16, TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-
>  		DevCap2: Completion Timeout: Not Supported, TimeoutDis-, LTR-, OBFF Not Supported
>  		DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis-, LTR-, OBFF Disabled
>  		LnkCtl2: Target Link Speed: 8GT/s, EnterCompliance- SpeedDis-
> @@ -33,7 +33,7 @@
>  		LnkSta2: Current De-emphasis Level: -3.5dB, EqualizationComplete-, EqualizationPhase1-
>  			 EqualizationPhase2-, EqualizationPhase3-, LinkEqualizationRequest-
>  	Capabilities: [a0] MSI: Enable- Count=1/1 Maskable- 64bit+
> -		Address: 00000000fee00000  Data: 0000
> +		Address: 0000000000000000  Data: 0000
>  	Capabilities: [100 v1] Vendor Specific Information: ID=0001 Rev=1 Len=010 <?>
>  	Capabilities: [150 v2] Advanced Error Reporting
>  		UESta:	DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
> @@ -45,7 +45,7 @@
>  	Capabilities: [270 v1] #19
>  	Capabilities: [2b0 v1] Address Translation Service (ATS)
>  		ATSCap:	Invalidate Queue Depth: 00
> -		ATSCtl:	Enable-, Smallest Translation Unit: 00
> +		ATSCtl:	Enable+, Smallest Translation Unit: 00
>  	Capabilities: [2c0 v1] #13
>  	Capabilities: [2d0 v1] #1b
>  	Kernel driver in use: vfio-pci
> 
> --- 005-lspci.290x.during.2nd.before.X.log	2014-02-07 01:17:55.571676376 +0100
> +++ 006-lspci.290x.during.2nd.after.X.crash.log	2014-02-07 01:18:16.996855362 +0100
> @@ -1,9 +1,9 @@
>  01:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] Hawaii XT [Radeon HD 8970] (prog-if 00 [VGA controller])
>  	Subsystem: Advanced Micro Devices, Inc. [AMD/ATI] Device 0b00
> -	Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR+ FastB2B- DisINTx-
> +	Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR+ FastB2B- DisINTx+
>  	Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
>  	Latency: 0, Cache Line Size: 64 bytes
> -	Interrupt: pin A routed to IRQ 18
> +	Interrupt: pin A routed to IRQ 47
>  	Region 0: Memory at c0000000 (64-bit, prefetchable) [size=256M]
>  	Region 2: Memory at df800000 (64-bit, prefetchable) [size=8M]
>  	Region 4: I/O ports at be00 [size=256]
> @@ -17,9 +17,9 @@
>  		DevCap:	MaxPayload 256 bytes, PhantFunc 0, Latency L0s <4us, L1 unlimited
>  			ExtTag+ AttnBtn- AttnInd- PwrInd- RBE+ FLReset-
>  		DevCtl:	Report errors: Correctable- Non-Fatal- Fatal- Unsupported-
> -			RlxdOrd+ ExtTag- PhantFunc- AuxPwr- NoSnoop+
> +			RlxdOrd+ ExtTag+ PhantFunc- AuxPwr- NoSnoop+
>  			MaxPayload 128 bytes, MaxReadReq 512 bytes
> -		DevSta:	CorrErr+ UncorrErr- FatalErr- UnsuppReq+ AuxPwr- TransPend-
> +		DevSta:	CorrErr- UncorrErr- FatalErr- UnsuppReq- AuxPwr- TransPend-
>  		LnkCap:	Port #0, Speed 8GT/s, Width x16, ASPM L0s L1, Exit Latency L0s <64ns, L1 <1us
>  			ClockPM- Surprise- LLActRep- BwNot-
>  		LnkCtl:	ASPM L0s L1 Enabled; RCB 64 bytes Disabled- CommClk+
> @@ -32,8 +32,8 @@
>  			 Compliance De-emphasis: -6dB
>  		LnkSta2: Current De-emphasis Level: -3.5dB, EqualizationComplete-, EqualizationPhase1-
>  			 EqualizationPhase2-, EqualizationPhase3-, LinkEqualizationRequest-
> -	Capabilities: [a0] MSI: Enable- Count=1/1 Maskable- 64bit+
> -		Address: 0000000000000000  Data: 0000
> +	Capabilities: [a0] MSI: Enable+ Count=1/1 Maskable- 64bit+
> +		Address: 00000000fee00000  Data: 0000
>  	Capabilities: [100 v1] Vendor Specific Information: ID=0001 Rev=1 Len=010 <?>
>  	Capabilities: [150 v2] Advanced Error Reporting
>  		UESta:	DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
> 
> Interesting is the diff between 1st and 2nd boot, so if I do the lspci
> prior to the booting. The only difference between 1st start and 2nd
> start are:
> 
> --- 001-lspci.290x.before.1st.log	2014-02-07 01:13:41.498827928 +0100
> +++ 004-lspci.290x.before.2nd.log	2014-02-07 01:16:50.966611282 +0100
> @@ -24,7 +24,7 @@
>  			ClockPM- Surprise- LLActRep- BwNot-
>  		LnkCtl:	ASPM L0s L1 Enabled; RCB 64 bytes Disabled- CommClk+
>  			ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
> -		LnkSta:	Speed 5GT/s, Width x16, TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-
> +		LnkSta:	Speed 2.5GT/s, Width x1, TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-
>  		DevCap2: Completion Timeout: Not Supported, TimeoutDis-, LTR-, OBFF Not Supported
>  		DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis-, LTR-, OBFF Disabled
>  		LnkCtl2: Target Link Speed: 8GT/s, EnterCompliance- SpeedDis-
> @@ -33,13 +33,13 @@
>  		LnkSta2: Current De-emphasis Level: -3.5dB, EqualizationComplete-, EqualizationPhase1-
>  			 EqualizationPhase2-, EqualizationPhase3-, LinkEqualizationRequest-
>  	Capabilities: [a0] MSI: Enable- Count=1/1 Maskable- 64bit+
> -		Address: 0000000000000000  Data: 0000
> +		Address: 00000000fee00000  Data: 0000
>  	Capabilities: [100 v1] Vendor Specific Information: ID=0001 Rev=1 Len=010 <?>
>  	Capabilities: [150 v2] Advanced Error Reporting
>  		UESta:	DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
>  		UEMsk:	DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
>  		UESvrt:	DLP+ SDES+ TLP- FCP+ CmpltTO- CmpltAbrt- UnxCmplt- RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol-
> -		CESta:	RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr-
> +		CESta:	RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr+
>  		CEMsk:	RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr+
>  		AERCap:	First Error Pointer: 00, GenCap+ CGenEn- ChkCap+ ChkEn-
>  	Capabilities: [270 v1] #19
> 
> After that if I do suspend-to-ram / resume trick I have again lspci
> output from before 1st boot.
> 

Another workaround where your patch works fine is to do the following:

  #1 Start VM
  #2 Start X
  #3 Stop X
  #4 rmmod fglrx
  #5 poweroff

After this I'm able to restart the VM as many times as I want with boot
VGA, fglrx and X but obviously if the VM crashes I need to issue
"suspend-to-ram" / resume workaround. It looks like fglrx properly
disables the device if unloaded.

[   36.081197] <6>[fglrx] IRQ 48 Disabled
[   36.096488] <6>[fglrx] module unloaded - fglrx 13.35.5 [Jan 29 2014]

Should I retry it with radeon driver or with VFIO debug enabled?

> > Alex
> > 
> 
> --Maik
> 

--Maik

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Qemu-devel] Multi GPU passthrough via VFIO
  2014-02-07  0:22             ` Maik Broemme
  2014-02-07 18:07               ` Maik Broemme
@ 2014-02-07 19:10               ` Alex Williamson
  2014-02-07 20:17                 ` Maik Broemme
  1 sibling, 1 reply; 16+ messages in thread
From: Alex Williamson @ 2014-02-07 19:10 UTC (permalink / raw)
  To: Maik Broemme; +Cc: qemu-devel

On Fri, 2014-02-07 at 01:22 +0100, Maik Broemme wrote:
> Interesting is the diff between 1st and 2nd boot, so if I do the lspci
> prior to the booting. The only difference between 1st start and 2nd
> start are:
> 
> --- 001-lspci.290x.before.1st.log	2014-02-07 01:13:41.498827928 +0100
> +++ 004-lspci.290x.before.2nd.log	2014-02-07 01:16:50.966611282 +0100
> @@ -24,7 +24,7 @@
>  			ClockPM- Surprise- LLActRep- BwNot-
>  		LnkCtl:	ASPM L0s L1 Enabled; RCB 64 bytes Disabled- CommClk+
>  			ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
> -		LnkSta:	Speed 5GT/s, Width x16, TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-
> +		LnkSta:	Speed 2.5GT/s, Width x1, TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-
>  		DevCap2: Completion Timeout: Not Supported, TimeoutDis-, LTR-, OBFF Not Supported
>  		DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis-, LTR-, OBFF Disabled
>  		LnkCtl2: Target Link Speed: 8GT/s, EnterCompliance- SpeedDis-
> @@ -33,13 +33,13 @@
>  		LnkSta2: Current De-emphasis Level: -3.5dB, EqualizationComplete-, EqualizationPhase1-
>  			 EqualizationPhase2-, EqualizationPhase3-, LinkEqualizationRequest-
>  	Capabilities: [a0] MSI: Enable- Count=1/1 Maskable- 64bit+
> -		Address: 0000000000000000  Data: 0000
> +		Address: 00000000fee00000  Data: 0000
>  	Capabilities: [100 v1] Vendor Specific Information: ID=0001 Rev=1 Len=010 <?>
>  	Capabilities: [150 v2] Advanced Error Reporting
>  		UESta:	DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
>  		UEMsk:	DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
>  		UESvrt:	DLP+ SDES+ TLP- FCP+ CmpltTO- CmpltAbrt- UnxCmplt- RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol-
> -		CESta:	RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr-
> +		CESta:	RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr+
>  		CEMsk:	RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr+
>  		AERCap:	First Error Pointer: 00, GenCap+ CGenEn- ChkCap+ ChkEn-
>  	Capabilities: [270 v1] #19
> 
> After that if I do suspend-to-ram / resume trick I have again lspci
> output from before 1st boot.

The Link Status change after X is stopped seems the most interesting to
me.  The MSI change is probably explained by the MSI save/restore of the
device, but should be harmless since MSI is disabled.  I'm a bit
surprised the Correctable Error Status in the AER capability didn't get
cleared.  I would have thought that a bus reset would have caused the
link to retrain back to the original speed/width as well.  Let's check
that we're actually getting a bus reset, try this in addition to the
previous qemu patch.  This just enables debug logging for the bus resest
function.  Thanks,

Alex

diff --git a/hw/misc/vfio.c b/hw/misc/vfio.c
index 8db182f..7fec259 100644
--- a/hw/misc/vfio.c
+++ b/hw/misc/vfio.c
@@ -2927,6 +2927,10 @@ static bool vfio_pci_host_match(PCIHostDeviceAddress *hos
             host1->slot == host2->slot && host1->function == host2->function);
 }
 
+#undef DPRINTF
+#define DPRINTF(fmt, ...) \
+    do { fprintf(stderr, "vfio: " fmt, ## __VA_ARGS__); } while (0)
+
 static int vfio_pci_hot_reset(VFIODevice *vdev, bool single)
 {
     VFIOGroup *group;
@@ -3104,6 +3108,15 @@ out_single:
     return ret;
 }
 
+#undef DPRINTF
+#ifdef DEBUG_VFIO
+#define DPRINTF(fmt, ...) \
+    do { fprintf(stderr, "vfio: " fmt, ## __VA_ARGS__); } while (0)
+#else
+#define DPRINTF(fmt, ...) \
+    do { } while (0)
+#endif
+
 /*
  * We want to differentiate hot reset of mulitple in-use devices vs hot reset
  * of a single in-use device.  VFIO_DEVICE_RESET will already handle the case

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [Qemu-devel] Multi GPU passthrough via VFIO
  2014-02-07 19:10               ` Alex Williamson
@ 2014-02-07 20:17                 ` Maik Broemme
  2014-02-14  0:01                   ` Maik Broemme
  0 siblings, 1 reply; 16+ messages in thread
From: Maik Broemme @ 2014-02-07 20:17 UTC (permalink / raw)
  To: Alex Williamson; +Cc: qemu-devel

Hi Alex,

Alex Williamson <alex.williamson@redhat.com> wrote:
> On Fri, 2014-02-07 at 01:22 +0100, Maik Broemme wrote:
> > Interesting is the diff between 1st and 2nd boot, so if I do the lspci
> > prior to the booting. The only difference between 1st start and 2nd
> > start are:
> > 
> > --- 001-lspci.290x.before.1st.log	2014-02-07 01:13:41.498827928 +0100
> > +++ 004-lspci.290x.before.2nd.log	2014-02-07 01:16:50.966611282 +0100
> > @@ -24,7 +24,7 @@
> >  			ClockPM- Surprise- LLActRep- BwNot-
> >  		LnkCtl:	ASPM L0s L1 Enabled; RCB 64 bytes Disabled- CommClk+
> >  			ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
> > -		LnkSta:	Speed 5GT/s, Width x16, TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-
> > +		LnkSta:	Speed 2.5GT/s, Width x1, TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-
> >  		DevCap2: Completion Timeout: Not Supported, TimeoutDis-, LTR-, OBFF Not Supported
> >  		DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis-, LTR-, OBFF Disabled
> >  		LnkCtl2: Target Link Speed: 8GT/s, EnterCompliance- SpeedDis-
> > @@ -33,13 +33,13 @@
> >  		LnkSta2: Current De-emphasis Level: -3.5dB, EqualizationComplete-, EqualizationPhase1-
> >  			 EqualizationPhase2-, EqualizationPhase3-, LinkEqualizationRequest-
> >  	Capabilities: [a0] MSI: Enable- Count=1/1 Maskable- 64bit+
> > -		Address: 0000000000000000  Data: 0000
> > +		Address: 00000000fee00000  Data: 0000
> >  	Capabilities: [100 v1] Vendor Specific Information: ID=0001 Rev=1 Len=010 <?>
> >  	Capabilities: [150 v2] Advanced Error Reporting
> >  		UESta:	DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
> >  		UEMsk:	DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
> >  		UESvrt:	DLP+ SDES+ TLP- FCP+ CmpltTO- CmpltAbrt- UnxCmplt- RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol-
> > -		CESta:	RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr-
> > +		CESta:	RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr+
> >  		CEMsk:	RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr+
> >  		AERCap:	First Error Pointer: 00, GenCap+ CGenEn- ChkCap+ ChkEn-
> >  	Capabilities: [270 v1] #19
> > 
> > After that if I do suspend-to-ram / resume trick I have again lspci
> > output from before 1st boot.
> 
> The Link Status change after X is stopped seems the most interesting to
> me.  The MSI change is probably explained by the MSI save/restore of the
> device, but should be harmless since MSI is disabled.  I'm a bit
> surprised the Correctable Error Status in the AER capability didn't get
> cleared.  I would have thought that a bus reset would have caused the
> link to retrain back to the original speed/width as well.  Let's check
> that we're actually getting a bus reset, try this in addition to the
> previous qemu patch.  This just enables debug logging for the bus resest
> function.  Thanks,
> 

Below are the outputs from 2 boots, VGA, load fglrx and start X. (2nd
time X gets killed and oops happened)

- 1st boot:

vfio: vfio_pci_hot_reset(0000:01:00.1) multi
vfio: 0000:01:00.1: hot reset dependent devices:
vfio: 	0000:01:00.0 group 1
vfio: 	0000:01:00.1 group 1
vfio: 0000:01:00.1 hot reset: Success
vfio: vfio_pci_hot_reset(0000:01:00.1) one
vfio: 0000:01:00.1: hot reset dependent devices:
vfio: 	0000:01:00.0 group 1
vfio: vfio: found another in-use device 0000:01:00.0
vfio: vfio_pci_hot_reset(0000:01:00.0) one
vfio: 0000:01:00.0: hot reset dependent devices:
vfio: 	0000:01:00.0 group 1
vfio: 	0000:01:00.1 group 1
vfio: vfio: found another in-use device 0000:01:00.1

- 2nd boot:

vfio: vfio_pci_hot_reset(0000:01:00.1) multi
vfio: 0000:01:00.1: hot reset dependent devices:
vfio: 	0000:01:00.0 group 1
vfio: 	0000:01:00.1 group 1
vfio: 0000:01:00.1 hot reset: Success
vfio: vfio_pci_hot_reset(0000:01:00.1) one
vfio: 0000:01:00.1: hot reset dependent devices:
vfio: 	0000:01:00.0 group 1
vfio: vfio: found another in-use device 0000:01:00.0
vfio: vfio_pci_hot_reset(0000:01:00.0) one
vfio: 0000:01:00.0: hot reset dependent devices:
vfio: 	0000:01:00.0 group 1
vfio: 	0000:01:00.1 group 1
vfio: vfio: found another in-use device 0000:01:00.1

> Alex
> 
> diff --git a/hw/misc/vfio.c b/hw/misc/vfio.c
> index 8db182f..7fec259 100644
> --- a/hw/misc/vfio.c
> +++ b/hw/misc/vfio.c
> @@ -2927,6 +2927,10 @@ static bool vfio_pci_host_match(PCIHostDeviceAddress *hos
>              host1->slot == host2->slot && host1->function == host2->function);
>  }
>  
> +#undef DPRINTF
> +#define DPRINTF(fmt, ...) \
> +    do { fprintf(stderr, "vfio: " fmt, ## __VA_ARGS__); } while (0)
> +
>  static int vfio_pci_hot_reset(VFIODevice *vdev, bool single)
>  {
>      VFIOGroup *group;
> @@ -3104,6 +3108,15 @@ out_single:
>      return ret;
>  }
>  
> +#undef DPRINTF
> +#ifdef DEBUG_VFIO
> +#define DPRINTF(fmt, ...) \
> +    do { fprintf(stderr, "vfio: " fmt, ## __VA_ARGS__); } while (0)
> +#else
> +#define DPRINTF(fmt, ...) \
> +    do { } while (0)
> +#endif
> +
>  /*
>   * We want to differentiate hot reset of mulitple in-use devices vs hot reset
>   * of a single in-use device.  VFIO_DEVICE_RESET will already handle the case
> 
> 

--Maik

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Qemu-devel] Multi GPU passthrough via VFIO
  2014-02-07 20:17                 ` Maik Broemme
@ 2014-02-14  0:01                   ` Maik Broemme
  2014-02-14  0:33                     ` Alex Williamson
  0 siblings, 1 reply; 16+ messages in thread
From: Maik Broemme @ 2014-02-14  0:01 UTC (permalink / raw)
  To: Alex Williamson; +Cc: qemu-devel

Hi Alex,

Maik Broemme <mbroemme@parallels.com> wrote:
> Hi Alex,
> 
> Alex Williamson <alex.williamson@redhat.com> wrote:
> > On Fri, 2014-02-07 at 01:22 +0100, Maik Broemme wrote:
> > > Interesting is the diff between 1st and 2nd boot, so if I do the lspci
> > > prior to the booting. The only difference between 1st start and 2nd
> > > start are:
> > > 
> > > --- 001-lspci.290x.before.1st.log	2014-02-07 01:13:41.498827928 +0100
> > > +++ 004-lspci.290x.before.2nd.log	2014-02-07 01:16:50.966611282 +0100
> > > @@ -24,7 +24,7 @@
> > >  			ClockPM- Surprise- LLActRep- BwNot-
> > >  		LnkCtl:	ASPM L0s L1 Enabled; RCB 64 bytes Disabled- CommClk+
> > >  			ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
> > > -		LnkSta:	Speed 5GT/s, Width x16, TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-
> > > +		LnkSta:	Speed 2.5GT/s, Width x1, TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-
> > >  		DevCap2: Completion Timeout: Not Supported, TimeoutDis-, LTR-, OBFF Not Supported
> > >  		DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis-, LTR-, OBFF Disabled
> > >  		LnkCtl2: Target Link Speed: 8GT/s, EnterCompliance- SpeedDis-
> > > @@ -33,13 +33,13 @@
> > >  		LnkSta2: Current De-emphasis Level: -3.5dB, EqualizationComplete-, EqualizationPhase1-
> > >  			 EqualizationPhase2-, EqualizationPhase3-, LinkEqualizationRequest-
> > >  	Capabilities: [a0] MSI: Enable- Count=1/1 Maskable- 64bit+
> > > -		Address: 0000000000000000  Data: 0000
> > > +		Address: 00000000fee00000  Data: 0000
> > >  	Capabilities: [100 v1] Vendor Specific Information: ID=0001 Rev=1 Len=010 <?>
> > >  	Capabilities: [150 v2] Advanced Error Reporting
> > >  		UESta:	DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
> > >  		UEMsk:	DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
> > >  		UESvrt:	DLP+ SDES+ TLP- FCP+ CmpltTO- CmpltAbrt- UnxCmplt- RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol-
> > > -		CESta:	RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr-
> > > +		CESta:	RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr+
> > >  		CEMsk:	RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr+
> > >  		AERCap:	First Error Pointer: 00, GenCap+ CGenEn- ChkCap+ ChkEn-
> > >  	Capabilities: [270 v1] #19
> > > 
> > > After that if I do suspend-to-ram / resume trick I have again lspci
> > > output from before 1st boot.
> > 
> > The Link Status change after X is stopped seems the most interesting to
> > me.  The MSI change is probably explained by the MSI save/restore of the
> > device, but should be harmless since MSI is disabled.  I'm a bit
> > surprised the Correctable Error Status in the AER capability didn't get
> > cleared.  I would have thought that a bus reset would have caused the
> > link to retrain back to the original speed/width as well.  Let's check
> > that we're actually getting a bus reset, try this in addition to the
> > previous qemu patch.  This just enables debug logging for the bus resest
> > function.  Thanks,
> > 
> 
> Below are the outputs from 2 boots, VGA, load fglrx and start X. (2nd
> time X gets killed and oops happened)
> 
> - 1st boot:
> 
> vfio: vfio_pci_hot_reset(0000:01:00.1) multi
> vfio: 0000:01:00.1: hot reset dependent devices:
> vfio: 	0000:01:00.0 group 1
> vfio: 	0000:01:00.1 group 1
> vfio: 0000:01:00.1 hot reset: Success
> vfio: vfio_pci_hot_reset(0000:01:00.1) one
> vfio: 0000:01:00.1: hot reset dependent devices:
> vfio: 	0000:01:00.0 group 1
> vfio: vfio: found another in-use device 0000:01:00.0
> vfio: vfio_pci_hot_reset(0000:01:00.0) one
> vfio: 0000:01:00.0: hot reset dependent devices:
> vfio: 	0000:01:00.0 group 1
> vfio: 	0000:01:00.1 group 1
> vfio: vfio: found another in-use device 0000:01:00.1
> 
> - 2nd boot:
> 
> vfio: vfio_pci_hot_reset(0000:01:00.1) multi
> vfio: 0000:01:00.1: hot reset dependent devices:
> vfio: 	0000:01:00.0 group 1
> vfio: 	0000:01:00.1 group 1
> vfio: 0000:01:00.1 hot reset: Success
> vfio: vfio_pci_hot_reset(0000:01:00.1) one
> vfio: 0000:01:00.1: hot reset dependent devices:
> vfio: 	0000:01:00.0 group 1
> vfio: vfio: found another in-use device 0000:01:00.0
> vfio: vfio_pci_hot_reset(0000:01:00.0) one
> vfio: 0000:01:00.0: hot reset dependent devices:
> vfio: 	0000:01:00.0 group 1
> vfio: 	0000:01:00.1 group 1
> vfio: vfio: found another in-use device 0000:01:00.1
> 

Did you had already a chance to look into it or anything else I can help
with?

> > Alex
> > 
> > diff --git a/hw/misc/vfio.c b/hw/misc/vfio.c
> > index 8db182f..7fec259 100644
> > --- a/hw/misc/vfio.c
> > +++ b/hw/misc/vfio.c
> > @@ -2927,6 +2927,10 @@ static bool vfio_pci_host_match(PCIHostDeviceAddress *hos
> >              host1->slot == host2->slot && host1->function == host2->function);
> >  }
> >  
> > +#undef DPRINTF
> > +#define DPRINTF(fmt, ...) \
> > +    do { fprintf(stderr, "vfio: " fmt, ## __VA_ARGS__); } while (0)
> > +
> >  static int vfio_pci_hot_reset(VFIODevice *vdev, bool single)
> >  {
> >      VFIOGroup *group;
> > @@ -3104,6 +3108,15 @@ out_single:
> >      return ret;
> >  }
> >  
> > +#undef DPRINTF
> > +#ifdef DEBUG_VFIO
> > +#define DPRINTF(fmt, ...) \
> > +    do { fprintf(stderr, "vfio: " fmt, ## __VA_ARGS__); } while (0)
> > +#else
> > +#define DPRINTF(fmt, ...) \
> > +    do { } while (0)
> > +#endif
> > +
> >  /*
> >   * We want to differentiate hot reset of mulitple in-use devices vs hot reset
> >   * of a single in-use device.  VFIO_DEVICE_RESET will already handle the case
> > 
> > 
> 
> --Maik
> 

--Maik

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Qemu-devel] Multi GPU passthrough via VFIO
  2014-02-14  0:01                   ` Maik Broemme
@ 2014-02-14  0:33                     ` Alex Williamson
  2014-02-14 14:51                       ` Maik Broemme
  0 siblings, 1 reply; 16+ messages in thread
From: Alex Williamson @ 2014-02-14  0:33 UTC (permalink / raw)
  To: Maik Broemme; +Cc: qemu-devel

On Fri, 2014-02-14 at 01:01 +0100, Maik Broemme wrote:
> Hi Alex,
> 
> Maik Broemme <mbroemme@parallels.com> wrote:
> > Hi Alex,
> > 
> > Alex Williamson <alex.williamson@redhat.com> wrote:
> > > On Fri, 2014-02-07 at 01:22 +0100, Maik Broemme wrote:
> > > > Interesting is the diff between 1st and 2nd boot, so if I do the lspci
> > > > prior to the booting. The only difference between 1st start and 2nd
> > > > start are:
> > > > 
> > > > --- 001-lspci.290x.before.1st.log	2014-02-07 01:13:41.498827928 +0100
> > > > +++ 004-lspci.290x.before.2nd.log	2014-02-07 01:16:50.966611282 +0100
> > > > @@ -24,7 +24,7 @@
> > > >  			ClockPM- Surprise- LLActRep- BwNot-
> > > >  		LnkCtl:	ASPM L0s L1 Enabled; RCB 64 bytes Disabled- CommClk+
> > > >  			ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
> > > > -		LnkSta:	Speed 5GT/s, Width x16, TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-
> > > > +		LnkSta:	Speed 2.5GT/s, Width x1, TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-
> > > >  		DevCap2: Completion Timeout: Not Supported, TimeoutDis-, LTR-, OBFF Not Supported
> > > >  		DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis-, LTR-, OBFF Disabled
> > > >  		LnkCtl2: Target Link Speed: 8GT/s, EnterCompliance- SpeedDis-
> > > > @@ -33,13 +33,13 @@
> > > >  		LnkSta2: Current De-emphasis Level: -3.5dB, EqualizationComplete-, EqualizationPhase1-
> > > >  			 EqualizationPhase2-, EqualizationPhase3-, LinkEqualizationRequest-
> > > >  	Capabilities: [a0] MSI: Enable- Count=1/1 Maskable- 64bit+
> > > > -		Address: 0000000000000000  Data: 0000
> > > > +		Address: 00000000fee00000  Data: 0000
> > > >  	Capabilities: [100 v1] Vendor Specific Information: ID=0001 Rev=1 Len=010 <?>
> > > >  	Capabilities: [150 v2] Advanced Error Reporting
> > > >  		UESta:	DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
> > > >  		UEMsk:	DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
> > > >  		UESvrt:	DLP+ SDES+ TLP- FCP+ CmpltTO- CmpltAbrt- UnxCmplt- RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol-
> > > > -		CESta:	RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr-
> > > > +		CESta:	RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr+
> > > >  		CEMsk:	RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr+
> > > >  		AERCap:	First Error Pointer: 00, GenCap+ CGenEn- ChkCap+ ChkEn-
> > > >  	Capabilities: [270 v1] #19
> > > > 
> > > > After that if I do suspend-to-ram / resume trick I have again lspci
> > > > output from before 1st boot.
> > > 
> > > The Link Status change after X is stopped seems the most interesting to
> > > me.  The MSI change is probably explained by the MSI save/restore of the
> > > device, but should be harmless since MSI is disabled.  I'm a bit
> > > surprised the Correctable Error Status in the AER capability didn't get
> > > cleared.  I would have thought that a bus reset would have caused the
> > > link to retrain back to the original speed/width as well.  Let's check
> > > that we're actually getting a bus reset, try this in addition to the
> > > previous qemu patch.  This just enables debug logging for the bus resest
> > > function.  Thanks,
> > > 
> > 
> > Below are the outputs from 2 boots, VGA, load fglrx and start X. (2nd
> > time X gets killed and oops happened)
> > 
> > - 1st boot:
> > 
> > vfio: vfio_pci_hot_reset(0000:01:00.1) multi
> > vfio: 0000:01:00.1: hot reset dependent devices:
> > vfio: 	0000:01:00.0 group 1
> > vfio: 	0000:01:00.1 group 1
> > vfio: 0000:01:00.1 hot reset: Success
> > vfio: vfio_pci_hot_reset(0000:01:00.1) one
> > vfio: 0000:01:00.1: hot reset dependent devices:
> > vfio: 	0000:01:00.0 group 1
> > vfio: vfio: found another in-use device 0000:01:00.0
> > vfio: vfio_pci_hot_reset(0000:01:00.0) one
> > vfio: 0000:01:00.0: hot reset dependent devices:
> > vfio: 	0000:01:00.0 group 1
> > vfio: 	0000:01:00.1 group 1
> > vfio: vfio: found another in-use device 0000:01:00.1
> > 
> > - 2nd boot:
> > 
> > vfio: vfio_pci_hot_reset(0000:01:00.1) multi
> > vfio: 0000:01:00.1: hot reset dependent devices:
> > vfio: 	0000:01:00.0 group 1
> > vfio: 	0000:01:00.1 group 1
> > vfio: 0000:01:00.1 hot reset: Success
> > vfio: vfio_pci_hot_reset(0000:01:00.1) one
> > vfio: 0000:01:00.1: hot reset dependent devices:
> > vfio: 	0000:01:00.0 group 1
> > vfio: vfio: found another in-use device 0000:01:00.0
> > vfio: vfio_pci_hot_reset(0000:01:00.0) one
> > vfio: 0000:01:00.0: hot reset dependent devices:
> > vfio: 	0000:01:00.0 group 1
> > vfio: 	0000:01:00.1 group 1
> > vfio: vfio: found another in-use device 0000:01:00.1
> > 
> 
> Did you had already a chance to look into it or anything else I can help
> with?

According to the log we're doing the bus reset on both the first and 2nd
boot (it's expected that only the "multi" call gets to success).  I'm
surprised then that the link doesn't retrain back to the original width.
You could try forcing the link to retrain.  Look at the root port
upstream from the GPU, lspci -t is handy for this.  Run lspci on the
root port to get the PCI express capability offset, then use setpci to
set the link retrain bit.  For example:

# lspci -tv | grep NVIDIA
           +-07.0-[03]--+-00.0  NVIDIA Corporation GK106GL [Quadro K4000]
           |            \-00.1  NVIDIA Corporation GK106 HDMI Audio Controller

(upstream root port is 00:07.0)

# lspci -v -s 7.0 | grep Capabilities
	Capabilities: [40] Subsystem: Intel Corporation 5520/5500/X58 I/O Hub PCI Express Root Port 7
	Capabilities: [60] MSI: Enable+ Count=1/2 Maskable+ 64bit-
	Capabilities: [90] Express Root Port (Slot+), MSI 00
	Capabilities: [e0] Power Management version 3
	Capabilities: [100] Advanced Error Reporting
	Capabilities: [150] Access Control Services
	Capabilities: [160] Vendor Specific Information: ID=0002 Rev=0 Len=00c <?>

(PCI express capability is offset 0x90, Link Control is 0x10 off that)

# setpci -s 7.0 a0.w
0040

(retrain is bit 5, 0x20, OR'd with read value is 0x60)

# setpci -s 7.0 a0.w=60

# lspci... did it work?

Try doing that after the first boot to see if you can get back to a x16
link.  If that works, we may need to add something in the kernel to do
it automatically around a bus reset.  Thanks,

Alex

> > > diff --git a/hw/misc/vfio.c b/hw/misc/vfio.c
> > > index 8db182f..7fec259 100644
> > > --- a/hw/misc/vfio.c
> > > +++ b/hw/misc/vfio.c
> > > @@ -2927,6 +2927,10 @@ static bool vfio_pci_host_match(PCIHostDeviceAddress *hos
> > >              host1->slot == host2->slot && host1->function == host2->function);
> > >  }
> > >  
> > > +#undef DPRINTF
> > > +#define DPRINTF(fmt, ...) \
> > > +    do { fprintf(stderr, "vfio: " fmt, ## __VA_ARGS__); } while (0)
> > > +
> > >  static int vfio_pci_hot_reset(VFIODevice *vdev, bool single)
> > >  {
> > >      VFIOGroup *group;
> > > @@ -3104,6 +3108,15 @@ out_single:
> > >      return ret;
> > >  }
> > >  
> > > +#undef DPRINTF
> > > +#ifdef DEBUG_VFIO
> > > +#define DPRINTF(fmt, ...) \
> > > +    do { fprintf(stderr, "vfio: " fmt, ## __VA_ARGS__); } while (0)
> > > +#else
> > > +#define DPRINTF(fmt, ...) \
> > > +    do { } while (0)
> > > +#endif
> > > +
> > >  /*
> > >   * We want to differentiate hot reset of mulitple in-use devices vs hot reset
> > >   * of a single in-use device.  VFIO_DEVICE_RESET will already handle the case
> > > 
> > > 
> > 
> > --Maik
> > 
> 
> --Maik

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Qemu-devel] Multi GPU passthrough via VFIO
  2014-02-14  0:33                     ` Alex Williamson
@ 2014-02-14 14:51                       ` Maik Broemme
       [not found]                         ` <20140414170306.GH724@parallels.com>
  0 siblings, 1 reply; 16+ messages in thread
From: Maik Broemme @ 2014-02-14 14:51 UTC (permalink / raw)
  To: Alex Williamson; +Cc: qemu-devel

Hi Alex,

Alex Williamson <alex.williamson@redhat.com> wrote:
> On Fri, 2014-02-14 at 01:01 +0100, Maik Broemme wrote:
> > Hi Alex,
> > 
> > Maik Broemme <mbroemme@parallels.com> wrote:
> > > Hi Alex,
> > > 
> > > Alex Williamson <alex.williamson@redhat.com> wrote:
> > > > On Fri, 2014-02-07 at 01:22 +0100, Maik Broemme wrote:
> > > > > Interesting is the diff between 1st and 2nd boot, so if I do the lspci
> > > > > prior to the booting. The only difference between 1st start and 2nd
> > > > > start are:
> > > > > 
> > > > > --- 001-lspci.290x.before.1st.log	2014-02-07 01:13:41.498827928 +0100
> > > > > +++ 004-lspci.290x.before.2nd.log	2014-02-07 01:16:50.966611282 +0100
> > > > > @@ -24,7 +24,7 @@
> > > > >  			ClockPM- Surprise- LLActRep- BwNot-
> > > > >  		LnkCtl:	ASPM L0s L1 Enabled; RCB 64 bytes Disabled- CommClk+
> > > > >  			ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
> > > > > -		LnkSta:	Speed 5GT/s, Width x16, TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-
> > > > > +		LnkSta:	Speed 2.5GT/s, Width x1, TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-
> > > > >  		DevCap2: Completion Timeout: Not Supported, TimeoutDis-, LTR-, OBFF Not Supported
> > > > >  		DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis-, LTR-, OBFF Disabled
> > > > >  		LnkCtl2: Target Link Speed: 8GT/s, EnterCompliance- SpeedDis-
> > > > > @@ -33,13 +33,13 @@
> > > > >  		LnkSta2: Current De-emphasis Level: -3.5dB, EqualizationComplete-, EqualizationPhase1-
> > > > >  			 EqualizationPhase2-, EqualizationPhase3-, LinkEqualizationRequest-
> > > > >  	Capabilities: [a0] MSI: Enable- Count=1/1 Maskable- 64bit+
> > > > > -		Address: 0000000000000000  Data: 0000
> > > > > +		Address: 00000000fee00000  Data: 0000
> > > > >  	Capabilities: [100 v1] Vendor Specific Information: ID=0001 Rev=1 Len=010 <?>
> > > > >  	Capabilities: [150 v2] Advanced Error Reporting
> > > > >  		UESta:	DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
> > > > >  		UEMsk:	DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
> > > > >  		UESvrt:	DLP+ SDES+ TLP- FCP+ CmpltTO- CmpltAbrt- UnxCmplt- RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol-
> > > > > -		CESta:	RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr-
> > > > > +		CESta:	RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr+
> > > > >  		CEMsk:	RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr+
> > > > >  		AERCap:	First Error Pointer: 00, GenCap+ CGenEn- ChkCap+ ChkEn-
> > > > >  	Capabilities: [270 v1] #19
> > > > > 
> > > > > After that if I do suspend-to-ram / resume trick I have again lspci
> > > > > output from before 1st boot.
> > > > 
> > > > The Link Status change after X is stopped seems the most interesting to
> > > > me.  The MSI change is probably explained by the MSI save/restore of the
> > > > device, but should be harmless since MSI is disabled.  I'm a bit
> > > > surprised the Correctable Error Status in the AER capability didn't get
> > > > cleared.  I would have thought that a bus reset would have caused the
> > > > link to retrain back to the original speed/width as well.  Let's check
> > > > that we're actually getting a bus reset, try this in addition to the
> > > > previous qemu patch.  This just enables debug logging for the bus resest
> > > > function.  Thanks,
> > > > 
> > > 
> > > Below are the outputs from 2 boots, VGA, load fglrx and start X. (2nd
> > > time X gets killed and oops happened)
> > > 
> > > - 1st boot:
> > > 
> > > vfio: vfio_pci_hot_reset(0000:01:00.1) multi
> > > vfio: 0000:01:00.1: hot reset dependent devices:
> > > vfio: 	0000:01:00.0 group 1
> > > vfio: 	0000:01:00.1 group 1
> > > vfio: 0000:01:00.1 hot reset: Success
> > > vfio: vfio_pci_hot_reset(0000:01:00.1) one
> > > vfio: 0000:01:00.1: hot reset dependent devices:
> > > vfio: 	0000:01:00.0 group 1
> > > vfio: vfio: found another in-use device 0000:01:00.0
> > > vfio: vfio_pci_hot_reset(0000:01:00.0) one
> > > vfio: 0000:01:00.0: hot reset dependent devices:
> > > vfio: 	0000:01:00.0 group 1
> > > vfio: 	0000:01:00.1 group 1
> > > vfio: vfio: found another in-use device 0000:01:00.1
> > > 
> > > - 2nd boot:
> > > 
> > > vfio: vfio_pci_hot_reset(0000:01:00.1) multi
> > > vfio: 0000:01:00.1: hot reset dependent devices:
> > > vfio: 	0000:01:00.0 group 1
> > > vfio: 	0000:01:00.1 group 1
> > > vfio: 0000:01:00.1 hot reset: Success
> > > vfio: vfio_pci_hot_reset(0000:01:00.1) one
> > > vfio: 0000:01:00.1: hot reset dependent devices:
> > > vfio: 	0000:01:00.0 group 1
> > > vfio: vfio: found another in-use device 0000:01:00.0
> > > vfio: vfio_pci_hot_reset(0000:01:00.0) one
> > > vfio: 0000:01:00.0: hot reset dependent devices:
> > > vfio: 	0000:01:00.0 group 1
> > > vfio: 	0000:01:00.1 group 1
> > > vfio: vfio: found another in-use device 0000:01:00.1
> > > 
> > 
> > Did you had already a chance to look into it or anything else I can help
> > with?
> 
> According to the log we're doing the bus reset on both the first and 2nd
> boot (it's expected that only the "multi" call gets to success).  I'm
> surprised then that the link doesn't retrain back to the original width.
> You could try forcing the link to retrain.  Look at the root port
> upstream from the GPU, lspci -t is handy for this.  Run lspci on the
> root port to get the PCI express capability offset, then use setpci to
> set the link retrain bit.  For example:
> 
> # lspci -tv | grep NVIDIA
>            +-07.0-[03]--+-00.0  NVIDIA Corporation GK106GL [Quadro K4000]
>            |            \-00.1  NVIDIA Corporation GK106 HDMI Audio Controller
> 
> (upstream root port is 00:07.0)
> 
> # lspci -v -s 7.0 | grep Capabilities
> 	Capabilities: [40] Subsystem: Intel Corporation 5520/5500/X58 I/O Hub PCI Express Root Port 7
> 	Capabilities: [60] MSI: Enable+ Count=1/2 Maskable+ 64bit-
> 	Capabilities: [90] Express Root Port (Slot+), MSI 00
> 	Capabilities: [e0] Power Management version 3
> 	Capabilities: [100] Advanced Error Reporting
> 	Capabilities: [150] Access Control Services
> 	Capabilities: [160] Vendor Specific Information: ID=0002 Rev=0 Len=00c <?>
> 
> (PCI express capability is offset 0x90, Link Control is 0x10 off that)
> 
> # setpci -s 7.0 a0.w
> 0040
> 
> (retrain is bit 5, 0x20, OR'd with read value is 0x60)
> 
> # setpci -s 7.0 a0.w=60
> 
> # lspci... did it work?
> 
> Try doing that after the first boot to see if you can get back to a x16
> link.  If that works, we may need to add something in the kernel to do
> it automatically around a bus reset.  Thanks,
> 

Well this doesn't help either and it looks like VFIO reset is setting it
already back to original width. For example:

           +-02.0-[01]--+-00.0  Advanced Micro Devices, Inc. [AMD/ATI] Hawaii XT [Radeon HD 8970]
           |            \-00.1  Advanced Micro Devices, Inc. [AMD/ATI] Device aac8

Before 1st run:

root@homer:~# lspci -vvv -s 00:02.0 | grep LnkSta:
		LnkSta:	Speed 5GT/s, Width x16, TrErr- Train- SlotClk+ DLActive+ BWMgmt- ABWMgmt-
root@homer:~# lspci -vvv -s 01:00.0 | grep LnkSta:
		LnkSta:	Speed 5GT/s, Width x16, TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-

After power down of VM:

root@homer:~# lspci -vvv -s 00:02.0 | grep LnkSta:
		LnkSta:	Speed 2.5GT/s, Width x1, TrErr- Train- SlotClk+ DLActive+ BWMgmt- ABWMgmt+
root@homer:~# lspci -vvv -s 01:00.0 | grep LnkSta:
		LnkSta:	Speed 2.5GT/s, Width x1, TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-

After 2nd start once VFIO did reset:

root@homer:~# lspci -vvv -s 00:02.0 | grep LnkSta:
		LnkSta:	Speed 5GT/s, Width x16, TrErr- Train- SlotClk+ DLActive+ BWMgmt- ABWMgmt+
root@homer:~# lspci -vvv -s 01:00.0 | grep LnkSta:
		LnkSta:	Speed 5GT/s, Width x16, TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-

The only difference on bus I see here is ABWMgmt- vs ABWMgmt+ but it
shouldn't be relevant here as it the same if I unload fglrx module
before shutdown the VM which is the only case where I can run multiple
VM reboot cycles.

So the only difference on bus is the following:

-60: 10 08 00 00 02 cd 31 00 40 00 02 b1 80 25 14 00
+60: 10 08 00 00 02 cd 31 00 40 00 11 b0 80 25 14 00

6a (before 02, after 11)
6b (before b1, after b0)

But I cannot write these parameters using setpci. My PCI express capability
is offset 0x58 + 0x10 for link control which is already set back to 40

root@homer:~# lspci -vvv -s 00:02.0 | grep Capa
	Capabilities: [50] Power Management version 3
	Capabilities: [58] Express (v2) Root Port (Slot+), MSI 00
	Capabilities: [a0] MSI: Enable- Count=1/1 Maskable- 64bit-
	Capabilities: [b0] Subsystem: Gigabyte Technology Co., Ltd Device 5000
	Capabilities: [b8] HyperTransport: MSI Mapping Enable+ Fixed+
	Capabilities: [100 v1] Vendor Specific Information: ID=0001 Rev=1 Len=010 <?>
	Capabilities: [190 v1] Access Control Services

> Alex
> 
> > > > diff --git a/hw/misc/vfio.c b/hw/misc/vfio.c
> > > > index 8db182f..7fec259 100644
> > > > --- a/hw/misc/vfio.c
> > > > +++ b/hw/misc/vfio.c
> > > > @@ -2927,6 +2927,10 @@ static bool vfio_pci_host_match(PCIHostDeviceAddress *hos
> > > >              host1->slot == host2->slot && host1->function == host2->function);
> > > >  }
> > > >  
> > > > +#undef DPRINTF
> > > > +#define DPRINTF(fmt, ...) \
> > > > +    do { fprintf(stderr, "vfio: " fmt, ## __VA_ARGS__); } while (0)
> > > > +
> > > >  static int vfio_pci_hot_reset(VFIODevice *vdev, bool single)
> > > >  {
> > > >      VFIOGroup *group;
> > > > @@ -3104,6 +3108,15 @@ out_single:
> > > >      return ret;
> > > >  }
> > > >  
> > > > +#undef DPRINTF
> > > > +#ifdef DEBUG_VFIO
> > > > +#define DPRINTF(fmt, ...) \
> > > > +    do { fprintf(stderr, "vfio: " fmt, ## __VA_ARGS__); } while (0)
> > > > +#else
> > > > +#define DPRINTF(fmt, ...) \
> > > > +    do { } while (0)
> > > > +#endif
> > > > +
> > > >  /*
> > > >   * We want to differentiate hot reset of mulitple in-use devices vs hot reset
> > > >   * of a single in-use device.  VFIO_DEVICE_RESET will already handle the case
> > > > 
> > > > 
> > > 
> > > --Maik
> > > 
> > 
> > --Maik
> 
> 
> 

--Maik

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Qemu-devel] Multi GPU passthrough via VFIO
       [not found]                         ` <20140414170306.GH724@parallels.com>
@ 2015-01-16 12:21                           ` Maik Broemme
  2015-01-19 17:43                             ` Alex Williamson
  0 siblings, 1 reply; 16+ messages in thread
From: Maik Broemme @ 2015-01-16 12:21 UTC (permalink / raw)
  To: Alex Williamson; +Cc: qemu-devel

Hi Alex,

Maik Broemme <mbroemme@parallels.com> wrote:
> Hi Alex,
> 
> Maik Broemme <mbroemme@parallels.com> wrote:
> > Hi Alex,
> > 
> > Alex Williamson <alex.williamson@redhat.com> wrote:
> > > On Fri, 2014-02-14 at 01:01 +0100, Maik Broemme wrote:
> > > > Hi Alex,
> > > > 
> > > > Maik Broemme <mbroemme@parallels.com> wrote:
> > > > > Hi Alex,
> > > > > 
> > > > > Alex Williamson <alex.williamson@redhat.com> wrote:
> > > > > > On Fri, 2014-02-07 at 01:22 +0100, Maik Broemme wrote:
> > > > > > > Interesting is the diff between 1st and 2nd boot, so if I do the lspci
> > > > > > > prior to the booting. The only difference between 1st start and 2nd
> > > > > > > start are:
> > > > > > > 
> > > > > > > --- 001-lspci.290x.before.1st.log	2014-02-07 01:13:41.498827928 +0100
> > > > > > > +++ 004-lspci.290x.before.2nd.log	2014-02-07 01:16:50.966611282 +0100
> > > > > > > @@ -24,7 +24,7 @@
> > > > > > >  			ClockPM- Surprise- LLActRep- BwNot-
> > > > > > >  		LnkCtl:	ASPM L0s L1 Enabled; RCB 64 bytes Disabled- CommClk+
> > > > > > >  			ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
> > > > > > > -		LnkSta:	Speed 5GT/s, Width x16, TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-
> > > > > > > +		LnkSta:	Speed 2.5GT/s, Width x1, TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-
> > > > > > >  		DevCap2: Completion Timeout: Not Supported, TimeoutDis-, LTR-, OBFF Not Supported
> > > > > > >  		DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis-, LTR-, OBFF Disabled
> > > > > > >  		LnkCtl2: Target Link Speed: 8GT/s, EnterCompliance- SpeedDis-
> > > > > > > @@ -33,13 +33,13 @@
> > > > > > >  		LnkSta2: Current De-emphasis Level: -3.5dB, EqualizationComplete-, EqualizationPhase1-
> > > > > > >  			 EqualizationPhase2-, EqualizationPhase3-, LinkEqualizationRequest-
> > > > > > >  	Capabilities: [a0] MSI: Enable- Count=1/1 Maskable- 64bit+
> > > > > > > -		Address: 0000000000000000  Data: 0000
> > > > > > > +		Address: 00000000fee00000  Data: 0000
> > > > > > >  	Capabilities: [100 v1] Vendor Specific Information: ID=0001 Rev=1 Len=010 <?>
> > > > > > >  	Capabilities: [150 v2] Advanced Error Reporting
> > > > > > >  		UESta:	DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
> > > > > > >  		UEMsk:	DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
> > > > > > >  		UESvrt:	DLP+ SDES+ TLP- FCP+ CmpltTO- CmpltAbrt- UnxCmplt- RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol-
> > > > > > > -		CESta:	RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr-
> > > > > > > +		CESta:	RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr+
> > > > > > >  		CEMsk:	RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr+
> > > > > > >  		AERCap:	First Error Pointer: 00, GenCap+ CGenEn- ChkCap+ ChkEn-
> > > > > > >  	Capabilities: [270 v1] #19
> > > > > > > 
> > > > > > > After that if I do suspend-to-ram / resume trick I have again lspci
> > > > > > > output from before 1st boot.
> > > > > > 
> > > > > > The Link Status change after X is stopped seems the most interesting to
> > > > > > me.  The MSI change is probably explained by the MSI save/restore of the
> > > > > > device, but should be harmless since MSI is disabled.  I'm a bit
> > > > > > surprised the Correctable Error Status in the AER capability didn't get
> > > > > > cleared.  I would have thought that a bus reset would have caused the
> > > > > > link to retrain back to the original speed/width as well.  Let's check
> > > > > > that we're actually getting a bus reset, try this in addition to the
> > > > > > previous qemu patch.  This just enables debug logging for the bus resest
> > > > > > function.  Thanks,
> > > > > > 
> > > > > 
> > > > > Below are the outputs from 2 boots, VGA, load fglrx and start X. (2nd
> > > > > time X gets killed and oops happened)
> > > > > 
> > > > > - 1st boot:
> > > > > 
> > > > > vfio: vfio_pci_hot_reset(0000:01:00.1) multi
> > > > > vfio: 0000:01:00.1: hot reset dependent devices:
> > > > > vfio: 	0000:01:00.0 group 1
> > > > > vfio: 	0000:01:00.1 group 1
> > > > > vfio: 0000:01:00.1 hot reset: Success
> > > > > vfio: vfio_pci_hot_reset(0000:01:00.1) one
> > > > > vfio: 0000:01:00.1: hot reset dependent devices:
> > > > > vfio: 	0000:01:00.0 group 1
> > > > > vfio: vfio: found another in-use device 0000:01:00.0
> > > > > vfio: vfio_pci_hot_reset(0000:01:00.0) one
> > > > > vfio: 0000:01:00.0: hot reset dependent devices:
> > > > > vfio: 	0000:01:00.0 group 1
> > > > > vfio: 	0000:01:00.1 group 1
> > > > > vfio: vfio: found another in-use device 0000:01:00.1
> > > > > 
> > > > > - 2nd boot:
> > > > > 
> > > > > vfio: vfio_pci_hot_reset(0000:01:00.1) multi
> > > > > vfio: 0000:01:00.1: hot reset dependent devices:
> > > > > vfio: 	0000:01:00.0 group 1
> > > > > vfio: 	0000:01:00.1 group 1
> > > > > vfio: 0000:01:00.1 hot reset: Success
> > > > > vfio: vfio_pci_hot_reset(0000:01:00.1) one
> > > > > vfio: 0000:01:00.1: hot reset dependent devices:
> > > > > vfio: 	0000:01:00.0 group 1
> > > > > vfio: vfio: found another in-use device 0000:01:00.0
> > > > > vfio: vfio_pci_hot_reset(0000:01:00.0) one
> > > > > vfio: 0000:01:00.0: hot reset dependent devices:
> > > > > vfio: 	0000:01:00.0 group 1
> > > > > vfio: 	0000:01:00.1 group 1
> > > > > vfio: vfio: found another in-use device 0000:01:00.1
> > > > > 
> > > > 
> > > > Did you had already a chance to look into it or anything else I can help
> > > > with?
> > > 
> > > According to the log we're doing the bus reset on both the first and 2nd
> > > boot (it's expected that only the "multi" call gets to success).  I'm
> > > surprised then that the link doesn't retrain back to the original width.
> > > You could try forcing the link to retrain.  Look at the root port
> > > upstream from the GPU, lspci -t is handy for this.  Run lspci on the
> > > root port to get the PCI express capability offset, then use setpci to
> > > set the link retrain bit.  For example:
> > > 
> > > # lspci -tv | grep NVIDIA
> > >            +-07.0-[03]--+-00.0  NVIDIA Corporation GK106GL [Quadro K4000]
> > >            |            \-00.1  NVIDIA Corporation GK106 HDMI Audio Controller
> > > 
> > > (upstream root port is 00:07.0)
> > > 
> > > # lspci -v -s 7.0 | grep Capabilities
> > > 	Capabilities: [40] Subsystem: Intel Corporation 5520/5500/X58 I/O Hub PCI Express Root Port 7
> > > 	Capabilities: [60] MSI: Enable+ Count=1/2 Maskable+ 64bit-
> > > 	Capabilities: [90] Express Root Port (Slot+), MSI 00
> > > 	Capabilities: [e0] Power Management version 3
> > > 	Capabilities: [100] Advanced Error Reporting
> > > 	Capabilities: [150] Access Control Services
> > > 	Capabilities: [160] Vendor Specific Information: ID=0002 Rev=0 Len=00c <?>
> > > 
> > > (PCI express capability is offset 0x90, Link Control is 0x10 off that)
> > > 
> > > # setpci -s 7.0 a0.w
> > > 0040
> > > 
> > > (retrain is bit 5, 0x20, OR'd with read value is 0x60)
> > > 
> > > # setpci -s 7.0 a0.w=60
> > > 
> > > # lspci... did it work?
> > > 
> > > Try doing that after the first boot to see if you can get back to a x16
> > > link.  If that works, we may need to add something in the kernel to do
> > > it automatically around a bus reset.  Thanks,
> > > 
> > 
> > Well this doesn't help either and it looks like VFIO reset is setting it
> > already back to original width. For example:
> > 
> >            +-02.0-[01]--+-00.0  Advanced Micro Devices, Inc. [AMD/ATI] Hawaii XT [Radeon HD 8970]
> >            |            \-00.1  Advanced Micro Devices, Inc. [AMD/ATI] Device aac8
> > 
> > Before 1st run:
> > 
> > root@homer:~# lspci -vvv -s 00:02.0 | grep LnkSta:
> > 		LnkSta:	Speed 5GT/s, Width x16, TrErr- Train- SlotClk+ DLActive+ BWMgmt- ABWMgmt-
> > root@homer:~# lspci -vvv -s 01:00.0 | grep LnkSta:
> > 		LnkSta:	Speed 5GT/s, Width x16, TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-
> > 
> > After power down of VM:
> > 
> > root@homer:~# lspci -vvv -s 00:02.0 | grep LnkSta:
> > 		LnkSta:	Speed 2.5GT/s, Width x1, TrErr- Train- SlotClk+ DLActive+ BWMgmt- ABWMgmt+
> > root@homer:~# lspci -vvv -s 01:00.0 | grep LnkSta:
> > 		LnkSta:	Speed 2.5GT/s, Width x1, TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-
> > 
> > After 2nd start once VFIO did reset:
> > 
> > root@homer:~# lspci -vvv -s 00:02.0 | grep LnkSta:
> > 		LnkSta:	Speed 5GT/s, Width x16, TrErr- Train- SlotClk+ DLActive+ BWMgmt- ABWMgmt+
> > root@homer:~# lspci -vvv -s 01:00.0 | grep LnkSta:
> > 		LnkSta:	Speed 5GT/s, Width x16, TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-
> > 
> > The only difference on bus I see here is ABWMgmt- vs ABWMgmt+ but it
> > shouldn't be relevant here as it the same if I unload fglrx module
> > before shutdown the VM which is the only case where I can run multiple
> > VM reboot cycles.
> > 
> > So the only difference on bus is the following:
> > 
> > -60: 10 08 00 00 02 cd 31 00 40 00 02 b1 80 25 14 00
> > +60: 10 08 00 00 02 cd 31 00 40 00 11 b0 80 25 14 00
> > 
> > 6a (before 02, after 11)
> > 6b (before b1, after b0)
> > 
> > But I cannot write these parameters using setpci. My PCI express capability
> > is offset 0x58 + 0x10 for link control which is already set back to 40
> > 
> > root@homer:~# lspci -vvv -s 00:02.0 | grep Capa
> > 	Capabilities: [50] Power Management version 3
> > 	Capabilities: [58] Express (v2) Root Port (Slot+), MSI 00
> > 	Capabilities: [a0] MSI: Enable- Count=1/1 Maskable- 64bit-
> > 	Capabilities: [b0] Subsystem: Gigabyte Technology Co., Ltd Device 5000
> > 	Capabilities: [b8] HyperTransport: MSI Mapping Enable+ Fixed+
> > 	Capabilities: [100 v1] Vendor Specific Information: ID=0001 Rev=1 Len=010 <?>
> > 	Capabilities: [190 v1] Access Control Services
> > 
> 
> Wouldn't it be a possible solution to do a D0 -> D3 -> D0 transition for
> devices which doesn't support FLR? The setpci way doesn't help me at all
> 

I want to renew the thread a bit as with latest slot/bus reset some
things have changed but it still doesn't work in all cases.

#1 QEMU+OVMF (UEFI):

I've flashed my R9 290X with an UEFI compatible BIOS and QEMU+OVMF
(without CSM) boots Windows 8.1 fine. Catalyst 14.12 drivers can be
installed without issues and work fine. However an attempt to reboot the
VM result in Windows 8.1 typical "Something went wrong :(" screen. The
suspend/resume trick still works between VM reboots.

#2 QEMU (BIOS):

In this scenario I use secondary GPU passthrough (no VGA as primary
adapter) using Windows 7. Catalyst 14.12 drivers can be installed
without issues and work fine. Also I was surprised that an attempt to
reboot the VM was also working. Windows 7 restarts fine, I see the login
screen and no performance issues. But it doesn't work always, sometimes
it works for 3-4 reboots and next one fails with just a black screen
(but Windows VM is pingable and ACPI shutdown still works), sometimes it
works only for one reboot. In all cases the suspend/resume trick still
works.

So I would like to narrow down the problem. Anything I can try Alex,
like debugging logs of QEMU.

Used QEMU version is 2.2.0, kernel is 3.18.2.

> > > Alex
> > > 
> > > > > > diff --git a/hw/misc/vfio.c b/hw/misc/vfio.c
> > > > > > index 8db182f..7fec259 100644
> > > > > > --- a/hw/misc/vfio.c
> > > > > > +++ b/hw/misc/vfio.c
> > > > > > @@ -2927,6 +2927,10 @@ static bool vfio_pci_host_match(PCIHostDeviceAddress *hos
> > > > > >              host1->slot == host2->slot && host1->function == host2->function);
> > > > > >  }
> > > > > >  
> > > > > > +#undef DPRINTF
> > > > > > +#define DPRINTF(fmt, ...) \
> > > > > > +    do { fprintf(stderr, "vfio: " fmt, ## __VA_ARGS__); } while (0)
> > > > > > +
> > > > > >  static int vfio_pci_hot_reset(VFIODevice *vdev, bool single)
> > > > > >  {
> > > > > >      VFIOGroup *group;
> > > > > > @@ -3104,6 +3108,15 @@ out_single:
> > > > > >      return ret;
> > > > > >  }
> > > > > >  
> > > > > > +#undef DPRINTF
> > > > > > +#ifdef DEBUG_VFIO
> > > > > > +#define DPRINTF(fmt, ...) \
> > > > > > +    do { fprintf(stderr, "vfio: " fmt, ## __VA_ARGS__); } while (0)
> > > > > > +#else
> > > > > > +#define DPRINTF(fmt, ...) \
> > > > > > +    do { } while (0)
> > > > > > +#endif
> > > > > > +
> > > > > >  /*
> > > > > >   * We want to differentiate hot reset of mulitple in-use devices vs hot reset
> > > > > >   * of a single in-use device.  VFIO_DEVICE_RESET will already handle the case
> > > > > > 
> > > > > > 
> > > > > 
> > > > > --Maik
> > > > > 
> > > > 
> > > > --Maik
> > > 
> > > 
> > > 
> > 
> > --Maik
> > 
> 
> --Maik
> 

--Maik

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Qemu-devel] Multi GPU passthrough via VFIO
  2015-01-16 12:21                           ` Maik Broemme
@ 2015-01-19 17:43                             ` Alex Williamson
  0 siblings, 0 replies; 16+ messages in thread
From: Alex Williamson @ 2015-01-19 17:43 UTC (permalink / raw)
  To: Maik Broemme; +Cc: qemu-devel

On Fri, 2015-01-16 at 13:21 +0100, Maik Broemme wrote:
> Hi Alex,
> 
> Maik Broemme <mbroemme@parallels.com> wrote:
> > Hi Alex,
> > 
> > Maik Broemme <mbroemme@parallels.com> wrote:
> > > Hi Alex,
> > > 
> > > Alex Williamson <alex.williamson@redhat.com> wrote:
> > > > On Fri, 2014-02-14 at 01:01 +0100, Maik Broemme wrote:
> > > > > Hi Alex,
> > > > > 
> > > > > Maik Broemme <mbroemme@parallels.com> wrote:
> > > > > > Hi Alex,
> > > > > > 
> > > > > > Alex Williamson <alex.williamson@redhat.com> wrote:
> > > > > > > On Fri, 2014-02-07 at 01:22 +0100, Maik Broemme wrote:
> > > > > > > > Interesting is the diff between 1st and 2nd boot, so if I do the lspci
> > > > > > > > prior to the booting. The only difference between 1st start and 2nd
> > > > > > > > start are:
> > > > > > > > 
> > > > > > > > --- 001-lspci.290x.before.1st.log	2014-02-07 01:13:41.498827928 +0100
> > > > > > > > +++ 004-lspci.290x.before.2nd.log	2014-02-07 01:16:50.966611282 +0100
> > > > > > > > @@ -24,7 +24,7 @@
> > > > > > > >  			ClockPM- Surprise- LLActRep- BwNot-
> > > > > > > >  		LnkCtl:	ASPM L0s L1 Enabled; RCB 64 bytes Disabled- CommClk+
> > > > > > > >  			ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
> > > > > > > > -		LnkSta:	Speed 5GT/s, Width x16, TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-
> > > > > > > > +		LnkSta:	Speed 2.5GT/s, Width x1, TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-
> > > > > > > >  		DevCap2: Completion Timeout: Not Supported, TimeoutDis-, LTR-, OBFF Not Supported
> > > > > > > >  		DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis-, LTR-, OBFF Disabled
> > > > > > > >  		LnkCtl2: Target Link Speed: 8GT/s, EnterCompliance- SpeedDis-
> > > > > > > > @@ -33,13 +33,13 @@
> > > > > > > >  		LnkSta2: Current De-emphasis Level: -3.5dB, EqualizationComplete-, EqualizationPhase1-
> > > > > > > >  			 EqualizationPhase2-, EqualizationPhase3-, LinkEqualizationRequest-
> > > > > > > >  	Capabilities: [a0] MSI: Enable- Count=1/1 Maskable- 64bit+
> > > > > > > > -		Address: 0000000000000000  Data: 0000
> > > > > > > > +		Address: 00000000fee00000  Data: 0000
> > > > > > > >  	Capabilities: [100 v1] Vendor Specific Information: ID=0001 Rev=1 Len=010 <?>
> > > > > > > >  	Capabilities: [150 v2] Advanced Error Reporting
> > > > > > > >  		UESta:	DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
> > > > > > > >  		UEMsk:	DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
> > > > > > > >  		UESvrt:	DLP+ SDES+ TLP- FCP+ CmpltTO- CmpltAbrt- UnxCmplt- RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol-
> > > > > > > > -		CESta:	RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr-
> > > > > > > > +		CESta:	RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr+
> > > > > > > >  		CEMsk:	RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr+
> > > > > > > >  		AERCap:	First Error Pointer: 00, GenCap+ CGenEn- ChkCap+ ChkEn-
> > > > > > > >  	Capabilities: [270 v1] #19
> > > > > > > > 
> > > > > > > > After that if I do suspend-to-ram / resume trick I have again lspci
> > > > > > > > output from before 1st boot.
> > > > > > > 
> > > > > > > The Link Status change after X is stopped seems the most interesting to
> > > > > > > me.  The MSI change is probably explained by the MSI save/restore of the
> > > > > > > device, but should be harmless since MSI is disabled.  I'm a bit
> > > > > > > surprised the Correctable Error Status in the AER capability didn't get
> > > > > > > cleared.  I would have thought that a bus reset would have caused the
> > > > > > > link to retrain back to the original speed/width as well.  Let's check
> > > > > > > that we're actually getting a bus reset, try this in addition to the
> > > > > > > previous qemu patch.  This just enables debug logging for the bus resest
> > > > > > > function.  Thanks,
> > > > > > > 
> > > > > > 
> > > > > > Below are the outputs from 2 boots, VGA, load fglrx and start X. (2nd
> > > > > > time X gets killed and oops happened)
> > > > > > 
> > > > > > - 1st boot:
> > > > > > 
> > > > > > vfio: vfio_pci_hot_reset(0000:01:00.1) multi
> > > > > > vfio: 0000:01:00.1: hot reset dependent devices:
> > > > > > vfio: 	0000:01:00.0 group 1
> > > > > > vfio: 	0000:01:00.1 group 1
> > > > > > vfio: 0000:01:00.1 hot reset: Success
> > > > > > vfio: vfio_pci_hot_reset(0000:01:00.1) one
> > > > > > vfio: 0000:01:00.1: hot reset dependent devices:
> > > > > > vfio: 	0000:01:00.0 group 1
> > > > > > vfio: vfio: found another in-use device 0000:01:00.0
> > > > > > vfio: vfio_pci_hot_reset(0000:01:00.0) one
> > > > > > vfio: 0000:01:00.0: hot reset dependent devices:
> > > > > > vfio: 	0000:01:00.0 group 1
> > > > > > vfio: 	0000:01:00.1 group 1
> > > > > > vfio: vfio: found another in-use device 0000:01:00.1
> > > > > > 
> > > > > > - 2nd boot:
> > > > > > 
> > > > > > vfio: vfio_pci_hot_reset(0000:01:00.1) multi
> > > > > > vfio: 0000:01:00.1: hot reset dependent devices:
> > > > > > vfio: 	0000:01:00.0 group 1
> > > > > > vfio: 	0000:01:00.1 group 1
> > > > > > vfio: 0000:01:00.1 hot reset: Success
> > > > > > vfio: vfio_pci_hot_reset(0000:01:00.1) one
> > > > > > vfio: 0000:01:00.1: hot reset dependent devices:
> > > > > > vfio: 	0000:01:00.0 group 1
> > > > > > vfio: vfio: found another in-use device 0000:01:00.0
> > > > > > vfio: vfio_pci_hot_reset(0000:01:00.0) one
> > > > > > vfio: 0000:01:00.0: hot reset dependent devices:
> > > > > > vfio: 	0000:01:00.0 group 1
> > > > > > vfio: 	0000:01:00.1 group 1
> > > > > > vfio: vfio: found another in-use device 0000:01:00.1
> > > > > > 
> > > > > 
> > > > > Did you had already a chance to look into it or anything else I can help
> > > > > with?
> > > > 
> > > > According to the log we're doing the bus reset on both the first and 2nd
> > > > boot (it's expected that only the "multi" call gets to success).  I'm
> > > > surprised then that the link doesn't retrain back to the original width.
> > > > You could try forcing the link to retrain.  Look at the root port
> > > > upstream from the GPU, lspci -t is handy for this.  Run lspci on the
> > > > root port to get the PCI express capability offset, then use setpci to
> > > > set the link retrain bit.  For example:
> > > > 
> > > > # lspci -tv | grep NVIDIA
> > > >            +-07.0-[03]--+-00.0  NVIDIA Corporation GK106GL [Quadro K4000]
> > > >            |            \-00.1  NVIDIA Corporation GK106 HDMI Audio Controller
> > > > 
> > > > (upstream root port is 00:07.0)
> > > > 
> > > > # lspci -v -s 7.0 | grep Capabilities
> > > > 	Capabilities: [40] Subsystem: Intel Corporation 5520/5500/X58 I/O Hub PCI Express Root Port 7
> > > > 	Capabilities: [60] MSI: Enable+ Count=1/2 Maskable+ 64bit-
> > > > 	Capabilities: [90] Express Root Port (Slot+), MSI 00
> > > > 	Capabilities: [e0] Power Management version 3
> > > > 	Capabilities: [100] Advanced Error Reporting
> > > > 	Capabilities: [150] Access Control Services
> > > > 	Capabilities: [160] Vendor Specific Information: ID=0002 Rev=0 Len=00c <?>
> > > > 
> > > > (PCI express capability is offset 0x90, Link Control is 0x10 off that)
> > > > 
> > > > # setpci -s 7.0 a0.w
> > > > 0040
> > > > 
> > > > (retrain is bit 5, 0x20, OR'd with read value is 0x60)
> > > > 
> > > > # setpci -s 7.0 a0.w=60
> > > > 
> > > > # lspci... did it work?
> > > > 
> > > > Try doing that after the first boot to see if you can get back to a x16
> > > > link.  If that works, we may need to add something in the kernel to do
> > > > it automatically around a bus reset.  Thanks,
> > > > 
> > > 
> > > Well this doesn't help either and it looks like VFIO reset is setting it
> > > already back to original width. For example:
> > > 
> > >            +-02.0-[01]--+-00.0  Advanced Micro Devices, Inc. [AMD/ATI] Hawaii XT [Radeon HD 8970]
> > >            |            \-00.1  Advanced Micro Devices, Inc. [AMD/ATI] Device aac8
> > > 
> > > Before 1st run:
> > > 
> > > root@homer:~# lspci -vvv -s 00:02.0 | grep LnkSta:
> > > 		LnkSta:	Speed 5GT/s, Width x16, TrErr- Train- SlotClk+ DLActive+ BWMgmt- ABWMgmt-
> > > root@homer:~# lspci -vvv -s 01:00.0 | grep LnkSta:
> > > 		LnkSta:	Speed 5GT/s, Width x16, TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-
> > > 
> > > After power down of VM:
> > > 
> > > root@homer:~# lspci -vvv -s 00:02.0 | grep LnkSta:
> > > 		LnkSta:	Speed 2.5GT/s, Width x1, TrErr- Train- SlotClk+ DLActive+ BWMgmt- ABWMgmt+
> > > root@homer:~# lspci -vvv -s 01:00.0 | grep LnkSta:
> > > 		LnkSta:	Speed 2.5GT/s, Width x1, TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-
> > > 
> > > After 2nd start once VFIO did reset:
> > > 
> > > root@homer:~# lspci -vvv -s 00:02.0 | grep LnkSta:
> > > 		LnkSta:	Speed 5GT/s, Width x16, TrErr- Train- SlotClk+ DLActive+ BWMgmt- ABWMgmt+
> > > root@homer:~# lspci -vvv -s 01:00.0 | grep LnkSta:
> > > 		LnkSta:	Speed 5GT/s, Width x16, TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-
> > > 
> > > The only difference on bus I see here is ABWMgmt- vs ABWMgmt+ but it
> > > shouldn't be relevant here as it the same if I unload fglrx module
> > > before shutdown the VM which is the only case where I can run multiple
> > > VM reboot cycles.
> > > 
> > > So the only difference on bus is the following:
> > > 
> > > -60: 10 08 00 00 02 cd 31 00 40 00 02 b1 80 25 14 00
> > > +60: 10 08 00 00 02 cd 31 00 40 00 11 b0 80 25 14 00
> > > 
> > > 6a (before 02, after 11)
> > > 6b (before b1, after b0)
> > > 
> > > But I cannot write these parameters using setpci. My PCI express capability
> > > is offset 0x58 + 0x10 for link control which is already set back to 40
> > > 
> > > root@homer:~# lspci -vvv -s 00:02.0 | grep Capa
> > > 	Capabilities: [50] Power Management version 3
> > > 	Capabilities: [58] Express (v2) Root Port (Slot+), MSI 00
> > > 	Capabilities: [a0] MSI: Enable- Count=1/1 Maskable- 64bit-
> > > 	Capabilities: [b0] Subsystem: Gigabyte Technology Co., Ltd Device 5000
> > > 	Capabilities: [b8] HyperTransport: MSI Mapping Enable+ Fixed+
> > > 	Capabilities: [100 v1] Vendor Specific Information: ID=0001 Rev=1 Len=010 <?>
> > > 	Capabilities: [190 v1] Access Control Services
> > > 
> > 
> > Wouldn't it be a possible solution to do a D0 -> D3 -> D0 transition for
> > devices which doesn't support FLR? The setpci way doesn't help me at all
> > 
> 
> I want to renew the thread a bit as with latest slot/bus reset some
> things have changed but it still doesn't work in all cases.
> 
> #1 QEMU+OVMF (UEFI):
> 
> I've flashed my R9 290X with an UEFI compatible BIOS and QEMU+OVMF
> (without CSM) boots Windows 8.1 fine. Catalyst 14.12 drivers can be
> installed without issues and work fine. However an attempt to reboot the
> VM result in Windows 8.1 typical "Something went wrong :(" screen. The
> suspend/resume trick still works between VM reboots.
> 
> #2 QEMU (BIOS):
> 
> In this scenario I use secondary GPU passthrough (no VGA as primary
> adapter) using Windows 7. Catalyst 14.12 drivers can be installed
> without issues and work fine. Also I was surprised that an attempt to
> reboot the VM was also working. Windows 7 restarts fine, I see the login
> screen and no performance issues. But it doesn't work always, sometimes
> it works for 3-4 reboots and next one fails with just a black screen
> (but Windows VM is pingable and ACPI shutdown still works), sometimes it
> works only for one reboot. In all cases the suspend/resume trick still
> works.
> 
> So I would like to narrow down the problem. Anything I can try Alex,
> like debugging logs of QEMU.
> 
> Used QEMU version is 2.2.0, kernel is 3.18.2.

There's a small changed queued for v3.20 that will exclude PM reset as
an option for AMD GPUs (because it doesn't so anything), but I don't
expect this will change anything for you.  It mostly just enables reset
on release for cards like my HD8570 that report they support PM reset.

Cards like your R9 290X (if I'm remembering correctly) and my R7790
simply don't seem to reset their internal components like they're
supposed to during a bus reset.  I've reached out to AMD developers
regarding this problem; it has theoretically been passed to the
appropriate teams, but I haven't heard of any progress or resolution.

Assignment as a secondary GPU requires driver support and while AMD
seems interested in supporting GPU assignment, I haven't seen any
evidence that they're willing to do anything to make it happen.
Guessing what might be wrong in case #2 is not fun, so it's not a very
interesting case unless AMD wants to make an effort there.  Have you
tried reporting the bug to AMD?  Perhaps you can install a VNC server in
the guest so you can interact and collect data in the failure case.
Case #1 is stuck at the reset problem, and again AMD isn't offering much
help there and I'm out of ideas short of dissecting datasheets for
various root ports to figure out if we can toggle power to the slot.
Thanks,

Alex

^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2015-01-19 17:43 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-02-05 18:59 [Qemu-devel] Multi GPU passthrough via VFIO Maik Broemme
2014-02-05 20:26 ` Alex Williamson
2014-02-05 21:10   ` Maik Broemme
2014-02-05 21:27     ` Alex Williamson
2014-02-05 23:47       ` Maik Broemme
2014-02-06  0:25         ` Maik Broemme
2014-02-06  3:36           ` Alex Williamson
2014-02-07  0:22             ` Maik Broemme
2014-02-07 18:07               ` Maik Broemme
2014-02-07 19:10               ` Alex Williamson
2014-02-07 20:17                 ` Maik Broemme
2014-02-14  0:01                   ` Maik Broemme
2014-02-14  0:33                     ` Alex Williamson
2014-02-14 14:51                       ` Maik Broemme
     [not found]                         ` <20140414170306.GH724@parallels.com>
2015-01-16 12:21                           ` Maik Broemme
2015-01-19 17:43                             ` Alex Williamson

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.