All of lore.kernel.org
 help / color / mirror / Atom feed
* Fwd: Graphics pass-through
       [not found] <AANLkTikNHRcDquYOL3NhsxkkBYcE48nMyu4+t8t=19e7@mail.gmail.com>
@ 2011-01-25 23:03 ` Prasad Joshi
  2011-01-26  5:12   ` Alex Williamson
  0 siblings, 1 reply; 27+ messages in thread
From: Prasad Joshi @ 2011-01-25 23:03 UTC (permalink / raw)
  To: kvm; +Cc: Oswaldo Cadenas, André Weidemann

Hello,

This is to announce that, we have been able to pass-through a ATI
Radeon RV370 FireGL V3100 to Ubuntu VM. This card was attached to a
separate monitor, after passing-through the Keyboard and Mouse
everything worked as normal.

The changes we made are very less, mostly disabling default QEMU VGA.

But there are few problems that we are still working on
1. The display on the monitor, probably only appears after the KMS is
enabled. It does not display the grub menu and booting log.

2. If we intermix the QEMU default devices like Network or another
VGA. The VM does not work properly. We could see IO_PAGE_FAULT events
being logged in the system messages. I guess this is happening because
of some memory region conflicts.

But if we pass-through a Network device and a GPU card to VM it works
perfectly. Till now we only observed problem when QEMU default devices
were intermixed with the pass-through ATI card.

3. Windows does not work at all. No display, lots of IO_PAGE_FAULT events.

4. The card that we used is somewhat old one. "André Weidemann"
<Andre.Weidemann@web.de> is trying to pass-through a relatively new
ATI card. We will have results very soon.

5. Nvidia or Intel IGD cards have not been tested.

Currently I am focusing on solving problem 2 and 1. I am very new to
KVM and QEMU, I will appreciate if someone can help me or point me to
a correct direction. Once these problems are solved I will send the
patches for comments on this mailing list.

Thanks and Regards,
Prasad

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: Graphics pass-through
  2011-01-25 23:03 ` Fwd: Graphics pass-through Prasad Joshi
@ 2011-01-26  5:12   ` Alex Williamson
  2011-01-26  8:17     ` Gerd Hoffmann
  2011-01-27 11:56     ` André Weidemann
  0 siblings, 2 replies; 27+ messages in thread
From: Alex Williamson @ 2011-01-26  5:12 UTC (permalink / raw)
  To: Prasad Joshi; +Cc: kvm, Oswaldo Cadenas, André Weidemann

2011/1/25 Prasad Joshi <prasadjoshi124@gmail.com>:
> Hello,
>
> This is to announce that, we have been able to pass-through a ATI
> Radeon RV370 FireGL V3100 to Ubuntu VM. This card was attached to a
> separate monitor, after passing-through the Keyboard and Mouse
> everything worked as normal.
>
> The changes we made are very less, mostly disabling default QEMU VGA.
>
> But there are few problems that we are still working on
> 1. The display on the monitor, probably only appears after the KMS is
> enabled. It does not display the grub menu and booting log.

Sounds like you're not enabling legacy VGA MMIO and I/O port space to
the physical device.  Without card specific drivers, seabios and grub
have no way to write to the screen.

> 2. If we intermix the QEMU default devices like Network or another
> VGA. The VM does not work properly. We could see IO_PAGE_FAULT events
> being logged in the system messages. I guess this is happening because
> of some memory region conflicts.
>
> But if we pass-through a Network device and a GPU card to VM it works
> perfectly. Till now we only observed problem when QEMU default devices
> were intermixed with the pass-through ATI card.
>
> 3. Windows does not work at all. No display, lots of IO_PAGE_FAULT events.
>
> 4. The card that we used is somewhat old one. "André Weidemann"
> <Andre.Weidemann@web.de> is trying to pass-through a relatively new
> ATI card. We will have results very soon.
>
> 5. Nvidia or Intel IGD cards have not been tested.
>
> Currently I am focusing on solving problem 2 and 1. I am very new to
> KVM and QEMU, I will appreciate if someone can help me or point me to
> a correct direction. Once these problems are solved I will send the
> patches for comments on this mailing list.

As we've discussed on irc, vga routing and arbitration is a hard
problem, especially if you want to support multiple graphics cards,
potentially assigning some to guests and using others in the host.
I'm not sure why you're having problems using an emulated nic
alongside an assigned gfx card, but the lack of bios, grub, or windows
support seems to indicate you're not properly handling the legacy
address space.

At a minimum, you need to do a cpu_register_physical_memory() for the
VGA range starting at 0xa0000 and register ioport handers
(register_ioport_read/write) for the range starting at 0x3b0.  These
need to be backed by a /dev/mem mapping for the mmio and in*/out* for
the ioports.  You'll also find there are some chipset routines that
like to steal back ownership of these ranges, see for instance
i440fx_update_memory_mappings().

That may be enough to get things working a little better, but it's a
huge kludge.  It would perhaps be nice if x86 supported
HAVE_PCI_LEGACY in pci-sysfs so we could maybe let the kernel do the
heavy lifting of reconfiguring chipset vga routing.  This would avoid
qemu needing to open /dev/mem and do raw ioport accesses as well.  We
also need proper vga arbitration so we don't stomp on other qemu
instances or host userspace from accessing their cards.  The
vga_arbiter exists, but it doesn't seem to be widely used.  I wonder
if we might be better off letting the legacy pci-sysfs interface do
the arbitration since X already knows how to use that on some
architectures... numerous problems that need to be tacked here.

So while your initial results are promising, my guess is that you're
using card specific drivers and still need to consider some of the
harder problems with generic support for vga assignment.  I hacked on
this for a bit trying to see if I could get vga assignment working
with the vfio driver.  Setting up the legacy access and preventing
qemu from stealing it back should get you basic vga modes and might
even allow the option rom to run to initialize the card for pre-boot.
I was able to get this far on a similar ATI card.  I never hard much
luck with other cards though, and I was never able to get the vesa
extensions working.  Thanks,

Alex

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: Graphics pass-through
  2011-01-26  5:12   ` Alex Williamson
@ 2011-01-26  8:17     ` Gerd Hoffmann
  2011-01-27 11:56     ` André Weidemann
  1 sibling, 0 replies; 27+ messages in thread
From: Gerd Hoffmann @ 2011-01-26  8:17 UTC (permalink / raw)
  To: Alex Williamson; +Cc: Prasad Joshi, kvm, Oswaldo Cadenas, André Weidemann

   Hi,

>> The changes we made are very less, mostly disabling default QEMU VGA.
>>
>> But there are few problems that we are still working on
>> 1. The display on the monitor, probably only appears after the KMS is
>> enabled. It does not display the grub menu and booting log.
>
> Sounds like you're not enabling legacy VGA MMIO and I/O port space to
> the physical device.  Without card specific drivers, seabios and grub
> have no way to write to the screen.

There is a initialization order issue with the legacy vga memory @ 
0xa00000 and the piix3 chipset.  Booting a guest with '-vga none -device 
VGA' has the same effect (no boot messages).

cheers,
   Gerd

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: Graphics pass-through
  2011-01-26  5:12   ` Alex Williamson
  2011-01-26  8:17     ` Gerd Hoffmann
@ 2011-01-27 11:56     ` André Weidemann
  2011-01-28  0:45       ` Alex Williamson
  1 sibling, 1 reply; 27+ messages in thread
From: André Weidemann @ 2011-01-27 11:56 UTC (permalink / raw)
  To: Alex Williamson; +Cc: Prasad Joshi, kvm, Oswaldo Cadenas

Hi Alex,

On 26.01.2011 06:12, Alex Williamson wrote:

> So while your initial results are promising, my guess is that you're
> using card specific drivers and still need to consider some of the
> harder problems with generic support for vga assignment.  I hacked on
> this for a bit trying to see if I could get vga assignment working
> with the vfio driver.  Setting up the legacy access and preventing
> qemu from stealing it back should get you basic vga modes and might
> even allow the option rom to run to initialize the card for pre-boot.
> I was able to get this far on a similar ATI card.  I never hard much
> luck with other cards though, and I was never able to get the vesa
> extensions working.  Thanks,

Do you mind sharing these patches?

Thank you very much
  André

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: Graphics pass-through
  2011-01-27 11:56     ` André Weidemann
@ 2011-01-28  0:45       ` Alex Williamson
  2011-01-28 17:29         ` André Weidemann
  2011-05-05  8:50         ` Jan Kiszka
  0 siblings, 2 replies; 27+ messages in thread
From: Alex Williamson @ 2011-01-28  0:45 UTC (permalink / raw)
  To: André Weidemann; +Cc: Prasad Joshi, kvm, Oswaldo Cadenas

[-- Attachment #1: Type: text/plain, Size: 862 bytes --]

On Thu, 2011-01-27 at 12:56 +0100, André Weidemann wrote:
> Hi Alex,
> 
> On 26.01.2011 06:12, Alex Williamson wrote:
> 
> > So while your initial results are promising, my guess is that you're
> > using card specific drivers and still need to consider some of the
> > harder problems with generic support for vga assignment.  I hacked on
> > this for a bit trying to see if I could get vga assignment working
> > with the vfio driver.  Setting up the legacy access and preventing
> > qemu from stealing it back should get you basic vga modes and might
> > even allow the option rom to run to initialize the card for pre-boot.
> > I was able to get this far on a similar ATI card.  I never hard much
> > luck with other cards though, and I was never able to get the vesa
> > extensions working.  Thanks,
> 
> Do you mind sharing these patches?

Attached.

Alex

[-- Attachment #2: vfio-vga.patch --]
[-- Type: text/x-patch, Size: 12185 bytes --]

commit 0313d97cf24177023cdb6f2e4c54d077c5a775c1
Author: Alex Williamson <alex.williamson@redhat.com>
Date:   Wed Sep 29 13:50:39 2010 -0600

vfio: VGA passthrough support(ish)

Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
---

diff --git a/Makefile.target b/Makefile.target
index c507dd2..cb0cea6 100644
--- a/Makefile.target
+++ b/Makefile.target
@@ -203,6 +203,7 @@ obj-i386-y += device-hotplug.o pci-hotplug.o smbios.o wdt_ib700.o
 obj-i386-y += debugcon.o multiboot.o
 obj-i386-y += pc_piix.o
 obj-i386-y += vfio.o
+obj-$(CONFIG_VFIO_VGA) += vfio-vga.o
 
 # shared objects
 obj-ppc-y = ppc.o
diff --git a/configure b/configure
index 3bfc5e9..b15e68f 100755
--- a/configure
+++ b/configure
@@ -322,6 +322,7 @@ user_pie="no"
 zero_malloc=""
 trace_backend="nop"
 trace_file="trace"
+vfio_vga="no"
 
 # OS specific
 if check_define __linux__ ; then
@@ -718,6 +719,8 @@ for opt do
   ;;
   --enable-vhost-net) vhost_net="yes"
   ;;
+  --enable-vfio-vga) vfio_vga="yes"
+  ;;
   --*dir)
   ;;
   *) echo "ERROR: unknown option $opt"; show_help="yes"
@@ -907,6 +910,7 @@ echo "  --disable-docs           disable documentation build"
 echo "  --disable-vhost-net      disable vhost-net acceleration support"
 echo "  --enable-vhost-net       enable vhost-net acceleration support"
 echo "  --trace-backend=B        Trace backend nop simple ust"
+echo "  --enable-vfio-vga        enable vfio VGA passthrough support"
 echo "  --trace-file=NAME        Full PATH,NAME of file to store traces"
 echo "                           Default:trace-<pid>"
 echo ""
@@ -2240,6 +2244,7 @@ echo "preadv support    $preadv"
 echo "fdatasync         $fdatasync"
 echo "uuid support      $uuid"
 echo "vhost-net support $vhost_net"
+echo "vfio-vga support  $vfio_vga"
 echo "Trace backend     $trace_backend"
 echo "Trace output file $trace_file-<pid>"
 
@@ -2762,6 +2767,9 @@ case "$target_arch2" in
     if test "$xen" = "yes" -a "$target_softmmu" = "yes" ; then
       echo "CONFIG_XEN=y" >> $config_target_mak
     fi
+    if test $vfio_vga = "yes" ; then
+      echo "CONFIG_VFIO_VGA=y" >> $config_host_mak
+    fi
 esac
 case "$target_arch2" in
   i386|x86_64|ppcemb|ppc|ppc64|s390x)
diff --git a/hw/vfio-vga.c b/hw/vfio-vga.c
new file mode 100644
index 0000000..5c1899c
--- /dev/null
+++ b/hw/vfio-vga.c
@@ -0,0 +1,291 @@
+/*
+ * vfio VGA device assignment support
+ *
+ * Copyright Red Hat, Inc. 2010
+ *
+ * Authors:
+ *  Alex Williamson <alex.williamson@redhat.com>
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2.  See
+ * the COPYING file in the top-level directory.
+ *
+ * Based on qemu-kvm device-assignment:
+ *  Adapted for KVM by Qumranet.
+ *  Copyright (c) 2007, Neocleus, Alex Novik (alex@neocleus.com)
+ *  Copyright (c) 2007, Neocleus, Guy Zana (guy@neocleus.com)
+ *  Copyright (C) 2008, Qumranet, Amit Shah (amit.shah@qumranet.com)
+ *  Copyright (C) 2008, Red Hat, Amit Shah (amit.shah@redhat.com)
+ *  Copyright (C) 2008, IBM, Muli Ben-Yehuda (muli@il.ibm.com)
+ */
+
+#include <stdio.h>
+#include <unistd.h>
+#include <sys/io.h>
+#include <sys/mman.h>
+#include <sys/types.h>
+#include <sys/stat.h>
+#include "event_notifier.h"
+#include "hw.h"
+#include "memory.h"
+#include "monitor.h"
+#include "pc.h"
+#include "qemu-error.h"
+#include "sysemu.h"
+#include "vfio.h"
+#include <pci/header.h>
+#include <pci/types.h>
+#include <linux/types.h>
+#include "linux-vfio.h"
+
+//#define DEBUG_VFIO_VGA
+#ifdef DEBUG_VFIO_VGA
+#define DPRINTF(fmt, ...) \
+    do { printf("vfio-vga: " fmt, ## __VA_ARGS__); } while (0)
+#else
+#define DPRINTF(fmt, ...) \
+    do { } while (0)
+#endif
+
+/*
+ * VGA setup
+ */
+static void vfio_vga_write(VFIODevice *vdev, uint32_t addr,
+                           uint32_t val, int len)
+{
+    DPRINTF("%s 0x%x %d - 0x%x\n", __func__, 0xa0000 + addr, len, val);
+    switch (len) {
+        case 1:
+            *(uint8_t *)(vdev->vga_mmio + addr) = (uint8_t)val;
+            break;
+        case 2:
+            *(uint16_t *)(vdev->vga_mmio + addr) = (uint16_t)val;
+            break;
+        case 4:
+            *(uint32_t *)(vdev->vga_mmio + addr) = val;
+            break;
+    }
+}
+
+static void vfio_vga_writeb(void *opaque, target_phys_addr_t addr, uint32_t val)
+{
+    vfio_vga_write(opaque, addr, val, 1);
+}
+
+static void vfio_vga_writew(void *opaque, target_phys_addr_t addr, uint32_t val)
+{
+    vfio_vga_write(opaque, addr, val, 2);
+}
+
+static void vfio_vga_writel(void *opaque, target_phys_addr_t addr, uint32_t val)
+{
+    vfio_vga_write(opaque, addr, val, 4);
+}
+
+static CPUWriteMemoryFunc * const vfio_vga_writes[] = {
+    &vfio_vga_writeb,
+    &vfio_vga_writew,
+    &vfio_vga_writel
+};
+
+static uint32_t vfio_vga_read(VFIODevice *vdev, uint32_t addr, int len)
+{
+    uint32_t val = 0xffffffff;
+    switch (len) {
+        case 1:
+            val = (uint32_t)*(uint8_t *)(vdev->vga_mmio + addr);
+            break;
+        case 2:
+            val = (uint32_t)*(uint16_t *)(vdev->vga_mmio + addr);
+            break;
+        case 4:
+            val = *(uint32_t *)(vdev->vga_mmio + addr);
+            break;
+    }
+    DPRINTF("%s 0x%x %d = 0x%x\n", __func__, 0xa0000 + addr, len, val);
+    return val;
+}
+
+static uint32_t vfio_vga_readb(void *opaque, target_phys_addr_t addr)
+{
+    return vfio_vga_read(opaque, addr, 1);
+}
+
+static uint32_t vfio_vga_readw(void *opaque, target_phys_addr_t addr)
+{
+    return vfio_vga_read(opaque, addr, 2);
+}
+
+static uint32_t vfio_vga_readl(void *opaque, target_phys_addr_t addr)
+{
+    return vfio_vga_read(opaque, addr, 4);
+}
+
+static CPUReadMemoryFunc * const vfio_vga_reads[] = {
+    &vfio_vga_readb,
+    &vfio_vga_readw,
+    &vfio_vga_readl
+};
+
+static void vfio_vga_out(VFIODevice *vdev, uint32_t addr, uint32_t val, int len)
+{
+    DPRINTF("%s 0x%x %d - 0x%x\n", __func__, addr, len, val);
+    ioperm(0x3b0, 0x30, 1); /* XXX fix me */
+    switch (len) {
+        case 1:
+            outb(val, addr);
+            break;
+        case 2:
+            outw(val, addr);
+            break;
+        case 4:
+            outl(val, addr);
+            break;
+    }
+}
+
+static void vfio_vga_outb(void *opaque, uint32_t addr, uint32_t val)
+{
+    vfio_vga_out(opaque, addr, val, 1);
+}
+
+static void vfio_vga_outw(void *opaque, uint32_t addr, uint32_t val)
+{
+    vfio_vga_out(opaque, addr, val, 2);
+}
+
+static void vfio_vga_outl(void *opaque, uint32_t addr, uint32_t val)
+{
+    vfio_vga_out(opaque, addr, val, 4);
+}
+
+static uint32_t vfio_vga_in(VFIODevice *vdev, uint32_t addr, int len)
+{
+    uint32_t val = 0xffffffff;
+    ioperm(0x3b0, 0x30, 1); /* XXX fix me */
+    switch (len) {
+        case 1:
+            val = inb(addr);
+            break;
+        case 2:
+            val = inw(addr);
+            break;
+        case 4:
+            val = inl(addr);
+            break;
+    }
+    DPRINTF("%s 0x%x, %d = 0x%x\n", __func__, addr, len, val);
+    return val;
+}
+
+static uint32_t vfio_vga_inb(void *opaque, uint32_t addr)
+{
+    return vfio_vga_in(opaque, addr, 1);
+}
+
+static uint32_t vfio_vga_inw(void *opaque, uint32_t addr)
+{
+    return vfio_vga_in(opaque, addr, 2);
+}
+
+static uint32_t vfio_vga_inl(void *opaque, uint32_t addr)
+{
+    return vfio_vga_in(opaque, addr, 4);
+}
+
+int vfio_vga_setup(VFIODevice *vdev)
+{
+    char buf[256];
+    int ret;
+
+    if (vga_interface_type != VGA_NONE) {
+        fprintf(stderr,
+                "VGA devie assigned without -vga none param, no ISA VGA\n");
+        return -1;
+    }
+
+    vdev->vga_fd = open("/dev/vga_arbiter", O_RDWR);
+    if (vdev->vga_fd < 0) {
+        fprintf(stderr, "%s - Failed to open vga arbiter (%s)\n",
+                __func__, strerror(errno));
+        return -1;
+    }
+    ret = read(vdev->vga_fd, buf, sizeof(buf));
+    if (ret <= 0) {
+        fprintf(stderr, "%s - Failed to read from vga arbiter (%s)\n",
+                __func__, strerror(errno));
+        close(vdev->vga_fd);
+        return -1;
+    }
+    buf[ret - 1] = 0;
+    vdev->vga_orig = qemu_strdup(buf);
+
+    snprintf(buf, sizeof(buf), "target PCI:%04x:%02x:%02x.%x",
+             vdev->host.seg, vdev->host.bus, vdev->host.dev, vdev->host.func);
+    ret = write(vdev->vga_fd, buf, strlen(buf));
+    if (ret != strlen(buf)) {
+        fprintf(stderr, "%s - Failed to write to vga arbiter (%s)\n",
+                __func__, strerror(errno));
+        close(vdev->vga_fd);
+        return -1;
+    }
+    snprintf(buf, sizeof(buf), "decodes io+mem");
+    ret = write(vdev->vga_fd, buf, strlen(buf));
+    if (ret != strlen(buf)) {
+        fprintf(stderr, "%s - Failed to write to vga arbiter (%s)\n",
+                __func__, strerror(errno));
+        close(vdev->vga_fd);
+        return -1;
+    }
+
+    vdev->vga_mmio_fd = open("/dev/mem", O_RDWR);
+    if (vdev->vga_mmio_fd < 0) {
+        fprintf(stderr, "%s - Failed to open /dev/mem (%s)\n",
+                __func__, strerror(errno));
+        return -1;
+    }
+    vdev->vga_mmio = mmap(NULL, 0x40000, PROT_READ | PROT_WRITE,
+                          MAP_SHARED, vdev->vga_mmio_fd, 0xa0000);
+    if (vdev->vga_mmio == MAP_FAILED) {
+        fprintf(stderr, "%s - mmap failed (%s)\n", __func__, strerror(errno));
+        return -1;
+    }
+
+#if 1
+    vdev->vga_io = cpu_register_io_memory(vfio_vga_reads,
+                                          vfio_vga_writes, vdev);
+    cpu_register_physical_memory(0xa0000, 0x20000, vdev->vga_io);
+    qemu_register_coalesced_mmio(0xa0000, 0x20000);
+#else
+    cpu_register_physical_memory(0xa0000, 0x20000, 
+        qemu_ram_map(&vdev->pdev.qdev, "VGA", 0x20000, vdev->vga_mmio));
+    qemu_register_coalesced_mmio(0xa0000, 0x20000);
+#endif
+
+    register_ioport_write(0x3b0, 0x30, 1, vfio_vga_outb, vdev);
+    register_ioport_write(0x3b0, 0x30, 2, vfio_vga_outw, vdev);
+    register_ioport_write(0x3b0, 0x30, 4, vfio_vga_outl, vdev);
+    register_ioport_read(0x3b0, 0x30, 1, vfio_vga_inb, vdev);
+    register_ioport_read(0x3b0, 0x30, 2, vfio_vga_inw, vdev);
+    register_ioport_read(0x3b0, 0x30, 4, vfio_vga_inl, vdev);
+    if (ioperm(0x3b0, 0x30, 1)) {
+        fprintf(stderr, "%s - ioperm failed (%s)\n", __func__, strerror(errno));
+        return -1;
+    }
+    return 0;
+}
+
+void vfio_vga_exit(VFIODevice *vdev)
+{
+    if (!vdev->vga_io)
+        return;
+
+    isa_unassign_ioport(0x3b0, 0x30);
+    qemu_unregister_coalesced_mmio(0xa0000, 0x20000);
+    cpu_register_physical_memory(0xa0000, 0x20000, IO_MEM_UNASSIGNED);
+    cpu_unregister_io_memory(vdev->vga_io);
+    munmap(vdev->vga_mmio, 0x40000);
+    close(vdev->vga_mmio_fd);
+    qemu_free(vdev->vga_orig);
+    close(vdev->vga_fd);
+}
+
diff --git a/hw/vfio.c b/hw/vfio.c
index e2da724..f7c7a42 100644
--- a/hw/vfio.c
+++ b/hw/vfio.c
@@ -1268,8 +1268,22 @@ static int vfio_initfn(struct PCIDevice *pdev)
     if (vfio_enable_intx(vdev))
         goto out_unmap_iommu;
 
+#ifdef CONFIG_VFIO_VGA
+    {
+        uint16_t class;
+
+        class = vfio_pci_read_config(&vdev->pdev, PCI_CLASS_DEVICE, 2);
+        if (class == PCI_CLASS_DISPLAY_VGA && vfio_vga_setup(vdev))
+            goto out_vga_fail;
+    }
+#endif
+
     return 0;
 
+#ifdef CONFIG_VFIO_VGA
+out_vga_fail:
+    vfio_disable_intx(vdev);
+#endif
 out_unmap_iommu:
     vfio_unmap_iommu(vdev);
 out_unmap_resources:
@@ -1290,6 +1304,9 @@ static int vfio_exitfn(struct PCIDevice *pdev)
 {
     VFIODevice *vdev = DO_UPCAST(VFIODevice, pdev, pdev);
     
+#ifdef CONFIG_VFIO_VGA
+    vfio_vga_exit(vdev);
+#endif
     vfio_disable_intx(vdev);
     vfio_disable_msi(vdev);
     vfio_disable_msix(vdev);
diff --git a/hw/vfio.h b/hw/vfio.h
index b5a0525..c7490b3 100644
--- a/hw/vfio.h
+++ b/hw/vfio.h
@@ -83,8 +83,20 @@ typedef struct VFIODevice {
     MSIX msix;
     int vfiofd;
     int uiommufd;
+#ifdef CONFIG_VFIO_VGA
+    int vga_io;
+    int vga_fd;
+    int vga_mmio_fd;
+    uint8_t *vga_mmio;
+    char *vga_orig;
+#endif
     char *vfiofd_name;
     char *uiommufd_name;
 } VFIODevice;
 
+#ifdef CONFIG_VFIO_VGA
+int vfio_vga_setup(VFIODevice *vdev);
+void vfio_vga_exit(VFIODevice *vdev);
+#endif
+
 #endif /* __VFIO_H__ */

^ permalink raw reply related	[flat|nested] 27+ messages in thread

* Re: Graphics pass-through
  2011-01-28 17:29         ` André Weidemann
@ 2011-01-28 16:25           ` Alex Williamson
  0 siblings, 0 replies; 27+ messages in thread
From: Alex Williamson @ 2011-01-28 16:25 UTC (permalink / raw)
  To: André Weidemann; +Cc: Prasad Joshi, kvm, Oswaldo Cadenas

On Fri, 2011-01-28 at 18:29 +0100, André Weidemann wrote:
> Hi Alex,
> 
> On 28.01.2011 01:45, Alex Williamson wrote:
> 
> >> Do you mind sharing these patches?
> >
> > Attached.
> 
> Thank you for attaching the patch. Unfortunately it does not apply to 
> current clone of the qemu-kvm git repository. The file hw/vfio.c does 
> not exist in the public repository, but your patch contains lines for 
> hw/vfio.c.

Yes, vfio isn't upstream yet.  The patch is for reference, not for
applying.

Alex


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: Graphics pass-through
  2011-01-28  0:45       ` Alex Williamson
@ 2011-01-28 17:29         ` André Weidemann
  2011-01-28 16:25           ` Alex Williamson
  2011-05-05  8:50         ` Jan Kiszka
  1 sibling, 1 reply; 27+ messages in thread
From: André Weidemann @ 2011-01-28 17:29 UTC (permalink / raw)
  To: Alex Williamson; +Cc: Prasad Joshi, kvm, Oswaldo Cadenas

Hi Alex,

On 28.01.2011 01:45, Alex Williamson wrote:

>> Do you mind sharing these patches?
>
> Attached.

Thank you for attaching the patch. Unfortunately it does not apply to 
current clone of the qemu-kvm git repository. The file hw/vfio.c does 
not exist in the public repository, but your patch contains lines for 
hw/vfio.c.

Regards
  André

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: Graphics pass-through
  2011-01-28  0:45       ` Alex Williamson
  2011-01-28 17:29         ` André Weidemann
@ 2011-05-05  8:50         ` Jan Kiszka
  2011-05-05 15:17           ` Alex Williamson
  1 sibling, 1 reply; 27+ messages in thread
From: Jan Kiszka @ 2011-05-05  8:50 UTC (permalink / raw)
  To: Alex Williamson; +Cc: André Weidemann, Prasad Joshi, kvm, Oswaldo Cadenas

Hi Alex,

On 2011-01-28 01:45, Alex Williamson wrote:
> On Thu, 2011-01-27 at 12:56 +0100, André Weidemann wrote:
>> Hi Alex,
>>
>> On 26.01.2011 06:12, Alex Williamson wrote:
>>
>>> So while your initial results are promising, my guess is that you're
>>> using card specific drivers and still need to consider some of the
>>> harder problems with generic support for vga assignment.  I hacked on
>>> this for a bit trying to see if I could get vga assignment working
>>> with the vfio driver.  Setting up the legacy access and preventing
>>> qemu from stealing it back should get you basic vga modes and might
>>> even allow the option rom to run to initialize the card for pre-boot.
>>> I was able to get this far on a similar ATI card.  I never hard much
>>> luck with other cards though, and I was never able to get the vesa
>>> extensions working.  Thanks,
>>
>> Do you mind sharing these patches?
> 
> Attached.
> 

We are about to try some pass-through with an NVIDA card. So I already
hacked on your vfio patch to make it build against current devices
assignment code. Some questions arose while studying the code:

...

> --- /dev/null
> +++ b/hw/vfio-vga.c
> @@ -0,0 +1,291 @@
> +/*
> + * vfio VGA device assignment support
> + *
> + * Copyright Red Hat, Inc. 2010
> + *
> + * Authors:
> + *  Alex Williamson <alex.williamson@redhat.com>
> + *
> + * This work is licensed under the terms of the GNU GPL, version 2.  See
> + * the COPYING file in the top-level directory.
> + *
> + * Based on qemu-kvm device-assignment:
> + *  Adapted for KVM by Qumranet.
> + *  Copyright (c) 2007, Neocleus, Alex Novik (alex@neocleus.com)
> + *  Copyright (c) 2007, Neocleus, Guy Zana (guy@neocleus.com)
> + *  Copyright (C) 2008, Qumranet, Amit Shah (amit.shah@qumranet.com)
> + *  Copyright (C) 2008, Red Hat, Amit Shah (amit.shah@redhat.com)
> + *  Copyright (C) 2008, IBM, Muli Ben-Yehuda (muli@il.ibm.com)
> + */
> +
> +#include <stdio.h>
> +#include <unistd.h>
> +#include <sys/io.h>
> +#include <sys/mman.h>
> +#include <sys/types.h>
> +#include <sys/stat.h>
> +#include "event_notifier.h"
> +#include "hw.h"
> +#include "memory.h"
> +#include "monitor.h"
> +#include "pc.h"
> +#include "qemu-error.h"
> +#include "sysemu.h"
> +#include "vfio.h"
> +#include <pci/header.h>
> +#include <pci/types.h>
> +#include <linux/types.h>
> +#include "linux-vfio.h"
> +
> +//#define DEBUG_VFIO_VGA
> +#ifdef DEBUG_VFIO_VGA
> +#define DPRINTF(fmt, ...) \
> +    do { printf("vfio-vga: " fmt, ## __VA_ARGS__); } while (0)
> +#else
> +#define DPRINTF(fmt, ...) \
> +    do { } while (0)
> +#endif
> +
> +/*
> + * VGA setup
> + */
> +static void vfio_vga_write(VFIODevice *vdev, uint32_t addr,
> +                           uint32_t val, int len)
> +{
> +    DPRINTF("%s 0x%x %d - 0x%x\n", __func__, 0xa0000 + addr, len, val);
> +    switch (len) {
> +        case 1:
> +            *(uint8_t *)(vdev->vga_mmio + addr) = (uint8_t)val;
> +            break;
> +        case 2:
> +            *(uint16_t *)(vdev->vga_mmio + addr) = (uint16_t)val;
> +            break;
> +        case 4:
> +            *(uint32_t *)(vdev->vga_mmio + addr) = val;
> +            break;
> +    }
> +}
> +
> +static void vfio_vga_writeb(void *opaque, target_phys_addr_t addr, uint32_t val)
> +{
> +    vfio_vga_write(opaque, addr, val, 1);
> +}
> +
> +static void vfio_vga_writew(void *opaque, target_phys_addr_t addr, uint32_t val)
> +{
> +    vfio_vga_write(opaque, addr, val, 2);
> +}
> +
> +static void vfio_vga_writel(void *opaque, target_phys_addr_t addr, uint32_t val)
> +{
> +    vfio_vga_write(opaque, addr, val, 4);
> +}
> +
> +static CPUWriteMemoryFunc * const vfio_vga_writes[] = {
> +    &vfio_vga_writeb,
> +    &vfio_vga_writew,
> +    &vfio_vga_writel
> +};
> +
> +static uint32_t vfio_vga_read(VFIODevice *vdev, uint32_t addr, int len)
> +{
> +    uint32_t val = 0xffffffff;
> +    switch (len) {
> +        case 1:
> +            val = (uint32_t)*(uint8_t *)(vdev->vga_mmio + addr);
> +            break;
> +        case 2:
> +            val = (uint32_t)*(uint16_t *)(vdev->vga_mmio + addr);
> +            break;
> +        case 4:
> +            val = *(uint32_t *)(vdev->vga_mmio + addr);
> +            break;
> +    }
> +    DPRINTF("%s 0x%x %d = 0x%x\n", __func__, 0xa0000 + addr, len, val);
> +    return val;
> +}
> +
> +static uint32_t vfio_vga_readb(void *opaque, target_phys_addr_t addr)
> +{
> +    return vfio_vga_read(opaque, addr, 1);
> +}
> +
> +static uint32_t vfio_vga_readw(void *opaque, target_phys_addr_t addr)
> +{
> +    return vfio_vga_read(opaque, addr, 2);
> +}
> +
> +static uint32_t vfio_vga_readl(void *opaque, target_phys_addr_t addr)
> +{
> +    return vfio_vga_read(opaque, addr, 4);
> +}
> +
> +static CPUReadMemoryFunc * const vfio_vga_reads[] = {
> +    &vfio_vga_readb,
> +    &vfio_vga_readw,
> +    &vfio_vga_readl
> +};
> +
> +static void vfio_vga_out(VFIODevice *vdev, uint32_t addr, uint32_t val, int len)
> +{
> +    DPRINTF("%s 0x%x %d - 0x%x\n", __func__, addr, len, val);
> +    ioperm(0x3b0, 0x30, 1); /* XXX fix me */

Why do you have to re-establish the ioperms here on each access? Are we
just lacking the use of generic kvm ioperm management?

> +    switch (len) {
> +        case 1:
> +            outb(val, addr);
> +            break;
> +        case 2:
> +            outw(val, addr);
> +            break;
> +        case 4:
> +            outl(val, addr);
> +            break;
> +    }
> +}
> +
> +static void vfio_vga_outb(void *opaque, uint32_t addr, uint32_t val)
> +{
> +    vfio_vga_out(opaque, addr, val, 1);
> +}
> +
> +static void vfio_vga_outw(void *opaque, uint32_t addr, uint32_t val)
> +{
> +    vfio_vga_out(opaque, addr, val, 2);
> +}
> +
> +static void vfio_vga_outl(void *opaque, uint32_t addr, uint32_t val)
> +{
> +    vfio_vga_out(opaque, addr, val, 4);
> +}
> +
> +static uint32_t vfio_vga_in(VFIODevice *vdev, uint32_t addr, int len)
> +{
> +    uint32_t val = 0xffffffff;
> +    ioperm(0x3b0, 0x30, 1); /* XXX fix me */
> +    switch (len) {
> +        case 1:
> +            val = inb(addr);
> +            break;
> +        case 2:
> +            val = inw(addr);
> +            break;
> +        case 4:
> +            val = inl(addr);
> +            break;
> +    }
> +    DPRINTF("%s 0x%x, %d = 0x%x\n", __func__, addr, len, val);
> +    return val;
> +}
> +
> +static uint32_t vfio_vga_inb(void *opaque, uint32_t addr)
> +{
> +    return vfio_vga_in(opaque, addr, 1);
> +}
> +
> +static uint32_t vfio_vga_inw(void *opaque, uint32_t addr)
> +{
> +    return vfio_vga_in(opaque, addr, 2);
> +}
> +
> +static uint32_t vfio_vga_inl(void *opaque, uint32_t addr)
> +{
> +    return vfio_vga_in(opaque, addr, 4);
> +}
> +
> +int vfio_vga_setup(VFIODevice *vdev)
> +{
> +    char buf[256];
> +    int ret;
> +
> +    if (vga_interface_type != VGA_NONE) {
> +        fprintf(stderr,
> +                "VGA devie assigned without -vga none param, no ISA VGA\n");
> +        return -1;
> +    }
> +
> +    vdev->vga_fd = open("/dev/vga_arbiter", O_RDWR);
> +    if (vdev->vga_fd < 0) {
> +        fprintf(stderr, "%s - Failed to open vga arbiter (%s)\n",
> +                __func__, strerror(errno));
> +        return -1;
> +    }
> +    ret = read(vdev->vga_fd, buf, sizeof(buf));
> +    if (ret <= 0) {
> +        fprintf(stderr, "%s - Failed to read from vga arbiter (%s)\n",
> +                __func__, strerror(errno));
> +        close(vdev->vga_fd);
> +        return -1;
> +    }
> +    buf[ret - 1] = 0;
> +    vdev->vga_orig = qemu_strdup(buf);
> +
> +    snprintf(buf, sizeof(buf), "target PCI:%04x:%02x:%02x.%x",
> +             vdev->host.seg, vdev->host.bus, vdev->host.dev, vdev->host.func);
> +    ret = write(vdev->vga_fd, buf, strlen(buf));
> +    if (ret != strlen(buf)) {
> +        fprintf(stderr, "%s - Failed to write to vga arbiter (%s)\n",
> +                __func__, strerror(errno));
> +        close(vdev->vga_fd);
> +        return -1;
> +    }
> +    snprintf(buf, sizeof(buf), "decodes io+mem");
> +    ret = write(vdev->vga_fd, buf, strlen(buf));
> +    if (ret != strlen(buf)) {
> +        fprintf(stderr, "%s - Failed to write to vga arbiter (%s)\n",
> +                __func__, strerror(errno));
> +        close(vdev->vga_fd);
> +        return -1;
> +    }

OK, so we grab the assigned adapter and make it handle legacy io+mem. I
guess this approach only works with a single guest with an assigned
adapter. Would it be possible and not extremely costly to do some
on-demand grabbing of the range to share it with multiple VMs?

And what about the host? When does Linux release the legacy range?
Always or only when a specific (!=vga/vesa) framebuffer driver is loaded?

Is there some other way to pass the legacy accesses from the guest to a
specific adapter without going via the host's legacy area? I.e. do some
adapters allow remapping?

> +
> +    vdev->vga_mmio_fd = open("/dev/mem", O_RDWR);
> +    if (vdev->vga_mmio_fd < 0) {
> +        fprintf(stderr, "%s - Failed to open /dev/mem (%s)\n",
> +                __func__, strerror(errno));
> +        return -1;
> +    }
> +    vdev->vga_mmio = mmap(NULL, 0x40000, PROT_READ | PROT_WRITE,
> +                          MAP_SHARED, vdev->vga_mmio_fd, 0xa0000);
> +    if (vdev->vga_mmio == MAP_FAILED) {
> +        fprintf(stderr, "%s - mmap failed (%s)\n", __func__, strerror(errno));
> +        return -1;
> +    }
> +
> +#if 1
> +    vdev->vga_io = cpu_register_io_memory(vfio_vga_reads,
> +                                          vfio_vga_writes, vdev);
> +    cpu_register_physical_memory(0xa0000, 0x20000, vdev->vga_io);
> +    qemu_register_coalesced_mmio(0xa0000, 0x20000);
> +#else
> +    cpu_register_physical_memory(0xa0000, 0x20000, 
> +        qemu_ram_map(&vdev->pdev.qdev, "VGA", 0x20000, vdev->vga_mmio));
> +    qemu_register_coalesced_mmio(0xa0000, 0x20000);
> +#endif

To make the second case work, we would have to track the mode switches
of the guest via legacy VGA interfaces and switch the mapping on the
fly, right?

> +
> +    register_ioport_write(0x3b0, 0x30, 1, vfio_vga_outb, vdev);
> +    register_ioport_write(0x3b0, 0x30, 2, vfio_vga_outw, vdev);
> +    register_ioport_write(0x3b0, 0x30, 4, vfio_vga_outl, vdev);
> +    register_ioport_read(0x3b0, 0x30, 1, vfio_vga_inb, vdev);
> +    register_ioport_read(0x3b0, 0x30, 2, vfio_vga_inw, vdev);
> +    register_ioport_read(0x3b0, 0x30, 4, vfio_vga_inl, vdev);
> +    if (ioperm(0x3b0, 0x30, 1)) {
> +        fprintf(stderr, "%s - ioperm failed (%s)\n", __func__, strerror(errno));
> +        return -1;
> +    }
> +    return 0;
> +}
> +
> +void vfio_vga_exit(VFIODevice *vdev)
> +{
> +    if (!vdev->vga_io)
> +        return;
> +
> +    isa_unassign_ioport(0x3b0, 0x30);
> +    qemu_unregister_coalesced_mmio(0xa0000, 0x20000);
> +    cpu_register_physical_memory(0xa0000, 0x20000, IO_MEM_UNASSIGNED);
> +    cpu_unregister_io_memory(vdev->vga_io);
> +    munmap(vdev->vga_mmio, 0x40000);
> +    close(vdev->vga_mmio_fd);
> +    qemu_free(vdev->vga_orig);
> +    close(vdev->vga_fd);
> +}
> +

Thanks,
Jan

-- 
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: Graphics pass-through
  2011-05-05  8:50         ` Jan Kiszka
@ 2011-05-05 15:17           ` Alex Williamson
  2011-05-09 11:14             ` Jan Kiszka
  0 siblings, 1 reply; 27+ messages in thread
From: Alex Williamson @ 2011-05-05 15:17 UTC (permalink / raw)
  To: Jan Kiszka; +Cc: André Weidemann, Prasad Joshi, kvm, Oswaldo Cadenas

Hi Jan,

On Thu, 2011-05-05 at 10:50 +0200, Jan Kiszka wrote:
> Hi Alex,
> 
> On 2011-01-28 01:45, Alex Williamson wrote:
> > On Thu, 2011-01-27 at 12:56 +0100, André Weidemann wrote:
> >> Hi Alex,
> >>
> >> On 26.01.2011 06:12, Alex Williamson wrote:
> >>
> >>> So while your initial results are promising, my guess is that you're
> >>> using card specific drivers and still need to consider some of the
> >>> harder problems with generic support for vga assignment.  I hacked on
> >>> this for a bit trying to see if I could get vga assignment working
> >>> with the vfio driver.  Setting up the legacy access and preventing
> >>> qemu from stealing it back should get you basic vga modes and might
> >>> even allow the option rom to run to initialize the card for pre-boot.
> >>> I was able to get this far on a similar ATI card.  I never hard much
> >>> luck with other cards though, and I was never able to get the vesa
> >>> extensions working.  Thanks,
> >>
> >> Do you mind sharing these patches?
> > 
> > Attached.
> > 
> 
> We are about to try some pass-through with an NVIDA card. So I already
> hacked on your vfio patch to make it build against current devices
> assignment code. Some questions arose while studying the code:

Cool!

> > --- /dev/null
> > +++ b/hw/vfio-vga.c
> > @@ -0,0 +1,291 @@
> > +/*
> > + * vfio VGA device assignment support
> > + *
> > + * Copyright Red Hat, Inc. 2010
> > + *
> > + * Authors:
> > + *  Alex Williamson <alex.williamson@redhat.com>
> > + *
> > + * This work is licensed under the terms of the GNU GPL, version 2.  See
> > + * the COPYING file in the top-level directory.
> > + *
> > + * Based on qemu-kvm device-assignment:
> > + *  Adapted for KVM by Qumranet.
> > + *  Copyright (c) 2007, Neocleus, Alex Novik (alex@neocleus.com)
> > + *  Copyright (c) 2007, Neocleus, Guy Zana (guy@neocleus.com)
> > + *  Copyright (C) 2008, Qumranet, Amit Shah (amit.shah@qumranet.com)
> > + *  Copyright (C) 2008, Red Hat, Amit Shah (amit.shah@redhat.com)
> > + *  Copyright (C) 2008, IBM, Muli Ben-Yehuda (muli@il.ibm.com)
> > + */
> > +
> > +#include <stdio.h>
> > +#include <unistd.h>
> > +#include <sys/io.h>
> > +#include <sys/mman.h>
> > +#include <sys/types.h>
> > +#include <sys/stat.h>
> > +#include "event_notifier.h"
> > +#include "hw.h"
> > +#include "memory.h"
> > +#include "monitor.h"
> > +#include "pc.h"
> > +#include "qemu-error.h"
> > +#include "sysemu.h"
> > +#include "vfio.h"
> > +#include <pci/header.h>
> > +#include <pci/types.h>
> > +#include <linux/types.h>
> > +#include "linux-vfio.h"
> > +
> > +//#define DEBUG_VFIO_VGA
> > +#ifdef DEBUG_VFIO_VGA
> > +#define DPRINTF(fmt, ...) \
> > +    do { printf("vfio-vga: " fmt, ## __VA_ARGS__); } while (0)
> > +#else
> > +#define DPRINTF(fmt, ...) \
> > +    do { } while (0)
> > +#endif
> > +
> > +/*
> > + * VGA setup
> > + */
> > +static void vfio_vga_write(VFIODevice *vdev, uint32_t addr,
> > +                           uint32_t val, int len)
> > +{
> > +    DPRINTF("%s 0x%x %d - 0x%x\n", __func__, 0xa0000 + addr, len, val);
> > +    switch (len) {
> > +        case 1:
> > +            *(uint8_t *)(vdev->vga_mmio + addr) = (uint8_t)val;
> > +            break;
> > +        case 2:
> > +            *(uint16_t *)(vdev->vga_mmio + addr) = (uint16_t)val;
> > +            break;
> > +        case 4:
> > +            *(uint32_t *)(vdev->vga_mmio + addr) = val;
> > +            break;
> > +    }
> > +}
> > +
> > +static void vfio_vga_writeb(void *opaque, target_phys_addr_t addr, uint32_t val)
> > +{
> > +    vfio_vga_write(opaque, addr, val, 1);
> > +}
> > +
> > +static void vfio_vga_writew(void *opaque, target_phys_addr_t addr, uint32_t val)
> > +{
> > +    vfio_vga_write(opaque, addr, val, 2);
> > +}
> > +
> > +static void vfio_vga_writel(void *opaque, target_phys_addr_t addr, uint32_t val)
> > +{
> > +    vfio_vga_write(opaque, addr, val, 4);
> > +}
> > +
> > +static CPUWriteMemoryFunc * const vfio_vga_writes[] = {
> > +    &vfio_vga_writeb,
> > +    &vfio_vga_writew,
> > +    &vfio_vga_writel
> > +};
> > +
> > +static uint32_t vfio_vga_read(VFIODevice *vdev, uint32_t addr, int len)
> > +{
> > +    uint32_t val = 0xffffffff;
> > +    switch (len) {
> > +        case 1:
> > +            val = (uint32_t)*(uint8_t *)(vdev->vga_mmio + addr);
> > +            break;
> > +        case 2:
> > +            val = (uint32_t)*(uint16_t *)(vdev->vga_mmio + addr);
> > +            break;
> > +        case 4:
> > +            val = *(uint32_t *)(vdev->vga_mmio + addr);
> > +            break;
> > +    }
> > +    DPRINTF("%s 0x%x %d = 0x%x\n", __func__, 0xa0000 + addr, len, val);
> > +    return val;
> > +}
> > +
> > +static uint32_t vfio_vga_readb(void *opaque, target_phys_addr_t addr)
> > +{
> > +    return vfio_vga_read(opaque, addr, 1);
> > +}
> > +
> > +static uint32_t vfio_vga_readw(void *opaque, target_phys_addr_t addr)
> > +{
> > +    return vfio_vga_read(opaque, addr, 2);
> > +}
> > +
> > +static uint32_t vfio_vga_readl(void *opaque, target_phys_addr_t addr)
> > +{
> > +    return vfio_vga_read(opaque, addr, 4);
> > +}
> > +
> > +static CPUReadMemoryFunc * const vfio_vga_reads[] = {
> > +    &vfio_vga_readb,
> > +    &vfio_vga_readw,
> > +    &vfio_vga_readl
> > +};
> > +
> > +static void vfio_vga_out(VFIODevice *vdev, uint32_t addr, uint32_t val, int len)
> > +{
> > +    DPRINTF("%s 0x%x %d - 0x%x\n", __func__, addr, len, val);
> > +    ioperm(0x3b0, 0x30, 1); /* XXX fix me */
> 
> Why do you have to re-establish the ioperms here on each access? Are we
> just lacking the use of generic kvm ioperm management?

IIRC, setting it up initially wasn't sticking, so I put it here as just
a quick fix to make sure it was set before we used it.  I never fully
made it though debugging why it wasn't working when set earlier.

In general, legacy mmio and ioport needs a better solution.  I wish x86
implemented the legacy io feature of pci sysfs so we could do it that
way, which might also move vga arbitration and chipset vga routing into
the host kernel.

> > +    switch (len) {
> > +        case 1:
> > +            outb(val, addr);
> > +            break;
> > +        case 2:
> > +            outw(val, addr);
> > +            break;
> > +        case 4:
> > +            outl(val, addr);
> > +            break;
> > +    }
> > +}
> > +
> > +static void vfio_vga_outb(void *opaque, uint32_t addr, uint32_t val)
> > +{
> > +    vfio_vga_out(opaque, addr, val, 1);
> > +}
> > +
> > +static void vfio_vga_outw(void *opaque, uint32_t addr, uint32_t val)
> > +{
> > +    vfio_vga_out(opaque, addr, val, 2);
> > +}
> > +
> > +static void vfio_vga_outl(void *opaque, uint32_t addr, uint32_t val)
> > +{
> > +    vfio_vga_out(opaque, addr, val, 4);
> > +}
> > +
> > +static uint32_t vfio_vga_in(VFIODevice *vdev, uint32_t addr, int len)
> > +{
> > +    uint32_t val = 0xffffffff;
> > +    ioperm(0x3b0, 0x30, 1); /* XXX fix me */
> > +    switch (len) {
> > +        case 1:
> > +            val = inb(addr);
> > +            break;
> > +        case 2:
> > +            val = inw(addr);
> > +            break;
> > +        case 4:
> > +            val = inl(addr);
> > +            break;
> > +    }
> > +    DPRINTF("%s 0x%x, %d = 0x%x\n", __func__, addr, len, val);
> > +    return val;
> > +}
> > +
> > +static uint32_t vfio_vga_inb(void *opaque, uint32_t addr)
> > +{
> > +    return vfio_vga_in(opaque, addr, 1);
> > +}
> > +
> > +static uint32_t vfio_vga_inw(void *opaque, uint32_t addr)
> > +{
> > +    return vfio_vga_in(opaque, addr, 2);
> > +}
> > +
> > +static uint32_t vfio_vga_inl(void *opaque, uint32_t addr)
> > +{
> > +    return vfio_vga_in(opaque, addr, 4);
> > +}
> > +
> > +int vfio_vga_setup(VFIODevice *vdev)
> > +{
> > +    char buf[256];
> > +    int ret;
> > +
> > +    if (vga_interface_type != VGA_NONE) {
> > +        fprintf(stderr,
> > +                "VGA devie assigned without -vga none param, no ISA VGA\n");
> > +        return -1;
> > +    }
> > +
> > +    vdev->vga_fd = open("/dev/vga_arbiter", O_RDWR);
> > +    if (vdev->vga_fd < 0) {
> > +        fprintf(stderr, "%s - Failed to open vga arbiter (%s)\n",
> > +                __func__, strerror(errno));
> > +        return -1;
> > +    }
> > +    ret = read(vdev->vga_fd, buf, sizeof(buf));
> > +    if (ret <= 0) {
> > +        fprintf(stderr, "%s - Failed to read from vga arbiter (%s)\n",
> > +                __func__, strerror(errno));
> > +        close(vdev->vga_fd);
> > +        return -1;
> > +    }
> > +    buf[ret - 1] = 0;
> > +    vdev->vga_orig = qemu_strdup(buf);
> > +
> > +    snprintf(buf, sizeof(buf), "target PCI:%04x:%02x:%02x.%x",
> > +             vdev->host.seg, vdev->host.bus, vdev->host.dev, vdev->host.func);
> > +    ret = write(vdev->vga_fd, buf, strlen(buf));
> > +    if (ret != strlen(buf)) {
> > +        fprintf(stderr, "%s - Failed to write to vga arbiter (%s)\n",
> > +                __func__, strerror(errno));
> > +        close(vdev->vga_fd);
> > +        return -1;
> > +    }
> > +    snprintf(buf, sizeof(buf), "decodes io+mem");
> > +    ret = write(vdev->vga_fd, buf, strlen(buf));
> > +    if (ret != strlen(buf)) {
> > +        fprintf(stderr, "%s - Failed to write to vga arbiter (%s)\n",
> > +                __func__, strerror(errno));
> > +        close(vdev->vga_fd);
> > +        return -1;
> > +    }
> 
> OK, so we grab the assigned adapter and make it handle legacy io+mem. I
> guess this approach only works with a single guest with an assigned
> adapter. Would it be possible and not extremely costly to do some
> on-demand grabbing of the range to share it with multiple VMs?

Yes, and that was my intention but never got that far.  Each legacy io
access should switch the arbiter to the necessary device.  Unfortunately
the vga arbiter only works if everyone uses it, and so far it seems like
nobody does.  Obviously some pretty hefty performance implications with
switch on every read.  I'm not sure how that's going to play out.  I
expect once we bootstrap the VGA device and load a real driver, the
legacy areas are seldom used.

> And what about the host? When does Linux release the legacy range?
> Always or only when a specific (!=vga/vesa) framebuffer driver is loaded?

Well, that's where it'd be nice if the vga arbiter was actually in more
widespread use.  It currently seems to be nothing more than a shared
mutex, but it would actually be useful if it included backends to do the
chipset vga routing changes.  I think when I was testing this, I was
externally poking PCI bridge chipset to toggle the VGA_EN bit.

> Is there some other way to pass the legacy accesses from the guest to a
> specific adapter without going via the host's legacy area? I.e. do some
> adapters allow remapping?

Not that I know of on x86.  I wouldn't be surprised if some adapters
just re-route the legacy address ranges to standard PCI mappings, but I
don't know how to figure out if that's true and what the offsets would
be.  I've seen ia64 hardware that supports a _TRA offset such that each
PCI root bridge can support it's own legacy io port space, but that
requires a whole different ioport model.

I believe X.org tries to tackle this by brute force, manually changing
VGA enabled bits on PCI bridges.  I think this is part if why it's
difficult to run multiple X servers on the same system.  Not sure if
that problem has gotten any better since I last looked.

> > +
> > +    vdev->vga_mmio_fd = open("/dev/mem", O_RDWR);
> > +    if (vdev->vga_mmio_fd < 0) {
> > +        fprintf(stderr, "%s - Failed to open /dev/mem (%s)\n",
> > +                __func__, strerror(errno));
> > +        return -1;
> > +    }
> > +    vdev->vga_mmio = mmap(NULL, 0x40000, PROT_READ | PROT_WRITE,
> > +                          MAP_SHARED, vdev->vga_mmio_fd, 0xa0000);
> > +    if (vdev->vga_mmio == MAP_FAILED) {
> > +        fprintf(stderr, "%s - mmap failed (%s)\n", __func__, strerror(errno));
> > +        return -1;
> > +    }
> > +
> > +#if 1
> > +    vdev->vga_io = cpu_register_io_memory(vfio_vga_reads,
> > +                                          vfio_vga_writes, vdev);
> > +    cpu_register_physical_memory(0xa0000, 0x20000, vdev->vga_io);
> > +    qemu_register_coalesced_mmio(0xa0000, 0x20000);
> > +#else
> > +    cpu_register_physical_memory(0xa0000, 0x20000, 
> > +        qemu_ram_map(&vdev->pdev.qdev, "VGA", 0x20000, vdev->vga_mmio));
> > +    qemu_register_coalesced_mmio(0xa0000, 0x20000);
> > +#endif
> 
> To make the second case work, we would have to track the mode switches
> of the guest via legacy VGA interfaces and switch the mapping on the
> fly, right?

Yeah, something like that.   IIRC, I was expecting the second case to
work since I'm doing a static switch of the legacy address space and I
can't recall if it wasn't working or if I used the read/write interface
just so I could add fprintfs to make sure something is happening.
Thanks,

Alex

> > +
> > +    register_ioport_write(0x3b0, 0x30, 1, vfio_vga_outb, vdev);
> > +    register_ioport_write(0x3b0, 0x30, 2, vfio_vga_outw, vdev);
> > +    register_ioport_write(0x3b0, 0x30, 4, vfio_vga_outl, vdev);
> > +    register_ioport_read(0x3b0, 0x30, 1, vfio_vga_inb, vdev);
> > +    register_ioport_read(0x3b0, 0x30, 2, vfio_vga_inw, vdev);
> > +    register_ioport_read(0x3b0, 0x30, 4, vfio_vga_inl, vdev);
> > +    if (ioperm(0x3b0, 0x30, 1)) {
> > +        fprintf(stderr, "%s - ioperm failed (%s)\n", __func__, strerror(errno));
> > +        return -1;
> > +    }
> > +    return 0;
> > +}
> > +
> > +void vfio_vga_exit(VFIODevice *vdev)
> > +{
> > +    if (!vdev->vga_io)
> > +        return;
> > +
> > +    isa_unassign_ioport(0x3b0, 0x30);
> > +    qemu_unregister_coalesced_mmio(0xa0000, 0x20000);
> > +    cpu_register_physical_memory(0xa0000, 0x20000, IO_MEM_UNASSIGNED);
> > +    cpu_unregister_io_memory(vdev->vga_io);
> > +    munmap(vdev->vga_mmio, 0x40000);
> > +    close(vdev->vga_mmio_fd);
> > +    qemu_free(vdev->vga_orig);
> > +    close(vdev->vga_fd);
> > +}
> > +
> 
> Thanks,
> Jan
> 




^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: Graphics pass-through
  2011-05-05 15:17           ` Alex Williamson
@ 2011-05-09 11:14             ` Jan Kiszka
  2011-05-09 14:29               ` Alex Williamson
  2011-05-09 14:55               ` Prasad Joshi
  0 siblings, 2 replies; 27+ messages in thread
From: Jan Kiszka @ 2011-05-09 11:14 UTC (permalink / raw)
  To: Alex Williamson; +Cc: André Weidemann, Prasad Joshi, kvm, Oswaldo Cadenas

On 2011-05-05 17:17, Alex Williamson wrote:
>> And what about the host? When does Linux release the legacy range?
>> Always or only when a specific (!=vga/vesa) framebuffer driver is loaded?
> 
> Well, that's where it'd be nice if the vga arbiter was actually in more
> widespread use.  It currently seems to be nothing more than a shared
> mutex, but it would actually be useful if it included backends to do the
> chipset vga routing changes.  I think when I was testing this, I was
> externally poking PCI bridge chipset to toggle the VGA_EN bit.

Right, we had to drop the approach to pass through the secondary card
for now, the arbiter was not switching properly. Haven't checked yet if
VGA_EN was properly set, though the kernel code looks like it should
take care of this.

Even with handing out the primary adapter, we had only mixed success so
far. The onboard adapter worked well (in VESA mode), but the NVIDIA is
not displaying early boot messages at all. Maybe a vgabios issue.
Windows was booting nevertheless - until we installed the NVIDIA
drivers. Than it ran into a blue screen.

BTW, what ATI adapter did you use precisely, and what did work, what not?

One thing I was wondering: Most modern adapters should be PCIe these
days. Our NVIDIA definitely is. But so far we are claiming to have it
attached to a PCI bus. That caps all the extended capabilities e.g.
Could this make some relevant difference?

Jan

-- 
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: Graphics pass-through
  2011-05-09 11:14             ` Jan Kiszka
@ 2011-05-09 14:29               ` Alex Williamson
  2011-05-09 15:02                 ` Jan Kiszka
  2011-05-09 14:55               ` Prasad Joshi
  1 sibling, 1 reply; 27+ messages in thread
From: Alex Williamson @ 2011-05-09 14:29 UTC (permalink / raw)
  To: Jan Kiszka; +Cc: André Weidemann, Prasad Joshi, kvm, Oswaldo Cadenas

On Mon, 2011-05-09 at 13:14 +0200, Jan Kiszka wrote:
> On 2011-05-05 17:17, Alex Williamson wrote:
> >> And what about the host? When does Linux release the legacy range?
> >> Always or only when a specific (!=vga/vesa) framebuffer driver is loaded?
> > 
> > Well, that's where it'd be nice if the vga arbiter was actually in more
> > widespread use.  It currently seems to be nothing more than a shared
> > mutex, but it would actually be useful if it included backends to do the
> > chipset vga routing changes.  I think when I was testing this, I was
> > externally poking PCI bridge chipset to toggle the VGA_EN bit.
> 
> Right, we had to drop the approach to pass through the secondary card
> for now, the arbiter was not switching properly. Haven't checked yet if
> VGA_EN was properly set, though the kernel code looks like it should
> take care of this.
> 
> Even with handing out the primary adapter, we had only mixed success so
> far. The onboard adapter worked well (in VESA mode), but the NVIDIA is
> not displaying early boot messages at all. Maybe a vgabios issue.
> Windows was booting nevertheless - until we installed the NVIDIA
> drivers. Than it ran into a blue screen.

Interesting, IIRC I could never get VESA modes to work.  I believe I
only had a basic VGA16 mode running in a Windows guest too.

> BTW, what ATI adapter did you use precisely, and what did work, what not?

I have an old X550 (rv380?).  I also have an Nvidia gs8400, but ISTR the
ATI working better for me.

> One thing I was wondering: Most modern adapters should be PCIe these
> days. Our NVIDIA definitely is. But so far we are claiming to have it
> attached to a PCI bus. That caps all the extended capabilities e.g.
> Could this make some relevant difference?

The BIOS and early boot use shouldn't care too much about that, but I
could imagine the high performance drivers potentially failing.  Thanks,

Alex



^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: Graphics pass-through
  2011-05-09 11:14             ` Jan Kiszka
  2011-05-09 14:29               ` Alex Williamson
@ 2011-05-09 14:55               ` Prasad Joshi
  2011-05-09 15:27                 ` Jan Kiszka
  2011-05-10 10:53                 ` Gerd Hoffmann
  1 sibling, 2 replies; 27+ messages in thread
From: Prasad Joshi @ 2011-05-09 14:55 UTC (permalink / raw)
  To: Jan Kiszka; +Cc: Alex Williamson, André Weidemann, kvm, Oswaldo Cadenas

On Mon, May 9, 2011 at 12:14 PM, Jan Kiszka <jan.kiszka@siemens.com> wrote:
> On 2011-05-05 17:17, Alex Williamson wrote:
>>> And what about the host? When does Linux release the legacy range?
>>> Always or only when a specific (!=vga/vesa) framebuffer driver is loaded?
>>
>> Well, that's where it'd be nice if the vga arbiter was actually in more
>> widespread use.  It currently seems to be nothing more than a shared
>> mutex, but it would actually be useful if it included backends to do the
>> chipset vga routing changes.  I think when I was testing this, I was
>> externally poking PCI bridge chipset to toggle the VGA_EN bit.
>
> Right, we had to drop the approach to pass through the secondary card
> for now, the arbiter was not switching properly. Haven't checked yet if
> VGA_EN was properly set, though the kernel code looks like it should
> take care of this.
>
> Even with handing out the primary adapter, we had only mixed success so
> far. The onboard adapter worked well (in VESA mode), but the NVIDIA is
> not displaying early boot messages at all. Maybe a vgabios issue.
> Windows was booting nevertheless - until we installed the NVIDIA
> drivers. Than it ran into a blue screen.
>
> BTW, what ATI adapter did you use precisely, and what did work, what not?

Not hijacking the mail thread. Just wanted to provide some inputs.

Few days back I had tried passing through the secondary graphics card.
I could pass-through two graphics cards to virtual machine.

02:00.0 VGA compatible controller: ATI Technologies Inc Redwood
[Radeon HD 5670] (prog-if 00 [VGA controller])
	Subsystem: PC Partner Limited Device e151
	Flags: bus master, fast devsel, latency 0, IRQ 87
	Memory at d0000000 (64-bit, prefetchable) [size=256M]
	Memory at fe6e0000 (64-bit, non-prefetchable) [size=128K]
	I/O ports at b000 [size=256]
	Expansion ROM at fe6c0000 [disabled] [size=128K]
	Capabilities: <access denied>
	Kernel driver in use: radeon
	Kernel modules: radeon

07:00.0 VGA compatible controller: nVidia Corporation G86 [Quadro NVS
290] (rev a1) (prog-if 00 [VGA controller])
       Subsystem: nVidia Corporation Device 0492
       Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop-
ParErr-Stepping- SERR+ FastB2B- DisINTx-
       Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast
>TAbort-<TAbort- <MAbort- >SERR- <PERR- INTx-
       Latency: 0, Cache Line Size: 64 bytes
       Interrupt: pin A routed to IRQ 24
       Region 0: Memory at fd000000 (32-bit, non-prefetchable) [size=16M]
       Region 1: Memory at d0000000 (64-bit, prefetchable) [size=256M]
       Region 3: Memory at fa000000 (64-bit, non-prefetchable) [size=32M]
       Region 5: I/O ports at ec00 [size=128]
       Expansion ROM at fe9e0000 [disabled] [size=128K]
       Capabilities: <access denied>
       Kernel driver in use: nouveau
       Kernel modules: nouveau, nvidiafb

Both of them are PCIe cards. I have one more ATI card and another
NVIDIA card which does not work.

One of the reason the pass-through did not work is because of the
limit on amount of pci configuration memory by SeaBIOS. SeaBIOS places
a hard limit of 256MB or so on the amount of PCI memory space. Thus,
for some of the VGA device that need more memory never worked for me.

SeaBIOS allows this memory region to be extended to some value near
512MB, but even then the range is not enough.

Another problem with SeaBIOS which limits the amount of memory space
is: SeaBIOS allocates the BAR regions as they are encountered. As far
as I know, the BAR regions should be naturally aligned. Thus the
simple strategy of the SeaBIOS results in large fragmentation.
Therefore, even after increasing the PCI memory space to 512MB the BAR
regions were unallocated.

I will confirm you the details of other graphics cards which do not work.

Thanks and Regards,
Prasad

>
> One thing I was wondering: Most modern adapters should be PCIe these
> days. Our NVIDIA definitely is. But so far we are claiming to have it
> attached to a PCI bus. That caps all the extended capabilities e.g.
> Could this make some relevant difference?
>
> Jan
>
> --
> Siemens AG, Corporate Technology, CT T DE IT 1
> Corporate Competence Center Embedded Linux
>

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: Graphics pass-through
  2011-05-09 14:29               ` Alex Williamson
@ 2011-05-09 15:02                 ` Jan Kiszka
  0 siblings, 0 replies; 27+ messages in thread
From: Jan Kiszka @ 2011-05-09 15:02 UTC (permalink / raw)
  To: Alex Williamson; +Cc: André Weidemann, Prasad Joshi, kvm, Oswaldo Cadenas

On 2011-05-09 16:29, Alex Williamson wrote:
> On Mon, 2011-05-09 at 13:14 +0200, Jan Kiszka wrote:
>> On 2011-05-05 17:17, Alex Williamson wrote:
>>>> And what about the host? When does Linux release the legacy range?
>>>> Always or only when a specific (!=vga/vesa) framebuffer driver is loaded?
>>>
>>> Well, that's where it'd be nice if the vga arbiter was actually in more
>>> widespread use.  It currently seems to be nothing more than a shared
>>> mutex, but it would actually be useful if it included backends to do the
>>> chipset vga routing changes.  I think when I was testing this, I was
>>> externally poking PCI bridge chipset to toggle the VGA_EN bit.
>>
>> Right, we had to drop the approach to pass through the secondary card
>> for now, the arbiter was not switching properly. Haven't checked yet if
>> VGA_EN was properly set, though the kernel code looks like it should
>> take care of this.
>>
>> Even with handing out the primary adapter, we had only mixed success so
>> far. The onboard adapter worked well (in VESA mode), but the NVIDIA is
>> not displaying early boot messages at all. Maybe a vgabios issue.
>> Windows was booting nevertheless - until we installed the NVIDIA
>> drivers. Than it ran into a blue screen.
> 
> Interesting, IIRC I could never get VESA modes to work.  I believe I
> only had a basic VGA16 mode running in a Windows guest too.
> 
>> BTW, what ATI adapter did you use precisely, and what did work, what not?
> 
> I have an old X550 (rv380?).  I also have an Nvidia gs8400, but ISTR the
> ATI working better for me.

Is that Nvidia a PCIe adapter? Did it show BIOS / early boot messages
properly?

BTW, we are fighting with a Quadro FX 3800.

> 
>> One thing I was wondering: Most modern adapters should be PCIe these
>> days. Our NVIDIA definitely is. But so far we are claiming to have it
>> attached to a PCI bus. That caps all the extended capabilities e.g.
>> Could this make some relevant difference?
> 
> The BIOS and early boot use shouldn't care too much about that, but I
> could imagine the high performance drivers potentially failing.  Thanks,

Yeah, that was my thinking as well. But we will try to confirm this by
tracing the BIOS activities. There is a telling that some adapters do
not allow reading the true cold-boot ROM content during runtime, thus
booting those adapters inside the guest may fail to some degree.

Anyway, I've hacked on the q35 patches until they allowed me to boot a
Linux guest with an assigned PCIe Atheros WLAN adapter - all caps were
suddenly visible. Those bits are now on their way to our test box. Let's
see if they are able to change the BSOD a bit...

Jan

-- 
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: Graphics pass-through
  2011-05-09 14:55               ` Prasad Joshi
@ 2011-05-09 15:27                 ` Jan Kiszka
  2011-05-09 15:40                   ` Prasad Joshi
                                     ` (2 more replies)
  2011-05-10 10:53                 ` Gerd Hoffmann
  1 sibling, 3 replies; 27+ messages in thread
From: Jan Kiszka @ 2011-05-09 15:27 UTC (permalink / raw)
  To: Prasad Joshi
  Cc: Alex Williamson, André Weidemann, kvm, Oswaldo Cadenas,
	Maxim Nikolaev

On 2011-05-09 16:55, Prasad Joshi wrote:
> On Mon, May 9, 2011 at 12:14 PM, Jan Kiszka <jan.kiszka@siemens.com> wrote:
>> On 2011-05-05 17:17, Alex Williamson wrote:
>>>> And what about the host? When does Linux release the legacy range?
>>>> Always or only when a specific (!=vga/vesa) framebuffer driver is loaded?
>>>
>>> Well, that's where it'd be nice if the vga arbiter was actually in more
>>> widespread use.  It currently seems to be nothing more than a shared
>>> mutex, but it would actually be useful if it included backends to do the
>>> chipset vga routing changes.  I think when I was testing this, I was
>>> externally poking PCI bridge chipset to toggle the VGA_EN bit.
>>
>> Right, we had to drop the approach to pass through the secondary card
>> for now, the arbiter was not switching properly. Haven't checked yet if
>> VGA_EN was properly set, though the kernel code looks like it should
>> take care of this.
>>
>> Even with handing out the primary adapter, we had only mixed success so
>> far. The onboard adapter worked well (in VESA mode), but the NVIDIA is
>> not displaying early boot messages at all. Maybe a vgabios issue.
>> Windows was booting nevertheless - until we installed the NVIDIA
>> drivers. Than it ran into a blue screen.
>>
>> BTW, what ATI adapter did you use precisely, and what did work, what not?
> 
> Not hijacking the mail thread. Just wanted to provide some inputs.

Much appreciated in fact!

> 
> Few days back I had tried passing through the secondary graphics card.
> I could pass-through two graphics cards to virtual machine.
> 
> 02:00.0 VGA compatible controller: ATI Technologies Inc Redwood
> [Radeon HD 5670] (prog-if 00 [VGA controller])
> 	Subsystem: PC Partner Limited Device e151
> 	Flags: bus master, fast devsel, latency 0, IRQ 87
> 	Memory at d0000000 (64-bit, prefetchable) [size=256M]
> 	Memory at fe6e0000 (64-bit, non-prefetchable) [size=128K]
> 	I/O ports at b000 [size=256]
> 	Expansion ROM at fe6c0000 [disabled] [size=128K]
> 	Capabilities: <access denied>
> 	Kernel driver in use: radeon
> 	Kernel modules: radeon
> 
> 07:00.0 VGA compatible controller: nVidia Corporation G86 [Quadro NVS
> 290] (rev a1) (prog-if 00 [VGA controller])
>        Subsystem: nVidia Corporation Device 0492
>        Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop-
> ParErr-Stepping- SERR+ FastB2B- DisINTx-
>        Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast
>> TAbort-<TAbort- <MAbort- >SERR- <PERR- INTx-
>        Latency: 0, Cache Line Size: 64 bytes
>        Interrupt: pin A routed to IRQ 24
>        Region 0: Memory at fd000000 (32-bit, non-prefetchable) [size=16M]
>        Region 1: Memory at d0000000 (64-bit, prefetchable) [size=256M]
>        Region 3: Memory at fa000000 (64-bit, non-prefetchable) [size=32M]
>        Region 5: I/O ports at ec00 [size=128]
>        Expansion ROM at fe9e0000 [disabled] [size=128K]
>        Capabilities: <access denied>
>        Kernel driver in use: nouveau
>        Kernel modules: nouveau, nvidiafb
> 
> Both of them are PCIe cards. I have one more ATI card and another
> NVIDIA card which does not work.

Interesting. That may rule out missing PCIe capabilities as source for
the NVIDIA driver indisposition.

Did you passed those cards each as primary to the guest, or was the
guest seeing multiple adapters? I presume you only got output after
early boot was completed, right?

To avoid having to deal with legacy I/O forwarding, we started with a
dual adapter setup in the hope to leave the primary guest adapter at
know-to-work cirrus-vga. But already in a native setup with on-board
primary + NVIDIA secondary, the NVIDIA Windows drivers refused to talk
to its hardware in this constellation.

> 
> One of the reason the pass-through did not work is because of the
> limit on amount of pci configuration memory by SeaBIOS. SeaBIOS places
> a hard limit of 256MB or so on the amount of PCI memory space. Thus,
> for some of the VGA device that need more memory never worked for me.
> 
> SeaBIOS allows this memory region to be extended to some value near
> 512MB, but even then the range is not enough.
> 
> Another problem with SeaBIOS which limits the amount of memory space
> is: SeaBIOS allocates the BAR regions as they are encountered. As far
> as I know, the BAR regions should be naturally aligned. Thus the
> simple strategy of the SeaBIOS results in large fragmentation.
> Therefore, even after increasing the PCI memory space to 512MB the BAR
> regions were unallocated.

That's an interesting trace! We'll check this here, but I bet it
contributes to the problems. Our FX 3800 has 1G memory...

> 
> I will confirm you the details of other graphics cards which do not work.

TiA,
Jan

-- 
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: Graphics pass-through
  2011-05-09 15:27                 ` Jan Kiszka
@ 2011-05-09 15:40                   ` Prasad Joshi
  2011-05-09 15:48                   ` Alex Williamson
  2011-05-11 11:23                   ` Avi Kivity
  2 siblings, 0 replies; 27+ messages in thread
From: Prasad Joshi @ 2011-05-09 15:40 UTC (permalink / raw)
  To: Jan Kiszka
  Cc: Alex Williamson, André Weidemann, kvm, Oswaldo Cadenas,
	Maxim Nikolaev

On Mon, May 9, 2011 at 4:27 PM, Jan Kiszka <jan.kiszka@siemens.com> wrote:
> On 2011-05-09 16:55, Prasad Joshi wrote:
>> On Mon, May 9, 2011 at 12:14 PM, Jan Kiszka <jan.kiszka@siemens.com> wrote:
>>> On 2011-05-05 17:17, Alex Williamson wrote:
>>>>> And what about the host? When does Linux release the legacy range?
>>>>> Always or only when a specific (!=vga/vesa) framebuffer driver is loaded?
>>>>
>>>> Well, that's where it'd be nice if the vga arbiter was actually in more
>>>> widespread use.  It currently seems to be nothing more than a shared
>>>> mutex, but it would actually be useful if it included backends to do the
>>>> chipset vga routing changes.  I think when I was testing this, I was
>>>> externally poking PCI bridge chipset to toggle the VGA_EN bit.
>>>
>>> Right, we had to drop the approach to pass through the secondary card
>>> for now, the arbiter was not switching properly. Haven't checked yet if
>>> VGA_EN was properly set, though the kernel code looks like it should
>>> take care of this.
>>>
>>> Even with handing out the primary adapter, we had only mixed success so
>>> far. The onboard adapter worked well (in VESA mode), but the NVIDIA is
>>> not displaying early boot messages at all. Maybe a vgabios issue.
>>> Windows was booting nevertheless - until we installed the NVIDIA
>>> drivers. Than it ran into a blue screen.
>>>
>>> BTW, what ATI adapter did you use precisely, and what did work, what not?
>>
>> Not hijacking the mail thread. Just wanted to provide some inputs.
>
> Much appreciated in fact!
>
>>
>> Few days back I had tried passing through the secondary graphics card.
>> I could pass-through two graphics cards to virtual machine.
>>
>> 02:00.0 VGA compatible controller: ATI Technologies Inc Redwood
>> [Radeon HD 5670] (prog-if 00 [VGA controller])
>>       Subsystem: PC Partner Limited Device e151
>>       Flags: bus master, fast devsel, latency 0, IRQ 87
>>       Memory at d0000000 (64-bit, prefetchable) [size=256M]
>>       Memory at fe6e0000 (64-bit, non-prefetchable) [size=128K]
>>       I/O ports at b000 [size=256]
>>       Expansion ROM at fe6c0000 [disabled] [size=128K]
>>       Capabilities: <access denied>
>>       Kernel driver in use: radeon
>>       Kernel modules: radeon
>>
>> 07:00.0 VGA compatible controller: nVidia Corporation G86 [Quadro NVS
>> 290] (rev a1) (prog-if 00 [VGA controller])
>>        Subsystem: nVidia Corporation Device 0492
>>        Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop-
>> ParErr-Stepping- SERR+ FastB2B- DisINTx-
>>        Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast
>>> TAbort-<TAbort- <MAbort- >SERR- <PERR- INTx-
>>        Latency: 0, Cache Line Size: 64 bytes
>>        Interrupt: pin A routed to IRQ 24
>>        Region 0: Memory at fd000000 (32-bit, non-prefetchable) [size=16M]
>>        Region 1: Memory at d0000000 (64-bit, prefetchable) [size=256M]
>>        Region 3: Memory at fa000000 (64-bit, non-prefetchable) [size=32M]
>>        Region 5: I/O ports at ec00 [size=128]
>>        Expansion ROM at fe9e0000 [disabled] [size=128K]
>>        Capabilities: <access denied>
>>        Kernel driver in use: nouveau
>>        Kernel modules: nouveau, nvidiafb
>>
>> Both of them are PCIe cards. I have one more ATI card and another
>> NVIDIA card which does not work.
>
> Interesting. That may rule out missing PCIe capabilities as source for
> the NVIDIA driver indisposition.
>
> Did you passed those cards each as primary to the guest, or was the
> guest seeing multiple adapters?

I passed the graphics device as a primary device to the guest virtual
machine, with -vga none parameter to disable the default vga device.

> I presume you only got output after
> early boot was completed, right?

Yes you are correct. I got the display, only after the KMS was
started. The initial BIOS messages were not displayed.

>
> To avoid having to deal with legacy I/O forwarding, we started with a
> dual adapter setup in the hope to leave the primary guest adapter at
> know-to-work cirrus-vga. But already in a native setup with on-board
> primary + NVIDIA secondary, the NVIDIA Windows drivers refused to talk
> to its hardware in this constellation.
>

Windows operating system never worked for me with either of the graphics card.

>>
>> One of the reason the pass-through did not work is because of the
>> limit on amount of pci configuration memory by SeaBIOS. SeaBIOS places
>> a hard limit of 256MB or so on the amount of PCI memory space. Thus,
>> for some of the VGA device that need more memory never worked for me.
>>
>> SeaBIOS allows this memory region to be extended to some value near
>> 512MB, but even then the range is not enough.
>>
>> Another problem with SeaBIOS which limits the amount of memory space
>> is: SeaBIOS allocates the BAR regions as they are encountered. As far
>> as I know, the BAR regions should be naturally aligned. Thus the
>> simple strategy of the SeaBIOS results in large fragmentation.
>> Therefore, even after increasing the PCI memory space to 512MB the BAR
>> regions were unallocated.
>
> That's an interesting trace! We'll check this here, but I bet it
> contributes to the problems. Our FX 3800 has 1G memory...

Yes it is one of the problem. I remember reading something about the
NVIDIA BIOS and FLR, those could be other interesting issues.

>
>>
>> I will confirm you the details of other graphics cards which do not work.
>
> TiA,
> Jan
>
> --
> Siemens AG, Corporate Technology, CT T DE IT 1
> Corporate Competence Center Embedded Linux
>

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: Graphics pass-through
  2011-05-09 15:27                 ` Jan Kiszka
  2011-05-09 15:40                   ` Prasad Joshi
@ 2011-05-09 15:48                   ` Alex Williamson
  2011-05-09 16:00                     ` Jan Kiszka
  2011-05-11 11:25                     ` Avi Kivity
  2011-05-11 11:23                   ` Avi Kivity
  2 siblings, 2 replies; 27+ messages in thread
From: Alex Williamson @ 2011-05-09 15:48 UTC (permalink / raw)
  To: Jan Kiszka
  Cc: Prasad Joshi, André Weidemann, kvm, Oswaldo Cadenas, Maxim Nikolaev

On Mon, 2011-05-09 at 17:27 +0200, Jan Kiszka wrote:
> On 2011-05-09 16:55, Prasad Joshi wrote:
> > On Mon, May 9, 2011 at 12:14 PM, Jan Kiszka <jan.kiszka@siemens.com> wrote:
> >> On 2011-05-05 17:17, Alex Williamson wrote:
> >>>> And what about the host? When does Linux release the legacy range?
> >>>> Always or only when a specific (!=vga/vesa) framebuffer driver is loaded?
> >>>
> >>> Well, that's where it'd be nice if the vga arbiter was actually in more
> >>> widespread use.  It currently seems to be nothing more than a shared
> >>> mutex, but it would actually be useful if it included backends to do the
> >>> chipset vga routing changes.  I think when I was testing this, I was
> >>> externally poking PCI bridge chipset to toggle the VGA_EN bit.
> >>
> >> Right, we had to drop the approach to pass through the secondary card
> >> for now, the arbiter was not switching properly. Haven't checked yet if
> >> VGA_EN was properly set, though the kernel code looks like it should
> >> take care of this.
> >>
> >> Even with handing out the primary adapter, we had only mixed success so
> >> far. The onboard adapter worked well (in VESA mode), but the NVIDIA is
> >> not displaying early boot messages at all. Maybe a vgabios issue.
> >> Windows was booting nevertheless - until we installed the NVIDIA
> >> drivers. Than it ran into a blue screen.
> >>
> >> BTW, what ATI adapter did you use precisely, and what did work, what not?
> > 
> > Not hijacking the mail thread. Just wanted to provide some inputs.
> 
> Much appreciated in fact!
> 
> > 
> > Few days back I had tried passing through the secondary graphics card.
> > I could pass-through two graphics cards to virtual machine.
> > 
> > 02:00.0 VGA compatible controller: ATI Technologies Inc Redwood
> > [Radeon HD 5670] (prog-if 00 [VGA controller])
> > 	Subsystem: PC Partner Limited Device e151
> > 	Flags: bus master, fast devsel, latency 0, IRQ 87
> > 	Memory at d0000000 (64-bit, prefetchable) [size=256M]
> > 	Memory at fe6e0000 (64-bit, non-prefetchable) [size=128K]
> > 	I/O ports at b000 [size=256]
> > 	Expansion ROM at fe6c0000 [disabled] [size=128K]
> > 	Capabilities: <access denied>
> > 	Kernel driver in use: radeon
> > 	Kernel modules: radeon
> > 
> > 07:00.0 VGA compatible controller: nVidia Corporation G86 [Quadro NVS
> > 290] (rev a1) (prog-if 00 [VGA controller])
> >        Subsystem: nVidia Corporation Device 0492
> >        Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop-
> > ParErr-Stepping- SERR+ FastB2B- DisINTx-
> >        Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast
> >> TAbort-<TAbort- <MAbort- >SERR- <PERR- INTx-
> >        Latency: 0, Cache Line Size: 64 bytes
> >        Interrupt: pin A routed to IRQ 24
> >        Region 0: Memory at fd000000 (32-bit, non-prefetchable) [size=16M]
> >        Region 1: Memory at d0000000 (64-bit, prefetchable) [size=256M]
> >        Region 3: Memory at fa000000 (64-bit, non-prefetchable) [size=32M]
> >        Region 5: I/O ports at ec00 [size=128]
> >        Expansion ROM at fe9e0000 [disabled] [size=128K]
> >        Capabilities: <access denied>
> >        Kernel driver in use: nouveau
> >        Kernel modules: nouveau, nvidiafb
> > 
> > Both of them are PCIe cards. I have one more ATI card and another
> > NVIDIA card which does not work.
> 
> Interesting. That may rule out missing PCIe capabilities as source for
> the NVIDIA driver indisposition.
> 
> Did you passed those cards each as primary to the guest, or was the
> guest seeing multiple adapters? I presume you only got output after
> early boot was completed, right?
> 
> To avoid having to deal with legacy I/O forwarding, we started with a
> dual adapter setup in the hope to leave the primary guest adapter at
> know-to-work cirrus-vga. But already in a native setup with on-board
> primary + NVIDIA secondary, the NVIDIA Windows drivers refused to talk
> to its hardware in this constellation.
> 
> > 
> > One of the reason the pass-through did not work is because of the
> > limit on amount of pci configuration memory by SeaBIOS. SeaBIOS places
> > a hard limit of 256MB or so on the amount of PCI memory space. Thus,
> > for some of the VGA device that need more memory never worked for me.
> > 
> > SeaBIOS allows this memory region to be extended to some value near
> > 512MB, but even then the range is not enough.
> > 
> > Another problem with SeaBIOS which limits the amount of memory space
> > is: SeaBIOS allocates the BAR regions as they are encountered. As far
> > as I know, the BAR regions should be naturally aligned. Thus the
> > simple strategy of the SeaBIOS results in large fragmentation.
> > Therefore, even after increasing the PCI memory space to 512MB the BAR
> > regions were unallocated.
> 
> That's an interesting trace! We'll check this here, but I bet it
> contributes to the problems. Our FX 3800 has 1G memory...

Yes, qemu leaves far too little MMIO space to think about assigning
graphics cards.  Both of my cards have 512MB and I hacked qemu to leave
a bigger gap via something like:

diff --git a/hw/pc.c b/hw/pc.c
index 0ea6d10..a6376f8 100644
--- a/hw/pc.c
+++ b/hw/pc.c
@@ -879,6 +879,8 @@ void pc_cpus_init(const char *cpu_model)
     }
 }
 
+#define PC_MAX_LOW_RAM 0xc0000000
+
 void pc_memory_init(ram_addr_t ram_size,
                     const char *kernel_filename,
                     const char *kernel_cmdline,
@@ -893,9 +895,9 @@ void pc_memory_init(ram_addr_t ram_size,
     int bios_size, isa_bios_size;
     void *fw_cfg;
 
-    if (ram_size >= 0xe0000000 ) {
-        above_4g_mem_size = ram_size - 0xe0000000;
-        below_4g_mem_size = 0xe0000000;
+    if (ram_size >= PC_MAX_LOW_RAM ) {
+        above_4g_mem_size = ram_size - PC_MAX_LOW_RAM;
+        below_4g_mem_size = PC_MAX_LOW_RAM;
     } else {
         below_4g_mem_size = ram_size;
     }

There's also a #define that needs to be changed in seabios config.h and
and acpi dsdt update, but I can't seem to find patches for those.  Also
pay attention to the cpu_register_physical_memory calls in
i440fx_update_memory_mappings(), those can steal the legacy VGA MMIO
range from you.  I just commented them out:

diff --git a/hw/piix_pci.c b/hw/piix_pci.c
index b5589b9..1327563 100644
--- a/hw/piix_pci.c
+++ b/hw/piix_pci.c
@@ -106,11 +106,11 @@ static void i440fx_update_memory_mappings(PCII440FXState *d)
     }
     smram = d->dev.config[I440FX_SMRAM];
     if ((d->smm_enabled && (smram & 0x08)) || (smram & 0x40)) {
-        cpu_register_physical_memory(0xa0000, 0x20000, 0xa0000);
+        //cpu_register_physical_memory(0xa0000, 0x20000, 0xa0000);
     } else {
         for(addr = 0xa0000; addr < 0xc0000; addr += 4096) {
-            cpu_register_physical_memory(addr, 4096,
-                                         d->isa_page_descs[(addr - 0xa0000) >> 12]);
+            //cpu_register_physical_memory(addr, 4096,
+            //                             d->isa_page_descs[(addr - 0xa0000) >> 12]);
         }
     }
 }

That's all the tricks I remember.  Thanks,

Alex



^ permalink raw reply related	[flat|nested] 27+ messages in thread

* Re: Graphics pass-through
  2011-05-09 15:48                   ` Alex Williamson
@ 2011-05-09 16:00                     ` Jan Kiszka
  2011-05-11 11:25                     ` Avi Kivity
  1 sibling, 0 replies; 27+ messages in thread
From: Jan Kiszka @ 2011-05-09 16:00 UTC (permalink / raw)
  To: Alex Williamson
  Cc: Prasad Joshi, André Weidemann, kvm, Oswaldo Cadenas,
	Nikolaev, Maxim

On 2011-05-09 17:48, Alex Williamson wrote:
> On Mon, 2011-05-09 at 17:27 +0200, Jan Kiszka wrote:
>> On 2011-05-09 16:55, Prasad Joshi wrote:
>>> On Mon, May 9, 2011 at 12:14 PM, Jan Kiszka <jan.kiszka@siemens.com> wrote:
>>>> On 2011-05-05 17:17, Alex Williamson wrote:
>>>>>> And what about the host? When does Linux release the legacy range?
>>>>>> Always or only when a specific (!=vga/vesa) framebuffer driver is loaded?
>>>>>
>>>>> Well, that's where it'd be nice if the vga arbiter was actually in more
>>>>> widespread use.  It currently seems to be nothing more than a shared
>>>>> mutex, but it would actually be useful if it included backends to do the
>>>>> chipset vga routing changes.  I think when I was testing this, I was
>>>>> externally poking PCI bridge chipset to toggle the VGA_EN bit.
>>>>
>>>> Right, we had to drop the approach to pass through the secondary card
>>>> for now, the arbiter was not switching properly. Haven't checked yet if
>>>> VGA_EN was properly set, though the kernel code looks like it should
>>>> take care of this.
>>>>
>>>> Even with handing out the primary adapter, we had only mixed success so
>>>> far. The onboard adapter worked well (in VESA mode), but the NVIDIA is
>>>> not displaying early boot messages at all. Maybe a vgabios issue.
>>>> Windows was booting nevertheless - until we installed the NVIDIA
>>>> drivers. Than it ran into a blue screen.
>>>>
>>>> BTW, what ATI adapter did you use precisely, and what did work, what not?
>>>
>>> Not hijacking the mail thread. Just wanted to provide some inputs.
>>
>> Much appreciated in fact!
>>
>>>
>>> Few days back I had tried passing through the secondary graphics card.
>>> I could pass-through two graphics cards to virtual machine.
>>>
>>> 02:00.0 VGA compatible controller: ATI Technologies Inc Redwood
>>> [Radeon HD 5670] (prog-if 00 [VGA controller])
>>> 	Subsystem: PC Partner Limited Device e151
>>> 	Flags: bus master, fast devsel, latency 0, IRQ 87
>>> 	Memory at d0000000 (64-bit, prefetchable) [size=256M]
>>> 	Memory at fe6e0000 (64-bit, non-prefetchable) [size=128K]
>>> 	I/O ports at b000 [size=256]
>>> 	Expansion ROM at fe6c0000 [disabled] [size=128K]
>>> 	Capabilities: <access denied>
>>> 	Kernel driver in use: radeon
>>> 	Kernel modules: radeon
>>>
>>> 07:00.0 VGA compatible controller: nVidia Corporation G86 [Quadro NVS
>>> 290] (rev a1) (prog-if 00 [VGA controller])
>>>        Subsystem: nVidia Corporation Device 0492
>>>        Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop-
>>> ParErr-Stepping- SERR+ FastB2B- DisINTx-
>>>        Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast
>>>> TAbort-<TAbort- <MAbort- >SERR- <PERR- INTx-
>>>        Latency: 0, Cache Line Size: 64 bytes
>>>        Interrupt: pin A routed to IRQ 24
>>>        Region 0: Memory at fd000000 (32-bit, non-prefetchable) [size=16M]
>>>        Region 1: Memory at d0000000 (64-bit, prefetchable) [size=256M]
>>>        Region 3: Memory at fa000000 (64-bit, non-prefetchable) [size=32M]
>>>        Region 5: I/O ports at ec00 [size=128]
>>>        Expansion ROM at fe9e0000 [disabled] [size=128K]
>>>        Capabilities: <access denied>
>>>        Kernel driver in use: nouveau
>>>        Kernel modules: nouveau, nvidiafb
>>>
>>> Both of them are PCIe cards. I have one more ATI card and another
>>> NVIDIA card which does not work.
>>
>> Interesting. That may rule out missing PCIe capabilities as source for
>> the NVIDIA driver indisposition.
>>
>> Did you passed those cards each as primary to the guest, or was the
>> guest seeing multiple adapters? I presume you only got output after
>> early boot was completed, right?
>>
>> To avoid having to deal with legacy I/O forwarding, we started with a
>> dual adapter setup in the hope to leave the primary guest adapter at
>> know-to-work cirrus-vga. But already in a native setup with on-board
>> primary + NVIDIA secondary, the NVIDIA Windows drivers refused to talk
>> to its hardware in this constellation.
>>
>>>
>>> One of the reason the pass-through did not work is because of the
>>> limit on amount of pci configuration memory by SeaBIOS. SeaBIOS places
>>> a hard limit of 256MB or so on the amount of PCI memory space. Thus,
>>> for some of the VGA device that need more memory never worked for me.
>>>
>>> SeaBIOS allows this memory region to be extended to some value near
>>> 512MB, but even then the range is not enough.
>>>
>>> Another problem with SeaBIOS which limits the amount of memory space
>>> is: SeaBIOS allocates the BAR regions as they are encountered. As far
>>> as I know, the BAR regions should be naturally aligned. Thus the
>>> simple strategy of the SeaBIOS results in large fragmentation.
>>> Therefore, even after increasing the PCI memory space to 512MB the BAR
>>> regions were unallocated.
>>
>> That's an interesting trace! We'll check this here, but I bet it
>> contributes to the problems. Our FX 3800 has 1G memory...
> 
> Yes, qemu leaves far too little MMIO space to think about assigning
> graphics cards.  Both of my cards have 512MB and I hacked qemu to leave
> a bigger gap via something like:
> 
> diff --git a/hw/pc.c b/hw/pc.c
> index 0ea6d10..a6376f8 100644
> --- a/hw/pc.c
> +++ b/hw/pc.c
> @@ -879,6 +879,8 @@ void pc_cpus_init(const char *cpu_model)
>      }
>  }
>  
> +#define PC_MAX_LOW_RAM 0xc0000000
> +
>  void pc_memory_init(ram_addr_t ram_size,
>                      const char *kernel_filename,
>                      const char *kernel_cmdline,
> @@ -893,9 +895,9 @@ void pc_memory_init(ram_addr_t ram_size,
>      int bios_size, isa_bios_size;
>      void *fw_cfg;
>  
> -    if (ram_size >= 0xe0000000 ) {
> -        above_4g_mem_size = ram_size - 0xe0000000;
> -        below_4g_mem_size = 0xe0000000;
> +    if (ram_size >= PC_MAX_LOW_RAM ) {
> +        above_4g_mem_size = ram_size - PC_MAX_LOW_RAM;
> +        below_4g_mem_size = PC_MAX_LOW_RAM;
>      } else {
>          below_4g_mem_size = ram_size;
>      }
> 
> There's also a #define that needs to be changed in seabios config.h and
> and acpi dsdt update, but I can't seem to find patches for those.

Hmm, as this does not scale with the constantly growing memory sizes of
GPUs, I guess this should would for us as well, even with 1G. The
adapters likely only map a window to their on-board RAM.

>  Also
> pay attention to the cpu_register_physical_memory calls in
> i440fx_update_memory_mappings(), those can steal the legacy VGA MMIO
> range from you.  I just commented them out:
> 
> diff --git a/hw/piix_pci.c b/hw/piix_pci.c
> index b5589b9..1327563 100644
> --- a/hw/piix_pci.c
> +++ b/hw/piix_pci.c
> @@ -106,11 +106,11 @@ static void i440fx_update_memory_mappings(PCII440FXState *d)
>      }
>      smram = d->dev.config[I440FX_SMRAM];
>      if ((d->smm_enabled && (smram & 0x08)) || (smram & 0x40)) {
> -        cpu_register_physical_memory(0xa0000, 0x20000, 0xa0000);
> +        //cpu_register_physical_memory(0xa0000, 0x20000, 0xa0000);
>      } else {
>          for(addr = 0xa0000; addr < 0xc0000; addr += 4096) {
> -            cpu_register_physical_memory(addr, 4096,
> -                                         d->isa_page_descs[(addr - 0xa0000) >> 12]);
> +            //cpu_register_physical_memory(addr, 4096,
> +            //                             d->isa_page_descs[(addr - 0xa0000) >> 12]);
>          }
>      }
>  }
> 
> That's all the tricks I remember.  Thanks,

Yeah, we are already carrying half of the above in our tree (only the
second disabling is actually needed, KVM does not support SMM). I
started looking into fixing PAM/SMRAM mess, but it's not yet beautiful -
partly because we urgently need slot management at core level.

Jan

-- 
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: Graphics pass-through
  2011-05-09 14:55               ` Prasad Joshi
  2011-05-09 15:27                 ` Jan Kiszka
@ 2011-05-10 10:53                 ` Gerd Hoffmann
  1 sibling, 0 replies; 27+ messages in thread
From: Gerd Hoffmann @ 2011-05-10 10:53 UTC (permalink / raw)
  To: Prasad Joshi
  Cc: Jan Kiszka, Alex Williamson, André Weidemann, kvm, Oswaldo Cadenas

[-- Attachment #1: Type: text/plain, Size: 546 bytes --]

   Hi,

> Another problem with SeaBIOS which limits the amount of memory space
> is: SeaBIOS allocates the BAR regions as they are encountered. As far
> as I know, the BAR regions should be naturally aligned. Thus the
> simple strategy of the SeaBIOS results in large fragmentation.
> Therefore, even after increasing the PCI memory space to 512MB the BAR
> regions were unallocated.

Ran into this too.  Started fixing that with a second pci pass.  Not 
finished yet.  Patch attached FYI.  Feel free to grab it and run with it.

cheers,
   Gerd

[-- Attachment #2: 0001-wip-pci-move-to-two-pass-pci-initialization.patch --]
[-- Type: text/plain, Size: 9219 bytes --]

>From bf779e443e92872c5e076babb9c1b1a2890402bd Mon Sep 17 00:00:00 2001
From: Gerd Hoffmann <kraxel@redhat.com>
Date: Tue, 3 May 2011 12:38:15 +0200
Subject: [PATCH] [wip] pci: move to two-pass pci initialization

This patch adds a second device scan to the pci initialization, which
counts the memory bars of the various sizes and types.  Then it
calculates the sizes and the packing of the prefetchable and
non-prefetchable pci memory windows and prints the results.

TODO #1: handle pci bridges properly.
TODO #2: actually use the calculated stuff.
---
 src/pciinit.c |  237 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++-
 1 files changed, 234 insertions(+), 3 deletions(-)

diff --git a/src/pciinit.c b/src/pciinit.c
index ee2e72d..993d3cb 100644
--- a/src/pciinit.c
+++ b/src/pciinit.c
@@ -15,12 +15,64 @@
 #define PCI_ROM_SLOT 6
 #define PCI_NUM_REGIONS 7
 
+#define PCI_IO_INDEX_SHIFT 2
+#define PCI_MEM_INDEX_SHIFT 12
+
 static void pci_bios_init_device_in_bus(int bus);
+static void pci_bios_check_device_in_bus(int bus);
 
 static struct pci_region pci_bios_io_region;
 static struct pci_region pci_bios_mem_region;
 static struct pci_region pci_bios_prefmem_region;
 
+static struct pci_bus {
+    /* pci region stats */
+    u32 io_count[16 - PCI_IO_INDEX_SHIFT];
+    u32 mem_count[32 - PCI_MEM_INDEX_SHIFT];
+    u32 prefmem_count[32 - PCI_MEM_INDEX_SHIFT];
+    u32 io_sum, io_max;
+    u32 mem_sum, mem_max;
+    u32 prefmem_sum, prefmem_max;
+    /* pci region assignments */
+    u32 io_bases[16 - PCI_IO_INDEX_SHIFT];
+    u32 mem_bases[32 - PCI_MEM_INDEX_SHIFT];
+    u32 prefmem_bases[32 - PCI_MEM_INDEX_SHIFT];
+    u32 io_base, mem_base, prefmem_base;
+} busses[2];
+
+static int pci_size_to_index(u32 size, int shift)
+{
+    int index = 0;
+
+    while (size > (1 << index)) {
+        index++;
+    }
+    if (index < shift)
+        index = shift;
+    index -= shift;
+    return index;
+}
+
+static int pci_io_size_to_index(u32 size)
+{
+    return pci_size_to_index(size, PCI_IO_INDEX_SHIFT);
+}
+
+static u32 pci_io_index_to_size(int index)
+{
+    return 1 << (index + PCI_IO_INDEX_SHIFT);
+}
+
+static int pci_mem_size_to_index(u32 size)
+{
+    return pci_size_to_index(size, PCI_MEM_INDEX_SHIFT);
+}
+
+static u32 pci_mem_index_to_size(int index)
+{
+    return 1 << (index + PCI_MEM_INDEX_SHIFT);
+}
+
 /* host irqs corresponding to PCI irqs A-D */
 const u8 pci_irqs[4] = {
     10, 10, 11, 11
@@ -393,6 +445,180 @@ pci_bios_init_bus(void)
     pci_bios_init_bus_rec(0 /* host bus */, &pci_bus);
 }
 
+static void pci_bios_check_device(struct pci_bus *bus, u16 bdf)
+{
+    int io_index, mem_index, prefmem_index;
+    u16 class;
+    int i;
+
+    class = pci_config_readw(bdf, PCI_CLASS_DEVICE);
+    if (class == PCI_CLASS_BRIDGE_PCI) {
+        u8 secbus = pci_config_readb(bdf, PCI_SECONDARY_BUS);
+        if (secbus >= ARRAY_SIZE(busses)) {
+            dprintf(1, "PCI: busses array too small, skipping bus %d\n", secbus);
+            return;
+        }
+        pci_bios_check_device_in_bus(secbus);
+        io_index = pci_io_size_to_index(busses[secbus].io_sum);
+        mem_index = pci_mem_size_to_index(busses[secbus].mem_sum);
+        prefmem_index = pci_mem_size_to_index(busses[secbus].prefmem_sum);
+        dprintf(1, "PCI: secondary bus %d sizes: io %x, mem %x, prefmem %x\n",
+                secbus, pci_io_index_to_size(io_index),
+                pci_mem_index_to_size(mem_index),
+                pci_mem_index_to_size(prefmem_index));
+        return;
+    }
+
+    for (i = 0; i < PCI_NUM_REGIONS; i++) {
+        u32 ofs = pci_bar(bdf, i);
+        u32 old = pci_config_readl(bdf, ofs);
+        u32 mask, index;
+        if (i == PCI_ROM_SLOT) {
+            mask = PCI_ROM_ADDRESS_MASK;
+            pci_config_writel(bdf, ofs, mask);
+        } else {
+            if (old & PCI_BASE_ADDRESS_SPACE_IO)
+                mask = PCI_BASE_ADDRESS_IO_MASK;
+            else
+                mask = PCI_BASE_ADDRESS_MEM_MASK;
+            pci_config_writel(bdf, ofs, ~0);
+        }
+        u32 val = pci_config_readl(bdf, ofs);
+        pci_config_writel(bdf, ofs, old);
+        u32 size = (~(val & mask)) + 1;
+        if (val == 0) {
+            continue;
+        }
+
+        if (val & PCI_BASE_ADDRESS_SPACE_IO) {
+            index = pci_io_size_to_index(size);
+            size = pci_io_index_to_size(index);
+            bus->io_count[index]++;
+            bus->io_sum += size;
+            if (bus->io_max < size)
+                bus->io_max = size;
+        } else {
+            index = pci_mem_size_to_index(size);
+            size = pci_mem_index_to_size(index);
+            if (val & PCI_BASE_ADDRESS_MEM_PREFETCH) {
+                bus->prefmem_count[index]++;
+                bus->prefmem_sum += size;
+                if (bus->prefmem_max < size)
+                    bus->prefmem_max = size;
+            } else {
+                bus->mem_count[index]++;
+                bus->mem_sum += size;
+                if (bus->mem_max < size)
+                    bus->mem_max = size;
+            }
+        }
+
+        if (!(val & PCI_BASE_ADDRESS_SPACE_IO) &&
+            (val & PCI_BASE_ADDRESS_MEM_TYPE_MASK) == PCI_BASE_ADDRESS_MEM_TYPE_64) {
+            i++;
+        }
+    }
+}
+
+static void pci_bios_check_device_in_bus(int bus)
+{
+    int bdf, max;
+
+    dprintf(1, "PCI: check devices bus %d\n", bus);
+    foreachpci_in_bus(bdf, max, bus) {
+        pci_bios_check_device(&busses[bus], bdf);
+    }
+}
+
+static void pci_bios_init_bus_bases(struct pci_bus *bus)
+{
+    u32 base, newbase, size;
+    int i;
+
+    /* assign prefetchable memory regions */
+    dprintf(1, "  prefmem max %x sum %x base %x\n",
+            bus->prefmem_max, bus->prefmem_sum, bus->prefmem_base);
+    base = bus->prefmem_base;
+    for (i = ARRAY_SIZE(bus->prefmem_count)-1; i >= 0; i--) {
+        size = pci_mem_index_to_size(i);
+        if (!bus->prefmem_count[i])
+            continue;
+        newbase = base + size * bus->prefmem_count[i];
+        dprintf(1, "    size %8x: %d bar(s), %8x -> %8x\n",
+                size, bus->prefmem_count[i], base, newbase - 1);
+        bus->prefmem_bases[i] = base;
+        base = newbase;
+    }
+
+    /* assign memory regions */
+    dprintf(1, "  mem max %x sum %x base %x\n",
+            bus->mem_max, bus->mem_sum, bus->mem_base);
+    base = bus->mem_base;
+    for (i = ARRAY_SIZE(bus->mem_count)-1; i >= 0; i--) {
+        size = pci_mem_index_to_size(i);
+        if (!bus->mem_count[i])
+            continue;
+        newbase = base + size * bus->mem_count[i];
+        dprintf(1, "    mem size %8x: %d bar(s), %8x -> %8x\n",
+                size, bus->mem_count[i], base, newbase - 1);
+        bus->mem_bases[i] = base;
+        base = newbase;
+    }
+
+    /* assign io regions */
+    dprintf(1, "  io max %x sum %x base %x\n",
+            bus->io_max, bus->io_sum, bus->io_base);
+    base = bus->io_base;
+    for (i = ARRAY_SIZE(bus->io_count)-1; i >= 0; i--) {
+        size = pci_io_index_to_size(i);
+        if (!bus->io_count[i])
+            continue;
+        newbase = base + size * bus->io_count[i];
+        dprintf(1, "    io size %4x: %d bar(s), %4x -> %4x\n",
+                size, bus->io_count[i], base, newbase - 1);
+        bus->io_bases[i] = base;
+        base = newbase;
+    }
+}
+
+static void pci_bios_init_root_regions(void)
+{
+    struct pci_bus *bus = &busses[0];
+
+    /* calculate memory windows */
+    if (bus->prefmem_sum) {
+        u32 reserved = 0xffffffff - BUILD_PCIMEM_END + 1;
+        u32 window = bus->prefmem_max;
+        while (bus->prefmem_sum + reserved > window) {
+            window += bus->prefmem_max;
+        }
+        bus->prefmem_base = 0xffffffff - window + 1;
+    } else {
+        bus->prefmem_base = BUILD_PCIMEM_END;
+    }
+
+    if (bus->mem_sum) {
+        u32 reserved = 0xffffffff - bus->prefmem_base + 1;
+        u32 window = bus->mem_max;
+        while (bus->mem_sum + reserved > window) {
+            window += bus->mem_max;
+        }
+        bus->mem_base = 0xffffffff - window + 1;
+    }
+
+    bus->io_base = 0xc000;
+
+    /* simple sanity check */
+    /* TODO: check e820 table */
+    if (bus->mem_base < RamSize) {
+        dprintf(1, "PCI: out of space for memory bars\n");
+        /* Hmm, what to do now? */
+    }
+
+    dprintf(1, "PCI: init bases bus 0 (primary)\n");
+    pci_bios_init_bus_bases(bus);
+}
+
 void
 pci_setup(void)
 {
@@ -402,15 +628,20 @@ pci_setup(void)
 
     dprintf(3, "pci setup\n");
 
+    pci_bios_init_bus();
+
+    int bdf, max;
+    dprintf(1, "PCI pass 1\n");
+    pci_bios_check_device_in_bus(0 /* host bus */);
+    pci_bios_init_root_regions();
+
     pci_region_init(&pci_bios_io_region, 0xc000, 64 * 1024 - 1);
     pci_region_init(&pci_bios_mem_region,
                     BUILD_PCIMEM_START, BUILD_PCIMEM_END - 1);
     pci_region_init(&pci_bios_prefmem_region,
                     BUILD_PCIPREFMEM_START, BUILD_PCIPREFMEM_END - 1);
 
-    pci_bios_init_bus();
-
-    int bdf, max;
+    dprintf(1, "PCI pass 2\n");
     foreachpci(bdf, max) {
         pci_init_device(pci_isa_bridge_tbl, bdf, NULL);
     }
-- 
1.7.1


^ permalink raw reply related	[flat|nested] 27+ messages in thread

* Re: Graphics pass-through
  2011-05-09 15:27                 ` Jan Kiszka
  2011-05-09 15:40                   ` Prasad Joshi
  2011-05-09 15:48                   ` Alex Williamson
@ 2011-05-11 11:23                   ` Avi Kivity
  2011-05-11 12:31                     ` Jan Kiszka
  2 siblings, 1 reply; 27+ messages in thread
From: Avi Kivity @ 2011-05-11 11:23 UTC (permalink / raw)
  To: Jan Kiszka
  Cc: Prasad Joshi, Alex Williamson, André Weidemann, kvm,
	Oswaldo Cadenas, Maxim Nikolaev

On 05/09/2011 06:27 PM, Jan Kiszka wrote:
> To avoid having to deal with legacy I/O forwarding, we started with a
> dual adapter setup in the hope to leave the primary guest adapter at
> know-to-work cirrus-vga. But already in a native setup with on-board
> primary + NVIDIA secondary, the NVIDIA Windows drivers refused to talk
> to its hardware in this constellation.

IIRC one issue with nvidia is that it uses non-BAR registers to move its 
PCI BAR around, which causes cpu writes to hit empty space.

One way to see if this is the problem is to trace mmio that misses both 
kvm internal devices and qemu devices.

-- 
error compiling committee.c: too many arguments to function


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: Graphics pass-through
  2011-05-09 15:48                   ` Alex Williamson
  2011-05-09 16:00                     ` Jan Kiszka
@ 2011-05-11 11:25                     ` Avi Kivity
  2011-05-11 13:08                       ` Jan Kiszka
  1 sibling, 1 reply; 27+ messages in thread
From: Avi Kivity @ 2011-05-11 11:25 UTC (permalink / raw)
  To: Alex Williamson
  Cc: Jan Kiszka, Prasad Joshi, André Weidemann, kvm,
	Oswaldo Cadenas, Maxim Nikolaev

On 05/09/2011 06:48 PM, Alex Williamson wrote:
> >  That's an interesting trace! We'll check this here, but I bet it
> >  contributes to the problems. Our FX 3800 has 1G memory...
>
> Yes, qemu leaves far too little MMIO space to think about assigning
> graphics cards.  Both of my cards have 512MB and I hacked qemu to leave
> a bigger gap via something like:
>

What about 64-bit BARs?

-- 
error compiling committee.c: too many arguments to function


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: Graphics pass-through
  2011-05-11 11:23                   ` Avi Kivity
@ 2011-05-11 12:31                     ` Jan Kiszka
  0 siblings, 0 replies; 27+ messages in thread
From: Jan Kiszka @ 2011-05-11 12:31 UTC (permalink / raw)
  To: Avi Kivity
  Cc: Prasad Joshi, Alex Williamson, André Weidemann, kvm,
	Oswaldo Cadenas, Nikolaev, Maxim

On 2011-05-11 13:23, Avi Kivity wrote:
> On 05/09/2011 06:27 PM, Jan Kiszka wrote:
>> To avoid having to deal with legacy I/O forwarding, we started with a
>> dual adapter setup in the hope to leave the primary guest adapter at
>> know-to-work cirrus-vga. But already in a native setup with on-board
>> primary + NVIDIA secondary, the NVIDIA Windows drivers refused to talk
>> to its hardware in this constellation.
> 
> IIRC one issue with nvidia is that it uses non-BAR registers to move its 
> PCI BAR around, which causes cpu writes to hit empty space.

I wonder if that would still be "virtualization friendly" as the adapter
claims to be...

> 
> One way to see if this is the problem is to trace mmio that misses both 
> kvm internal devices and qemu devices.

We'll check.

Jan

-- 
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: Graphics pass-through
  2011-05-11 11:25                     ` Avi Kivity
@ 2011-05-11 13:08                       ` Jan Kiszka
  2011-05-11 13:26                         ` Avi Kivity
  0 siblings, 1 reply; 27+ messages in thread
From: Jan Kiszka @ 2011-05-11 13:08 UTC (permalink / raw)
  To: Avi Kivity
  Cc: Alex Williamson, Prasad Joshi, André Weidemann, kvm,
	Oswaldo Cadenas, Nikolaev, Maxim

On 2011-05-11 13:25, Avi Kivity wrote:
> On 05/09/2011 06:48 PM, Alex Williamson wrote:
>>>  That's an interesting trace! We'll check this here, but I bet it
>>>  contributes to the problems. Our FX 3800 has 1G memory...
>>
>> Yes, qemu leaves far too little MMIO space to think about assigning
>> graphics cards.  Both of my cards have 512MB and I hacked qemu to leave
>> a bigger gap via something like:
>>
> 
> What about 64-bit BARs?

Aren't they backward compatible? Or do you think some guest drivers may
assume to find their 64-bit capable bars also registered as such and get
upset when seeing them as 32-bit ones?

Jan

-- 
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: Graphics pass-through
  2011-05-11 13:08                       ` Jan Kiszka
@ 2011-05-11 13:26                         ` Avi Kivity
  2011-05-11 13:50                           ` Jan Kiszka
  0 siblings, 1 reply; 27+ messages in thread
From: Avi Kivity @ 2011-05-11 13:26 UTC (permalink / raw)
  To: Jan Kiszka
  Cc: Alex Williamson, Prasad Joshi, André Weidemann, kvm,
	Oswaldo Cadenas, Nikolaev, Maxim

On 05/11/2011 04:08 PM, Jan Kiszka wrote:
> On 2011-05-11 13:25, Avi Kivity wrote:
> >  On 05/09/2011 06:48 PM, Alex Williamson wrote:
> >>>   That's an interesting trace! We'll check this here, but I bet it
> >>>   contributes to the problems. Our FX 3800 has 1G memory...
> >>
> >>  Yes, qemu leaves far too little MMIO space to think about assigning
> >>  graphics cards.  Both of my cards have 512MB and I hacked qemu to leave
> >>  a bigger gap via something like:
> >>
> >
> >  What about 64-bit BARs?
>
> Aren't they backward compatible? Or do you think some guest drivers may
> assume to find their 64-bit capable bars also registered as such and get
> upset when seeing them as 32-bit ones?
>

I mean, if you have a 1GB framebuffer, put it above 4GB and hope the 
OS/driver can handle it.

-- 
error compiling committee.c: too many arguments to function


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: Graphics pass-through
  2011-05-11 13:26                         ` Avi Kivity
@ 2011-05-11 13:50                           ` Jan Kiszka
  2011-05-11 13:54                             ` Avi Kivity
  0 siblings, 1 reply; 27+ messages in thread
From: Jan Kiszka @ 2011-05-11 13:50 UTC (permalink / raw)
  To: Avi Kivity
  Cc: Alex Williamson, Prasad Joshi, André Weidemann, kvm,
	Oswaldo Cadenas, Nikolaev, Maxim

On 2011-05-11 15:26, Avi Kivity wrote:
> On 05/11/2011 04:08 PM, Jan Kiszka wrote:
>> On 2011-05-11 13:25, Avi Kivity wrote:
>>>  On 05/09/2011 06:48 PM, Alex Williamson wrote:
>>>>>   That's an interesting trace! We'll check this here, but I bet it
>>>>>   contributes to the problems. Our FX 3800 has 1G memory...
>>>>
>>>>  Yes, qemu leaves far too little MMIO space to think about assigning
>>>>  graphics cards.  Both of my cards have 512MB and I hacked qemu to leave
>>>>  a bigger gap via something like:
>>>>
>>>
>>>  What about 64-bit BARs?
>>
>> Aren't they backward compatible? Or do you think some guest drivers may
>> assume to find their 64-bit capable bars also registered as such and get
>> upset when seeing them as 32-bit ones?
>>
> 
> I mean, if you have a 1GB framebuffer, put it above 4GB and hope the 
> OS/driver can handle it.

The question is if the drivers actually depend on this. At least the
binary nvidia thing here on my notebook, it is obviously happy with
below-4G-bars (and likely change the mapped window on demand):

01:00.0 VGA compatible controller: nVidia Corporation GT216 [Quadro FX 880M] (rev a2) (prog-if 00 [VGA controller])
        Subsystem: Fujitsu Limited. Device 1584
        Flags: bus master, fast devsel, latency 0, IRQ 16
        Memory at cc000000 (32-bit, non-prefetchable) [size=16M]
        Memory at d0000000 (64-bit, prefetchable) [size=256M]
        Memory at ce000000 (64-bit, prefetchable) [size=32M]
        I/O ports at 2000 [size=128]
        [virtual] Expansion ROM at cd000000 [disabled] [size=512K]
        Capabilities: [60] Power Management version 3
        Capabilities: [68] MSI: Enable- Count=1/1 Maskable- 64bit+
        Capabilities: [78] Express Endpoint, MSI 00
        Capabilities: [b4] Vendor Specific Information: Len=14 <?>
        Capabilities: [100] Virtual Channel
        Capabilities: [128] Power Budgeting <?>
        Capabilities: [600] Vendor Specific Information: ID=0001 Rev=1 Len=024 <?>
        Kernel driver in use: nvidia

Maybe the crashing Windows driver of the FX3800 has different
requirements.

Jan

-- 
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: Graphics pass-through
  2011-05-11 13:50                           ` Jan Kiszka
@ 2011-05-11 13:54                             ` Avi Kivity
  2011-05-11 14:06                               ` Jan Kiszka
  0 siblings, 1 reply; 27+ messages in thread
From: Avi Kivity @ 2011-05-11 13:54 UTC (permalink / raw)
  To: Jan Kiszka
  Cc: Alex Williamson, Prasad Joshi, André Weidemann, kvm,
	Oswaldo Cadenas, Nikolaev, Maxim

On 05/11/2011 04:50 PM, Jan Kiszka wrote:
> On 2011-05-11 15:26, Avi Kivity wrote:
> >  On 05/11/2011 04:08 PM, Jan Kiszka wrote:
> >>  On 2011-05-11 13:25, Avi Kivity wrote:
> >>>   On 05/09/2011 06:48 PM, Alex Williamson wrote:
> >>>>>    That's an interesting trace! We'll check this here, but I bet it
> >>>>>    contributes to the problems. Our FX 3800 has 1G memory...
> >>>>
> >>>>   Yes, qemu leaves far too little MMIO space to think about assigning
> >>>>   graphics cards.  Both of my cards have 512MB and I hacked qemu to leave
> >>>>   a bigger gap via something like:
> >>>>
> >>>
> >>>   What about 64-bit BARs?
> >>
> >>  Aren't they backward compatible? Or do you think some guest drivers may
> >>  assume to find their 64-bit capable bars also registered as such and get
> >>  upset when seeing them as 32-bit ones?
> >>
> >
> >  I mean, if you have a 1GB framebuffer, put it above 4GB and hope the
> >  OS/driver can handle it.
>
> The question is if the drivers actually depend on this. At least the
> binary nvidia thing here on my notebook, it is obviously happy with
> below-4G-bars (and likely change the mapped window on demand):
>
> 01:00.0 VGA compatible controller: nVidia Corporation GT216 [Quadro FX 880M] (rev a2) (prog-if 00 [VGA controller])
>          Subsystem: Fujitsu Limited. Device 1584
>          Flags: bus master, fast devsel, latency 0, IRQ 16
>          Memory at cc000000 (32-bit, non-prefetchable) [size=16M]
>          Memory at d0000000 (64-bit, prefetchable) [size=256M]
>          Memory at ce000000 (64-bit, prefetchable) [size=32M]
>          I/O ports at 2000 [size=128]
>          [virtual] Expansion ROM at cd000000 [disabled] [size=512K]
>          Capabilities: [60] Power Management version 3
>          Capabilities: [68] MSI: Enable- Count=1/1 Maskable- 64bit+
>          Capabilities: [78] Express Endpoint, MSI 00
>          Capabilities: [b4] Vendor Specific Information: Len=14<?>
>          Capabilities: [100] Virtual Channel
>          Capabilities: [128] Power Budgeting<?>
>          Capabilities: [600] Vendor Specific Information: ID=0001 Rev=1 Len=024<?>
>          Kernel driver in use: nvidia
>
> Maybe the crashing Windows driver of the FX3800 has different
> requirements.

I doubt it.  A 64-bit BAR would be configured as 32-bit on an older 
BIOS, no?

I'd guess 64-bit BARs are only needed for large BARs.

-- 
error compiling committee.c: too many arguments to function


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: Graphics pass-through
  2011-05-11 13:54                             ` Avi Kivity
@ 2011-05-11 14:06                               ` Jan Kiszka
  2011-05-11 14:14                                 ` Avi Kivity
  0 siblings, 1 reply; 27+ messages in thread
From: Jan Kiszka @ 2011-05-11 14:06 UTC (permalink / raw)
  To: Avi Kivity
  Cc: Alex Williamson, Prasad Joshi, André Weidemann, kvm,
	Oswaldo Cadenas, Nikolaev, Maxim

On 2011-05-11 15:54, Avi Kivity wrote:
> On 05/11/2011 04:50 PM, Jan Kiszka wrote:
>> On 2011-05-11 15:26, Avi Kivity wrote:
>>>  On 05/11/2011 04:08 PM, Jan Kiszka wrote:
>>>>  On 2011-05-11 13:25, Avi Kivity wrote:
>>>>>   On 05/09/2011 06:48 PM, Alex Williamson wrote:
>>>>>>>    That's an interesting trace! We'll check this here, but I bet it
>>>>>>>    contributes to the problems. Our FX 3800 has 1G memory...
>>>>>>
>>>>>>   Yes, qemu leaves far too little MMIO space to think about assigning
>>>>>>   graphics cards.  Both of my cards have 512MB and I hacked qemu to leave
>>>>>>   a bigger gap via something like:
>>>>>>
>>>>>
>>>>>   What about 64-bit BARs?
>>>>
>>>>  Aren't they backward compatible? Or do you think some guest drivers may
>>>>  assume to find their 64-bit capable bars also registered as such and get
>>>>  upset when seeing them as 32-bit ones?
>>>>
>>>
>>>  I mean, if you have a 1GB framebuffer, put it above 4GB and hope the
>>>  OS/driver can handle it.
>>
>> The question is if the drivers actually depend on this. At least the
>> binary nvidia thing here on my notebook, it is obviously happy with
>> below-4G-bars (and likely change the mapped window on demand):
>>
>> 01:00.0 VGA compatible controller: nVidia Corporation GT216 [Quadro FX 880M] (rev a2) (prog-if 00 [VGA controller])
>>          Subsystem: Fujitsu Limited. Device 1584
>>          Flags: bus master, fast devsel, latency 0, IRQ 16
>>          Memory at cc000000 (32-bit, non-prefetchable) [size=16M]
>>          Memory at d0000000 (64-bit, prefetchable) [size=256M]
>>          Memory at ce000000 (64-bit, prefetchable) [size=32M]
>>          I/O ports at 2000 [size=128]
>>          [virtual] Expansion ROM at cd000000 [disabled] [size=512K]
>>          Capabilities: [60] Power Management version 3
>>          Capabilities: [68] MSI: Enable- Count=1/1 Maskable- 64bit+
>>          Capabilities: [78] Express Endpoint, MSI 00
>>          Capabilities: [b4] Vendor Specific Information: Len=14<?>
>>          Capabilities: [100] Virtual Channel
>>          Capabilities: [128] Power Budgeting<?>
>>          Capabilities: [600] Vendor Specific Information: ID=0001 Rev=1 Len=024<?>
>>          Kernel driver in use: nvidia
>>
>> Maybe the crashing Windows driver of the FX3800 has different
>> requirements.
> 
> I doubt it.  A 64-bit BAR would be configured as 32-bit on an older 
> BIOS, no?
> 
> I'd guess 64-bit BARs are only needed for large BARs.
> 

The BIOS can't configure the bars to 64 bit as it does not know which
type of OS (32 or 64 bits) is going to pick them up. But maybe 64-bit
Windows reconfigures the bars before it starts the driver. Would we
support this?

Jan

-- 
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: Graphics pass-through
  2011-05-11 14:06                               ` Jan Kiszka
@ 2011-05-11 14:14                                 ` Avi Kivity
  0 siblings, 0 replies; 27+ messages in thread
From: Avi Kivity @ 2011-05-11 14:14 UTC (permalink / raw)
  To: Jan Kiszka
  Cc: Alex Williamson, Prasad Joshi, André Weidemann, kvm,
	Oswaldo Cadenas, Nikolaev, Maxim

On 05/11/2011 05:06 PM, Jan Kiszka wrote:
> >
> >  I doubt it.  A 64-bit BAR would be configured as 32-bit on an older
> >  BIOS, no?
> >
> >  I'd guess 64-bit BARs are only needed for large BARs.
> >
>
> The BIOS can't configure the bars to 64 bit as it does not know which
> type of OS (32 or 64 bits) is going to pick them up.

If it's a really large BAR, it has no choice.  BTW, a 32-bit OS can 
handle 64-bit BARs, all it needs is PAE or PSE-36.

> But maybe 64-bit
> Windows reconfigures the bars before it starts the driver. Would we
> support this?

Yes.  qemu doesn't know if it's the BIOS reprogramming the BARs or the 
OS.  Of course unmap+remap is not a heavily tested code path.

-- 
error compiling committee.c: too many arguments to function


^ permalink raw reply	[flat|nested] 27+ messages in thread

end of thread, other threads:[~2011-05-11 16:17 UTC | newest]

Thread overview: 27+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <AANLkTikNHRcDquYOL3NhsxkkBYcE48nMyu4+t8t=19e7@mail.gmail.com>
2011-01-25 23:03 ` Fwd: Graphics pass-through Prasad Joshi
2011-01-26  5:12   ` Alex Williamson
2011-01-26  8:17     ` Gerd Hoffmann
2011-01-27 11:56     ` André Weidemann
2011-01-28  0:45       ` Alex Williamson
2011-01-28 17:29         ` André Weidemann
2011-01-28 16:25           ` Alex Williamson
2011-05-05  8:50         ` Jan Kiszka
2011-05-05 15:17           ` Alex Williamson
2011-05-09 11:14             ` Jan Kiszka
2011-05-09 14:29               ` Alex Williamson
2011-05-09 15:02                 ` Jan Kiszka
2011-05-09 14:55               ` Prasad Joshi
2011-05-09 15:27                 ` Jan Kiszka
2011-05-09 15:40                   ` Prasad Joshi
2011-05-09 15:48                   ` Alex Williamson
2011-05-09 16:00                     ` Jan Kiszka
2011-05-11 11:25                     ` Avi Kivity
2011-05-11 13:08                       ` Jan Kiszka
2011-05-11 13:26                         ` Avi Kivity
2011-05-11 13:50                           ` Jan Kiszka
2011-05-11 13:54                             ` Avi Kivity
2011-05-11 14:06                               ` Jan Kiszka
2011-05-11 14:14                                 ` Avi Kivity
2011-05-11 11:23                   ` Avi Kivity
2011-05-11 12:31                     ` Jan Kiszka
2011-05-10 10:53                 ` Gerd Hoffmann

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.