All of lore.kernel.org
 help / color / mirror / Atom feed
From: Alex Williamson <alex.williamson@redhat.com>
To: Chris Sanders <sanders.chris@gmail.com>
Cc: kvm@vger.kernel.org
Subject: Re: AMD KVM Pci Passthrough reports device busy
Date: Mon, 04 Jun 2012 21:44:05 -0600	[thread overview]
Message-ID: <1338867845.23475.168.camel@bling.home> (raw)
In-Reply-To: <CALH4D1EJTOBLMh=YaC0AggeddpSchbjjBmqMSmCJs7TTOiscFA@mail.gmail.com>

On Mon, 2012-06-04 at 16:11 -0500, Chris Sanders wrote:
> Hello, I've been working for several days trying to get Pci
> Passthrough to work.  So far the #kvm IRC channel has helped me with a
> few suggestions, though that hasn't yet solved the problem.  I'm
> running CentOS 6.2 and was suggested I try compiling 3.2.18 kernel
> form kernel.org.  This has changed a few of the messages but the guest
> still fails to start.
> 
> Grepping for AMD-VI produces:
> # dmesg | grep AMD-Vi
> AMD-Vi: Enabling IOMMU at 0000:00:00.2 cap 0x40
> AMD-Vi: Lazy IO/TLB flushing enabled
> 
> After boot I'm running the following script
> echo "unbind pci-pci bridge"
> echo "1002 4383" > /sys/bus/pci/drivers/pci-stub/new_id
> echo "unbind pci device"
> echo "0000:03:07.0" > /sys/bus/pci/drivers/ivtv/unbind
> echo "4444 0803" > /sys/bus/pci/drivers/pci-stub/new_id
> echo 0000:03:07.0 > /sys/bus/pci/devices/0000\:03\:07.0/driver/unbind
> echo 0000:03:07.0 > /sys/bus/pci/drivers/pci-stub/bind
> 
> This is lspci -n showing my device behind the Pci-Pci bridge
> -[0000:00]-+-00.0
>            +-00.2
>            +-02.0-[01]--+-00.0
>            |            \-00.1
>            +-09.0-[02]----00.0
>            +-11.0
>            +-12.0
>            +-12.2
>            +-13.0
>            +-13.2
>            +-14.0
>            +-14.1
>            +-14.3
>            +-14.4-[03]----07.0
>            +-14.5
>            +-15.0-[04]--
>            +-16.0
>            +-16.2
>            +-18.0
>            +-18.1
>            +-18.2
>            +-18.3
>            +-18.4
>            \-18.5
> 
> My kvm command and error are:
> # /usr/libexec/qemu-kvm -m 3048 -net none -device
> virtio-serial-pci,id=virtio-serial0,bus=pci.0,addr=0x5 -drive
> file=/dev/vg_hdd/lv_sagetv,if=none,id=drive-virtio-disk0,format=raw,cache=none,aio=native
> -device pci-assign,host=03:07.0
> Failed to assign device "(null)" : Device or resource busy
> qemu-kvm: -device pci-assign,host=03:07.0: Device 'pci-assign' could
> not be initialized

I have a setup with an AMD 990FX system and a spare PVR-350 card that I
installed to reproduce.  The sad answer is that it's nearly impossible
to assign PCI devices on these systems due to the aliasing of devices
below the PCIe-to-PCI bridge (PCIe devices are much, much easier to
assign).  If you boot with amd_iommu_dump, you'll see some output like
this:

AMD-Vi:   DEV_ALIAS_RANGE		 devid: 05:00.0 flags: 00 devid_to: 00:14.4
AMD-Vi:   DEV_RANGE_END		 devid: 05:1f.7

This says the devices on bus 5 (my bus 5 is equivalent to your bus 3)
are all aliased to device 14.4:

00:14.4 PCI bridge: Advanced Micro Devices [AMD] nee ATI SBx00 PCI to PCI Bridge (rev 40) (prog-if 01 [Subtractive decode])

What that means is that the IOMMU can't distinguish devices behind the
PCI-to-PCI bridge so all devices are grouped as an alias to device 14.4.
You would hopefully not care about this, you don't have any other
devices anyway.  Unfortunately amd_iommu pre-allocates IOMMU domains for
every device, so it's already allocated a domain for device 14.4 and
adds device 03:07.0 into it.  Unbinding 03:07.0 from the ivtv driver
detaches that devices from the domain, but when we go to assign it to a
guest we create a new domain.  Assigning 03:07.0 into that new domain
fails because the device is an alias for 00:14.4, which still has a
different domain.  One way to get around this would be to also assign
the bridge to the guest, but we don't support and actually reject
assigning bridges :(

This works a bit better on Intel VT-d systems because domains are
dynamically allocated.  Thus for streaming DMA, the domain is only
created when the driver attempts to setup a DMA transaction.  When the
driver is unbound, the domain is destroyed thus allowing us to setup a
new domain for device assignment.

If you don't mind running non-upstream code, VFIO is a re-write of
device assignment for Qemu that is aware of such alias problems and
actually works in this case.  The downside is that VFIO is strict about
multifunction devices supporting ACS to prevent peer-to-peer between
domains, so will require all of the 14.x devices to be bound to pci-stub
as well.  On my system, this includes an smbus controller, audio device,
lpc controller, and usb device.  If AMD could confirm this device
doesn't allow peer-to-peer between functions, we could relax this
requirement a bit.  VFIO kernel and qemu can be found here:

git://github.com/awilliam/linux-vfio.git (iommu-group-vfio-next-20120529)
git://github.com/awilliam/qemu-vfio.git (iommu-group-vfio)

See Documentation/vfio.txt for description.  The major difference is the
setup of drivers.  At a minimum, the device you want to assign to the
guest needs to be bound to the vfio-pci driver, much in the same way you
bind devices to pci-stub.  All other devices in the group should be
bound to pci-stub.  You can find the other devices in the group by
following the iommu_group link in sysfs, ex:

/sys/bus/pci/devices/0000:03:07.0/iommu_group/devices

Once you have that, simply replace pci-assign with vfio-pci on the qemu
command line.  I'm actively trying to get this code upstream and hoping
we can do it asap, so barring it getting rejected, this would hopefully
eventually make mainline.  Thanks,

Alex


  reply	other threads:[~2012-06-05  3:44 UTC|newest]

Thread overview: 30+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-06-04 21:11 AMD KVM Pci Passthrough reports device busy Chris Sanders
2012-06-05  3:44 ` Alex Williamson [this message]
2012-06-05 10:39   ` Andreas Hartmann
2012-06-05 14:27     ` Alex Williamson
2012-06-05 15:17       ` Andreas Hartmann
2012-06-05 15:48         ` Alex Williamson
2012-06-05 15:58         ` Andreas Hartmann
2012-06-05 16:19           ` Alex Williamson
2012-06-05 16:55             ` Andreas Hartmann
2012-06-05 18:43               ` Alex Williamson
2012-06-05 20:37                 ` Andreas Hartmann
2012-06-05 21:09                   ` Alex Williamson
2012-06-05 22:02                     ` Andreas Hartmann
2012-06-06  8:12               ` Andreas Hartmann
2012-06-06  8:46                 ` Andreas Hartmann
2012-06-06  9:35                   ` Andreas Hartmann
2012-06-06 16:39                 ` Alex Williamson
2012-06-06 19:17                   ` Andreas Hartmann
2012-06-06 10:11       ` Joerg Roedel
2012-06-25  5:55         ` Andreas Hartmann
2012-06-25 11:22           ` Joerg Roedel
2012-07-11 14:26         ` Andreas Hartmann
2012-07-11 16:58           ` Joerg Roedel
2012-07-11 19:32             ` Andreas Hartmann
2012-07-11 20:01               ` Alex Williamson
2012-06-06  1:32     ` sheng qiu
2012-06-06  3:07       ` Chris Sanders
2012-06-06  3:25         ` Alex Williamson
2012-06-06  3:31           ` Chris Sanders
2012-06-06  5:27             ` Alex Williamson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1338867845.23475.168.camel@bling.home \
    --to=alex.williamson@redhat.com \
    --cc=kvm@vger.kernel.org \
    --cc=sanders.chris@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.