qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: yurij <lnkgyv@gmail.com>
To: Alex Williamson <alex.williamson@redhat.com>,
	Alexey Kardashevskiy <aik@ozlabs.ru>
Cc: qemu-devel@nongnu.org
Subject: Re: PCIe device paththrough via vfio issue
Date: Tue, 14 Jan 2020 19:02:41 +0300	[thread overview]
Message-ID: <61443d01-2206-4375-e22b-674536e0e2a0@gmail.com> (raw)
In-Reply-To: <20200114070415.3309a36e@x1.home>


On 1/14/20 5:04 PM, Alex Williamson wrote:
> On Tue, 14 Jan 2020 17:14:33 +1100
> Alexey Kardashevskiy <aik@ozlabs.ru> wrote:
> 
>> On 14/01/2020 03:28, Alex Williamson wrote:
>>> On Mon, 13 Jan 2020 18:49:21 +0300
>>> yurij <lnkgyv@gmail.com> wrote:
>>>    
>>>> Hello everybody!
>>>>
>>>> I have a specific PCIe device (sorry, but I can't tell about what is it
>>>> and what it does) but PCI configuration space consists of 4 BARs (lspci
>>>> output brief):
>>>>
>>>> lspci -s 84:00.00 -vvv
>>>>
>>>> . . .
>>>> Region 0: Memory at fa000000 (64-bit, non-prefetchable) [size=16M]
>>>> 	Region 2: Memory at fb001000 (32-bit, non-prefetchable) [size=4K]
>>>> 	Region 3: Memory at fb000000 (32-bit, non-prefetchable) [size=4K]
>>>> 	Region 4: Memory at f9000000 (64-bit, non-prefetchable) [size=16M]
>>>> . . .
>>>> Kernel driver in use: vfio-pci
>>>> . . .
>>>>
>>>> BAR0 merged with BAR1, BAR4 merged with BAR5 so they are 64 bit width.
>>>>
>>>> I put this PCIe device in virtual machine via vfio:
>>>>
>>>> -device vfio-pci,host=84:00.0,id=hostdev0,bus=pci.6,addr=0x0
>>>>
>>>> Virtual machine successfully boot. PCI configuration space in virtual
>>>> environment looks OK (lspci output brief):
>>>>
>>>> lspci -s 06:00.0 -vvv
>>>>
>>>> . . .
>>>> Region 0: Memory at f8000000 (64-bit, non-prefetchable) [size=16M]
>>>> 	Region 2: Memory at fa000000 (32-bit, non-prefetchable) [size=4K]
>>>> 	Region 3: Memory at fa001000 (32-bit, non-prefetchable) [size=4K]
>>>> 	Region 4: Memory at f9000000 (64-bit, non-prefetchable) [size=16M]
>>>> . . .
>>>> Kernel driver in use: custom_driver
>>>>
>>>> BAR0 merged with BAR1 and BAR4 merged with BAR5 and so they are also 64
>>>> bit width.
>>>>
>>>> The main problem in 4K HOLE in REGION 0 in virtual environment. So some
>>>> device features don't work.
>>>>
>>>> I have enabled iommu trace in host system (trace_event=iommu) and
>>>> display all events (for i in $(find
>>>> /sys/kernel/debug/tracing/events/iommu/ -name enable);do echo 1 > $i;
>>>> done). I saw next events during virtual machine booting:
>>>>
>>>> # cat /sys/kernel/debug/tracing/trace
>>>> . . .
>>>>          CPU 0/KVM-3046  [051] .... 63113.338894: map: IOMMU:
>>>> iova=0x00000000f8000000 paddr=0x00000000fa000000 size=24576
>>>>          CPU 0/KVM-3046  [051] .... 63113.339177: map: IOMMU:
>>>> iova=0x00000000f8007000 paddr=0x00000000fa007000 size=16748544
>>>>          CPU 0/KVM-3046  [051] .... 63113.339444: map: IOMMU:
>>>> iova=0x00000000fa000000 paddr=0x00000000fb001000 size=4096
>>>>          CPU 0/KVM-3046  [051] .... 63113.339697: map: IOMMU:
>>>> iova=0x00000000fa001000 paddr=0x00000000fb000000 size=4096
>>>>          CPU 0/KVM-3046  [051] .... 63113.340209: map: IOMMU:
>>>> iova=0x00000000f9000000 paddr=0x00000000f9000000 size=16777216
>>>> . . .
>>>>
>>>> I have enabled qemu trace(-trace events=/root/qemu/trace_events). Trace
>>>> file consists of the falling functions:
>>>> vfio_region_mmap
>>>> vfio_get_dev_region
>>>> vfio_pci_size_rom
>>>> vfio_pci_read_config
>>>> vfio_pci_write_config
>>>> vfio_iommu_map_notify
>>>> vfio_listener_region_add_iommu
>>>> vfio_listener_region_add_ram
>>>>
>>>> Some important brief from qemu trace:
>>>> . . .
>>>> янв 13 18:17:24 VM qemu-system-x86_64[7131]: vfio_region_mmap Region
>>>> 0000:84:00.0 BAR 0 mmaps[0] [0x0 - 0xffffff]
>>>> янв 13 18:17:24 VM qemu-system-x86_64[7131]: vfio_region_mmap Region
>>>> 0000:84:00.0 BAR 2 mmaps[0] [0x0 - 0xfff]
>>>> янв 13 18:17:24 VM qemu-system-x86_64[7131]: vfio_region_mmap Region
>>>> 0000:84:00.0 BAR 3 mmaps[0] [0x0 - 0xfff]
>>>> янв 13 18:17:24 VM qemu-system-x86_64[7131]: vfio_region_mmap Region
>>>> 0000:84:00.0 BAR 4 mmaps[0] [0x0 - 0xffffff]
>>>> . . .
>>>> янв 13 18:17:37 VM qemu-system-x86_64[7131]:
>>>> vfio_listener_region_add_ram region_add [ram] 0xf8000000 - 0xf8005fff
>>>> [0x7f691e800000]
>>>> янв 13 18:17:37 VM qemu-system-x86_64[7131]:
>>>> vfio_listener_region_add_ram region_add [ram] 0xf8007000 - 0xf8ffffff
>>>> [0x7f691e807000]
>>>> янв 13 18:17:37 VM qemu-system-x86_64[7131]:
>>>> vfio_listener_region_add_ram region_add [ram] 0xfa000000 - 0xfa000fff
>>>> [0x7f6b5de37000]
>>>> янв 13 18:17:37 VM qemu-system-x86_64[7131]:
>>>> vfio_listener_region_add_ram region_add [ram] 0xfa001000 - 0xfa001fff
>>>> [0x7f6b58004000]
>>>> янв 13 18:17:37 VM qemu-system-x86_64[7131]:
>>>> vfio_listener_region_add_ram region_add [ram] 0xf9000000 - 0xf9ffffff
>>>> [0x7f691d800000]
>>>>
>>>> I use qemu 4.0.0 which I rebuild for tracing support
>>>> (--enable-trace-backends=syslog).
>>>>
>>>> Please, help me solve this issue. Thank you!
>>>
>>> Something has probably created a QEMU MemoryRegion overlapping the BAR,
>>> we do this for quirks where we want to intercept a range of MMIO for
>>> emulation, but the offset 0x6000 on BAR0 doesn't sound familiar to me.
>>> Run the VM with a monitor and see if 'info mtree' provides any info on
>>> the handling of that overlap.  Thanks,
>>
>>
>> Could not it be an MSIX region? 'info mtree -f' should tell exactly what
>> is going on.
> 
> Oh, good call, that's probably it.  The PCI spec specifically
> recommends against placing non-MSIX related registers within the same
> 4K page as the vector table to avoid such things:
> 
>   If a Base Address register that maps address space for the MSI-X Table
>   or MSI-X PBA also maps other usable address space that is not
>   associated with MSI-X structures, locations (e.g., for CSRs) used in
>   the other address space must not share any naturally aligned 4-KB
>   address range with one where either MSI-X structure resides. This
>   allows system software where applicable to use different processor
>   attributes for MSI-X structures and the other address space.
> 
> We have the following QEMU vfio-pci device option to relocate the BAR
> elsewhere for hardware that violates that recommendation or for where
> the PCI spec recommended alignment isn't sufficient:
> 
>   x-msix-relocation=<OffAutoPCIBAR> - off/auto/bar0/bar1/bar2/bar3/bar4/bar5
> 
> In this case I'd probably recommend bar2 or bar3 as those BARs would
> only be extended to 8K versus bar0/4 would be extended to 32M.  Thanks,
> 
> Alex
> 

 >   x-msix-relocation=<OffAutoPCIBAR> - 
off/auto/bar0/bar1/bar2/bar3/bar4/bar5

I have used successfully 'x-msix-relocation' option:
-device 
vfio-pci,host=84:00.0,id=hostdev0,bus=pci.6,addr=0x0,x-msix-relocation=bar2

Now,  IOMMU trace looks like:
. . .
        CPU 0/KVM-4237  [055] ....  4750.918416: map: IOMMU: 
iova=0x00000000f8000000 paddr=0x00000000fa000000 size=16777216
        CPU 0/KVM-4237  [055] ....  4750.918740: map: IOMMU: 
iova=0x00000000fa000000 paddr=0x00000000fb001000 size=4096
        CPU 0/KVM-4237  [055] ....  4750.919069: map: IOMMU: 
iova=0x00000000fa002000 paddr=0x00000000fb000000 size=4096
        CPU 0/KVM-4237  [055] ....  4750.919698: map: IOMMU: 
iova=0x00000000f9000000 paddr=0x00000000f9000000 size=16777216
. . .

All seems to be OK.

Thank you very much!

-- 
with best regards
Yurij Goncharuk


  reply	other threads:[~2020-01-14 16:09 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-01-13 15:49 PCIe device paththrough via vfio issue yurij
2020-01-13 16:28 ` Alex Williamson
2020-01-14  6:14   ` Alexey Kardashevskiy
2020-01-14 14:04     ` Alex Williamson
2020-01-14 16:02       ` yurij [this message]
2020-01-14 16:21         ` Alex Williamson
2020-01-14 13:52   ` yurij

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=61443d01-2206-4375-e22b-674536e0e2a0@gmail.com \
    --to=lnkgyv@gmail.com \
    --cc=aik@ozlabs.ru \
    --cc=alex.williamson@redhat.com \
    --cc=qemu-devel@nongnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).