All of lore.kernel.org
 help / color / mirror / Atom feed
From: Alexey Kardashevskiy <aik@ozlabs.ru>
To: Alex Williamson <alex.williamson@redhat.com>
Cc: qemu-devel@nongnu.org, eric.auger@redhat.com
Subject: Re: [Qemu-devel] [RFC/RFT PATCH 5/5] vfio/pci: Allow relocating MSI-X MMIO
Date: Tue, 19 Dec 2017 12:22:56 +1100	[thread overview]
Message-ID: <e087ecee-b42f-afcd-bfa5-7b9ee54301df@ozlabs.ru> (raw)
In-Reply-To: <20171218072800.58456558@w520.home>

On 19/12/17 01:28, Alex Williamson wrote:
> On Tue, 19 Dec 2017 00:55:32 +1100
> Alexey Kardashevskiy <aik@ozlabs.ru> wrote:
> 
>> On 19/12/17 00:28, Alex Williamson wrote:
>>> On Mon, 18 Dec 2017 20:04:23 +1100
>>> Alexey Kardashevskiy <aik@ozlabs.ru> wrote:
>>>   
>>>> On 18/12/17 16:02, Alex Williamson wrote:  
>>>>> With recently proposed kernel side vfio-pci changes, the MSI-X vector
>>>>> table area can be mmap'd from userspace, allowing direct access to
>>>>> non-MSI-X registers within the host page size of this area.  However,
>>>>> we only get that direct access if QEMU isn't also emulating MSI-X
>>>>> within that same page.  For x86/64 host, the system page size is 4K
>>>>> and the PCI spec recommends a minimum of 4K to 8K alignment to
>>>>> separate MSI-X from non-MSI-X registers, therefore only devices which
>>>>> don't honor this recommendation would see any improvement from this
>>>>> option.  The real targets for this feature are hosts where the page
>>>>> size exceeds the PCI spec recommended alignment, such as ARM64 systems
>>>>> with 64K pages.
>>>>>
>>>>> This new x-msix-relocation option accepts the following options:
>>>>>
>>>>>   off: Disable MSI-X relocation, use native device config (default)
>>>>>   auto: Automaically relocate MSI-X MMIO to another BAR or offset
>>>>>        based on minimum additional MMIO requirement
>>>>>   bar0..bar5: Specify the target BAR, which will either be extended
>>>>>        if the BAR exists or added if the BAR slot is available.    
>>>>
>>>>
>>>> While I am digesting the patchset, here are some test results.  
>>>
>>> Thanks for testing!
>>>   
>>>> This is the device:
>>>>
>>>> 00:00.0 Serial Attached SCSI controller: LSI Logic / Symbios Logic SAS3008
>>>> PCI-Express Fusion-MPT SAS-3 (rev 02)  
>>>
>>> BAR1:
>>>   
>>>> Memory at 210000000000 (64-bit, non-prefetchable) [size=64K]  
>>>
>>> BAR3:
>>>   
>>>> Memory at 210000040000 (64-bit, non-prefetchable) [size=256K]
>>>>
>>>> Capabilities: [c0] MSI-X: Enable+ Count=96 Masked-
>>>>         Vector table: BAR=1 offset=0000e000
>>>>         PBA: BAR=1 offset=0000f000
>>>>
>>>>
>>>> Test #1: x-msix-relocation = "off":
>>>>
>>>> FlatView #1
>>>>  AS "memory", root: system
>>>>  AS "cpu-memory", root: system
>>>>  Root memory region: system
>>>>   0000000000000000-000000007fffffff (prio 0, ram): ppc_spapr.ram
>>>>   0000210000000000-000021000000dfff (prio 0, i/o): 0001:03:00.0 BAR 1
>>>>   000021000000e000-000021000000e5ff (prio 0, i/o): msix-table
>>>>   000021000000e600-000021000000ffff (prio 0, i/o): 0001:03:00.0 BAR 1
>>>> @000000000000e600
>>>>   0000210000040000-000021000007ffff (prio 0, ramd): 0001:03:00.0 BAR 3 mmaps[0]
>>>>
>>>> Ok, works.
>>>>
>>>>
>>>> Test #2: x-msix-relocation = "auto":
>>>>
>>>> FlatView #2
>>>>  AS "memory", root: system
>>>>  AS "cpu-memory", root: system
>>>>  Root memory region: system
>>>>   0000000000000000-000000007fffffff (prio 0, ram): ppc_spapr.ram
>>>>   0000200080000000-00002000800005ff (prio 0, i/o): msix-table
>>>>   0000200080000600-000020008000ffff (prio 1, i/o): 0001:03:00.0 base BAR 0
>>>> @0000000000000600
>>>>   0000210000000000-000021000000ffff (prio 0, i/o): 0001:03:00.0 BAR 1
>>>>   0000210000040000-000021000007ffff (prio 0, ramd): 0001:03:00.0 BAR 3 mmaps[0]
>>>>
>>>>
>>>> The guest fails probing because the first 64bit BAR is broken.
>>>>
>>>> lspci:
>>>>
>>>> Region 0: Memory at 200080000000 (32-bit, prefetchable) [size=64K]
>>>> Region 1: Memory at 210000000000 (64-bit, non-prefetchable) [size=64K]
>>>> Region 3: Memory at 210000040000 (64-bit, non-prefetchable) [size=256K]
>>>>
>>>> Capabilities: [c0] MSI-X: Enable- Count=96 Masked-
>>>>         Vector table: BAR=0 offset=00000000
>>>>         PBA: BAR=0 offset=00000600  
>>>
>>> Why do you suppose it's broken?  The added BAR0 is 32bit, it cannot be
>>> 64bit since BAR1 is implemented.  I don't see anything fundamentally
>>> different between this and the working BAR5 test below.  
>>
>>
>> BAR1 (0x14..0x17) uses BAR0 (0x10..0x13) as upper 32bits when it is 64bit
>> BAR, no?
> 
> AIUI, if BAR1 is 64bit, it consumes 0x14-0x17 for the lower 32bis and
> 0x18-1b for the upper 32bits, ie. it consumes BAR1 + BAR2.  Likewise
> the 64bit BAR3 also consumes BAR4.  See for instance the 82576
> datasheet:
> 
> https://www.intel.com/content/dam/www/public/us/en/documents/datasheets/82576eb-gigabit-ethernet-controller-datasheet.pdf
> 
> 9.4.11.2 shows the BAR configuration in 64bit mode, 64bit BAR0 consumes
> BAR0 (lower) + BAR1 (upper), 64bit BAR2 consumes BAR2 (lower) + BAR3
> (upper), and the MSI-X BAR becomes 64bit at BAR4, consuming BAR4
> (lower) + BAR5 (upper).  lspci would show this as Region 0, 2, 4.  The
> layout of your SAS card does seem poorly thought out that they've
> essentially precluded a 3rd 64bit BAR by starting with BAR1, but
> perhaps it's for compatibility with an equally poorly designed 32bit
> version of the device.  Thanks,


Ah, makes sense, I just never saw 64bit BARs starting from an odd offset.
My card is weird^Wunusual then:


aik@stratton2:~$ lspci -vbxs 0001:03:00.0

0001:03:00.0 Serial Attached SCSI controller: LSI Logic / Symbios Logic
SAS3008 PCI-Express Fusion-MPT SAS-3 (rev
02)
        Subsystem: Super Micro Computer Inc SAS3008 PCI-Express Fusion-MPT
SAS-3
        Flags: bus master, fast devsel, latency 0
        I/O ports at <unassigned> [disabled]
        Memory at 80140000 (64-bit, non-prefetchable)
        Memory at 80100000 (64-bit, non-prefetchable)
        Capabilities: <access denied>
        Kernel driver in use: vfio-pci
        Kernel modules: mpt3sas
00: 00 10 97 00 46 05 10 00 02 00 07 01 00 00 00 00
10: 01 00 00 00 04 00 14 80 00 00 00 00 04 00 10 80
20: 00 00 00 00 00 00 00 00 00 00 00 00 d9 15 08 08
30: 00 00 00 00 50 00 00 00 00 00 00 00 00 01 00 00



The mpt3sas driver is funny too - it fails probing with MSIX in bar0 but
succeeds with bar5.

Region 1: Memory at 210000000000 (64-bit, non-prefetchable)
Region 3: Memory at 210000040000 (64-bit, non-prefetchable)
Region 5: Memory at 80000000 (32-bit, prefetchable)
Capabilities: [c0] MSI-X: Enable+ Count=96 Masked-
        Vector table: BAR=5 offset=00000000
        PBA: BAR=5 offset=00000600


vs.

Region 0: Memory at 80000000 (32-bit, prefetchable)
Region 1: Memory at 210000000000 (64-bit, non-prefetchable)
Region 3: Memory at 210000040000 (64-bit, non-prefetchable)
Capabilities: [c0] MSI-X: Enable- Count=96 Masked-
        Vector table: BAR=0 offset=00000000
        PBA: BAR=0 offset=00000600


Here is why:

https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/scsi/mpt3sas/mpt3sas_base.c?h=v4.15-rc4#n2608

It is looking for a first MMIO BAR and assumes it is the one which
implements the basic registers including doorbell. I am not so sure this is
that unusual.



-- 
Alexey

  reply	other threads:[~2017-12-19  1:23 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-12-18  5:02 [Qemu-devel] [RFC/RFT PATCH 0/5] vfio/pci: MSI-X MMIO relocation Alex Williamson
2017-12-18  5:02 ` [Qemu-devel] [RFC/RFT PATCH 1/5] vfio/pci: Fixup VFIOMSIXInfo comment Alex Williamson
2017-12-18  5:02 ` [Qemu-devel] [RFC/RFT PATCH 2/5] vfio/pci: Add base BAR MemoryRegion Alex Williamson
2017-12-19  3:44   ` Alexey Kardashevskiy
2017-12-19  3:56     ` Alex Williamson
2017-12-19  4:15       ` Alexey Kardashevskiy
2017-12-18  5:02 ` [Qemu-devel] [RFC/RFT PATCH 3/5] vfio/pci: Emulate BARs Alex Williamson
2017-12-18  5:02 ` [Qemu-devel] [RFC/RFT PATCH 4/5] qapi: Create DEFINE_PROP_OFF_AUTO_PCIBAR Alex Williamson
2017-12-18  5:02 ` [Qemu-devel] [RFC/RFT PATCH 5/5] vfio/pci: Allow relocating MSI-X MMIO Alex Williamson
2017-12-18  9:04   ` Alexey Kardashevskiy
2017-12-18 13:28     ` Alex Williamson
2017-12-18 13:55       ` Alexey Kardashevskiy
2017-12-18 14:28         ` Alex Williamson
2017-12-19  1:22           ` Alexey Kardashevskiy [this message]
2017-12-19  3:07   ` Alexey Kardashevskiy
2017-12-19  3:40     ` Alex Williamson
2017-12-19  6:02       ` Alexey Kardashevskiy
2017-12-19  6:56         ` Alex Williamson
2017-12-19  8:28           ` Alexey Kardashevskiy
2017-12-18  5:34 ` [Qemu-devel] [RFC/RFT PATCH 0/5] vfio/pci: MSI-X MMIO relocation no-reply

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=e087ecee-b42f-afcd-bfa5-7b9ee54301df@ozlabs.ru \
    --to=aik@ozlabs.ru \
    --cc=alex.williamson@redhat.com \
    --cc=eric.auger@redhat.com \
    --cc=qemu-devel@nongnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.