* Re: [RFC PATCH] s390x/pci: vfio-pci breakage with disabled mem enforcement
2020-07-27 15:40 ` Pierre Morel
@ 2020-07-27 16:37 ` Matthew Rosato
2020-07-27 16:47 ` Alex Williamson
2020-07-28 8:59 ` Niklas Schnelle
2 siblings, 0 replies; 14+ messages in thread
From: Matthew Rosato @ 2020-07-27 16:37 UTC (permalink / raw)
To: Pierre Morel, Alex Williamson
Cc: schnelle, david, cohuck, qemu-devel, pasic, borntraeger, qemu-s390x, rth
On 7/27/20 11:40 AM, Pierre Morel wrote:
>
>
> On 2020-07-23 18:29, Alex Williamson wrote:
>> On Thu, 23 Jul 2020 11:13:55 -0400
>> Matthew Rosato <mjrosato@linux.ibm.com> wrote:
>>
>>> I noticed that after kernel commit abafbc55 'vfio-pci: Invalidate mmaps
>>> and block MMIO access on disabled memory' vfio-pci via qemu on s390x
>>> fails spectacularly, with errors in qemu like:
>>>
>>> qemu-system-s390x: vfio_region_read(0001:00:00.0:region0+0x0, 4)
>>> failed: Input/output error
>>>
>>> From read to bar 0 originating out of
>>> hw/s390x/s390-pci-inst.c:zpci_read_bar().
>>>
>>> So, I'm trying to figure out how to get vfio-pci happy again on
>>> s390x. From
>>> a bit of tracing, we seem to be triggering the new trap in
>>> __vfio_pci_memory_enabled(). Sure enough, if I just force this
>>> function to
>>> return 'true' as a test case, things work again.
>>> The included patch attempts to enforce the setting, which restores
>>> everything
>>> to working order but also triggers vfio_bar_restore() in the
>>> process.... So
>>> this isn't the right answer, more of a proof-of-concept.
>>>
>>> @Alex: Any guidance on what needs to happen to make qemu-s390x happy
>>> with this
>>> recent kernel change?
>>
>> Bummer! I won't claim to understand s390 PCI, but if we have a VF
>> exposed to the "host" (ie. the first level where vfio-pci is being
>> used), but we can't tell that it's a VF, how do we know whether the
>> memory bit in the command register is unimplemented because it's a VF
>> or unimplemented because the device doesn't support MMIO? How are the
>> device ID, vendor ID, and BAR registers virtualized to the host? Could
>> the memory enable bit also be emulated by that virtualization, much
>> like vfio-pci does for userspace? If the other registers are
>> virtualized, but these command register bits are left unimplemented, do
>> we need code to deduce that we have a VF based on the existence of MMIO
>> BARs, but lack of memory enable bit? Thanks,
>>
>> Alex
>
> Alex, Matt,
>
> in s390 we have the possibility to assign a virtual function to a
> logical partition as function 0.
> In this case it can not be treated as a virtual function but must be
> treated as a physical function.
> This is currently working very well.
> However, these functions do not set PCI_COMMAND_MEMORY as we need.
>
Thanks for the explanation.
> Shouldn't we fix this inside the kernel, to keep older QMEU working?
Yes, ideally.
>
> Then would it be OK to add a new bit/boolean inside the
> pci_dev/vfio_pci_device like, is_detached_vfn, that we could set during
> enumeration and test inside __vfio_pci_memory_enabled() to return true?
>
> In the enumeration we have the possibility to know if the function is a
> HW/Firmware virtual function on devfn 0 or if it is created by SRIOV.
>
> It seems an easy fix without side effects.
>
> What do you think?
>
From my perspective at least, this would resolve the issue and is
in-line with the sort of suggestion Alex floated the other day of "do
we need code to deduce that we have a VF based on the existence of MMIO
BARs, but lack of memory enable bit?" -- it just sounds like a different
way of coming to the conclusion. Effectively we need a way to say: 'for
the purposes of memory_enable detection, treat this thing like a virtual
function because it is one, even though it doesn't always act like one'
>
>>
>>> @Nilkas/@Pierre: I wonder if this might be related to host device
>>> is_virtfn?
>>> I note that my host device lspci output looks like:
>>>
>>> 0000:00:00.0 Ethernet controller: Mellanox Technologies MT27710
>>> Family [ConnectX-4 Lx Virtual Function]
>>>
>>> But the device is not marked as is_virtfn.. Otherwise, Alex's fix
>>> from htps://lkml.org/lkml/2020/6/25/628 should cover the case.
>>>
>
> I do not think we can fix this problem by forcing the is_virtfn bit.
> AFAIU, our HW virtual function works a lot like a physical function.
>
>>>
>>>
>>> Matthew Rosato (1):
>>> s390x/pci: Enforce PCI_COMMAND_MEMORY for vfio-pci
>>>
>>> hw/s390x/s390-pci-inst.c | 10 ++++++++++
>>> 1 file changed, 10 insertions(+)
>>>
>>
>>
>
> Regards,
> Pierre
>
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [RFC PATCH] s390x/pci: vfio-pci breakage with disabled mem enforcement
2020-07-27 15:40 ` Pierre Morel
2020-07-27 16:37 ` Matthew Rosato
@ 2020-07-27 16:47 ` Alex Williamson
2020-07-28 9:33 ` Niklas Schnelle
2020-07-28 8:59 ` Niklas Schnelle
2 siblings, 1 reply; 14+ messages in thread
From: Alex Williamson @ 2020-07-27 16:47 UTC (permalink / raw)
To: Pierre Morel
Cc: schnelle, Matthew Rosato, david, cohuck, qemu-devel, pasic,
borntraeger, qemu-s390x, rth
On Mon, 27 Jul 2020 17:40:39 +0200
Pierre Morel <pmorel@linux.ibm.com> wrote:
> On 2020-07-23 18:29, Alex Williamson wrote:
> > On Thu, 23 Jul 2020 11:13:55 -0400
> > Matthew Rosato <mjrosato@linux.ibm.com> wrote:
> >
> >> I noticed that after kernel commit abafbc55 'vfio-pci: Invalidate mmaps
> >> and block MMIO access on disabled memory' vfio-pci via qemu on s390x
> >> fails spectacularly, with errors in qemu like:
> >>
> >> qemu-system-s390x: vfio_region_read(0001:00:00.0:region0+0x0, 4) failed: Input/output error
> >>
> >> From read to bar 0 originating out of hw/s390x/s390-pci-inst.c:zpci_read_bar().
> >>
> >> So, I'm trying to figure out how to get vfio-pci happy again on s390x. From
> >> a bit of tracing, we seem to be triggering the new trap in
> >> __vfio_pci_memory_enabled(). Sure enough, if I just force this function to
> >> return 'true' as a test case, things work again.
> >> The included patch attempts to enforce the setting, which restores everything
> >> to working order but also triggers vfio_bar_restore() in the process.... So
> >> this isn't the right answer, more of a proof-of-concept.
> >>
> >> @Alex: Any guidance on what needs to happen to make qemu-s390x happy with this
> >> recent kernel change?
> >
> > Bummer! I won't claim to understand s390 PCI, but if we have a VF
> > exposed to the "host" (ie. the first level where vfio-pci is being
> > used), but we can't tell that it's a VF, how do we know whether the
> > memory bit in the command register is unimplemented because it's a VF
> > or unimplemented because the device doesn't support MMIO? How are the
> > device ID, vendor ID, and BAR registers virtualized to the host? Could
> > the memory enable bit also be emulated by that virtualization, much
> > like vfio-pci does for userspace? If the other registers are
> > virtualized, but these command register bits are left unimplemented, do
> > we need code to deduce that we have a VF based on the existence of MMIO
> > BARs, but lack of memory enable bit? Thanks,
> >
> > Alex
>
> Alex, Matt,
>
> in s390 we have the possibility to assign a virtual function to a
> logical partition as function 0.
> In this case it can not be treated as a virtual function but must be
> treated as a physical function.
> This is currently working very well.
> However, these functions do not set PCI_COMMAND_MEMORY as we need.
Where is the vendor and device ID virtualization done for these
devices, we can't have a PF with IDs 0000:0000.
> Shouldn't we fix this inside the kernel, to keep older QMEU working?
>
> Then would it be OK to add a new bit/boolean inside the
> pci_dev/vfio_pci_device like, is_detached_vfn, that we could set during
> enumeration and test inside __vfio_pci_memory_enabled() to return true?
Probably each instance of is_virtfn in vfio-pci should be looked at to
see if it applies to s390. If we're going to recognize this as a VF,
I'd rather we complete the emulation that the lower level hypervisor
has missed. If we can enable all the is_virtfn code on s390, then we
should probably cache is_virtfn on the vfio_pci_device object and allow
s390 a place to set it once at probe or enable time.
> In the enumeration we have the possibility to know if the function is a
> HW/Firmware virtual function on devfn 0 or if it is created by SRIOV.
>
> It seems an easy fix without side effects.
>
> What do you think?
It sure seems preferable to recognize that it is a VF in the kernel
than to require userspace to have arch specific hacks. Thanks,
Alex
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [RFC PATCH] s390x/pci: vfio-pci breakage with disabled mem enforcement
2020-07-27 16:47 ` Alex Williamson
@ 2020-07-28 9:33 ` Niklas Schnelle
2020-07-28 12:52 ` Alex Williamson
0 siblings, 1 reply; 14+ messages in thread
From: Niklas Schnelle @ 2020-07-28 9:33 UTC (permalink / raw)
To: Alex Williamson, Pierre Morel
Cc: Matthew Rosato, david, cohuck, qemu-devel, pasic, borntraeger,
qemu-s390x, rth
On 7/27/20 6:47 PM, Alex Williamson wrote:
> On Mon, 27 Jul 2020 17:40:39 +0200
> Pierre Morel <pmorel@linux.ibm.com> wrote:
>
>> On 2020-07-23 18:29, Alex Williamson wrote:
>>> On Thu, 23 Jul 2020 11:13:55 -0400
>>> Matthew Rosato <mjrosato@linux.ibm.com> wrote:
>>>
>>>> I noticed that after kernel commit abafbc55 'vfio-pci: Invalidate mmaps
>>>> and block MMIO access on disabled memory' vfio-pci via qemu on s390x
>>>> fails spectacularly, with errors in qemu like:
... snip ...
>>
>> Alex, Matt,
>>
>> in s390 we have the possibility to assign a virtual function to a
>> logical partition as function 0.
>> In this case it can not be treated as a virtual function but must be
>> treated as a physical function.
>> This is currently working very well.
>> However, these functions do not set PCI_COMMAND_MEMORY as we need.
>
> Where is the vendor and device ID virtualization done for these
> devices, we can't have a PF with IDs 0000:0000.
Pierre doesn't mean the Device/Vendor IDs he means it has devfn == 0
so it is the mandatory function zero on it's PCI bus, where until recently
we always had only one function per bus but with the recent multi-function
support it can act more like on other platforms with several PCI functions
sharing the same Bus e.g. a PF and the VFs created through sriov_numvfs.
That's why I'm saying that having devfn == 0 should not be very special for a VF
passed to a VM and I really don't see where it would not act like a VF passed
from any other Hypervisor.
The only really tricky part in my opinion is where during the "probing"
we do set is_virtfn so it happens both for VFs passed-through from z/VM
or LPAR and VFs created through sriov_numvfs which unlike on other platforms
are also scanned by Firmware (pdev->no_vf_scan disables the Linux side scanning).
With the fix I'm currently testing I had to do this in pcibios_enable_device()
because I also create sysfs links between VFs and their parent PFs and those
need the sysfs entries to be already created, which makes the more apropriately
sound pcibios_bus_add_device() too early.
>
>> Shouldn't we fix this inside the kernel, to keep older QMEU working?
>>
>> Then would it be OK to add a new bit/boolean inside the
>> pci_dev/vfio_pci_device like, is_detached_vfn, that we could set during
>> enumeration and test inside __vfio_pci_memory_enabled() to return true?
>
> Probably each instance of is_virtfn in vfio-pci should be looked at to
> see if it applies to s390. If we're going to recognize this as a VF,
> I'd rather we complete the emulation that the lower level hypervisor
> has missed. If we can enable all the is_virtfn code on s390, then we
> should probably cache is_virtfn on the vfio_pci_device object and allow
> s390 a place to set it once at probe or enable time.
>
>> In the enumeration we have the possibility to know if the function is a
>> HW/Firmware virtual function on devfn 0 or if it is created by SRIOV.
>>
>> It seems an easy fix without side effects.
>>
>> What do you think?
>
> It sure seems preferable to recognize that it is a VF in the kernel
> than to require userspace to have arch specific hacks. Thanks,
>
> Alex
>
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [RFC PATCH] s390x/pci: vfio-pci breakage with disabled mem enforcement
2020-07-28 9:33 ` Niklas Schnelle
@ 2020-07-28 12:52 ` Alex Williamson
2020-07-28 14:13 ` Niklas Schnelle
0 siblings, 1 reply; 14+ messages in thread
From: Alex Williamson @ 2020-07-28 12:52 UTC (permalink / raw)
To: Niklas Schnelle
Cc: Pierre Morel, Matthew Rosato, david, cohuck, qemu-devel, pasic,
borntraeger, qemu-s390x, rth
On Tue, 28 Jul 2020 11:33:55 +0200
Niklas Schnelle <schnelle@linux.ibm.com> wrote:
> On 7/27/20 6:47 PM, Alex Williamson wrote:
> > On Mon, 27 Jul 2020 17:40:39 +0200
> > Pierre Morel <pmorel@linux.ibm.com> wrote:
> >
> >> On 2020-07-23 18:29, Alex Williamson wrote:
> >>> On Thu, 23 Jul 2020 11:13:55 -0400
> >>> Matthew Rosato <mjrosato@linux.ibm.com> wrote:
> >>>
> >>>> I noticed that after kernel commit abafbc55 'vfio-pci: Invalidate mmaps
> >>>> and block MMIO access on disabled memory' vfio-pci via qemu on s390x
> >>>> fails spectacularly, with errors in qemu like:
> ... snip ...
> >>
> >> Alex, Matt,
> >>
> >> in s390 we have the possibility to assign a virtual function to a
> >> logical partition as function 0.
> >> In this case it can not be treated as a virtual function but must be
> >> treated as a physical function.
> >> This is currently working very well.
> >> However, these functions do not set PCI_COMMAND_MEMORY as we need.
> >
> > Where is the vendor and device ID virtualization done for these
> > devices, we can't have a PF with IDs 0000:0000.
> Pierre doesn't mean the Device/Vendor IDs he means it has devfn == 0
> so it is the mandatory function zero on it's PCI bus, where until recently
> we always had only one function per bus but with the recent multi-function
> support it can act more like on other platforms with several PCI functions
> sharing the same Bus e.g. a PF and the VFs created through sriov_numvfs.
> That's why I'm saying that having devfn == 0 should not be very special for a VF
> passed to a VM and I really don't see where it would not act like a VF passed
> from any other Hypervisor.
My question is relative to other registers on VFs that are not
implemented in hardware, not the device address. In addition to the
memory bit of the command register, SR-IOV VFs do not implement the
vendor and device IDs, these are to be virtualized from the values
provided in the PF SR-IOV capability. It wouldn't make much sense to
expose a device with no PCI vendor or device ID, so I assume the z/VM
hypervisor is virtualizing these, but not the memory bit.
> The only really tricky part in my opinion is where during the "probing"
> we do set is_virtfn so it happens both for VFs passed-through from z/VM
> or LPAR and VFs created through sriov_numvfs which unlike on other platforms
> are also scanned by Firmware (pdev->no_vf_scan disables the Linux side scanning).
> With the fix I'm currently testing I had to do this in pcibios_enable_device()
> because I also create sysfs links between VFs and their parent PFs and those
> need the sysfs entries to be already created, which makes the more apropriately
> sound pcibios_bus_add_device() too early.
I don't think it would be wise to set is_virtfn without a physfn being
present in the OS view. I believe pci_dev.is_virtfn implies
pci_dev.physfn points to the PF device. Thanks,
Alex
> >> Shouldn't we fix this inside the kernel, to keep older QMEU working?
> >>
> >> Then would it be OK to add a new bit/boolean inside the
> >> pci_dev/vfio_pci_device like, is_detached_vfn, that we could set during
> >> enumeration and test inside __vfio_pci_memory_enabled() to return true?
> >
> > Probably each instance of is_virtfn in vfio-pci should be looked at to
> > see if it applies to s390. If we're going to recognize this as a VF,
> > I'd rather we complete the emulation that the lower level hypervisor
> > has missed. If we can enable all the is_virtfn code on s390, then we
> > should probably cache is_virtfn on the vfio_pci_device object and allow
> > s390 a place to set it once at probe or enable time.
> >
> >> In the enumeration we have the possibility to know if the function is a
> >> HW/Firmware virtual function on devfn 0 or if it is created by SRIOV.
> >>
> >> It seems an easy fix without side effects.
> >>
> >> What do you think?
> >
> > It sure seems preferable to recognize that it is a VF in the kernel
> > than to require userspace to have arch specific hacks. Thanks,
> >
> > Alex
> >
>
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [RFC PATCH] s390x/pci: vfio-pci breakage with disabled mem enforcement
2020-07-28 12:52 ` Alex Williamson
@ 2020-07-28 14:13 ` Niklas Schnelle
0 siblings, 0 replies; 14+ messages in thread
From: Niklas Schnelle @ 2020-07-28 14:13 UTC (permalink / raw)
To: Alex Williamson
Cc: Pierre Morel, Matthew Rosato, david, cohuck, qemu-devel, pasic,
borntraeger, qemu-s390x, rth
On 7/28/20 2:52 PM, Alex Williamson wrote:
> On Tue, 28 Jul 2020 11:33:55 +0200
> Niklas Schnelle <schnelle@linux.ibm.com> wrote:
>
>> On 7/27/20 6:47 PM, Alex Williamson wrote:
>>> On Mon, 27 Jul 2020 17:40:39 +0200
>>> Pierre Morel <pmorel@linux.ibm.com> wrote:
>>>
>>>> On 2020-07-23 18:29, Alex Williamson wrote:
>>>>> On Thu, 23 Jul 2020 11:13:55 -0400
>>>>> Matthew Rosato <mjrosato@linux.ibm.com> wrote:
>>>>>
>>>>>> I noticed that after kernel commit abafbc55 'vfio-pci: Invalidate mmaps
>>>>>> and block MMIO access on disabled memory' vfio-pci via qemu on s390x
>>>>>> fails spectacularly, with errors in qemu like:
>> ... snip ...
>>>>
>>>> Alex, Matt,
>>>>
>>>> in s390 we have the possibility to assign a virtual function to a
>>>> logical partition as function 0.
>>>> In this case it can not be treated as a virtual function but must be
>>>> treated as a physical function.
>>>> This is currently working very well.
>>>> However, these functions do not set PCI_COMMAND_MEMORY as we need.
>>>
>>> Where is the vendor and device ID virtualization done for these
>>> devices, we can't have a PF with IDs 0000:0000.
>> Pierre doesn't mean the Device/Vendor IDs he means it has devfn == 0
>> so it is the mandatory function zero on it's PCI bus, where until recently
>> we always had only one function per bus but with the recent multi-function
>> support it can act more like on other platforms with several PCI functions
>> sharing the same Bus e.g. a PF and the VFs created through sriov_numvfs.
>> That's why I'm saying that having devfn == 0 should not be very special for a VF
>> passed to a VM and I really don't see where it would not act like a VF passed
>> from any other Hypervisor.
>
> My question is relative to other registers on VFs that are not
> implemented in hardware, not the device address. In addition to the
> memory bit of the command register, SR-IOV VFs do not implement the
> vendor and device IDs, these are to be virtualized from the values
> provided in the PF SR-IOV capability. It wouldn't make much sense to
> expose a device with no PCI vendor or device ID, so I assume the z/VM
> hypervisor is virtualizing these, but not the memory bit.
Ahh, yes I see. On Z these are actually already virtualized at the LPAR
level as part of the firmware based scanning I mentioned that is the
reason for pdev->no_vf_scan. This is true even for VFs created
through sriov_numvfs in the host which is why I did not realize these
don't come from hardware, even though looking at drivers/pci/iov.c it's
pretty obvious and I did touch that code before. Sorry for the confusion.
>
>> The only really tricky part in my opinion is where during the "probing"
>> we do set is_virtfn so it happens both for VFs passed-through from z/VM
>> or LPAR and VFs created through sriov_numvfs which unlike on other platforms
>> are also scanned by Firmware (pdev->no_vf_scan disables the Linux side scanning).
>> With the fix I'm currently testing I had to do this in pcibios_enable_device()
>> because I also create sysfs links between VFs and their parent PFs and those
>> need the sysfs entries to be already created, which makes the more apropriately
>> sound pcibios_bus_add_device() too early.
>
>
> I don't think it would be wise to set is_virtfn without a physfn being
> present in the OS view. I believe pci_dev.is_virtfn implies
> pci_dev.physfn points to the PF device. Thanks>
> Alex
Thank you a lot for that hint, you're absolutely right, while the
drivers do work with is_virtfn == 1 && physffn == NULL
vfio-pci gets very confused. And sorry Pierre for doubting
your is_virtfn_detached idea, this thing really is confusing.
>
>>>> Shouldn't we fix this inside the kernel, to keep older QMEU working?
>>>>
>>>> Then would it be OK to add a new bit/boolean inside the
>>>> pci_dev/vfio_pci_device like, is_detached_vfn, that we could set during
>>>> enumeration and test inside __vfio_pci_memory_enabled() to return true?
>>>
>>> Probably each instance of is_virtfn in vfio-pci should be looked at to
>>> see if it applies to s390. If we're going to recognize this as a VF,
>>> I'd rather we complete the emulation that the lower level hypervisor
>>> has missed. If we can enable all the is_virtfn code on s390, then we
>>> should probably cache is_virtfn on the vfio_pci_device object and allow
>>> s390 a place to set it once at probe or enable time.
>>>
>>>> In the enumeration we have the possibility to know if the function is a
>>>> HW/Firmware virtual function on devfn 0 or if it is created by SRIOV.
>>>>
>>>> It seems an easy fix without side effects.
>>>>
>>>> What do you think?
>>>
>>> It sure seems preferable to recognize that it is a VF in the kernel
>>> than to require userspace to have arch specific hacks. Thanks,
>>>
>>> Alex
>>>
>>
>
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [RFC PATCH] s390x/pci: vfio-pci breakage with disabled mem enforcement
2020-07-27 15:40 ` Pierre Morel
2020-07-27 16:37 ` Matthew Rosato
2020-07-27 16:47 ` Alex Williamson
@ 2020-07-28 8:59 ` Niklas Schnelle
2 siblings, 0 replies; 14+ messages in thread
From: Niklas Schnelle @ 2020-07-28 8:59 UTC (permalink / raw)
To: Pierre Morel, Alex Williamson, Matthew Rosato
Cc: david, cohuck, qemu-devel, pasic, borntraeger, qemu-s390x, rth
On 7/27/20 5:40 PM, Pierre Morel wrote:
>
>
> On 2020-07-23 18:29, Alex Williamson wrote:
>> On Thu, 23 Jul 2020 11:13:55 -0400
>> Matthew Rosato <mjrosato@linux.ibm.com> wrote:
>>
>>> I noticed that after kernel commit abafbc55 'vfio-pci: Invalidate mmaps
>>> and block MMIO access on disabled memory' vfio-pci via qemu on s390x
>>> fails spectacularly, with errors in qemu like:
>>>
>>> qemu-system-s390x: vfio_region_read(0001:00:00.0:region0+0x0, 4) failed: Input/output error
>>>
>>> From read to bar 0 originating out of hw/s390x/s390-pci-inst.c:zpci_read_bar().
>>>
>>> So, I'm trying to figure out how to get vfio-pci happy again on s390x. From
>>> a bit of tracing, we seem to be triggering the new trap in
>>> __vfio_pci_memory_enabled(). Sure enough, if I just force this function to
>>> return 'true' as a test case, things work again.
>>> The included patch attempts to enforce the setting, which restores everything
>>> to working order but also triggers vfio_bar_restore() in the process.... So
>>> this isn't the right answer, more of a proof-of-concept.
>>>
>>> @Alex: Any guidance on what needs to happen to make qemu-s390x happy with this
>>> recent kernel change?
>>
>> Bummer! I won't claim to understand s390 PCI, but if we have a VF
>> exposed to the "host" (ie. the first level where vfio-pci is being
>> used), but we can't tell that it's a VF, how do we know whether the
>> memory bit in the command register is unimplemented because it's a VF
>> or unimplemented because the device doesn't support MMIO? How are the
>> device ID, vendor ID, and BAR registers virtualized to the host? Could
>> the memory enable bit also be emulated by that virtualization, much
>> like vfio-pci does for userspace? If the other registers are
>> virtualized, but these command register bits are left unimplemented, do
>> we need code to deduce that we have a VF based on the existence of MMIO
>> BARs, but lack of memory enable bit? Thanks,
>>
>> Alex
>
> Alex, Matt,
>
> in s390 we have the possibility to assign a virtual function to a logical partition as function 0.
> In this case it can not be treated as a virtual function but must be treated as a physical function.
> This is currently working very well.
Can you explain why it must be treated as a physical function and must not have is_virtfn set?
I'm currently reworking my fix for PF/VF linking not happening for all ways to attach a
VF and in that I intend to set is_virtfn = 1 also for VFs that are not linked with a PF
including those attached to an LPAR.
So far I really can not see a reason why that should not work since I was wrong before
and Firmware does tell us that these are indeed VFs (zdev->is_physfn == 0).
AFAIK on nearly all platforms guests will often have a VF as function zero on a bus
because that is what I expect to happen if you pass it through as a PCI function.
So unless I'm missing something, that just makes LPAR look more like a QEMU guest on
another platform which is very likely much more well tested than treating a VF as
a PF as we have been doing.
> However, these functions do not set PCI_COMMAND_MEMORY as we need.
>
> Shouldn't we fix this inside the kernel, to keep older QMEU working?
>
> Then would it be OK to add a new bit/boolean inside the pci_dev/vfio_pci_device like, is_detached_vfn, that we could set during enumeration and test inside __vfio_pci_memory_enabled() to return true?
This does not make sense to me, as I wrote above it's totally normal for VMs to see VFs detached
from the PF as they are passed-through to a QEMU guest so IMHO that's already covered by the meaning
of is_virtfn.
>
> In the enumeration we have the possibility to know if the function is a HW/Firmware virtual function on devfn 0 or if it is created by SRIOV.
>
^ permalink raw reply [flat|nested] 14+ messages in thread