* [Qemu-devel] [PATCH 0/4] four zpci patches @ 2017-08-28 8:04 Yi Min Zhao 2017-08-28 8:04 ` [Qemu-devel] [PATCH 1/4] s390x/pci: fixup trap_msix() Yi Min Zhao ` (3 more replies) 0 siblings, 4 replies; 28+ messages in thread From: Yi Min Zhao @ 2017-08-28 8:04 UTC (permalink / raw) To: qemu-devel Cc: borntraeger, pasic, pmorel, cohuck, agraf, richard.henderson, zyimin This patch set contains four small zpci patches to fixup different issues. 1) fixup calculation of msix boundary 2) remove zpci idx from msix message, instead we could use PCIDevice's id to find zpci device in kvm_arch_fixup_msi_route() 3) fixup ind_offset calculation for adapter interrupt routing entry 4) introduce our own iommu_replay callback Yi Min Zhao (4): s390x/pci: fixup trap_msix() s390x/pci: remove idx from msix msg data s390x/pci: fixup ind_offset of msix routing entry s390x/pci: add iommu replay callback hw/s390x/s390-pci-bus.c | 24 +++++++++++++----------- hw/s390x/s390-pci-bus.h | 2 ++ hw/s390x/s390-pci-inst.c | 28 ++-------------------------- target/s390x/kvm.c | 11 ++++++----- 4 files changed, 23 insertions(+), 42 deletions(-) -- 2.11.0 (Apple Git-81) ^ permalink raw reply [flat|nested] 28+ messages in thread
* [Qemu-devel] [PATCH 1/4] s390x/pci: fixup trap_msix() 2017-08-28 8:04 [Qemu-devel] [PATCH 0/4] four zpci patches Yi Min Zhao @ 2017-08-28 8:04 ` Yi Min Zhao 2017-08-28 14:51 ` Cornelia Huck 2017-08-30 9:47 ` Cornelia Huck 2017-08-28 8:04 ` [Qemu-devel] [PATCH 2/4] s390x/pci: remove idx from msix msg data Yi Min Zhao ` (2 subsequent siblings) 3 siblings, 2 replies; 28+ messages in thread From: Yi Min Zhao @ 2017-08-28 8:04 UTC (permalink / raw) To: qemu-devel Cc: borntraeger, pasic, pmorel, cohuck, agraf, richard.henderson, zyimin The function trap_msix() is to check if pcistg instruction would access msix table entries. The correct boundary condition should be [table_offset, table_offset+entries*entry_size). But the current condition calculated misses the last entry. So let's fixup it. Acked-by: Dong Jia Shi <bjsdjshi@linux.vnet.ibm.com> Reviewed-by: Pierre Morel <pmorel@linux.vnet.ibm.com> Signed-off-by: Yi Min Zhao <zyimin@linux.vnet.ibm.com> --- hw/s390x/s390-pci-inst.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/hw/s390x/s390-pci-inst.c b/hw/s390x/s390-pci-inst.c index b7beb8c36a..eba9ffb5f2 100644 --- a/hw/s390x/s390-pci-inst.c +++ b/hw/s390x/s390-pci-inst.c @@ -440,8 +440,8 @@ static int trap_msix(S390PCIBusDevice *pbdev, uint64_t offset, uint8_t pcias) { if (pbdev->msix.available && pbdev->msix.table_bar == pcias && offset >= pbdev->msix.table_offset && - offset <= pbdev->msix.table_offset + - (pbdev->msix.entries - 1) * PCI_MSIX_ENTRY_SIZE) { + offset < (pbdev->msix.table_offset + + pbdev->msix.entries * PCI_MSIX_ENTRY_SIZE)) { return 1; } else { return 0; -- 2.11.0 (Apple Git-81) ^ permalink raw reply related [flat|nested] 28+ messages in thread
* Re: [Qemu-devel] [PATCH 1/4] s390x/pci: fixup trap_msix() 2017-08-28 8:04 ` [Qemu-devel] [PATCH 1/4] s390x/pci: fixup trap_msix() Yi Min Zhao @ 2017-08-28 14:51 ` Cornelia Huck 2017-08-29 4:32 ` Yi Min Zhao 2017-08-30 9:47 ` Cornelia Huck 1 sibling, 1 reply; 28+ messages in thread From: Cornelia Huck @ 2017-08-28 14:51 UTC (permalink / raw) To: Yi Min Zhao Cc: qemu-devel, borntraeger, pasic, pmorel, agraf, richard.henderson On Mon, 28 Aug 2017 10:04:44 +0200 Yi Min Zhao <zyimin@linux.vnet.ibm.com> wrote: > The function trap_msix() is to check if pcistg instruction would access > msix table entries. The correct boundary condition should be > [table_offset, table_offset+entries*entry_size). But the current > condition calculated misses the last entry. So let's fixup it. > > Acked-by: Dong Jia Shi <bjsdjshi@linux.vnet.ibm.com> > Reviewed-by: Pierre Morel <pmorel@linux.vnet.ibm.com> > Signed-off-by: Yi Min Zhao <zyimin@linux.vnet.ibm.com> > --- > hw/s390x/s390-pci-inst.c | 4 ++-- > 1 file changed, 2 insertions(+), 2 deletions(-) > > diff --git a/hw/s390x/s390-pci-inst.c b/hw/s390x/s390-pci-inst.c > index b7beb8c36a..eba9ffb5f2 100644 > --- a/hw/s390x/s390-pci-inst.c > +++ b/hw/s390x/s390-pci-inst.c > @@ -440,8 +440,8 @@ static int trap_msix(S390PCIBusDevice *pbdev, uint64_t offset, uint8_t pcias) > { > if (pbdev->msix.available && pbdev->msix.table_bar == pcias && > offset >= pbdev->msix.table_offset && > - offset <= pbdev->msix.table_offset + > - (pbdev->msix.entries - 1) * PCI_MSIX_ENTRY_SIZE) { > + offset < (pbdev->msix.table_offset + > + pbdev->msix.entries * PCI_MSIX_ENTRY_SIZE)) { > return 1; > } else { > return 0; What happened before due to the miscalculation? Write to wrong memory region? ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [Qemu-devel] [PATCH 1/4] s390x/pci: fixup trap_msix() 2017-08-28 14:51 ` Cornelia Huck @ 2017-08-29 4:32 ` Yi Min Zhao 2017-08-29 8:00 ` Cornelia Huck 0 siblings, 1 reply; 28+ messages in thread From: Yi Min Zhao @ 2017-08-29 4:32 UTC (permalink / raw) To: Cornelia Huck Cc: qemu-devel, borntraeger, pasic, pmorel, agraf, richard.henderson 在 2017/8/28 下午10:51, Cornelia Huck 写道: > On Mon, 28 Aug 2017 10:04:44 +0200 > Yi Min Zhao <zyimin@linux.vnet.ibm.com> wrote: > >> The function trap_msix() is to check if pcistg instruction would access >> msix table entries. The correct boundary condition should be >> [table_offset, table_offset+entries*entry_size). But the current >> condition calculated misses the last entry. So let's fixup it. >> >> Acked-by: Dong Jia Shi <bjsdjshi@linux.vnet.ibm.com> >> Reviewed-by: Pierre Morel <pmorel@linux.vnet.ibm.com> >> Signed-off-by: Yi Min Zhao <zyimin@linux.vnet.ibm.com> >> --- >> hw/s390x/s390-pci-inst.c | 4 ++-- >> 1 file changed, 2 insertions(+), 2 deletions(-) >> >> diff --git a/hw/s390x/s390-pci-inst.c b/hw/s390x/s390-pci-inst.c >> index b7beb8c36a..eba9ffb5f2 100644 >> --- a/hw/s390x/s390-pci-inst.c >> +++ b/hw/s390x/s390-pci-inst.c >> @@ -440,8 +440,8 @@ static int trap_msix(S390PCIBusDevice *pbdev, uint64_t offset, uint8_t pcias) >> { >> if (pbdev->msix.available && pbdev->msix.table_bar == pcias && >> offset >= pbdev->msix.table_offset && >> - offset <= pbdev->msix.table_offset + >> - (pbdev->msix.entries - 1) * PCI_MSIX_ENTRY_SIZE) { >> + offset < (pbdev->msix.table_offset + >> + pbdev->msix.entries * PCI_MSIX_ENTRY_SIZE)) { >> return 1; >> } else { >> return 0; > What happened before due to the miscalculation? Write to wrong memory > region? > > We tried to plug virtio-net pci device but failed. After inspected, we found that the device uses two msix entries but the last one was missed. Then we cannot register interrupt successfully because we should call trap_msixi() in order to save some useful and arch information into msix message. But what about wrong memory region didn't happen. ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [Qemu-devel] [PATCH 1/4] s390x/pci: fixup trap_msix() 2017-08-29 4:32 ` Yi Min Zhao @ 2017-08-29 8:00 ` Cornelia Huck 2017-08-29 8:05 ` Yi Min Zhao 2017-08-29 8:12 ` Yi Min Zhao 0 siblings, 2 replies; 28+ messages in thread From: Cornelia Huck @ 2017-08-29 8:00 UTC (permalink / raw) To: Yi Min Zhao Cc: qemu-devel, borntraeger, pasic, pmorel, agraf, richard.henderson On Tue, 29 Aug 2017 12:32:17 +0800 Yi Min Zhao <zyimin@linux.vnet.ibm.com> wrote: > 在 2017/8/28 下午10:51, Cornelia Huck 写道: > > On Mon, 28 Aug 2017 10:04:44 +0200 > > Yi Min Zhao <zyimin@linux.vnet.ibm.com> wrote: > > > >> The function trap_msix() is to check if pcistg instruction would access > >> msix table entries. The correct boundary condition should be > >> [table_offset, table_offset+entries*entry_size). But the current > >> condition calculated misses the last entry. So let's fixup it. > >> > >> Acked-by: Dong Jia Shi <bjsdjshi@linux.vnet.ibm.com> > >> Reviewed-by: Pierre Morel <pmorel@linux.vnet.ibm.com> > >> Signed-off-by: Yi Min Zhao <zyimin@linux.vnet.ibm.com> > >> --- > >> hw/s390x/s390-pci-inst.c | 4 ++-- > >> 1 file changed, 2 insertions(+), 2 deletions(-) > >> > >> diff --git a/hw/s390x/s390-pci-inst.c b/hw/s390x/s390-pci-inst.c > >> index b7beb8c36a..eba9ffb5f2 100644 > >> --- a/hw/s390x/s390-pci-inst.c > >> +++ b/hw/s390x/s390-pci-inst.c > >> @@ -440,8 +440,8 @@ static int trap_msix(S390PCIBusDevice *pbdev, uint64_t offset, uint8_t pcias) > >> { > >> if (pbdev->msix.available && pbdev->msix.table_bar == pcias && > >> offset >= pbdev->msix.table_offset && > >> - offset <= pbdev->msix.table_offset + > >> - (pbdev->msix.entries - 1) * PCI_MSIX_ENTRY_SIZE) { > >> + offset < (pbdev->msix.table_offset + > >> + pbdev->msix.entries * PCI_MSIX_ENTRY_SIZE)) { > >> return 1; > >> } else { > >> return 0; > > What happened before due to the miscalculation? Write to wrong memory > > region? > > > > > We tried to plug virtio-net pci device but failed. After inspected, we > found that the device uses two msix entries but the last one was > missed. Then we cannot register interrupt successfully because we > should call trap_msixi() in order to save some useful and arch > information into msix message. But what about wrong memory region > didn't happen. So, the guest just was not able to use the second msix entry, but did not get any exception? ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [Qemu-devel] [PATCH 1/4] s390x/pci: fixup trap_msix() 2017-08-29 8:00 ` Cornelia Huck @ 2017-08-29 8:05 ` Yi Min Zhao 2017-08-29 8:12 ` Yi Min Zhao 1 sibling, 0 replies; 28+ messages in thread From: Yi Min Zhao @ 2017-08-29 8:05 UTC (permalink / raw) To: Cornelia Huck Cc: qemu-devel, borntraeger, pasic, pmorel, agraf, richard.henderson 在 2017/8/29 下午4:00, Cornelia Huck 写道: > On Tue, 29 Aug 2017 12:32:17 +0800 > Yi Min Zhao <zyimin@linux.vnet.ibm.com> wrote: > >> 在 2017/8/28 下午10:51, Cornelia Huck 写道: >>> On Mon, 28 Aug 2017 10:04:44 +0200 >>> Yi Min Zhao <zyimin@linux.vnet.ibm.com> wrote: >>> >>>> The function trap_msix() is to check if pcistg instruction would access >>>> msix table entries. The correct boundary condition should be >>>> [table_offset, table_offset+entries*entry_size). But the current >>>> condition calculated misses the last entry. So let's fixup it. >>>> >>>> Acked-by: Dong Jia Shi <bjsdjshi@linux.vnet.ibm.com> >>>> Reviewed-by: Pierre Morel <pmorel@linux.vnet.ibm.com> >>>> Signed-off-by: Yi Min Zhao <zyimin@linux.vnet.ibm.com> >>>> --- >>>> hw/s390x/s390-pci-inst.c | 4 ++-- >>>> 1 file changed, 2 insertions(+), 2 deletions(-) >>>> >>>> diff --git a/hw/s390x/s390-pci-inst.c b/hw/s390x/s390-pci-inst.c >>>> index b7beb8c36a..eba9ffb5f2 100644 >>>> --- a/hw/s390x/s390-pci-inst.c >>>> +++ b/hw/s390x/s390-pci-inst.c >>>> @@ -440,8 +440,8 @@ static int trap_msix(S390PCIBusDevice *pbdev, uint64_t offset, uint8_t pcias) >>>> { >>>> if (pbdev->msix.available && pbdev->msix.table_bar == pcias && >>>> offset >= pbdev->msix.table_offset && >>>> - offset <= pbdev->msix.table_offset + >>>> - (pbdev->msix.entries - 1) * PCI_MSIX_ENTRY_SIZE) { >>>> + offset < (pbdev->msix.table_offset + >>>> + pbdev->msix.entries * PCI_MSIX_ENTRY_SIZE)) { >>>> return 1; >>>> } else { >>>> return 0; >>> What happened before due to the miscalculation? Write to wrong memory >>> region? >>> >>> >> We tried to plug virtio-net pci device but failed. After inspected, we >> found that the device uses two msix entries but the last one was >> missed. Then we cannot register interrupt successfully because we >> should call trap_msixi() in order to save some useful and arch >> information into msix message. But what about wrong memory region >> didn't happen. > So, the guest just was not able to use the second msix entry, but did > not get any exception? > > Yes, didn't get any exception. The guest just kept waiting for something (I guess that might be the response for interrupt register) and then the system had no response. What I can do is only destroy the guest. ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [Qemu-devel] [PATCH 1/4] s390x/pci: fixup trap_msix() 2017-08-29 8:00 ` Cornelia Huck 2017-08-29 8:05 ` Yi Min Zhao @ 2017-08-29 8:12 ` Yi Min Zhao 2017-08-29 8:22 ` Cornelia Huck 1 sibling, 1 reply; 28+ messages in thread From: Yi Min Zhao @ 2017-08-29 8:12 UTC (permalink / raw) To: Cornelia Huck Cc: qemu-devel, borntraeger, pasic, pmorel, agraf, richard.henderson 在 2017/8/29 下午4:00, Cornelia Huck 写道: > On Tue, 29 Aug 2017 12:32:17 +0800 > Yi Min Zhao <zyimin@linux.vnet.ibm.com> wrote: > >> 在 2017/8/28 下午10:51, Cornelia Huck 写道: >>> On Mon, 28 Aug 2017 10:04:44 +0200 >>> Yi Min Zhao <zyimin@linux.vnet.ibm.com> wrote: >>> >>>> The function trap_msix() is to check if pcistg instruction would access >>>> msix table entries. The correct boundary condition should be >>>> [table_offset, table_offset+entries*entry_size). But the current >>>> condition calculated misses the last entry. So let's fixup it. >>>> >>>> Acked-by: Dong Jia Shi <bjsdjshi@linux.vnet.ibm.com> >>>> Reviewed-by: Pierre Morel <pmorel@linux.vnet.ibm.com> >>>> Signed-off-by: Yi Min Zhao <zyimin@linux.vnet.ibm.com> >>>> --- >>>> hw/s390x/s390-pci-inst.c | 4 ++-- >>>> 1 file changed, 2 insertions(+), 2 deletions(-) >>>> >>>> diff --git a/hw/s390x/s390-pci-inst.c b/hw/s390x/s390-pci-inst.c >>>> index b7beb8c36a..eba9ffb5f2 100644 >>>> --- a/hw/s390x/s390-pci-inst.c >>>> +++ b/hw/s390x/s390-pci-inst.c >>>> @@ -440,8 +440,8 @@ static int trap_msix(S390PCIBusDevice *pbdev, uint64_t offset, uint8_t pcias) >>>> { >>>> if (pbdev->msix.available && pbdev->msix.table_bar == pcias && >>>> offset >= pbdev->msix.table_offset && >>>> - offset <= pbdev->msix.table_offset + >>>> - (pbdev->msix.entries - 1) * PCI_MSIX_ENTRY_SIZE) { >>>> + offset < (pbdev->msix.table_offset + >>>> + pbdev->msix.entries * PCI_MSIX_ENTRY_SIZE)) { >>>> return 1; >>>> } else { >>>> return 0; >>> What happened before due to the miscalculation? Write to wrong memory >>> region? >>> >>> >> We tried to plug virtio-net pci device but failed. After inspected, we >> found that the device uses two msix entries but the last one was >> missed. Then we cannot register interrupt successfully because we >> should call trap_msixi() in order to save some useful and arch >> information into msix message. But what about wrong memory region >> didn't happen. > So, the guest just was not able to use the second msix entry, but did > not get any exception? > > Forget one thing. The zpci idx is saved in msix message. The second msix entry has not been trapped so that no idx has been saved, on the other hand idx 0 is saved. So kvm_arch_fixup_msi_route() will update irq route according to the zpci device whose idx is 0. So the wrong logic in trap_msix() will result that flic mixes different pci devices' adapter interrupts. ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [Qemu-devel] [PATCH 1/4] s390x/pci: fixup trap_msix() 2017-08-29 8:12 ` Yi Min Zhao @ 2017-08-29 8:22 ` Cornelia Huck 2017-08-29 8:33 ` Yi Min Zhao 0 siblings, 1 reply; 28+ messages in thread From: Cornelia Huck @ 2017-08-29 8:22 UTC (permalink / raw) To: Yi Min Zhao Cc: qemu-devel, borntraeger, pasic, pmorel, agraf, richard.henderson On Tue, 29 Aug 2017 16:12:26 +0800 Yi Min Zhao <zyimin@linux.vnet.ibm.com> wrote: > 在 2017/8/29 下午4:00, Cornelia Huck 写道: > > On Tue, 29 Aug 2017 12:32:17 +0800 > > Yi Min Zhao <zyimin@linux.vnet.ibm.com> wrote: > > > >> 在 2017/8/28 下午10:51, Cornelia Huck 写道: > >>> On Mon, 28 Aug 2017 10:04:44 +0200 > >>> Yi Min Zhao <zyimin@linux.vnet.ibm.com> wrote: > >>> > >>>> The function trap_msix() is to check if pcistg instruction would access > >>>> msix table entries. The correct boundary condition should be > >>>> [table_offset, table_offset+entries*entry_size). But the current > >>>> condition calculated misses the last entry. So let's fixup it. > >>>> > >>>> Acked-by: Dong Jia Shi <bjsdjshi@linux.vnet.ibm.com> > >>>> Reviewed-by: Pierre Morel <pmorel@linux.vnet.ibm.com> > >>>> Signed-off-by: Yi Min Zhao <zyimin@linux.vnet.ibm.com> > >>>> --- > >>>> hw/s390x/s390-pci-inst.c | 4 ++-- > >>>> 1 file changed, 2 insertions(+), 2 deletions(-) > >>>> > >>>> diff --git a/hw/s390x/s390-pci-inst.c b/hw/s390x/s390-pci-inst.c > >>>> index b7beb8c36a..eba9ffb5f2 100644 > >>>> --- a/hw/s390x/s390-pci-inst.c > >>>> +++ b/hw/s390x/s390-pci-inst.c > >>>> @@ -440,8 +440,8 @@ static int trap_msix(S390PCIBusDevice *pbdev, uint64_t offset, uint8_t pcias) > >>>> { > >>>> if (pbdev->msix.available && pbdev->msix.table_bar == pcias && > >>>> offset >= pbdev->msix.table_offset && > >>>> - offset <= pbdev->msix.table_offset + > >>>> - (pbdev->msix.entries - 1) * PCI_MSIX_ENTRY_SIZE) { > >>>> + offset < (pbdev->msix.table_offset + > >>>> + pbdev->msix.entries * PCI_MSIX_ENTRY_SIZE)) { > >>>> return 1; > >>>> } else { > >>>> return 0; > >>> What happened before due to the miscalculation? Write to wrong memory > >>> region? > >>> > >>> > >> We tried to plug virtio-net pci device but failed. After inspected, we > >> found that the device uses two msix entries but the last one was > >> missed. Then we cannot register interrupt successfully because we > >> should call trap_msixi() in order to save some useful and arch > >> information into msix message. But what about wrong memory region > >> didn't happen. > > So, the guest just was not able to use the second msix entry, but did > > not get any exception? > > > > > Forget one thing. The zpci idx is saved in msix message. The second msix > entry has not been > trapped so that no idx has been saved, on the other hand idx 0 is saved. So > kvm_arch_fixup_msi_route() will update irq route according to the zpci > device whose idx is 0. > So the wrong logic in trap_msix() will result that flic mixes different > pci devices' adapter interrupts. Ouch. So this only ever worked for the small subset of pci devices we can passthrough (assuming none of them use more than one msix entry)? I'm tempted to have this cc:ed to stable so we can fixup 2.10 (which is the first version with usable zpci support). ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [Qemu-devel] [PATCH 1/4] s390x/pci: fixup trap_msix() 2017-08-29 8:22 ` Cornelia Huck @ 2017-08-29 8:33 ` Yi Min Zhao 2017-08-29 8:58 ` Cornelia Huck 0 siblings, 1 reply; 28+ messages in thread From: Yi Min Zhao @ 2017-08-29 8:33 UTC (permalink / raw) To: Cornelia Huck Cc: qemu-devel, borntraeger, pasic, pmorel, agraf, richard.henderson 在 2017/8/29 下午4:22, Cornelia Huck 写道: > On Tue, 29 Aug 2017 16:12:26 +0800 > Yi Min Zhao <zyimin@linux.vnet.ibm.com> wrote: > >> 在 2017/8/29 下午4:00, Cornelia Huck 写道: >>> On Tue, 29 Aug 2017 12:32:17 +0800 >>> Yi Min Zhao <zyimin@linux.vnet.ibm.com> wrote: >>> >>>> 在 2017/8/28 下午10:51, Cornelia Huck 写道: >>>>> On Mon, 28 Aug 2017 10:04:44 +0200 >>>>> Yi Min Zhao <zyimin@linux.vnet.ibm.com> wrote: >>>>> >>>>>> The function trap_msix() is to check if pcistg instruction would access >>>>>> msix table entries. The correct boundary condition should be >>>>>> [table_offset, table_offset+entries*entry_size). But the current >>>>>> condition calculated misses the last entry. So let's fixup it. >>>>>> >>>>>> Acked-by: Dong Jia Shi <bjsdjshi@linux.vnet.ibm.com> >>>>>> Reviewed-by: Pierre Morel <pmorel@linux.vnet.ibm.com> >>>>>> Signed-off-by: Yi Min Zhao <zyimin@linux.vnet.ibm.com> >>>>>> --- >>>>>> hw/s390x/s390-pci-inst.c | 4 ++-- >>>>>> 1 file changed, 2 insertions(+), 2 deletions(-) >>>>>> >>>>>> diff --git a/hw/s390x/s390-pci-inst.c b/hw/s390x/s390-pci-inst.c >>>>>> index b7beb8c36a..eba9ffb5f2 100644 >>>>>> --- a/hw/s390x/s390-pci-inst.c >>>>>> +++ b/hw/s390x/s390-pci-inst.c >>>>>> @@ -440,8 +440,8 @@ static int trap_msix(S390PCIBusDevice *pbdev, uint64_t offset, uint8_t pcias) >>>>>> { >>>>>> if (pbdev->msix.available && pbdev->msix.table_bar == pcias && >>>>>> offset >= pbdev->msix.table_offset && >>>>>> - offset <= pbdev->msix.table_offset + >>>>>> - (pbdev->msix.entries - 1) * PCI_MSIX_ENTRY_SIZE) { >>>>>> + offset < (pbdev->msix.table_offset + >>>>>> + pbdev->msix.entries * PCI_MSIX_ENTRY_SIZE)) { >>>>>> return 1; >>>>>> } else { >>>>>> return 0; >>>>> What happened before due to the miscalculation? Write to wrong memory >>>>> region? >>>>> >>>>> >>>> We tried to plug virtio-net pci device but failed. After inspected, we >>>> found that the device uses two msix entries but the last one was >>>> missed. Then we cannot register interrupt successfully because we >>>> should call trap_msixi() in order to save some useful and arch >>>> information into msix message. But what about wrong memory region >>>> didn't happen. >>> So, the guest just was not able to use the second msix entry, but did >>> not get any exception? >>> >>> >> Forget one thing. The zpci idx is saved in msix message. The second msix >> entry has not been >> trapped so that no idx has been saved, on the other hand idx 0 is saved. So >> kvm_arch_fixup_msi_route() will update irq route according to the zpci >> device whose idx is 0. >> So the wrong logic in trap_msix() will result that flic mixes different >> pci devices' adapter interrupts. > Ouch. So this only ever worked for the small subset of pci devices we > can passthrough (assuming none of them use more than one msix entry)? Because any passthroughed pci devices which I tested has more than 2 msix entries. And not all entries will be used. I find that the last entry is never touched. But virtio pci is much fancy and only uses two entries. So I encountered the issue when I tested virtio-pci device. > > I'm tempted to have this cc:ed to stable so we can fixup 2.10 (which is > the first version with usable zpci support). > > Thanks! ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [Qemu-devel] [PATCH 1/4] s390x/pci: fixup trap_msix() 2017-08-29 8:33 ` Yi Min Zhao @ 2017-08-29 8:58 ` Cornelia Huck 0 siblings, 0 replies; 28+ messages in thread From: Cornelia Huck @ 2017-08-29 8:58 UTC (permalink / raw) To: Yi Min Zhao Cc: qemu-devel, borntraeger, pasic, pmorel, agraf, richard.henderson On Tue, 29 Aug 2017 16:33:52 +0800 Yi Min Zhao <zyimin@linux.vnet.ibm.com> wrote: > 在 2017/8/29 下午4:22, Cornelia Huck 写道: > > On Tue, 29 Aug 2017 16:12:26 +0800 > > Yi Min Zhao <zyimin@linux.vnet.ibm.com> wrote: > > > >> 在 2017/8/29 下午4:00, Cornelia Huck 写道: > >>> On Tue, 29 Aug 2017 12:32:17 +0800 > >>> Yi Min Zhao <zyimin@linux.vnet.ibm.com> wrote: > >>> > >>>> 在 2017/8/28 下午10:51, Cornelia Huck 写道: > >>>>> On Mon, 28 Aug 2017 10:04:44 +0200 > >>>>> Yi Min Zhao <zyimin@linux.vnet.ibm.com> wrote: > >>>>> > >>>>>> The function trap_msix() is to check if pcistg instruction would access > >>>>>> msix table entries. The correct boundary condition should be > >>>>>> [table_offset, table_offset+entries*entry_size). But the current > >>>>>> condition calculated misses the last entry. So let's fixup it. > >>>>>> > >>>>>> Acked-by: Dong Jia Shi <bjsdjshi@linux.vnet.ibm.com> > >>>>>> Reviewed-by: Pierre Morel <pmorel@linux.vnet.ibm.com> > >>>>>> Signed-off-by: Yi Min Zhao <zyimin@linux.vnet.ibm.com> > >>>>>> --- > >>>>>> hw/s390x/s390-pci-inst.c | 4 ++-- > >>>>>> 1 file changed, 2 insertions(+), 2 deletions(-) > >>>>>> > >>>>>> diff --git a/hw/s390x/s390-pci-inst.c b/hw/s390x/s390-pci-inst.c > >>>>>> index b7beb8c36a..eba9ffb5f2 100644 > >>>>>> --- a/hw/s390x/s390-pci-inst.c > >>>>>> +++ b/hw/s390x/s390-pci-inst.c > >>>>>> @@ -440,8 +440,8 @@ static int trap_msix(S390PCIBusDevice *pbdev, uint64_t offset, uint8_t pcias) > >>>>>> { > >>>>>> if (pbdev->msix.available && pbdev->msix.table_bar == pcias && > >>>>>> offset >= pbdev->msix.table_offset && > >>>>>> - offset <= pbdev->msix.table_offset + > >>>>>> - (pbdev->msix.entries - 1) * PCI_MSIX_ENTRY_SIZE) { > >>>>>> + offset < (pbdev->msix.table_offset + > >>>>>> + pbdev->msix.entries * PCI_MSIX_ENTRY_SIZE)) { > >>>>>> return 1; > >>>>>> } else { > >>>>>> return 0; > >>>>> What happened before due to the miscalculation? Write to wrong memory > >>>>> region? > >>>>> > >>>>> > >>>> We tried to plug virtio-net pci device but failed. After inspected, we > >>>> found that the device uses two msix entries but the last one was > >>>> missed. Then we cannot register interrupt successfully because we > >>>> should call trap_msixi() in order to save some useful and arch > >>>> information into msix message. But what about wrong memory region > >>>> didn't happen. > >>> So, the guest just was not able to use the second msix entry, but did > >>> not get any exception? > >>> > >>> > >> Forget one thing. The zpci idx is saved in msix message. The second msix > >> entry has not been > >> trapped so that no idx has been saved, on the other hand idx 0 is saved. So > >> kvm_arch_fixup_msi_route() will update irq route according to the zpci > >> device whose idx is 0. > >> So the wrong logic in trap_msix() will result that flic mixes different > >> pci devices' adapter interrupts. > > Ouch. So this only ever worked for the small subset of pci devices we > > can passthrough (assuming none of them use more than one msix entry)? > Because any passthroughed pci devices which I tested has more than 2 > msix entries. And not all > entries will be used. I find that the last entry is never touched. But > virtio pci is much fancy and only > uses two entries. So I encountered the issue when I tested virtio-pci > device. So that really sounds to me like "we've been lucky"... > > > > I'm tempted to have this cc:ed to stable so we can fixup 2.10 (which is > > the first version with usable zpci support). ...and I'll add cc:stable, as we don't really have any control from qemu which kind of devices are handled by vfio. ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [Qemu-devel] [PATCH 1/4] s390x/pci: fixup trap_msix() 2017-08-28 8:04 ` [Qemu-devel] [PATCH 1/4] s390x/pci: fixup trap_msix() Yi Min Zhao 2017-08-28 14:51 ` Cornelia Huck @ 2017-08-30 9:47 ` Cornelia Huck 1 sibling, 0 replies; 28+ messages in thread From: Cornelia Huck @ 2017-08-30 9:47 UTC (permalink / raw) To: Yi Min Zhao Cc: qemu-devel, borntraeger, pasic, pmorel, agraf, richard.henderson On Mon, 28 Aug 2017 10:04:44 +0200 Yi Min Zhao <zyimin@linux.vnet.ibm.com> wrote: > The function trap_msix() is to check if pcistg instruction would access > msix table entries. The correct boundary condition should be > [table_offset, table_offset+entries*entry_size). But the current > condition calculated misses the last entry. So let's fixup it. > > Acked-by: Dong Jia Shi <bjsdjshi@linux.vnet.ibm.com> > Reviewed-by: Pierre Morel <pmorel@linux.vnet.ibm.com> > Signed-off-by: Yi Min Zhao <zyimin@linux.vnet.ibm.com> > --- > hw/s390x/s390-pci-inst.c | 4 ++-- > 1 file changed, 2 insertions(+), 2 deletions(-) > > diff --git a/hw/s390x/s390-pci-inst.c b/hw/s390x/s390-pci-inst.c > index b7beb8c36a..eba9ffb5f2 100644 > --- a/hw/s390x/s390-pci-inst.c > +++ b/hw/s390x/s390-pci-inst.c > @@ -440,8 +440,8 @@ static int trap_msix(S390PCIBusDevice *pbdev, uint64_t offset, uint8_t pcias) > { > if (pbdev->msix.available && pbdev->msix.table_bar == pcias && > offset >= pbdev->msix.table_offset && > - offset <= pbdev->msix.table_offset + > - (pbdev->msix.entries - 1) * PCI_MSIX_ENTRY_SIZE) { > + offset < (pbdev->msix.table_offset + > + pbdev->msix.entries * PCI_MSIX_ENTRY_SIZE)) { > return 1; > } else { > return 0; Added cc:stable and applied to s390-next. ^ permalink raw reply [flat|nested] 28+ messages in thread
* [Qemu-devel] [PATCH 2/4] s390x/pci: remove idx from msix msg data 2017-08-28 8:04 [Qemu-devel] [PATCH 0/4] four zpci patches Yi Min Zhao 2017-08-28 8:04 ` [Qemu-devel] [PATCH 1/4] s390x/pci: fixup trap_msix() Yi Min Zhao @ 2017-08-28 8:04 ` Yi Min Zhao 2017-08-28 15:04 ` Cornelia Huck 2017-08-28 8:04 ` [Qemu-devel] [PATCH 3/4] s390x/pci: fixup ind_offset of msix routing entry Yi Min Zhao 2017-08-28 8:04 ` [Qemu-devel] [PATCH 4/4] s390x/pci: add iommu replay callback Yi Min Zhao 3 siblings, 1 reply; 28+ messages in thread From: Yi Min Zhao @ 2017-08-28 8:04 UTC (permalink / raw) To: qemu-devel Cc: borntraeger, pasic, pmorel, cohuck, agraf, richard.henderson, zyimin PCIDevcie pointer has been a parameter of kvm_arch_fixup_msi_route(). So we don't need to store zpci idx in msix message data to find out the specific zpci device. Instead, we could use pci device id to find its corresponding zpci device. Signed-off-by: Yi Min Zhao <zyimin@linux.vnet.ibm.com> --- hw/s390x/s390-pci-bus.c | 16 +++++----------- hw/s390x/s390-pci-bus.h | 2 ++ hw/s390x/s390-pci-inst.c | 24 ------------------------ target/s390x/kvm.c | 7 +++++-- 4 files changed, 12 insertions(+), 37 deletions(-) diff --git a/hw/s390x/s390-pci-bus.c b/hw/s390x/s390-pci-bus.c index 61cfd2138f..9e1f7ff5c5 100644 --- a/hw/s390x/s390-pci-bus.c +++ b/hw/s390x/s390-pci-bus.c @@ -209,8 +209,8 @@ static S390PCIBusDevice *s390_pci_find_dev_by_uid(S390pciState *s, uint16_t uid) return NULL; } -static S390PCIBusDevice *s390_pci_find_dev_by_target(S390pciState *s, - const char *target) +S390PCIBusDevice *s390_pci_find_dev_by_target(S390pciState *s, + const char *target) { S390PCIBusDevice *pbdev; @@ -475,19 +475,13 @@ static void s390_msi_ctrl_write(void *opaque, hwaddr addr, uint64_t data, unsigned int size) { S390PCIBusDevice *pbdev = opaque; - uint32_t idx = data >> ZPCI_MSI_VEC_BITS; uint32_t vec = data & ZPCI_MSI_VEC_MASK; uint64_t ind_bit; uint32_t sum_bit; - uint32_t e = 0; - DPRINTF("write_msix data 0x%" PRIx64 " idx %d vec 0x%x\n", data, idx, vec); - - if (!pbdev) { - e |= (vec << ERR_EVENT_MVN_OFFSET); - s390_pci_generate_error_event(ERR_EVENT_NOMSI, idx, 0, addr, e); - return; - } + assert(pbdev); + DPRINTF("write_msix data 0x%" PRIx64 " idx %d vec 0x%x\n", data, + pbdev->idx, vec); if (pbdev->state != ZPCI_FS_ENABLED) { return; diff --git a/hw/s390x/s390-pci-bus.h b/hw/s390x/s390-pci-bus.h index 67af2c12ff..820c7fa52b 100644 --- a/hw/s390x/s390-pci-bus.h +++ b/hw/s390x/s390-pci-bus.h @@ -330,6 +330,8 @@ void s390_pci_generate_error_event(uint16_t pec, uint32_t fh, uint32_t fid, S390PCIBusDevice *s390_pci_find_dev_by_idx(S390pciState *s, uint32_t idx); S390PCIBusDevice *s390_pci_find_dev_by_fh(S390pciState *s, uint32_t fh); S390PCIBusDevice *s390_pci_find_dev_by_fid(S390pciState *s, uint32_t fid); +S390PCIBusDevice *s390_pci_find_dev_by_target(S390pciState *s, + const char *target); S390PCIBusDevice *s390_pci_find_next_avail_dev(S390pciState *s, S390PCIBusDevice *pbdev); diff --git a/hw/s390x/s390-pci-inst.c b/hw/s390x/s390-pci-inst.c index eba9ffb5f2..8e088f3dc9 100644 --- a/hw/s390x/s390-pci-inst.c +++ b/hw/s390x/s390-pci-inst.c @@ -413,29 +413,6 @@ int pcilg_service_call(S390CPU *cpu, uint8_t r1, uint8_t r2) return 0; } -static void update_msix_table_msg_data(S390PCIBusDevice *pbdev, uint64_t offset, - uint64_t *data, uint8_t len) -{ - uint32_t val; - uint8_t *msg_data; - - if (offset % PCI_MSIX_ENTRY_SIZE != 8) { - return; - } - - if (len != 4) { - DPRINTF("access msix table msg data but len is %d\n", len); - return; - } - - msg_data = (uint8_t *)data - offset % PCI_MSIX_ENTRY_SIZE + - PCI_MSIX_ENTRY_VECTOR_CTRL; - val = pci_get_long(msg_data) | - ((pbdev->fh & FH_MASK_INDEX) << ZPCI_MSI_VEC_BITS); - pci_set_long(msg_data, val); - DPRINTF("update msix msg_data to 0x%" PRIx64 "\n", *data); -} - static int trap_msix(S390PCIBusDevice *pbdev, uint64_t offset, uint8_t pcias) { if (pbdev->msix.available && pbdev->msix.table_bar == pcias && @@ -508,7 +485,6 @@ int pcistg_service_call(S390CPU *cpu, uint8_t r1, uint8_t r2) if (trap_msix(pbdev, offset, pcias)) { offset = offset - pbdev->msix.table_offset; mr = &pbdev->pdev->msix_table_mmio; - update_msix_table_msg_data(pbdev, offset, &data, len); } else { mr = pbdev->pdev->io_regions[pcias].memory; } diff --git a/target/s390x/kvm.c b/target/s390x/kvm.c index 1c68c36663..e348bfb7cc 100644 --- a/target/s390x/kvm.c +++ b/target/s390x/kvm.c @@ -2503,10 +2503,13 @@ int kvm_arch_fixup_msi_route(struct kvm_irq_routing_entry *route, uint64_t address, uint32_t data, PCIDevice *dev) { S390PCIBusDevice *pbdev; - uint32_t idx = data >> ZPCI_MSI_VEC_BITS; uint32_t vec = data & ZPCI_MSI_VEC_MASK; - pbdev = s390_pci_find_dev_by_idx(s390_get_phb(), idx); + if (!dev) { + return -ENODEV; + } + + pbdev = s390_pci_find_dev_by_target(s390_get_phb(), DEVICE(dev)->id); if (!pbdev) { DPRINTF("add_msi_route no dev\n"); return -ENODEV; -- 2.11.0 (Apple Git-81) ^ permalink raw reply related [flat|nested] 28+ messages in thread
* Re: [Qemu-devel] [PATCH 2/4] s390x/pci: remove idx from msix msg data 2017-08-28 8:04 ` [Qemu-devel] [PATCH 2/4] s390x/pci: remove idx from msix msg data Yi Min Zhao @ 2017-08-28 15:04 ` Cornelia Huck 2017-08-29 4:33 ` Yi Min Zhao 0 siblings, 1 reply; 28+ messages in thread From: Cornelia Huck @ 2017-08-28 15:04 UTC (permalink / raw) To: Yi Min Zhao Cc: qemu-devel, borntraeger, pasic, pmorel, agraf, richard.henderson On Mon, 28 Aug 2017 10:04:45 +0200 Yi Min Zhao <zyimin@linux.vnet.ibm.com> wrote: > PCIDevcie pointer has been a parameter of kvm_arch_fixup_msi_route(). s/PCIDevcie/PCIDevice/ > So we don't need to store zpci idx in msix message data to find out the > specific zpci device. Instead, we could use pci device id to find its > corresponding zpci device. > > Signed-off-by: Yi Min Zhao <zyimin@linux.vnet.ibm.com> > --- > hw/s390x/s390-pci-bus.c | 16 +++++----------- > hw/s390x/s390-pci-bus.h | 2 ++ > hw/s390x/s390-pci-inst.c | 24 ------------------------ > target/s390x/kvm.c | 7 +++++-- > 4 files changed, 12 insertions(+), 37 deletions(-) > diff --git a/target/s390x/kvm.c b/target/s390x/kvm.c > index 1c68c36663..e348bfb7cc 100644 > --- a/target/s390x/kvm.c > +++ b/target/s390x/kvm.c > @@ -2503,10 +2503,13 @@ int kvm_arch_fixup_msi_route(struct kvm_irq_routing_entry *route, > uint64_t address, uint32_t data, PCIDevice *dev) > { > S390PCIBusDevice *pbdev; > - uint32_t idx = data >> ZPCI_MSI_VEC_BITS; > uint32_t vec = data & ZPCI_MSI_VEC_MASK; > > - pbdev = s390_pci_find_dev_by_idx(s390_get_phb(), idx); > + if (!dev) { > + return -ENODEV; > + } > + > + pbdev = s390_pci_find_dev_by_target(s390_get_phb(), DEVICE(dev)->id); You need to replace the stub for s390_pci_find_dev_by_idx() with a stub for s390_pci_find_dev_by_target() in s390-pci-stubs.c (on my s390-next branch), so that it compiles without CONFIG_PCI. > if (!pbdev) { > DPRINTF("add_msi_route no dev\n"); > return -ENODEV; ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [Qemu-devel] [PATCH 2/4] s390x/pci: remove idx from msix msg data 2017-08-28 15:04 ` Cornelia Huck @ 2017-08-29 4:33 ` Yi Min Zhao 0 siblings, 0 replies; 28+ messages in thread From: Yi Min Zhao @ 2017-08-29 4:33 UTC (permalink / raw) To: qemu-devel 在 2017/8/28 下午11:04, Cornelia Huck 写道: > On Mon, 28 Aug 2017 10:04:45 +0200 > Yi Min Zhao <zyimin@linux.vnet.ibm.com> wrote: > >> PCIDevcie pointer has been a parameter of kvm_arch_fixup_msi_route(). > s/PCIDevcie/PCIDevice Thanks! > >> So we don't need to store zpci idx in msix message data to find out the >> specific zpci device. Instead, we could use pci device id to find its >> corresponding zpci device. >> >> Signed-off-by: Yi Min Zhao <zyimin@linux.vnet.ibm.com> >> --- >> hw/s390x/s390-pci-bus.c | 16 +++++----------- >> hw/s390x/s390-pci-bus.h | 2 ++ >> hw/s390x/s390-pci-inst.c | 24 ------------------------ >> target/s390x/kvm.c | 7 +++++-- >> 4 files changed, 12 insertions(+), 37 deletions(-) >> diff --git a/target/s390x/kvm.c b/target/s390x/kvm.c >> index 1c68c36663..e348bfb7cc 100644 >> --- a/target/s390x/kvm.c >> +++ b/target/s390x/kvm.c >> @@ -2503,10 +2503,13 @@ int kvm_arch_fixup_msi_route(struct kvm_irq_routing_entry *route, >> uint64_t address, uint32_t data, PCIDevice *dev) >> { >> S390PCIBusDevice *pbdev; >> - uint32_t idx = data >> ZPCI_MSI_VEC_BITS; >> uint32_t vec = data & ZPCI_MSI_VEC_MASK; >> >> - pbdev = s390_pci_find_dev_by_idx(s390_get_phb(), idx); >> + if (!dev) { >> + return -ENODEV; >> + } >> + >> + pbdev = s390_pci_find_dev_by_target(s390_get_phb(), DEVICE(dev)->id); > You need to replace the stub for s390_pci_find_dev_by_idx() with a stub > for s390_pci_find_dev_by_target() in s390-pci-stubs.c (on my s390-next > branch), so that it compiles without CONFIG_PCI. OK. Got it. > >> if (!pbdev) { >> DPRINTF("add_msi_route no dev\n"); >> return -ENODEV; > > ^ permalink raw reply [flat|nested] 28+ messages in thread
* [Qemu-devel] [PATCH 3/4] s390x/pci: fixup ind_offset of msix routing entry 2017-08-28 8:04 [Qemu-devel] [PATCH 0/4] four zpci patches Yi Min Zhao 2017-08-28 8:04 ` [Qemu-devel] [PATCH 1/4] s390x/pci: fixup trap_msix() Yi Min Zhao 2017-08-28 8:04 ` [Qemu-devel] [PATCH 2/4] s390x/pci: remove idx from msix msg data Yi Min Zhao @ 2017-08-28 8:04 ` Yi Min Zhao 2017-08-28 15:33 ` Cornelia Huck 2017-08-28 8:04 ` [Qemu-devel] [PATCH 4/4] s390x/pci: add iommu replay callback Yi Min Zhao 3 siblings, 1 reply; 28+ messages in thread From: Yi Min Zhao @ 2017-08-28 8:04 UTC (permalink / raw) To: qemu-devel Cc: borntraeger, pasic, pmorel, cohuck, agraf, richard.henderson, zyimin The aibvo of zpci device should be constant after issued mpcifc registering irqs instruction. Each msix vector should offset from the aibvo. But for flic adapter interrupt, we should use the absolute offset within the aibv. So let's use the aibvo+vector to fixup msix routing entry. Signed-off-by: Yi Min Zhao <zyimin@linux.vnet.ibm.com> --- target/s390x/kvm.c | 4 +--- 1 file changed, 1 insertion(+), 3 deletions(-) diff --git a/target/s390x/kvm.c b/target/s390x/kvm.c index e348bfb7cc..c08b7757e7 100644 --- a/target/s390x/kvm.c +++ b/target/s390x/kvm.c @@ -2515,14 +2515,12 @@ int kvm_arch_fixup_msi_route(struct kvm_irq_routing_entry *route, return -ENODEV; } - pbdev->routes.adapter.ind_offset = vec; - route->type = KVM_IRQ_ROUTING_S390_ADAPTER; route->flags = 0; route->u.adapter.summary_addr = pbdev->routes.adapter.summary_addr; route->u.adapter.ind_addr = pbdev->routes.adapter.ind_addr; route->u.adapter.summary_offset = pbdev->routes.adapter.summary_offset; - route->u.adapter.ind_offset = pbdev->routes.adapter.ind_offset; + route->u.adapter.ind_offset = pbdev->routes.adapter.ind_offset + vec; route->u.adapter.adapter_id = pbdev->routes.adapter.adapter_id; return 0; } -- 2.11.0 (Apple Git-81) ^ permalink raw reply related [flat|nested] 28+ messages in thread
* Re: [Qemu-devel] [PATCH 3/4] s390x/pci: fixup ind_offset of msix routing entry 2017-08-28 8:04 ` [Qemu-devel] [PATCH 3/4] s390x/pci: fixup ind_offset of msix routing entry Yi Min Zhao @ 2017-08-28 15:33 ` Cornelia Huck 2017-08-29 4:39 ` Yi Min Zhao 0 siblings, 1 reply; 28+ messages in thread From: Cornelia Huck @ 2017-08-28 15:33 UTC (permalink / raw) To: Yi Min Zhao Cc: qemu-devel, borntraeger, pasic, pmorel, agraf, richard.henderson On Mon, 28 Aug 2017 10:04:46 +0200 Yi Min Zhao <zyimin@linux.vnet.ibm.com> wrote: > The aibvo of zpci device should be constant after issued mpcifc > registering irqs instruction. Each msix vector should offset from the > aibvo. But for flic adapter interrupt, we should use the absolute > offset within the aibv. So let's use the aibvo+vector to fixup msix > routing entry. This makes sense, but I would tweak the description a bit. "The guest uses the mpcifc instruction to register the aibvo of a zpci device, which is the starting offset of indicators in the indicator area and thus remains constant. Each msix vector is an offset from the aibvo. When we map a msix route to an adapter route, we should not modify the starting offset, but instead add the vector to the starting offset to get the absolute offset in the specific route." I'm wondering how this was ever supposed to work? > > Signed-off-by: Yi Min Zhao <zyimin@linux.vnet.ibm.com> > --- > target/s390x/kvm.c | 4 +--- > 1 file changed, 1 insertion(+), 3 deletions(-) > > diff --git a/target/s390x/kvm.c b/target/s390x/kvm.c > index e348bfb7cc..c08b7757e7 100644 > --- a/target/s390x/kvm.c > +++ b/target/s390x/kvm.c > @@ -2515,14 +2515,12 @@ int kvm_arch_fixup_msi_route(struct kvm_irq_routing_entry *route, > return -ENODEV; > } > > - pbdev->routes.adapter.ind_offset = vec; > - > route->type = KVM_IRQ_ROUTING_S390_ADAPTER; > route->flags = 0; > route->u.adapter.summary_addr = pbdev->routes.adapter.summary_addr; > route->u.adapter.ind_addr = pbdev->routes.adapter.ind_addr; > route->u.adapter.summary_offset = pbdev->routes.adapter.summary_offset; > - route->u.adapter.ind_offset = pbdev->routes.adapter.ind_offset; > + route->u.adapter.ind_offset = pbdev->routes.adapter.ind_offset + vec; > route->u.adapter.adapter_id = pbdev->routes.adapter.adapter_id; > return 0; > } ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [Qemu-devel] [PATCH 3/4] s390x/pci: fixup ind_offset of msix routing entry 2017-08-28 15:33 ` Cornelia Huck @ 2017-08-29 4:39 ` Yi Min Zhao 0 siblings, 0 replies; 28+ messages in thread From: Yi Min Zhao @ 2017-08-29 4:39 UTC (permalink / raw) To: Cornelia Huck Cc: qemu-devel, borntraeger, pasic, pmorel, agraf, richard.henderson 在 2017/8/28 下午11:33, Cornelia Huck 写道: > On Mon, 28 Aug 2017 10:04:46 +0200 > Yi Min Zhao <zyimin@linux.vnet.ibm.com> wrote: > >> The aibvo of zpci device should be constant after issued mpcifc >> registering irqs instruction. Each msix vector should offset from the >> aibvo. But for flic adapter interrupt, we should use the absolute >> offset within the aibv. So let's use the aibvo+vector to fixup msix >> routing entry. > This makes sense, but I would tweak the description a bit. > > "The guest uses the mpcifc instruction to register the aibvo of a zpci > device, which is the starting offset of indicators in the indicator > area and thus remains constant. Each msix vector is an offset from the > aibvo. When we map a msix route to an adapter route, we should not > modify the starting offset, but instead add the vector to the starting > offset to get the absolute offset in the specific route." Much better. Thanks! > > I'm wondering how this was ever supposed to work? I investigated this. Linux kernel always uses 0 as starting offset for aibvo. And each msix entry is only registered one time. So we didn't encounter any problem. But the logic here is not right obviously. It is just a coincidence. > >> Signed-off-by: Yi Min Zhao <zyimin@linux.vnet.ibm.com> >> --- >> target/s390x/kvm.c | 4 +--- >> 1 file changed, 1 insertion(+), 3 deletions(-) >> >> diff --git a/target/s390x/kvm.c b/target/s390x/kvm.c >> index e348bfb7cc..c08b7757e7 100644 >> --- a/target/s390x/kvm.c >> +++ b/target/s390x/kvm.c >> @@ -2515,14 +2515,12 @@ int kvm_arch_fixup_msi_route(struct kvm_irq_routing_entry *route, >> return -ENODEV; >> } >> >> - pbdev->routes.adapter.ind_offset = vec; >> - >> route->type = KVM_IRQ_ROUTING_S390_ADAPTER; >> route->flags = 0; >> route->u.adapter.summary_addr = pbdev->routes.adapter.summary_addr; >> route->u.adapter.ind_addr = pbdev->routes.adapter.ind_addr; >> route->u.adapter.summary_offset = pbdev->routes.adapter.summary_offset; >> - route->u.adapter.ind_offset = pbdev->routes.adapter.ind_offset; >> + route->u.adapter.ind_offset = pbdev->routes.adapter.ind_offset + vec; >> route->u.adapter.adapter_id = pbdev->routes.adapter.adapter_id; >> return 0; >> } > ^ permalink raw reply [flat|nested] 28+ messages in thread
* [Qemu-devel] [PATCH 4/4] s390x/pci: add iommu replay callback 2017-08-28 8:04 [Qemu-devel] [PATCH 0/4] four zpci patches Yi Min Zhao ` (2 preceding siblings ...) 2017-08-28 8:04 ` [Qemu-devel] [PATCH 3/4] s390x/pci: fixup ind_offset of msix routing entry Yi Min Zhao @ 2017-08-28 8:04 ` Yi Min Zhao 2017-08-28 15:57 ` Cornelia Huck 3 siblings, 1 reply; 28+ messages in thread From: Yi Min Zhao @ 2017-08-28 8:04 UTC (permalink / raw) To: qemu-devel Cc: borntraeger, pasic, pmorel, cohuck, agraf, richard.henderson, zyimin Let's introduce iommu replay callback for s390 pci iommu memory region. Currently we don't need any dma mapping replay. So let it return directly. This implementation will avoid meaningless loops calling translation callback. Reviewed-by: Pierre Morel <pmorel@linux.vnet.ibm.com> Reviewed-by: Halil Pasic <pasic@linux.vnet.ibm.com> Signed-off-by: Yi Min Zhao <zyimin@linux.vnet.ibm.com> --- hw/s390x/s390-pci-bus.c | 8 ++++++++ 1 file changed, 8 insertions(+) diff --git a/hw/s390x/s390-pci-bus.c b/hw/s390x/s390-pci-bus.c index 9e1f7ff5c5..359509ccea 100644 --- a/hw/s390x/s390-pci-bus.c +++ b/hw/s390x/s390-pci-bus.c @@ -407,6 +407,13 @@ static IOMMUTLBEntry s390_translate_iommu(IOMMUMemoryRegion *mr, hwaddr addr, return ret; } +static void s390_pci_iommu_replay(IOMMUMemoryRegion *iommu, + IOMMUNotifier *notifier) +{ + /* we don't need iommu replay currently */ + return; +} + static S390PCIIOMMU *s390_pci_get_iommu(S390pciState *s, PCIBus *bus, int devfn) { @@ -1055,6 +1062,7 @@ static void s390_iommu_memory_region_class_init(ObjectClass *klass, void *data) IOMMUMemoryRegionClass *imrc = IOMMU_MEMORY_REGION_CLASS(klass); imrc->translate = s390_translate_iommu; + imrc->replay = s390_pci_iommu_replay; } static const TypeInfo s390_iommu_memory_region_info = { -- 2.11.0 (Apple Git-81) ^ permalink raw reply related [flat|nested] 28+ messages in thread
* Re: [Qemu-devel] [PATCH 4/4] s390x/pci: add iommu replay callback 2017-08-28 8:04 ` [Qemu-devel] [PATCH 4/4] s390x/pci: add iommu replay callback Yi Min Zhao @ 2017-08-28 15:57 ` Cornelia Huck 2017-08-29 4:46 ` Yi Min Zhao 0 siblings, 1 reply; 28+ messages in thread From: Cornelia Huck @ 2017-08-28 15:57 UTC (permalink / raw) To: Yi Min Zhao Cc: qemu-devel, borntraeger, pasic, pmorel, agraf, richard.henderson On Mon, 28 Aug 2017 10:04:47 +0200 Yi Min Zhao <zyimin@linux.vnet.ibm.com> wrote: > Let's introduce iommu replay callback for s390 pci iommu memory region. > Currently we don't need any dma mapping replay. So let it return > directly. This implementation will avoid meaningless loops calling > translation callback. > > Reviewed-by: Pierre Morel <pmorel@linux.vnet.ibm.com> > Reviewed-by: Halil Pasic <pasic@linux.vnet.ibm.com> > Signed-off-by: Yi Min Zhao <zyimin@linux.vnet.ibm.com> > --- > hw/s390x/s390-pci-bus.c | 8 ++++++++ > 1 file changed, 8 insertions(+) > > diff --git a/hw/s390x/s390-pci-bus.c b/hw/s390x/s390-pci-bus.c > index 9e1f7ff5c5..359509ccea 100644 > --- a/hw/s390x/s390-pci-bus.c > +++ b/hw/s390x/s390-pci-bus.c > @@ -407,6 +407,13 @@ static IOMMUTLBEntry s390_translate_iommu(IOMMUMemoryRegion *mr, hwaddr addr, > return ret; > } > > +static void s390_pci_iommu_replay(IOMMUMemoryRegion *iommu, > + IOMMUNotifier *notifier) > +{ > + /* we don't need iommu replay currently */ This really needs to be heavier on the _why_. My guess is that anything which would require replay goes through the rpcit instruction? > + return; > +} > + > static S390PCIIOMMU *s390_pci_get_iommu(S390pciState *s, PCIBus *bus, > int devfn) > { > @@ -1055,6 +1062,7 @@ static void s390_iommu_memory_region_class_init(ObjectClass *klass, void *data) > IOMMUMemoryRegionClass *imrc = IOMMU_MEMORY_REGION_CLASS(klass); > > imrc->translate = s390_translate_iommu; > + imrc->replay = s390_pci_iommu_replay; > } > > static const TypeInfo s390_iommu_memory_region_info = { ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [Qemu-devel] [PATCH 4/4] s390x/pci: add iommu replay callback 2017-08-28 15:57 ` Cornelia Huck @ 2017-08-29 4:46 ` Yi Min Zhao 2017-08-29 8:07 ` Cornelia Huck 0 siblings, 1 reply; 28+ messages in thread From: Yi Min Zhao @ 2017-08-29 4:46 UTC (permalink / raw) To: qemu-devel 在 2017/8/28 下午11:57, Cornelia Huck 写道: > On Mon, 28 Aug 2017 10:04:47 +0200 > Yi Min Zhao <zyimin@linux.vnet.ibm.com> wrote: > >> Let's introduce iommu replay callback for s390 pci iommu memory region. >> Currently we don't need any dma mapping replay. So let it return >> directly. This implementation will avoid meaningless loops calling >> translation callback. >> >> Reviewed-by: Pierre Morel <pmorel@linux.vnet.ibm.com> >> Reviewed-by: Halil Pasic <pasic@linux.vnet.ibm.com> >> Signed-off-by: Yi Min Zhao <zyimin@linux.vnet.ibm.com> >> --- >> hw/s390x/s390-pci-bus.c | 8 ++++++++ >> 1 file changed, 8 insertions(+) >> >> diff --git a/hw/s390x/s390-pci-bus.c b/hw/s390x/s390-pci-bus.c >> index 9e1f7ff5c5..359509ccea 100644 >> --- a/hw/s390x/s390-pci-bus.c >> +++ b/hw/s390x/s390-pci-bus.c >> @@ -407,6 +407,13 @@ static IOMMUTLBEntry s390_translate_iommu(IOMMUMemoryRegion *mr, hwaddr addr, >> return ret; >> } >> >> +static void s390_pci_iommu_replay(IOMMUMemoryRegion *iommu, >> + IOMMUNotifier *notifier) >> +{ >> + /* we don't need iommu replay currently */ > This really needs to be heavier on the _why_. My guess is that anything > which would require replay goes through the rpcit instruction? My understanding is: Our arch is different from others. Each pci device has its own iommu, not like other archs' implementation. So currently there must be no existing mappings belonging to any newly plugged pci device whose iommu doesn't have any mapping at the time. In addition, it's also due to vfio blocking migration. If vfio-pci supports migration in future, we could implement iommu replay by that time. > >> + return; >> +} >> + >> static S390PCIIOMMU *s390_pci_get_iommu(S390pciState *s, PCIBus *bus, >> int devfn) >> { >> @@ -1055,6 +1062,7 @@ static void s390_iommu_memory_region_class_init(ObjectClass *klass, void *data) >> IOMMUMemoryRegionClass *imrc = IOMMU_MEMORY_REGION_CLASS(klass); >> >> imrc->translate = s390_translate_iommu; >> + imrc->replay = s390_pci_iommu_replay; >> } >> >> static const TypeInfo s390_iommu_memory_region_info = { > > ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [Qemu-devel] [PATCH 4/4] s390x/pci: add iommu replay callback 2017-08-29 4:46 ` Yi Min Zhao @ 2017-08-29 8:07 ` Cornelia Huck 2017-08-29 8:26 ` Yi Min Zhao 0 siblings, 1 reply; 28+ messages in thread From: Cornelia Huck @ 2017-08-29 8:07 UTC (permalink / raw) To: Yi Min Zhao Cc: qemu-devel, borntraeger, pasic, pmorel, agraf, richard.henderson [Restored cc:s. Please remember to do reply-all.] On Tue, 29 Aug 2017 12:46:51 +0800 Yi Min Zhao <zyimin@linux.vnet.ibm.com> wrote: > 在 2017/8/28 下午11:57, Cornelia Huck 写道: > > On Mon, 28 Aug 2017 10:04:47 +0200 > > Yi Min Zhao <zyimin@linux.vnet.ibm.com> wrote: > > > >> Let's introduce iommu replay callback for s390 pci iommu memory region. > >> Currently we don't need any dma mapping replay. So let it return > >> directly. This implementation will avoid meaningless loops calling > >> translation callback. > >> > >> Reviewed-by: Pierre Morel <pmorel@linux.vnet.ibm.com> > >> Reviewed-by: Halil Pasic <pasic@linux.vnet.ibm.com> > >> Signed-off-by: Yi Min Zhao <zyimin@linux.vnet.ibm.com> > >> --- > >> hw/s390x/s390-pci-bus.c | 8 ++++++++ > >> 1 file changed, 8 insertions(+) > >> > >> diff --git a/hw/s390x/s390-pci-bus.c b/hw/s390x/s390-pci-bus.c > >> index 9e1f7ff5c5..359509ccea 100644 > >> --- a/hw/s390x/s390-pci-bus.c > >> +++ b/hw/s390x/s390-pci-bus.c > >> @@ -407,6 +407,13 @@ static IOMMUTLBEntry s390_translate_iommu(IOMMUMemoryRegion *mr, hwaddr addr, > >> return ret; > >> } > >> > >> +static void s390_pci_iommu_replay(IOMMUMemoryRegion *iommu, > >> + IOMMUNotifier *notifier) > >> +{ > >> + /* we don't need iommu replay currently */ > > This really needs to be heavier on the _why_. My guess is that anything > > which would require replay goes through the rpcit instruction? > My understanding is: > Our arch is different from others. Each pci device has its own iommu, not > like other archs' implementation. So currently there must be no existing > mappings belonging to any newly plugged pci device whose iommu doesn't > have any mapping at the time. So please put that explanation into the function. (Also, "currently"? Are we expecting it to change?) > In addition, it's also due to vfio blocking migration. If vfio-pci supports > migration in future, we could implement iommu replay by that time. That's not an argument: This is the base zpci support, it should not depend on the supported devices and what they do. (What's the status of virtio-pci, btw? Does it work with your patches applied, or is there still more to be done?) ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [Qemu-devel] [PATCH 4/4] s390x/pci: add iommu replay callback 2017-08-29 8:07 ` Cornelia Huck @ 2017-08-29 8:26 ` Yi Min Zhao 2017-08-29 9:33 ` Cornelia Huck 0 siblings, 1 reply; 28+ messages in thread From: Yi Min Zhao @ 2017-08-29 8:26 UTC (permalink / raw) To: Cornelia Huck Cc: qemu-devel, borntraeger, pasic, pmorel, agraf, richard.henderson 在 2017/8/29 下午4:07, Cornelia Huck 写道: > [Restored cc:s. Please remember to do reply-all.] > > On Tue, 29 Aug 2017 12:46:51 +0800 > Yi Min Zhao <zyimin@linux.vnet.ibm.com> wrote: > >> 在 2017/8/28 下午11:57, Cornelia Huck 写道: >>> On Mon, 28 Aug 2017 10:04:47 +0200 >>> Yi Min Zhao <zyimin@linux.vnet.ibm.com> wrote: >>> >>>> Let's introduce iommu replay callback for s390 pci iommu memory region. >>>> Currently we don't need any dma mapping replay. So let it return >>>> directly. This implementation will avoid meaningless loops calling >>>> translation callback. >>>> >>>> Reviewed-by: Pierre Morel <pmorel@linux.vnet.ibm.com> >>>> Reviewed-by: Halil Pasic <pasic@linux.vnet.ibm.com> >>>> Signed-off-by: Yi Min Zhao <zyimin@linux.vnet.ibm.com> >>>> --- >>>> hw/s390x/s390-pci-bus.c | 8 ++++++++ >>>> 1 file changed, 8 insertions(+) >>>> >>>> diff --git a/hw/s390x/s390-pci-bus.c b/hw/s390x/s390-pci-bus.c >>>> index 9e1f7ff5c5..359509ccea 100644 >>>> --- a/hw/s390x/s390-pci-bus.c >>>> +++ b/hw/s390x/s390-pci-bus.c >>>> @@ -407,6 +407,13 @@ static IOMMUTLBEntry s390_translate_iommu(IOMMUMemoryRegion *mr, hwaddr addr, >>>> return ret; >>>> } >>>> >>>> +static void s390_pci_iommu_replay(IOMMUMemoryRegion *iommu, >>>> + IOMMUNotifier *notifier) >>>> +{ >>>> + /* we don't need iommu replay currently */ >>> This really needs to be heavier on the _why_. My guess is that anything >>> which would require replay goes through the rpcit instruction? >> My understanding is: >> Our arch is different from others. Each pci device has its own iommu, not >> like other archs' implementation. So currently there must be no existing >> mappings belonging to any newly plugged pci device whose iommu doesn't >> have any mapping at the time. > So please put that explanation into the function. (Also, "currently"? > Are we expecting it to change?) The iommu replay function is originally introduced for vfio. I think they want to re-build the existing mappings because vfio has a copy of mappings in kernel. For our case, the mappings would be cleanup when a pci device unplugged, and new mappings would be created when a pci device plugged. I think replay only can happen during vfio-pci device migration. > >> In addition, it's also due to vfio blocking migration. If vfio-pci supports >> migration in future, we could implement iommu replay by that time. > That's not an argument: This is the base zpci support, it should not > depend on the supported devices and what they do. (What's the status of > virtio-pci, btw? Does it work with your patches applied, or is there > still more to be done?) > > My understanding is virtio-pci doesn't need replay. All mappings of any pci device are existing in guest memory. Once the mappings is built, all of them could be migrated along with the guest system. But I might misunderstand it. Please correct me. ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [Qemu-devel] [PATCH 4/4] s390x/pci: add iommu replay callback 2017-08-29 8:26 ` Yi Min Zhao @ 2017-08-29 9:33 ` Cornelia Huck 2017-08-29 9:49 ` Cornelia Huck 2017-08-29 9:51 ` Yi Min Zhao 0 siblings, 2 replies; 28+ messages in thread From: Cornelia Huck @ 2017-08-29 9:33 UTC (permalink / raw) To: Yi Min Zhao Cc: qemu-devel, borntraeger, pasic, pmorel, agraf, richard.henderson On Tue, 29 Aug 2017 16:26:10 +0800 Yi Min Zhao <zyimin@linux.vnet.ibm.com> wrote: > 在 2017/8/29 下午4:07, Cornelia Huck 写道: > > [Restored cc:s. Please remember to do reply-all.] > > > > On Tue, 29 Aug 2017 12:46:51 +0800 > > Yi Min Zhao <zyimin@linux.vnet.ibm.com> wrote: > > > >> 在 2017/8/28 下午11:57, Cornelia Huck 写道: > >>> On Mon, 28 Aug 2017 10:04:47 +0200 > >>> Yi Min Zhao <zyimin@linux.vnet.ibm.com> wrote: > >>> > >>>> Let's introduce iommu replay callback for s390 pci iommu memory region. > >>>> Currently we don't need any dma mapping replay. So let it return > >>>> directly. This implementation will avoid meaningless loops calling > >>>> translation callback. > >>>> > >>>> Reviewed-by: Pierre Morel <pmorel@linux.vnet.ibm.com> > >>>> Reviewed-by: Halil Pasic <pasic@linux.vnet.ibm.com> > >>>> Signed-off-by: Yi Min Zhao <zyimin@linux.vnet.ibm.com> > >>>> --- > >>>> hw/s390x/s390-pci-bus.c | 8 ++++++++ > >>>> 1 file changed, 8 insertions(+) > >>>> > >>>> diff --git a/hw/s390x/s390-pci-bus.c b/hw/s390x/s390-pci-bus.c > >>>> index 9e1f7ff5c5..359509ccea 100644 > >>>> --- a/hw/s390x/s390-pci-bus.c > >>>> +++ b/hw/s390x/s390-pci-bus.c > >>>> @@ -407,6 +407,13 @@ static IOMMUTLBEntry s390_translate_iommu(IOMMUMemoryRegion *mr, hwaddr addr, > >>>> return ret; > >>>> } > >>>> > >>>> +static void s390_pci_iommu_replay(IOMMUMemoryRegion *iommu, > >>>> + IOMMUNotifier *notifier) > >>>> +{ > >>>> + /* we don't need iommu replay currently */ > >>> This really needs to be heavier on the _why_. My guess is that anything > >>> which would require replay goes through the rpcit instruction? > >> My understanding is: > >> Our arch is different from others. Each pci device has its own iommu, not > >> like other archs' implementation. So currently there must be no existing > >> mappings belonging to any newly plugged pci device whose iommu doesn't > >> have any mapping at the time. > > So please put that explanation into the function. (Also, "currently"? > > Are we expecting it to change?) > The iommu replay function is originally introduced for vfio. I think > they want to re-build > the existing mappings because vfio has a copy of mappings in kernel. For > our case, > the mappings would be cleanup when a pci device unplugged, and new mappings > would be created when a pci device plugged. I think replay only can > happen during > vfio-pci device migration. So, the base reason is that it is impossible to plug a pci device on s390x that already has iommu mappings which need to be replayed, which is due to the "one iommu per zpci device" construct (and independent of which devices need replay on other architectures)? Then this should go into the explanation above. (And I'd still like to know what "currently" refers to. I don't like to rely on some kind of implicit assumptions - are we expecting this to break?) > > > >> In addition, it's also due to vfio blocking migration. If vfio-pci supports > >> migration in future, we could implement iommu replay by that time. > > That's not an argument: This is the base zpci support, it should not > > depend on the supported devices and what they do. (What's the status of > > virtio-pci, btw? Does it work with your patches applied, or is there > > still more to be done?) > > > > > My understanding is virtio-pci doesn't need replay. All mappings of any > pci device are existing in > guest memory. Once the mappings is built, all of them could be migrated > along with the guest > system. But I might misunderstand it. Please correct me. My question was whether virtio-pci works with your patches on top at all - last time I checked on master, virtio-pci devices failed to realize with a "msi-x is mandatory" message. ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [Qemu-devel] [PATCH 4/4] s390x/pci: add iommu replay callback 2017-08-29 9:33 ` Cornelia Huck @ 2017-08-29 9:49 ` Cornelia Huck 2017-08-29 9:53 ` Yi Min Zhao 2017-08-29 9:51 ` Yi Min Zhao 1 sibling, 1 reply; 28+ messages in thread From: Cornelia Huck @ 2017-08-29 9:49 UTC (permalink / raw) To: Yi Min Zhao Cc: qemu-devel, borntraeger, pasic, pmorel, agraf, richard.henderson On Tue, 29 Aug 2017 11:33:53 +0200 Cornelia Huck <cohuck@redhat.com> wrote: > My question was whether virtio-pci works with your patches on top at > all - last time I checked on master, virtio-pci devices failed to > realize with a "msi-x is mandatory" message. Just checked again, I still get qemu-system-s390x: -device virtio-rng-pci: MSI-X support is mandatory in the S390 architecture Any clue to what is missing? ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [Qemu-devel] [PATCH 4/4] s390x/pci: add iommu replay callback 2017-08-29 9:49 ` Cornelia Huck @ 2017-08-29 9:53 ` Yi Min Zhao 0 siblings, 0 replies; 28+ messages in thread From: Yi Min Zhao @ 2017-08-29 9:53 UTC (permalink / raw) To: Cornelia Huck Cc: qemu-devel, borntraeger, pasic, pmorel, agraf, richard.henderson 在 2017/8/29 下午5:49, Cornelia Huck 写道: > On Tue, 29 Aug 2017 11:33:53 +0200 > Cornelia Huck <cohuck@redhat.com> wrote: > >> My question was whether virtio-pci works with your patches on top at >> all - last time I checked on master, virtio-pci devices failed to >> realize with a "msi-x is mandatory" message. > Just checked again, I still get > > qemu-system-s390x: -device virtio-rng-pci: MSI-X support is mandatory in the S390 architecture > > Any clue to what is missing? > > virtio-rng-pci doesn't support msix. But msix is required on s390 arch. So we avoid plugging any pci device without msix support. ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [Qemu-devel] [PATCH 4/4] s390x/pci: add iommu replay callback 2017-08-29 9:33 ` Cornelia Huck 2017-08-29 9:49 ` Cornelia Huck @ 2017-08-29 9:51 ` Yi Min Zhao 2017-08-29 9:57 ` Cornelia Huck 1 sibling, 1 reply; 28+ messages in thread From: Yi Min Zhao @ 2017-08-29 9:51 UTC (permalink / raw) To: Cornelia Huck Cc: qemu-devel, borntraeger, pasic, pmorel, agraf, richard.henderson 在 2017/8/29 下午5:33, Cornelia Huck 写道: > On Tue, 29 Aug 2017 16:26:10 +0800 > Yi Min Zhao <zyimin@linux.vnet.ibm.com> wrote: > >> 在 2017/8/29 下午4:07, Cornelia Huck 写道: >>> [Restored cc:s. Please remember to do reply-all.] >>> >>> On Tue, 29 Aug 2017 12:46:51 +0800 >>> Yi Min Zhao <zyimin@linux.vnet.ibm.com> wrote: >>> >>>> 在 2017/8/28 下午11:57, Cornelia Huck 写道: >>>>> On Mon, 28 Aug 2017 10:04:47 +0200 >>>>> Yi Min Zhao <zyimin@linux.vnet.ibm.com> wrote: >>>>> >>>>>> Let's introduce iommu replay callback for s390 pci iommu memory region. >>>>>> Currently we don't need any dma mapping replay. So let it return >>>>>> directly. This implementation will avoid meaningless loops calling >>>>>> translation callback. >>>>>> >>>>>> Reviewed-by: Pierre Morel <pmorel@linux.vnet.ibm.com> >>>>>> Reviewed-by: Halil Pasic <pasic@linux.vnet.ibm.com> >>>>>> Signed-off-by: Yi Min Zhao <zyimin@linux.vnet.ibm.com> >>>>>> --- >>>>>> hw/s390x/s390-pci-bus.c | 8 ++++++++ >>>>>> 1 file changed, 8 insertions(+) >>>>>> >>>>>> diff --git a/hw/s390x/s390-pci-bus.c b/hw/s390x/s390-pci-bus.c >>>>>> index 9e1f7ff5c5..359509ccea 100644 >>>>>> --- a/hw/s390x/s390-pci-bus.c >>>>>> +++ b/hw/s390x/s390-pci-bus.c >>>>>> @@ -407,6 +407,13 @@ static IOMMUTLBEntry s390_translate_iommu(IOMMUMemoryRegion *mr, hwaddr addr, >>>>>> return ret; >>>>>> } >>>>>> >>>>>> +static void s390_pci_iommu_replay(IOMMUMemoryRegion *iommu, >>>>>> + IOMMUNotifier *notifier) >>>>>> +{ >>>>>> + /* we don't need iommu replay currently */ >>>>> This really needs to be heavier on the _why_. My guess is that anything >>>>> which would require replay goes through the rpcit instruction? >>>> My understanding is: >>>> Our arch is different from others. Each pci device has its own iommu, not >>>> like other archs' implementation. So currently there must be no existing >>>> mappings belonging to any newly plugged pci device whose iommu doesn't >>>> have any mapping at the time. >>> So please put that explanation into the function. (Also, "currently"? >>> Are we expecting it to change?) >> The iommu replay function is originally introduced for vfio. I think >> they want to re-build >> the existing mappings because vfio has a copy of mappings in kernel. For >> our case, >> the mappings would be cleanup when a pci device unplugged, and new mappings >> would be created when a pci device plugged. I think replay only can >> happen during >> vfio-pci device migration. > So, the base reason is that it is impossible to plug a pci device on > s390x that already has iommu mappings which need to be replayed, which > is due to the "one iommu per zpci device" construct (and independent of > which devices need replay on other architectures)? Yes. > > Then this should go into the explanation above. (And I'd still like to > know what "currently" refers to. I don't like to rely on some kind of > implicit assumptions - are we expecting this to break?) As our discussion during internal review, we don't need to replay currently because vfio-pci device doesn't support migration. 'currently' refers to the time of vfio-pci device migration block. Only when vfio-pci supports migration in future, we should fill the replaying code. > >>> >>>> In addition, it's also due to vfio blocking migration. If vfio-pci supports >>>> migration in future, we could implement iommu replay by that time. >>> That's not an argument: This is the base zpci support, it should not >>> depend on the supported devices and what they do. (What's the status of >>> virtio-pci, btw? Does it work with your patches applied, or is there >>> still more to be done?) >>> >>> >> My understanding is virtio-pci doesn't need replay. All mappings of any >> pci device are existing in >> guest memory. Once the mappings is built, all of them could be migrated >> along with the guest >> system. But I might misunderstand it. Please correct me. > My question was whether virtio-pci works with your patches on top at > all - last time I checked on master, virtio-pci devices failed to > realize with a "msi-x is mandatory" message. > > I tested. virtio-pci works fine. The message "msix is mandatory" means we only support msix interrupt. If virtio-pci device (like virtio-rng) doesn't support msix, we don't allow it to plug. I thinik this is not related to iommu replay. ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [Qemu-devel] [PATCH 4/4] s390x/pci: add iommu replay callback 2017-08-29 9:51 ` Yi Min Zhao @ 2017-08-29 9:57 ` Cornelia Huck 2017-08-29 10:00 ` Yi Min Zhao 0 siblings, 1 reply; 28+ messages in thread From: Cornelia Huck @ 2017-08-29 9:57 UTC (permalink / raw) To: Yi Min Zhao Cc: qemu-devel, borntraeger, pasic, pmorel, agraf, richard.henderson On Tue, 29 Aug 2017 17:51:43 +0800 Yi Min Zhao <zyimin@linux.vnet.ibm.com> wrote: > 在 2017/8/29 下午5:33, Cornelia Huck 写道: > > My question was whether virtio-pci works with your patches on top at > > all - last time I checked on master, virtio-pci devices failed to > > realize with a "msi-x is mandatory" message. > > > > > I tested. virtio-pci works fine. The message "msix is mandatory" means > we only support > msix interrupt. If virtio-pci device (like virtio-rng) doesn't support > msix, we don't allow it > to plug. Ah, that's it (it's a bit strange that not all virtio-pci devices support msi-x). I can add a virtio-net-pci device just fine. (Maybe we can enhance the message so that it is clear that it refers to that particular device?) > I thinik this is not related to iommu replay. This question was unrelated to this particular patch, more to the whole series :) ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [Qemu-devel] [PATCH 4/4] s390x/pci: add iommu replay callback 2017-08-29 9:57 ` Cornelia Huck @ 2017-08-29 10:00 ` Yi Min Zhao 0 siblings, 0 replies; 28+ messages in thread From: Yi Min Zhao @ 2017-08-29 10:00 UTC (permalink / raw) To: Cornelia Huck Cc: qemu-devel, borntraeger, pasic, pmorel, agraf, richard.henderson 在 2017/8/29 下午5:57, Cornelia Huck 写道: > On Tue, 29 Aug 2017 17:51:43 +0800 > Yi Min Zhao <zyimin@linux.vnet.ibm.com> wrote: > >> 在 2017/8/29 下午5:33, Cornelia Huck 写道: >>> My question was whether virtio-pci works with your patches on top at >>> all - last time I checked on master, virtio-pci devices failed to >>> realize with a "msi-x is mandatory" message. >>> >>> >> I tested. virtio-pci works fine. The message "msix is mandatory" means >> we only support >> msix interrupt. If virtio-pci device (like virtio-rng) doesn't support >> msix, we don't allow it >> to plug. > Ah, that's it (it's a bit strange that not all virtio-pci devices > support msi-x). I can add a virtio-net-pci device just fine. > > (Maybe we can enhance the message so that it is clear that it refers to > that particular device?) Hum, I think so. But it's not urgent. I could post another patch for message enhancement after this series review. Do you agree? > >> I thinik this is not related to iommu replay. > This question was unrelated to this particular patch, more to the whole > series :) > > Yup. ^ permalink raw reply [flat|nested] 28+ messages in thread
end of thread, other threads:[~2017-08-30 9:48 UTC | newest] Thread overview: 28+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2017-08-28 8:04 [Qemu-devel] [PATCH 0/4] four zpci patches Yi Min Zhao 2017-08-28 8:04 ` [Qemu-devel] [PATCH 1/4] s390x/pci: fixup trap_msix() Yi Min Zhao 2017-08-28 14:51 ` Cornelia Huck 2017-08-29 4:32 ` Yi Min Zhao 2017-08-29 8:00 ` Cornelia Huck 2017-08-29 8:05 ` Yi Min Zhao 2017-08-29 8:12 ` Yi Min Zhao 2017-08-29 8:22 ` Cornelia Huck 2017-08-29 8:33 ` Yi Min Zhao 2017-08-29 8:58 ` Cornelia Huck 2017-08-30 9:47 ` Cornelia Huck 2017-08-28 8:04 ` [Qemu-devel] [PATCH 2/4] s390x/pci: remove idx from msix msg data Yi Min Zhao 2017-08-28 15:04 ` Cornelia Huck 2017-08-29 4:33 ` Yi Min Zhao 2017-08-28 8:04 ` [Qemu-devel] [PATCH 3/4] s390x/pci: fixup ind_offset of msix routing entry Yi Min Zhao 2017-08-28 15:33 ` Cornelia Huck 2017-08-29 4:39 ` Yi Min Zhao 2017-08-28 8:04 ` [Qemu-devel] [PATCH 4/4] s390x/pci: add iommu replay callback Yi Min Zhao 2017-08-28 15:57 ` Cornelia Huck 2017-08-29 4:46 ` Yi Min Zhao 2017-08-29 8:07 ` Cornelia Huck 2017-08-29 8:26 ` Yi Min Zhao 2017-08-29 9:33 ` Cornelia Huck 2017-08-29 9:49 ` Cornelia Huck 2017-08-29 9:53 ` Yi Min Zhao 2017-08-29 9:51 ` Yi Min Zhao 2017-08-29 9:57 ` Cornelia Huck 2017-08-29 10:00 ` Yi Min Zhao
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.