* [PATCH 0/4] Fix broken PCIe device after migration @ 2022-02-24 17:44 Igor Mammedov 2022-02-24 17:44 ` [PATCH 1/4] pci: expose TYPE_XIO3130_DOWNSTREAM name Igor Mammedov ` (5 more replies) 0 siblings, 6 replies; 29+ messages in thread From: Igor Mammedov @ 2022-02-24 17:44 UTC (permalink / raw) To: qemu-devel; +Cc: kraxel, mst Currently ACPI PCI hotplug is enabled by default for Q35 machine type and overrides native PCIe hotplug. It works as expected when a PCIe device is hotplugged into slot, however the device becomes in-operational after migration. Which is caused by BARs being disabled on target due to powered off status of the slot. Proposed fix disables power control on PCIe slot when ACPI pcihp takes over hotplug control and makes PCIe slot check if power control is enabled before trying to change slot's power. Which leaves slot always powered on and that makes PCI core keep BARs enabled. PS: it's still hacky approach as all ACPI PCI hotplug is, but it's the least intrusive one. Alternative, I've considered, could be chaining hotplug callbacks and making pcihp ones call overriden native callbacks while inhibiting hotplug event in native callbacks somehow. But that were a bit more intrusive and spills over to SHPC if implemented systematically, so I ditched that for now. It could be resurrected later on if current approach turns out to be insufficient. RHBZ: https://bugzilla.redhat.com/show_bug.cgi?id=2053584 CC: mst@redhat.com CC: kraxel@redhat.com Igor Mammedov (4): pci: expose TYPE_XIO3130_DOWNSTREAM name pcie: update slot power status only is power control is enabled acpi: pcihp: disable power control on PCIe slot q35: compat: keep hotplugged PCIe device broken after migration for 6.2-older machine types include/hw/acpi/pcihp.h | 4 +++- include/hw/pci-bridge/xio3130_downstream.h | 15 +++++++++++++++ hw/acpi/acpi-pci-hotplug-stub.c | 3 ++- hw/acpi/ich9.c | 21 ++++++++++++++++++++- hw/acpi/pcihp.c | 16 +++++++++++++++- hw/acpi/piix4.c | 3 ++- hw/core/machine.c | 4 +++- hw/pci-bridge/xio3130_downstream.c | 3 ++- hw/pci/pcie.c | 5 ++--- 9 files changed, 64 insertions(+), 10 deletions(-) create mode 100644 include/hw/pci-bridge/xio3130_downstream.h -- 2.31.1 ^ permalink raw reply [flat|nested] 29+ messages in thread
* [PATCH 1/4] pci: expose TYPE_XIO3130_DOWNSTREAM name 2022-02-24 17:44 [PATCH 0/4] Fix broken PCIe device after migration Igor Mammedov @ 2022-02-24 17:44 ` Igor Mammedov 2022-02-24 17:44 ` [PATCH 2/4] pcie: update slot power status only is power control is enabled Igor Mammedov ` (4 subsequent siblings) 5 siblings, 0 replies; 29+ messages in thread From: Igor Mammedov @ 2022-02-24 17:44 UTC (permalink / raw) To: qemu-devel; +Cc: kraxel, mst Type name will be used in followup patch for cast check in pcihp code. Signed-off-by: Igor Mammedov <imammedo@redhat.com> --- include/hw/pci-bridge/xio3130_downstream.h | 15 +++++++++++++++ hw/pci-bridge/xio3130_downstream.c | 3 ++- 2 files changed, 17 insertions(+), 1 deletion(-) create mode 100644 include/hw/pci-bridge/xio3130_downstream.h diff --git a/include/hw/pci-bridge/xio3130_downstream.h b/include/hw/pci-bridge/xio3130_downstream.h new file mode 100644 index 0000000000..1d10139aea --- /dev/null +++ b/include/hw/pci-bridge/xio3130_downstream.h @@ -0,0 +1,15 @@ +/* + * TI X3130 pci express downstream port switch + * + * Copyright (C) 2022 Igor Mammedov <imammedo@redhat.com> + * + * SPDX-License-Identifier: GPL-2.0-or-later + */ + +#ifndef HW_PCI_BRIDGE_XIO3130_DOWNSTREAM_H +#define HW_PCI_BRIDGE_XIO3130_DOWNSTREAM_H + +#define TYPE_XIO3130_DOWNSTREAM "xio3130-downstream" + +#endif + diff --git a/hw/pci-bridge/xio3130_downstream.c b/hw/pci-bridge/xio3130_downstream.c index 04aae72cd6..b17cafd359 100644 --- a/hw/pci-bridge/xio3130_downstream.c +++ b/hw/pci-bridge/xio3130_downstream.c @@ -28,6 +28,7 @@ #include "migration/vmstate.h" #include "qapi/error.h" #include "qemu/module.h" +#include "hw/pci-bridge/xio3130_downstream.h" #define PCI_DEVICE_ID_TI_XIO3130D 0x8233 /* downstream port */ #define XIO3130_REVISION 0x1 @@ -173,7 +174,7 @@ static void xio3130_downstream_class_init(ObjectClass *klass, void *data) } static const TypeInfo xio3130_downstream_info = { - .name = "xio3130-downstream", + .name = TYPE_XIO3130_DOWNSTREAM, .parent = TYPE_PCIE_SLOT, .class_init = xio3130_downstream_class_init, .interfaces = (InterfaceInfo[]) { -- 2.31.1 ^ permalink raw reply related [flat|nested] 29+ messages in thread
* [PATCH 2/4] pcie: update slot power status only is power control is enabled 2022-02-24 17:44 [PATCH 0/4] Fix broken PCIe device after migration Igor Mammedov 2022-02-24 17:44 ` [PATCH 1/4] pci: expose TYPE_XIO3130_DOWNSTREAM name Igor Mammedov @ 2022-02-24 17:44 ` Igor Mammedov 2022-02-24 18:05 ` Michael S. Tsirkin ` (2 more replies) 2022-02-24 17:44 ` [PATCH 3/4] acpi: pcihp: disable power control on PCIe slot Igor Mammedov ` (3 subsequent siblings) 5 siblings, 3 replies; 29+ messages in thread From: Igor Mammedov @ 2022-02-24 17:44 UTC (permalink / raw) To: qemu-devel; +Cc: kraxel, mst on creation a PCIDevice has power turned on at the end of pci_qdev_realize() however later on if PCIe slot isn't populated with any children it's power is turned off. It's fine if native hotplug is used as plug callback will power slot on among other things. However when ACPI hotplug is enabled it replaces native PCIe plug callbacks with ACPI specific ones (acpi_pcihp_device_*plug_cb) and as result slot stays powered off. It works fine as ACPI hotplug on guest side takes care of enumerating/initializing hotplugged device. But when later guest is migrated, call chain introduced by [1] pcie_cap_slot_post_load() -> pcie_cap_update_power() -> pcie_set_power_device() -> pci_set_power() -> pci_update_mappings() will disable earlier initialized BARs for the hotplugged device in powered off slot due to commit [2] which disables BARs if power is off. As result guest OS after migration will be very much confused [3], still thinking that it has working device, which isn't true anymore due to disabled BARs. Fix it by honoring PCI_EXP_SLTCAP_PCP and updating power status only if capability is enabled. Follow up patch will disable PCI_EXP_SLTCAP_PCP overriding COMPAT_PROP_PCP property when PCIe slot is under ACPI PCI hotplug control. See [3] for reproducer. 1) Fixes: commit d5daff7d312 (pcie: implement slot power control for pcie root ports) 2) commit 23786d13441 (pci: implement power state) 3) Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2053584 Signed-off-by: Igor Mammedov <imammedo@redhat.com> --- hw/pci/pcie.c | 5 ++--- 1 file changed, 2 insertions(+), 3 deletions(-) diff --git a/hw/pci/pcie.c b/hw/pci/pcie.c index d7d73a31e4..2339729a7c 100644 --- a/hw/pci/pcie.c +++ b/hw/pci/pcie.c @@ -383,10 +383,9 @@ static void pcie_cap_update_power(PCIDevice *hotplug_dev) if (sltcap & PCI_EXP_SLTCAP_PCP) { power = (sltctl & PCI_EXP_SLTCTL_PCC) == PCI_EXP_SLTCTL_PWR_ON; + pci_for_each_device(sec_bus, pci_bus_num(sec_bus), + pcie_set_power_device, &power); } - - pci_for_each_device(sec_bus, pci_bus_num(sec_bus), - pcie_set_power_device, &power); } /* -- 2.31.1 ^ permalink raw reply related [flat|nested] 29+ messages in thread
* Re: [PATCH 2/4] pcie: update slot power status only is power control is enabled 2022-02-24 17:44 ` [PATCH 2/4] pcie: update slot power status only is power control is enabled Igor Mammedov @ 2022-02-24 18:05 ` Michael S. Tsirkin 2022-02-25 8:18 ` Igor Mammedov 2022-02-25 10:05 ` Michael S. Tsirkin 2022-02-25 10:12 ` Gerd Hoffmann 2 siblings, 1 reply; 29+ messages in thread From: Michael S. Tsirkin @ 2022-02-24 18:05 UTC (permalink / raw) To: Igor Mammedov; +Cc: qemu-devel, kraxel On Thu, Feb 24, 2022 at 12:44:09PM -0500, Igor Mammedov wrote: > on creation a PCIDevice has power turned on at the end of pci_qdev_realize() > however later on if PCIe slot isn't populated with any children > it's power is turned off. It's fine if native hotplug is used > as plug callback will power slot on among other things. > However when ACPI hotplug is enabled it replaces native PCIe plug > callbacks with ACPI specific ones (acpi_pcihp_device_*plug_cb) and > as result slot stays powered off. It works fine as ACPI hotplug > on guest side takes care of enumerating/initializing hotplugged > device. But when later guest is migrated, call chain introduced by [1] > > pcie_cap_slot_post_load() > -> pcie_cap_update_power() > -> pcie_set_power_device() > -> pci_set_power() > -> pci_update_mappings() > > will disable earlier initialized BARs for the hotplugged device > in powered off slot due to commit [2] which disables BARs if > power is off. As result guest OS after migration will be very > much confused [3], still thinking that it has working device, > which isn't true anymore due to disabled BARs. > > Fix it by honoring PCI_EXP_SLTCAP_PCP and updating power status > only if capability is enabled. Follow up patch will disable > PCI_EXP_SLTCAP_PCP overriding COMPAT_PROP_PCP property when > PCIe slot is under ACPI PCI hotplug control. > > See [3] for reproducer. > > 1) > Fixes: commit d5daff7d312 (pcie: implement slot power control for pcie root ports) > 2) > commit 23786d13441 (pci: implement power state) > 3) > Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2053584 > Correct format for the last paragraph: Fixes: d5daff7d312 ("pcie: implement slot power control for pcie root ports") Fixes: 23786d13441 ("pci: implement power state") Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=2053584 > Signed-off-by: Igor Mammedov <imammedo@redhat.com> > --- > hw/pci/pcie.c | 5 ++--- > 1 file changed, 2 insertions(+), 3 deletions(-) > > diff --git a/hw/pci/pcie.c b/hw/pci/pcie.c > index d7d73a31e4..2339729a7c 100644 > --- a/hw/pci/pcie.c > +++ b/hw/pci/pcie.c > @@ -383,10 +383,9 @@ static void pcie_cap_update_power(PCIDevice *hotplug_dev) > > if (sltcap & PCI_EXP_SLTCAP_PCP) { > power = (sltctl & PCI_EXP_SLTCTL_PCC) == PCI_EXP_SLTCTL_PWR_ON; > + pci_for_each_device(sec_bus, pci_bus_num(sec_bus), > + pcie_set_power_device, &power); > } > - > - pci_for_each_device(sec_bus, pci_bus_num(sec_bus), > - pcie_set_power_device, &power); I think this is correct. However, I wonder whether for 6.2 compatiblity as a hack we should sometimes skip the power update even when PCI_EXP_SLTCAP_PCP exists. Will that not work around the issue for these machine types? And assuming we want bug for bug compat anyway, why not just put it here? It seems easier to reason about frankly ... > } > > /* > -- > 2.31.1 ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH 2/4] pcie: update slot power status only is power control is enabled 2022-02-24 18:05 ` Michael S. Tsirkin @ 2022-02-25 8:18 ` Igor Mammedov 2022-02-25 9:51 ` Michael S. Tsirkin 0 siblings, 1 reply; 29+ messages in thread From: Igor Mammedov @ 2022-02-25 8:18 UTC (permalink / raw) To: Michael S. Tsirkin; +Cc: qemu-devel, kraxel On Thu, 24 Feb 2022 13:05:07 -0500 "Michael S. Tsirkin" <mst@redhat.com> wrote: > On Thu, Feb 24, 2022 at 12:44:09PM -0500, Igor Mammedov wrote: > > on creation a PCIDevice has power turned on at the end of pci_qdev_realize() > > however later on if PCIe slot isn't populated with any children > > it's power is turned off. It's fine if native hotplug is used > > as plug callback will power slot on among other things. > > However when ACPI hotplug is enabled it replaces native PCIe plug > > callbacks with ACPI specific ones (acpi_pcihp_device_*plug_cb) and > > as result slot stays powered off. It works fine as ACPI hotplug > > on guest side takes care of enumerating/initializing hotplugged > > device. But when later guest is migrated, call chain introduced by [1] > > > > pcie_cap_slot_post_load() > > -> pcie_cap_update_power() > > -> pcie_set_power_device() > > -> pci_set_power() > > -> pci_update_mappings() > > > > will disable earlier initialized BARs for the hotplugged device > > in powered off slot due to commit [2] which disables BARs if > > power is off. As result guest OS after migration will be very > > much confused [3], still thinking that it has working device, > > which isn't true anymore due to disabled BARs. > > > > Fix it by honoring PCI_EXP_SLTCAP_PCP and updating power status > > only if capability is enabled. Follow up patch will disable > > PCI_EXP_SLTCAP_PCP overriding COMPAT_PROP_PCP property when > > PCIe slot is under ACPI PCI hotplug control. > > > > See [3] for reproducer. > > > > 1) > > Fixes: commit d5daff7d312 (pcie: implement slot power control for pcie root ports) > > 2) > > commit 23786d13441 (pci: implement power state) > > 3) > > Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2053584 > > > > > Correct format for the last paragraph: > > > Fixes: d5daff7d312 ("pcie: implement slot power control for pcie root ports") > Fixes: 23786d13441 ("pci: implement power state") > Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=2053584 ok, will fix it up on respin like this to have references: 1) Fixes: d5daff7d312 ("pcie: implement slot power control for pcie root ports") 2) Fixes: 23786d13441 ("pci: implement power state") Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=2053584 > > > Signed-off-by: Igor Mammedov <imammedo@redhat.com> > > --- > > hw/pci/pcie.c | 5 ++--- > > 1 file changed, 2 insertions(+), 3 deletions(-) > > > > diff --git a/hw/pci/pcie.c b/hw/pci/pcie.c > > index d7d73a31e4..2339729a7c 100644 > > --- a/hw/pci/pcie.c > > +++ b/hw/pci/pcie.c > > @@ -383,10 +383,9 @@ static void pcie_cap_update_power(PCIDevice *hotplug_dev) > > > > if (sltcap & PCI_EXP_SLTCAP_PCP) { > > power = (sltctl & PCI_EXP_SLTCTL_PCC) == PCI_EXP_SLTCTL_PWR_ON; > > + pci_for_each_device(sec_bus, pci_bus_num(sec_bus), > > + pcie_set_power_device, &power); > > } > > - > > - pci_for_each_device(sec_bus, pci_bus_num(sec_bus), > > - pcie_set_power_device, &power); > > I think this is correct. However, I wonder whether for 6.2 compatiblity > as a hack we should sometimes skip the power update even when > PCI_EXP_SLTCAP_PCP exists. Will that not work around the issue for > these machine types? pc-q35-6.2 is broken utterly. With pc-q35-6.1, it's a mess. Here is a ping-pong migration matrix for it v6.1 | v6.2 | Fix v6.1 ok | broken | ok (#1) v6.2 | broken | broken (#2) [1] has PCI_EXP_SLTCAP_PCP due to x-pcihp-enable-pcie-pcp-cap=on i.e. pci_config is exactly the same as in qemu-v6.1 [2] PCI_EXP_SLTCAP_PCP is enabled + empty slot is powered off (+ state is migrated) there are some invariants that might work in one direction, but it won't survive ping-pong migration. And more importantly for upstream we care mostly care for old -> new working, and it's direction that is broken in v6.2. > And assuming we want bug for bug compat anyway, why not just put > it here? It seems easier to reason about frankly ... It should be possible hack PCI core to fixup broken power state on incoming migration at (at postload time), but that would just create more confusion, where in some cases migration would work and in some would not (depending on used qemu versions). Lets just declare v6.2 qemu broken, with upgrade/downgrade to (7.0/6.1) as suggested solution. PS: I'd very much prefer avoid adding hacks for ACPI pcihp sake to PCI core, and let PCI code behave as it's supposed to per spec. It's already bad enough with pcihp layered on top of PCI, making PCI code depend on pcihp will just make it more fragile. > > } > > > > /* > > -- > > 2.31.1 > ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH 2/4] pcie: update slot power status only is power control is enabled 2022-02-25 8:18 ` Igor Mammedov @ 2022-02-25 9:51 ` Michael S. Tsirkin 0 siblings, 0 replies; 29+ messages in thread From: Michael S. Tsirkin @ 2022-02-25 9:51 UTC (permalink / raw) To: Igor Mammedov; +Cc: qemu-devel, kraxel On Fri, Feb 25, 2022 at 09:18:30AM +0100, Igor Mammedov wrote: > On Thu, 24 Feb 2022 13:05:07 -0500 > "Michael S. Tsirkin" <mst@redhat.com> wrote: > > > On Thu, Feb 24, 2022 at 12:44:09PM -0500, Igor Mammedov wrote: > > > on creation a PCIDevice has power turned on at the end of pci_qdev_realize() > > > however later on if PCIe slot isn't populated with any children > > > it's power is turned off. It's fine if native hotplug is used > > > as plug callback will power slot on among other things. > > > However when ACPI hotplug is enabled it replaces native PCIe plug > > > callbacks with ACPI specific ones (acpi_pcihp_device_*plug_cb) and > > > as result slot stays powered off. It works fine as ACPI hotplug > > > on guest side takes care of enumerating/initializing hotplugged > > > device. But when later guest is migrated, call chain introduced by [1] > > > > > > pcie_cap_slot_post_load() > > > -> pcie_cap_update_power() > > > -> pcie_set_power_device() > > > -> pci_set_power() > > > -> pci_update_mappings() > > > > > > will disable earlier initialized BARs for the hotplugged device > > > in powered off slot due to commit [2] which disables BARs if > > > power is off. As result guest OS after migration will be very > > > much confused [3], still thinking that it has working device, > > > which isn't true anymore due to disabled BARs. > > > > > > Fix it by honoring PCI_EXP_SLTCAP_PCP and updating power status > > > only if capability is enabled. Follow up patch will disable > > > PCI_EXP_SLTCAP_PCP overriding COMPAT_PROP_PCP property when > > > PCIe slot is under ACPI PCI hotplug control. > > > > > > See [3] for reproducer. > > > > > > 1) > > > Fixes: commit d5daff7d312 (pcie: implement slot power control for pcie root ports) > > > 2) > > > commit 23786d13441 (pci: implement power state) > > > 3) > > > Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2053584 > > > > > > > > > Correct format for the last paragraph: > > > > > > Fixes: d5daff7d312 ("pcie: implement slot power control for pcie root ports") > > Fixes: 23786d13441 ("pci: implement power state") > > Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=2053584 > > ok, will fix it up on respin like this to have references: > > 1) > Fixes: d5daff7d312 ("pcie: implement slot power control for pcie root ports") > 2) > Fixes: 23786d13441 ("pci: implement power state") > Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=2053584 Just drop references, a bit of duplication is not a problem. E.g. in powered off slot due to commit 23786d13441 ("pci: implement power state") which disables BARs if Trailer tags belong in a group at the end with no interruptions, not all tools handle them otherwise. > > > > > Signed-off-by: Igor Mammedov <imammedo@redhat.com> > > > --- > > > hw/pci/pcie.c | 5 ++--- > > > 1 file changed, 2 insertions(+), 3 deletions(-) > > > > > > diff --git a/hw/pci/pcie.c b/hw/pci/pcie.c > > > index d7d73a31e4..2339729a7c 100644 > > > --- a/hw/pci/pcie.c > > > +++ b/hw/pci/pcie.c > > > @@ -383,10 +383,9 @@ static void pcie_cap_update_power(PCIDevice *hotplug_dev) > > > > > > if (sltcap & PCI_EXP_SLTCAP_PCP) { > > > power = (sltctl & PCI_EXP_SLTCTL_PCC) == PCI_EXP_SLTCTL_PWR_ON; > > > + pci_for_each_device(sec_bus, pci_bus_num(sec_bus), > > > + pcie_set_power_device, &power); > > > } > > > - > > > - pci_for_each_device(sec_bus, pci_bus_num(sec_bus), > > > - pcie_set_power_device, &power); > > > > I think this is correct. However, I wonder whether for 6.2 compatiblity > > as a hack we should sometimes skip the power update even when > > PCI_EXP_SLTCAP_PCP exists. Will that not work around the issue for > > these machine types? > > pc-q35-6.2 is broken utterly. > With pc-q35-6.1, it's a mess. Here is a ping-pong migration matrix for it > > v6.1 | v6.2 | Fix > v6.1 ok | broken | ok (#1) > v6.2 | broken | broken (#2) > > [1] has PCI_EXP_SLTCAP_PCP due to x-pcihp-enable-pcie-pcp-cap=on > i.e. pci_config is exactly the same as in qemu-v6.1 > [2] PCI_EXP_SLTCAP_PCP is enabled + empty slot is powered off > (+ state is migrated) > > there are some invariants that might work in one direction, > but it won't survive ping-pong migration. And more importantly > for upstream we care mostly care for old -> new working, > and it's direction that is broken in v6.2. > > > And assuming we want bug for bug compat anyway, why not just put > > it here? It seems easier to reason about frankly ... > > It should be possible hack PCI core to fixup broken power state > on incoming migration at (at postload time), but that would just > create more confusion, where in some cases migration would work > and in some would not (depending on used qemu versions). > > Lets just declare v6.2 qemu broken, with upgrade/downgrade to > (7.0/6.1) as suggested solution. > > PS: > I'd very much prefer avoid adding hacks for ACPI pcihp sake to > PCI core, and let PCI code behave as it's supposed to per spec. > It's already bad enough with pcihp layered on top of PCI, > making PCI code depend on pcihp will just make it more fragile. > > > > } > > > > > > /* > > > -- > > > 2.31.1 > > ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH 2/4] pcie: update slot power status only is power control is enabled 2022-02-24 17:44 ` [PATCH 2/4] pcie: update slot power status only is power control is enabled Igor Mammedov 2022-02-24 18:05 ` Michael S. Tsirkin @ 2022-02-25 10:05 ` Michael S. Tsirkin 2022-02-25 10:12 ` Gerd Hoffmann 2 siblings, 0 replies; 29+ messages in thread From: Michael S. Tsirkin @ 2022-02-25 10:05 UTC (permalink / raw) To: Igor Mammedov; +Cc: qemu-devel, kraxel On Thu, Feb 24, 2022 at 12:44:09PM -0500, Igor Mammedov wrote: > on creation a PCIDevice has power turned on at the end of pci_qdev_realize() > however later on if PCIe slot isn't populated with any children > it's power is turned off. It's fine if native hotplug is used > as plug callback will power slot on among other things. > However when ACPI hotplug is enabled it replaces native PCIe plug > callbacks with ACPI specific ones (acpi_pcihp_device_*plug_cb) and > as result slot stays powered off. It works fine as ACPI hotplug > on guest side takes care of enumerating/initializing hotplugged > device. But when later guest is migrated, call chain introduced by [1] > > pcie_cap_slot_post_load() > -> pcie_cap_update_power() > -> pcie_set_power_device() > -> pci_set_power() > -> pci_update_mappings() > > will disable earlier initialized BARs for the hotplugged device > in powered off slot due to commit [2] which disables BARs if > power is off. As result guest OS after migration will be very > much confused [3], still thinking that it has working device, > which isn't true anymore due to disabled BARs. > > Fix it by honoring PCI_EXP_SLTCAP_PCP and updating power status > only if capability is enabled. > Follow up patch will disable > PCI_EXP_SLTCAP_PCP overriding COMPAT_PROP_PCP property when > PCIe slot is under ACPI PCI hotplug control. > > See [3] for reproducer. > > 1) > Fixes: commit d5daff7d312 (pcie: implement slot power control for pcie root ports) > 2) > commit 23786d13441 (pci: implement power state) > 3) > Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2053584 > > Signed-off-by: Igor Mammedov <imammedo@redhat.com> > --- > hw/pci/pcie.c | 5 ++--- > 1 file changed, 2 insertions(+), 3 deletions(-) > > diff --git a/hw/pci/pcie.c b/hw/pci/pcie.c > index d7d73a31e4..2339729a7c 100644 > --- a/hw/pci/pcie.c > +++ b/hw/pci/pcie.c > @@ -383,10 +383,9 @@ static void pcie_cap_update_power(PCIDevice *hotplug_dev) > > if (sltcap & PCI_EXP_SLTCAP_PCP) { > power = (sltctl & PCI_EXP_SLTCTL_PCC) == PCI_EXP_SLTCTL_PWR_ON; > + pci_for_each_device(sec_bus, pci_bus_num(sec_bus), > + pcie_set_power_device, &power); > } > - > - pci_for_each_device(sec_bus, pci_bus_num(sec_bus), > - pcie_set_power_device, &power); Hmm I wrote I like it but now I am not sure I understand how does this patch help fix things. here is the full function: static void pcie_cap_update_power(PCIDevice *hotplug_dev) { uint8_t *exp_cap = hotplug_dev->config + hotplug_dev->exp.exp_cap; PCIBus *sec_bus = pci_bridge_get_sec_bus(PCI_BRIDGE(hotplug_dev)); uint32_t sltcap = pci_get_long(exp_cap + PCI_EXP_SLTCAP); uint16_t sltctl = pci_get_word(exp_cap + PCI_EXP_SLTCTL); bool power = true; if (sltcap & PCI_EXP_SLTCAP_PCP) { power = (sltctl & PCI_EXP_SLTCTL_PCC) == PCI_EXP_SLTCTL_PWR_ON; } pci_for_each_device(sec_bus, pci_bus_num(sec_bus), pcie_set_power_device, &power); } I understand the follow up patch, but it looks to me that without the capability, power is always on. Why does skipping that help fix anything? Thanks, > } > > /* > -- > 2.31.1 ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH 2/4] pcie: update slot power status only is power control is enabled 2022-02-24 17:44 ` [PATCH 2/4] pcie: update slot power status only is power control is enabled Igor Mammedov 2022-02-24 18:05 ` Michael S. Tsirkin 2022-02-25 10:05 ` Michael S. Tsirkin @ 2022-02-25 10:12 ` Gerd Hoffmann 2022-02-25 10:35 ` Michael S. Tsirkin 2022-02-25 13:02 ` Igor Mammedov 2 siblings, 2 replies; 29+ messages in thread From: Gerd Hoffmann @ 2022-02-25 10:12 UTC (permalink / raw) To: Igor Mammedov; +Cc: qemu-devel, mst Hi, > pcie_cap_slot_post_load() > -> pcie_cap_update_power() > -> pcie_set_power_device() > -> pci_set_power() > -> pci_update_mappings() > Fix it by honoring PCI_EXP_SLTCAP_PCP and updating power status > only if capability is enabled. > diff --git a/hw/pci/pcie.c b/hw/pci/pcie.c > index d7d73a31e4..2339729a7c 100644 > --- a/hw/pci/pcie.c > +++ b/hw/pci/pcie.c > @@ -383,10 +383,9 @@ static void pcie_cap_update_power(PCIDevice *hotplug_dev) > > if (sltcap & PCI_EXP_SLTCAP_PCP) { > power = (sltctl & PCI_EXP_SLTCTL_PCC) == PCI_EXP_SLTCTL_PWR_ON; > + pci_for_each_device(sec_bus, pci_bus_num(sec_bus), > + pcie_set_power_device, &power); > } > - > - pci_for_each_device(sec_bus, pci_bus_num(sec_bus), > - pcie_set_power_device, &power); > } The change makes sense, although I don't see how that changes qemu behavior. 'power' defaults to true, so when SLTCAP_PCP is off it should never ever try to power off the devices. And pci_set_power() should figure the state didn't change and instantly return without touching the device. take care, Gerd ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH 2/4] pcie: update slot power status only is power control is enabled 2022-02-25 10:12 ` Gerd Hoffmann @ 2022-02-25 10:35 ` Michael S. Tsirkin 2022-02-25 13:02 ` Igor Mammedov 1 sibling, 0 replies; 29+ messages in thread From: Michael S. Tsirkin @ 2022-02-25 10:35 UTC (permalink / raw) To: Gerd Hoffmann; +Cc: Igor Mammedov, qemu-devel On Fri, Feb 25, 2022 at 11:12:59AM +0100, Gerd Hoffmann wrote: > Hi, > > > pcie_cap_slot_post_load() > > -> pcie_cap_update_power() > > -> pcie_set_power_device() > > -> pci_set_power() > > -> pci_update_mappings() > > > Fix it by honoring PCI_EXP_SLTCAP_PCP and updating power status > > only if capability is enabled. > > > diff --git a/hw/pci/pcie.c b/hw/pci/pcie.c > > index d7d73a31e4..2339729a7c 100644 > > --- a/hw/pci/pcie.c > > +++ b/hw/pci/pcie.c > > @@ -383,10 +383,9 @@ static void pcie_cap_update_power(PCIDevice *hotplug_dev) > > > > if (sltcap & PCI_EXP_SLTCAP_PCP) { > > power = (sltctl & PCI_EXP_SLTCTL_PCC) == PCI_EXP_SLTCTL_PWR_ON; > > + pci_for_each_device(sec_bus, pci_bus_num(sec_bus), > > + pcie_set_power_device, &power); > > } > > - > > - pci_for_each_device(sec_bus, pci_bus_num(sec_bus), > > - pcie_set_power_device, &power); > > } > > The change makes sense, although I don't see how that changes qemu > behavior. > > 'power' defaults to true, so when SLTCAP_PCP is off it should never > ever try to power off the devices. And pci_set_power() should figure > the state didn't change and instantly return without touching the > device. > > take care, > Gerd And making sure power is actually up might be a bit cleaner just in case down the road we start plugging devices in a powered off state. -- MST ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH 2/4] pcie: update slot power status only is power control is enabled 2022-02-25 10:12 ` Gerd Hoffmann 2022-02-25 10:35 ` Michael S. Tsirkin @ 2022-02-25 13:02 ` Igor Mammedov 2022-02-25 13:08 ` Michael S. Tsirkin 1 sibling, 1 reply; 29+ messages in thread From: Igor Mammedov @ 2022-02-25 13:02 UTC (permalink / raw) To: Gerd Hoffmann; +Cc: qemu-devel, mst On Fri, 25 Feb 2022 11:12:59 +0100 Gerd Hoffmann <kraxel@redhat.com> wrote: > Hi, > > > pcie_cap_slot_post_load() > > -> pcie_cap_update_power() > > -> pcie_set_power_device() > > -> pci_set_power() > > -> pci_update_mappings() > > > Fix it by honoring PCI_EXP_SLTCAP_PCP and updating power status > > only if capability is enabled. > > > diff --git a/hw/pci/pcie.c b/hw/pci/pcie.c > > index d7d73a31e4..2339729a7c 100644 > > --- a/hw/pci/pcie.c > > +++ b/hw/pci/pcie.c > > @@ -383,10 +383,9 @@ static void pcie_cap_update_power(PCIDevice *hotplug_dev) > > > > if (sltcap & PCI_EXP_SLTCAP_PCP) { > > power = (sltctl & PCI_EXP_SLTCTL_PCC) == PCI_EXP_SLTCTL_PWR_ON; > > + pci_for_each_device(sec_bus, pci_bus_num(sec_bus), > > + pcie_set_power_device, &power); > > } > > - > > - pci_for_each_device(sec_bus, pci_bus_num(sec_bus), > > - pcie_set_power_device, &power); > > } > > The change makes sense, although I don't see how that changes qemu > behavior. looks like I need to fix commit message > > 'power' defaults to true, so when SLTCAP_PCP is off it should never > ever try to power off the devices. And pci_set_power() should figure > the state didn't change and instantly return without touching the > device. SLTCAP_PCP is on by default as well, so empty slot ends up with power disabled PCC state [1]: sltctl & SLTCTL_PCC == 0x400 by the time machine is initialized. Then ACPI pcihp callbacks override native hotplug ones so PCC remains stuck in this state since all power control is out of picture in case of ACPI based hotplug. Guest OS doesn't use/or ignore native PCC. After device hotplug, the device stays in has_power=on default state as pci_qdev_realize() initialized it. [2] However when migrated SLTCTL_PCC is also migrated, and kick in as described in commit message and turns power off [3] > > take care, > Gerd > 1) pci_qdev_realize pci_set_power: extra_root0, d->has_power: 0, new power state: 1 pci_set_power: extra_root0, set has_power to: 1 pcie_cap_slot_reset pcie_cap_update_power(extra_root0): sltcap & PCI_EXP_SLTCAP_PCP: 2, sltctl & SLTCTL_PCC: 400 pcie_cap_update_power(extra_root0): updated power: 0 <== has no effect on children since there is none 2) pci_qdev_realize pci_set_power: nic, d->has_power: 0, new power state: 1 pci_set_power: nic, set has_power to: 1 3) pci_qdev_realize pci_set_power: extra_root0, d->has_power: 0, new power state: 1 pci_set_power: extra_root0, set has_power to: 1 pci_qdev_realize pci_set_power: nic, d->has_power: 0, new power state: 1 pci_set_power: nic, set has_power to: 1 pcie_cap_slot_reset pcie_cap_update_power(extra_root0): sltcap & PCI_EXP_SLTCAP_PCP: 2, sltctl & SLTCTL_PCC: 0 pcie_cap_update_power(extra_root0): updated power: 1 pci_set_power: nic, d->has_power: 1, new power state: 1 pcie_cap_slot_post_load(extra_root0) pcie_cap_update_power(extra_root0): sltcap & PCI_EXP_SLTCAP_PCP: 2, sltctl & SLTCTL_PCC: 400 pcie_cap_update_power(extra_root0): updated power: 0 pci_set_power: nic, d->has_power: 1, new power state: 0 pci_set_power: nic, set has_power to: 0 ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH 2/4] pcie: update slot power status only is power control is enabled 2022-02-25 13:02 ` Igor Mammedov @ 2022-02-25 13:08 ` Michael S. Tsirkin 2022-02-25 13:35 ` Igor Mammedov 0 siblings, 1 reply; 29+ messages in thread From: Michael S. Tsirkin @ 2022-02-25 13:08 UTC (permalink / raw) To: Igor Mammedov; +Cc: Gerd Hoffmann, qemu-devel On Fri, Feb 25, 2022 at 02:02:31PM +0100, Igor Mammedov wrote: > On Fri, 25 Feb 2022 11:12:59 +0100 > Gerd Hoffmann <kraxel@redhat.com> wrote: > > > Hi, > > > > > pcie_cap_slot_post_load() > > > -> pcie_cap_update_power() > > > -> pcie_set_power_device() > > > -> pci_set_power() > > > -> pci_update_mappings() > > > > > Fix it by honoring PCI_EXP_SLTCAP_PCP and updating power status > > > only if capability is enabled. > > > > > diff --git a/hw/pci/pcie.c b/hw/pci/pcie.c > > > index d7d73a31e4..2339729a7c 100644 > > > --- a/hw/pci/pcie.c > > > +++ b/hw/pci/pcie.c > > > @@ -383,10 +383,9 @@ static void pcie_cap_update_power(PCIDevice *hotplug_dev) > > > > > > if (sltcap & PCI_EXP_SLTCAP_PCP) { > > > power = (sltctl & PCI_EXP_SLTCTL_PCC) == PCI_EXP_SLTCTL_PWR_ON; > > > + pci_for_each_device(sec_bus, pci_bus_num(sec_bus), > > > + pcie_set_power_device, &power); > > > } > > > - > > > - pci_for_each_device(sec_bus, pci_bus_num(sec_bus), > > > - pcie_set_power_device, &power); > > > } > > > > The change makes sense, although I don't see how that changes qemu > > behavior. > > looks like I need to fix commit message > > > > > 'power' defaults to true, so when SLTCAP_PCP is off it should never > > ever try to power off the devices. And pci_set_power() should figure > > the state didn't change and instantly return without touching the > > device. > > > SLTCAP_PCP is on by default as well, so empty slot ends up with > power disabled PCC state [1]: > > sltctl & SLTCTL_PCC == 0x400 > > by the time machine is initialized. > > Then ACPI pcihp callbacks override native hotplug ones > so PCC remains stuck in this state since all power control > is out of picture in case of ACPI based hotplug. Guest OS > doesn't use/or ignore native PCC. So how about when ACPI pcihp overrides native with its callbacks we also set PCC power to on? > > After device hotplug, the device stays in has_power=on default > state as pci_qdev_realize() initialized it. [2] > > However when migrated SLTCTL_PCC is also migrated, and kick in > as described in commit message and turns power off [3] > > > > > take care, > > Gerd > > > > 1) > pci_qdev_realize > pci_set_power: extra_root0, d->has_power: 0, new power state: 1 > pci_set_power: extra_root0, set has_power to: 1 > pcie_cap_slot_reset > pcie_cap_update_power(extra_root0): sltcap & PCI_EXP_SLTCAP_PCP: 2, sltctl & SLTCTL_PCC: 400 > pcie_cap_update_power(extra_root0): updated power: 0 <== has no effect on children since there is none > > 2) > pci_qdev_realize > pci_set_power: nic, d->has_power: 0, new power state: 1 > pci_set_power: nic, set has_power to: 1 > > > 3) > pci_qdev_realize > pci_set_power: extra_root0, d->has_power: 0, new power state: 1 > pci_set_power: extra_root0, set has_power to: 1 > pci_qdev_realize > pci_set_power: nic, d->has_power: 0, new power state: 1 > pci_set_power: nic, set has_power to: 1 > pcie_cap_slot_reset > pcie_cap_update_power(extra_root0): sltcap & PCI_EXP_SLTCAP_PCP: 2, sltctl & SLTCTL_PCC: 0 > pcie_cap_update_power(extra_root0): updated power: 1 > pci_set_power: nic, d->has_power: 1, new power state: 1 > pcie_cap_slot_post_load(extra_root0) > pcie_cap_update_power(extra_root0): sltcap & PCI_EXP_SLTCAP_PCP: 2, sltctl & SLTCTL_PCC: 400 > pcie_cap_update_power(extra_root0): updated power: 0 > pci_set_power: nic, d->has_power: 1, new power state: 0 > pci_set_power: nic, set has_power to: 0 ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH 2/4] pcie: update slot power status only is power control is enabled 2022-02-25 13:08 ` Michael S. Tsirkin @ 2022-02-25 13:35 ` Igor Mammedov 2022-02-25 13:48 ` Michael S. Tsirkin 0 siblings, 1 reply; 29+ messages in thread From: Igor Mammedov @ 2022-02-25 13:35 UTC (permalink / raw) To: Michael S. Tsirkin; +Cc: Gerd Hoffmann, qemu-devel On Fri, 25 Feb 2022 08:08:57 -0500 "Michael S. Tsirkin" <mst@redhat.com> wrote: > On Fri, Feb 25, 2022 at 02:02:31PM +0100, Igor Mammedov wrote: > > On Fri, 25 Feb 2022 11:12:59 +0100 > > Gerd Hoffmann <kraxel@redhat.com> wrote: > > > > > Hi, > > > > > > > pcie_cap_slot_post_load() > > > > -> pcie_cap_update_power() > > > > -> pcie_set_power_device() > > > > -> pci_set_power() > > > > -> pci_update_mappings() > > > > > > > Fix it by honoring PCI_EXP_SLTCAP_PCP and updating power status > > > > only if capability is enabled. > > > > > > > diff --git a/hw/pci/pcie.c b/hw/pci/pcie.c > > > > index d7d73a31e4..2339729a7c 100644 > > > > --- a/hw/pci/pcie.c > > > > +++ b/hw/pci/pcie.c > > > > @@ -383,10 +383,9 @@ static void pcie_cap_update_power(PCIDevice *hotplug_dev) > > > > > > > > if (sltcap & PCI_EXP_SLTCAP_PCP) { > > > > power = (sltctl & PCI_EXP_SLTCTL_PCC) == PCI_EXP_SLTCTL_PWR_ON; > > > > + pci_for_each_device(sec_bus, pci_bus_num(sec_bus), > > > > + pcie_set_power_device, &power); > > > > } > > > > - > > > > - pci_for_each_device(sec_bus, pci_bus_num(sec_bus), > > > > - pcie_set_power_device, &power); > > > > } > > > > > > The change makes sense, although I don't see how that changes qemu > > > behavior. > > > > looks like I need to fix commit message > > > > > > > > 'power' defaults to true, so when SLTCAP_PCP is off it should never > > > ever try to power off the devices. And pci_set_power() should figure > > > the state didn't change and instantly return without touching the > > > device. > > > > > > SLTCAP_PCP is on by default as well, so empty slot ends up with > > power disabled PCC state [1]: > > > > sltctl & SLTCTL_PCC == 0x400 > > > > by the time machine is initialized. > > > > Then ACPI pcihp callbacks override native hotplug ones > > so PCC remains stuck in this state since all power control > > is out of picture in case of ACPI based hotplug. Guest OS > > doesn't use/or ignore native PCC. > > So how about when ACPI pcihp overrides native with its callbacks we also > set PCC power to on? with some reworks it should work (i.e. adding an extra knob that will tell PCI core not to power off when it should, looks fragile and very hacky). It has the same migration implications as this patch, so I'd rather go after disabling whole SLTCAP_PCP thing to be correct and keeping PCI code free from ACPI hacks. > > > > > After device hotplug, the device stays in has_power=on default > > state as pci_qdev_realize() initialized it. [2] > > > > However when migrated SLTCTL_PCC is also migrated, and kick in > > as described in commit message and turns power off [3] > > > > > > > > take care, > > > Gerd > > > > > > > 1) > > pci_qdev_realize > > pci_set_power: extra_root0, d->has_power: 0, new power state: 1 > > pci_set_power: extra_root0, set has_power to: 1 > > pcie_cap_slot_reset > > pcie_cap_update_power(extra_root0): sltcap & PCI_EXP_SLTCAP_PCP: 2, sltctl & SLTCTL_PCC: 400 > > pcie_cap_update_power(extra_root0): updated power: 0 <== has no effect on children since there is none > > > > 2) > > pci_qdev_realize > > pci_set_power: nic, d->has_power: 0, new power state: 1 > > pci_set_power: nic, set has_power to: 1 > > > > > > 3) > > pci_qdev_realize > > pci_set_power: extra_root0, d->has_power: 0, new power state: 1 > > pci_set_power: extra_root0, set has_power to: 1 > > pci_qdev_realize > > pci_set_power: nic, d->has_power: 0, new power state: 1 > > pci_set_power: nic, set has_power to: 1 > > pcie_cap_slot_reset > > pcie_cap_update_power(extra_root0): sltcap & PCI_EXP_SLTCAP_PCP: 2, sltctl & SLTCTL_PCC: 0 > > pcie_cap_update_power(extra_root0): updated power: 1 > > pci_set_power: nic, d->has_power: 1, new power state: 1 > > pcie_cap_slot_post_load(extra_root0) > > pcie_cap_update_power(extra_root0): sltcap & PCI_EXP_SLTCAP_PCP: 2, sltctl & SLTCTL_PCC: 400 > > pcie_cap_update_power(extra_root0): updated power: 0 > > pci_set_power: nic, d->has_power: 1, new power state: 0 > > pci_set_power: nic, set has_power to: 0 > ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH 2/4] pcie: update slot power status only is power control is enabled 2022-02-25 13:35 ` Igor Mammedov @ 2022-02-25 13:48 ` Michael S. Tsirkin 2022-02-25 15:39 ` Igor Mammedov 0 siblings, 1 reply; 29+ messages in thread From: Michael S. Tsirkin @ 2022-02-25 13:48 UTC (permalink / raw) To: Igor Mammedov; +Cc: Gerd Hoffmann, qemu-devel On Fri, Feb 25, 2022 at 02:35:28PM +0100, Igor Mammedov wrote: > On Fri, 25 Feb 2022 08:08:57 -0500 > "Michael S. Tsirkin" <mst@redhat.com> wrote: > > > On Fri, Feb 25, 2022 at 02:02:31PM +0100, Igor Mammedov wrote: > > > On Fri, 25 Feb 2022 11:12:59 +0100 > > > Gerd Hoffmann <kraxel@redhat.com> wrote: > > > > > > > Hi, > > > > > > > > > pcie_cap_slot_post_load() > > > > > -> pcie_cap_update_power() > > > > > -> pcie_set_power_device() > > > > > -> pci_set_power() > > > > > -> pci_update_mappings() > > > > > > > > > Fix it by honoring PCI_EXP_SLTCAP_PCP and updating power status > > > > > only if capability is enabled. > > > > > > > > > diff --git a/hw/pci/pcie.c b/hw/pci/pcie.c > > > > > index d7d73a31e4..2339729a7c 100644 > > > > > --- a/hw/pci/pcie.c > > > > > +++ b/hw/pci/pcie.c > > > > > @@ -383,10 +383,9 @@ static void pcie_cap_update_power(PCIDevice *hotplug_dev) > > > > > > > > > > if (sltcap & PCI_EXP_SLTCAP_PCP) { > > > > > power = (sltctl & PCI_EXP_SLTCTL_PCC) == PCI_EXP_SLTCTL_PWR_ON; > > > > > + pci_for_each_device(sec_bus, pci_bus_num(sec_bus), > > > > > + pcie_set_power_device, &power); > > > > > } > > > > > - > > > > > - pci_for_each_device(sec_bus, pci_bus_num(sec_bus), > > > > > - pcie_set_power_device, &power); > > > > > } > > > > > > > > The change makes sense, although I don't see how that changes qemu > > > > behavior. > > > > > > looks like I need to fix commit message > > > > > > > > > > > 'power' defaults to true, so when SLTCAP_PCP is off it should never > > > > ever try to power off the devices. And pci_set_power() should figure > > > > the state didn't change and instantly return without touching the > > > > device. > > > > > > > > > SLTCAP_PCP is on by default as well, so empty slot ends up with > > > power disabled PCC state [1]: > > > > > > sltctl & SLTCTL_PCC == 0x400 > > > > > > by the time machine is initialized. > > > > > > Then ACPI pcihp callbacks override native hotplug ones > > > so PCC remains stuck in this state since all power control > > > is out of picture in case of ACPI based hotplug. Guest OS > > > doesn't use/or ignore native PCC. > > > > So how about when ACPI pcihp overrides native with its callbacks we also > > set PCC power to on? > > with some reworks it should work (i.e. adding an extra knob that will tell > PCI core not to power off when it should, looks fragile and very hacky). > It has the same migration implications as this patch, so I'd rather go > after disabling whole SLTCAP_PCP thing to be correct and keeping PCI > code free from ACPI hacks. Hmm I don't get it. I literally mean this: diff --git a/hw/pci/pcie.c b/hw/pci/pcie.c index d7d73a31e4..72de72ce7a 100644 --- a/hw/pci/pcie.c +++ b/hw/pci/pcie.c @@ -389,6 +389,17 @@ static void pcie_cap_update_power(PCIDevice *hotplug_dev) pcie_set_power_device, &power); } +void pcie_cap_enable_power(PCIDevice *hotplug_dev) +{ + uint8_t *exp_cap = hotplug_dev->config + hotplug_dev->exp.exp_cap; + uint32_t sltcap = pci_get_long(exp_cap + PCI_EXP_SLTCAP); + + if (sltcap & PCI_EXP_SLTCAP_PCP) { + pci_set_word_by_mask(exp_cap + PCI_EXP_SLTCTL, + PCI_EXP_SLTCTL_PCC, PCI_EXP_SLTCTL_PWR_ON); + } +} + /* * A PCI Express Hot-Plug Event has occurred, so update slot status register * and notify OS of the event if necessary. Then call this from ACPI. How would this have any migration implications at all? And why do we need a knob not to power off then? Power will just stay on since there's nothing turning it off. -- MST ^ permalink raw reply related [flat|nested] 29+ messages in thread
* Re: [PATCH 2/4] pcie: update slot power status only is power control is enabled 2022-02-25 13:48 ` Michael S. Tsirkin @ 2022-02-25 15:39 ` Igor Mammedov 2022-02-28 7:39 ` Gerd Hoffmann 0 siblings, 1 reply; 29+ messages in thread From: Igor Mammedov @ 2022-02-25 15:39 UTC (permalink / raw) To: Michael S. Tsirkin; +Cc: Gerd Hoffmann, qemu-devel On Fri, 25 Feb 2022 08:48:13 -0500 "Michael S. Tsirkin" <mst@redhat.com> wrote: > On Fri, Feb 25, 2022 at 02:35:28PM +0100, Igor Mammedov wrote: > > On Fri, 25 Feb 2022 08:08:57 -0500 > > "Michael S. Tsirkin" <mst@redhat.com> wrote: > > > > > On Fri, Feb 25, 2022 at 02:02:31PM +0100, Igor Mammedov wrote: > > > > On Fri, 25 Feb 2022 11:12:59 +0100 > > > > Gerd Hoffmann <kraxel@redhat.com> wrote: > > > > > > > > > Hi, > > > > > > > > > > > pcie_cap_slot_post_load() > > > > > > -> pcie_cap_update_power() > > > > > > -> pcie_set_power_device() > > > > > > -> pci_set_power() > > > > > > -> pci_update_mappings() > > > > > > > > > > > Fix it by honoring PCI_EXP_SLTCAP_PCP and updating power status > > > > > > only if capability is enabled. > > > > > > > > > > > diff --git a/hw/pci/pcie.c b/hw/pci/pcie.c > > > > > > index d7d73a31e4..2339729a7c 100644 > > > > > > --- a/hw/pci/pcie.c > > > > > > +++ b/hw/pci/pcie.c > > > > > > @@ -383,10 +383,9 @@ static void pcie_cap_update_power(PCIDevice *hotplug_dev) > > > > > > > > > > > > if (sltcap & PCI_EXP_SLTCAP_PCP) { > > > > > > power = (sltctl & PCI_EXP_SLTCTL_PCC) == PCI_EXP_SLTCTL_PWR_ON; > > > > > > + pci_for_each_device(sec_bus, pci_bus_num(sec_bus), > > > > > > + pcie_set_power_device, &power); > > > > > > } > > > > > > - > > > > > > - pci_for_each_device(sec_bus, pci_bus_num(sec_bus), > > > > > > - pcie_set_power_device, &power); > > > > > > } > > > > > > > > > > The change makes sense, although I don't see how that changes qemu > > > > > behavior. > > > > > > > > looks like I need to fix commit message > > > > > > > > > > > > > > 'power' defaults to true, so when SLTCAP_PCP is off it should never > > > > > ever try to power off the devices. And pci_set_power() should figure > > > > > the state didn't change and instantly return without touching the > > > > > device. > > > > > > > > > > > > SLTCAP_PCP is on by default as well, so empty slot ends up with > > > > power disabled PCC state [1]: > > > > > > > > sltctl & SLTCTL_PCC == 0x400 > > > > > > > > by the time machine is initialized. > > > > > > > > Then ACPI pcihp callbacks override native hotplug ones > > > > so PCC remains stuck in this state since all power control > > > > is out of picture in case of ACPI based hotplug. Guest OS > > > > doesn't use/or ignore native PCC. > > > > > > So how about when ACPI pcihp overrides native with its callbacks we also > > > set PCC power to on? > > > > with some reworks it should work (i.e. adding an extra knob that will tell > > PCI core not to power off when it should, looks fragile and very hacky). > > It has the same migration implications as this patch, so I'd rather go > > after disabling whole SLTCAP_PCP thing to be correct and keeping PCI > > code free from ACPI hacks. > > Hmm I don't get it. I literally mean this: I was thinking about the time when we do override native callbacks. which happens for every root port at its realize time. Start up sequence on src: acpi_pcihp_device_pre_plug_cb(dev: extra_root0) pci_qdev_realize(extra_root0) pci_set_power: extra_root0, d->has_power: 0, new power state: 1 pci_set_power: extra_root0, set has_power to: 1 acpi_pcihp_device_plug_cb(dev: extra_root0) <== lets assume we call pcie_cap_enable_power(dev) here it's all good wrt layering as we are rewiring being initialized root port to another hp controller from context of its parent device pcie_cap_slot_reset <== then here we hit "if (populated) {} else {}" which kills whatever above has done since slot is not populated and a knob would be needed to prevent reset (i.e. don't touch power state as it's 'managed' by ACPI) pcie_cap_update_power(extra_root0): sltcap & PCI_EXP_SLTCAP_PCP: 2, sltctl & SLTCTL_PCC: 400 pcie_cap_update_power(extra_root0): updated power: 0 Though I haven't thought about end-device hotplug time: (qemu) device_add e1000e,bus=extra_root0,id=nic acpi_pcihp_device_pre_plug_cb(dev: nic) pci_qdev_realize(nic) pci_set_power: nic, d->has_power: 0, new power state: 1 pci_set_power: nic, set has_power to: 1 acpi_pcihp_device_plug_cb(dev: nic) <== here we have a chance to power on no longer empty slot pcie_cap_enable_power(hotplug_dev) then when target loads state it will see SLTCTL_PCC: 0 and keep slot powered on. pci_set_power: nic, d->has_power: 1, new power state: 1 This where I wasn't comfortable with idea of calling random PCIe code chunks and thought about chaining callbacks so that pcie_cap_slot_[pre_]plug_cb() would do necessary PCIe steps and acpi_pcihp_device_[pre_]plug_cb() do ACPI specific things not intruding on each other, but that requires telling PCIe code that it should not issue native hotplug event to guest. > diff --git a/hw/pci/pcie.c b/hw/pci/pcie.c > index d7d73a31e4..72de72ce7a 100644 > --- a/hw/pci/pcie.c > +++ b/hw/pci/pcie.c > @@ -389,6 +389,17 @@ static void pcie_cap_update_power(PCIDevice *hotplug_dev) > pcie_set_power_device, &power); > } > > +void pcie_cap_enable_power(PCIDevice *hotplug_dev) > +{ > + uint8_t *exp_cap = hotplug_dev->config + hotplug_dev->exp.exp_cap; > + uint32_t sltcap = pci_get_long(exp_cap + PCI_EXP_SLTCAP); > + > + if (sltcap & PCI_EXP_SLTCAP_PCP) { > + pci_set_word_by_mask(exp_cap + PCI_EXP_SLTCTL, > + PCI_EXP_SLTCTL_PCC, PCI_EXP_SLTCTL_PWR_ON); > + } > +} > + > /* > * A PCI Express Hot-Plug Event has occurred, so update slot status register > * and notify OS of the event if necessary. > > Then call this from ACPI. How would this have any migration > implications at all? And why do we need a knob not to power off then? > Power will just stay on since there's nothing turning it off. It still changes pci_config, the similar to disabling SLTCAP_PCP, so I think we still need migration compat knob to have the same device state in cross version migration case. ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH 2/4] pcie: update slot power status only is power control is enabled 2022-02-25 15:39 ` Igor Mammedov @ 2022-02-28 7:39 ` Gerd Hoffmann 2022-02-28 8:55 ` Igor Mammedov 0 siblings, 1 reply; 29+ messages in thread From: Gerd Hoffmann @ 2022-02-28 7:39 UTC (permalink / raw) To: Igor Mammedov; +Cc: qemu-devel, Michael S. Tsirkin Hi, > This where I wasn't comfortable with idea of calling random PCIe code > chunks and thought about chaining callbacks so that > pcie_cap_slot_[pre_]plug_cb() would do necessary PCIe steps > and acpi_pcihp_device_[pre_]plug_cb() do ACPI specific things not > intruding on each other, but that requires telling PCIe code that > it should not issue native hotplug event to guest. I think with both acpi and pcie hotplug being active it surely makes sense that both are in sync when it comes to device state. So acpihp updating pcie slot state (power control, maybe also device presence) looks sane to me. Not sure whenever it would be better to call into pcie code or just update the pci config space bits directly to make sure pcie doesn't take unwanted actions like sending out events. take care, Gerd ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH 2/4] pcie: update slot power status only is power control is enabled 2022-02-28 7:39 ` Gerd Hoffmann @ 2022-02-28 8:55 ` Igor Mammedov 0 siblings, 0 replies; 29+ messages in thread From: Igor Mammedov @ 2022-02-28 8:55 UTC (permalink / raw) To: Gerd Hoffmann; +Cc: qemu-devel, Michael S. Tsirkin On Mon, 28 Feb 2022 08:39:57 +0100 Gerd Hoffmann <kraxel@redhat.com> wrote: > Hi, > > > This where I wasn't comfortable with idea of calling random PCIe code > > chunks and thought about chaining callbacks so that > > pcie_cap_slot_[pre_]plug_cb() would do necessary PCIe steps > > and acpi_pcihp_device_[pre_]plug_cb() do ACPI specific things not > > intruding on each other, but that requires telling PCIe code that > > it should not issue native hotplug event to guest. > > I think with both acpi and pcie hotplug being active it surely makes > sense that both are in sync when it comes to device state. So acpihp > updating pcie slot state (power control, maybe also device presence) > looks sane to me. > > Not sure whenever it would be better to call into pcie code or just > update the pci config space bits directly to make sure pcie doesn't > take unwanted actions like sending out events. If changing power state is preferred over disabling power control, I can respin series with what Michael has suggested earlier and drop 2/4 as not necessary. I'll wait for a day to see if there would more ideas/suggestions > take care, > Gerd > ^ permalink raw reply [flat|nested] 29+ messages in thread
* [PATCH 3/4] acpi: pcihp: disable power control on PCIe slot 2022-02-24 17:44 [PATCH 0/4] Fix broken PCIe device after migration Igor Mammedov 2022-02-24 17:44 ` [PATCH 1/4] pci: expose TYPE_XIO3130_DOWNSTREAM name Igor Mammedov 2022-02-24 17:44 ` [PATCH 2/4] pcie: update slot power status only is power control is enabled Igor Mammedov @ 2022-02-24 17:44 ` Igor Mammedov 2022-02-24 17:44 ` [PATCH 4/4] q35: compat: keep hotplugged PCIe device broken after migration for 6.2-older machine types Igor Mammedov ` (2 subsequent siblings) 5 siblings, 0 replies; 29+ messages in thread From: Igor Mammedov @ 2022-02-24 17:44 UTC (permalink / raw) To: qemu-devel; +Cc: kraxel, mst Previous patch [1] fixed unconditional power handling on a PCIe slot, and make it honor PCI_EXP_SLTCAP_PCP capability. Use COMPAT_PROP_PCP to disable power control (PCI_EXP_SLTCAP_PCP) on PCIe slot when its plug callbacks are wired to ACPI pcihp, which effectively leaves stop always powered. PS: See [1] for detailed description of the issue [2] and how it's being addressed. 1) "pcie: update slot power status only is power control is enabled" 2) Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2053584 Signed-off-by: Igor Mammedov <imammedo@redhat.com> --- hw/acpi/pcihp.c | 10 ++++++++++ 1 file changed, 10 insertions(+) diff --git a/hw/acpi/pcihp.c b/hw/acpi/pcihp.c index 6befd23e16..bc47d1bf64 100644 --- a/hw/acpi/pcihp.c +++ b/hw/acpi/pcihp.c @@ -32,6 +32,7 @@ #include "hw/pci/pci_bridge.h" #include "hw/pci/pci_host.h" #include "hw/pci/pcie_port.h" +#include "hw/pci-bridge/xio3130_downstream.h" #include "hw/i386/acpi-build.h" #include "hw/acpi/acpi.h" #include "hw/pci/pci_bus.h" @@ -329,6 +330,15 @@ void acpi_pcihp_device_pre_plug_cb(HotplugHandler *hotplug_dev, GINT_TO_POINTER(pdev->acpi_index), g_cmp_uint32, NULL); } + + /* + * since acpi_pcihp manages hotplug, disable PCI-E power control on slot + */ + if (object_dynamic_cast(OBJECT(dev), TYPE_PCIE_ROOT_PORT) || + object_dynamic_cast(OBJECT(dev), TYPE_XIO3130_DOWNSTREAM)) { + object_property_set_bool(OBJECT(dev), COMPAT_PROP_PCP, false, + &error_abort); + } } void acpi_pcihp_device_plug_cb(HotplugHandler *hotplug_dev, AcpiPciHpState *s, -- 2.31.1 ^ permalink raw reply related [flat|nested] 29+ messages in thread
* [PATCH 4/4] q35: compat: keep hotplugged PCIe device broken after migration for 6.2-older machine types 2022-02-24 17:44 [PATCH 0/4] Fix broken PCIe device after migration Igor Mammedov ` (2 preceding siblings ...) 2022-02-24 17:44 ` [PATCH 3/4] acpi: pcihp: disable power control on PCIe slot Igor Mammedov @ 2022-02-24 17:44 ` Igor Mammedov 2022-02-24 18:11 ` Michael S. Tsirkin 2022-02-24 18:08 ` [PATCH 0/4] Fix broken PCIe device after migration Michael S. Tsirkin 2022-02-25 9:58 ` Michael S. Tsirkin 5 siblings, 1 reply; 29+ messages in thread From: Igor Mammedov @ 2022-02-24 17:44 UTC (permalink / raw) To: qemu-devel; +Cc: kraxel, mst Q35 switched to ACPI PCI hotplug by default in since 6.1 machine type and migration worked as expected (with BARs on target being the same as on source) However native PCIe fixes [1] merged in 6.2 time, regressed migration part, resulting in disabled BARs on target. The issue affects pc-q35-6.2 and pc-q35-6.1 machine types (and older if qemu-6.2 binary is used on source with manually enabled ACPI PCI hotplug). Introduce x-pcihp-enable-pcie-pcp-cap compat property to keep 6.2 and older machine types in broken state when ACPI PCI hotplug is enabled to make sure that guest does see the same PCIe device and slot on rc & dst (i.e. with or without power control present/configured). 1) commit d5daff7d312 (pcie: implement slot power control for pcie root ports) Signed-off-by: Igor Mammedov <imammedo@redhat.com> --- include/hw/acpi/pcihp.h | 4 +++- hw/acpi/acpi-pci-hotplug-stub.c | 3 ++- hw/acpi/ich9.c | 21 ++++++++++++++++++++- hw/acpi/pcihp.c | 20 ++++++++++++-------- hw/acpi/piix4.c | 3 ++- hw/core/machine.c | 4 +++- 6 files changed, 42 insertions(+), 13 deletions(-) diff --git a/include/hw/acpi/pcihp.h b/include/hw/acpi/pcihp.h index af1a169fc3..06466dd2a8 100644 --- a/include/hw/acpi/pcihp.h +++ b/include/hw/acpi/pcihp.h @@ -52,6 +52,7 @@ typedef struct AcpiPciHpState { bool legacy_piix; uint16_t io_base; uint16_t io_len; + bool enable_pcie_pcp_cap; } AcpiPciHpState; void acpi_pcihp_init(Object *owner, AcpiPciHpState *, PCIBus *root, @@ -59,7 +60,8 @@ void acpi_pcihp_init(Object *owner, AcpiPciHpState *, PCIBus *root, uint16_t io_base); void acpi_pcihp_device_pre_plug_cb(HotplugHandler *hotplug_dev, - DeviceState *dev, Error **errp); + AcpiPciHpState *s, DeviceState *dev, + Error **errp); void acpi_pcihp_device_plug_cb(HotplugHandler *hotplug_dev, AcpiPciHpState *s, DeviceState *dev, Error **errp); void acpi_pcihp_device_unplug_cb(HotplugHandler *hotplug_dev, AcpiPciHpState *s, diff --git a/hw/acpi/acpi-pci-hotplug-stub.c b/hw/acpi/acpi-pci-hotplug-stub.c index 734e4c5986..6904b772b3 100644 --- a/hw/acpi/acpi-pci-hotplug-stub.c +++ b/hw/acpi/acpi-pci-hotplug-stub.c @@ -18,7 +18,8 @@ void acpi_pcihp_device_plug_cb(HotplugHandler *hotplug_dev, AcpiPciHpState *s, } void acpi_pcihp_device_pre_plug_cb(HotplugHandler *hotplug_dev, - DeviceState *dev, Error **errp) + AcpiPciHpState *s, DeviceState *dev, + Error **errp) { return; } diff --git a/hw/acpi/ich9.c b/hw/acpi/ich9.c index bd9bbade70..928d5c101c 100644 --- a/hw/acpi/ich9.c +++ b/hw/acpi/ich9.c @@ -430,6 +430,21 @@ static void ich9_pm_set_keep_pci_slot_hpc(Object *obj, bool value, Error **errp) s->pm.keep_pci_slot_hpc = value; } +static bool ich9_pm_get_enable_pcie_pcp_cap(Object *obj, Error **errp) +{ + ICH9LPCState *s = ICH9_LPC_DEVICE(obj); + + return s->pm.acpi_pci_hotplug.enable_pcie_pcp_cap; +} + +static void ich9_pm_set_enable_pcie_pcp_cap(Object *obj, bool value, + Error **errp) +{ + ICH9LPCState *s = ICH9_LPC_DEVICE(obj); + + s->pm.acpi_pci_hotplug.enable_pcie_pcp_cap = value; +} + void ich9_pm_add_properties(Object *obj, ICH9LPCPMRegs *pm) { static const uint32_t gpe0_len = ICH9_PMIO_GPE0_LEN; @@ -469,6 +484,9 @@ void ich9_pm_add_properties(Object *obj, ICH9LPCPMRegs *pm) object_property_add_bool(obj, "x-keep-pci-slot-hpc", ich9_pm_get_keep_pci_slot_hpc, ich9_pm_set_keep_pci_slot_hpc); + object_property_add_bool(obj, "x-pcihp-enable-pcie-pcp-cap", + ich9_pm_get_enable_pcie_pcp_cap, + ich9_pm_set_enable_pcie_pcp_cap); } void ich9_pm_device_pre_plug_cb(HotplugHandler *hotplug_dev, DeviceState *dev, @@ -477,7 +495,8 @@ void ich9_pm_device_pre_plug_cb(HotplugHandler *hotplug_dev, DeviceState *dev, ICH9LPCState *lpc = ICH9_LPC_DEVICE(hotplug_dev); if (object_dynamic_cast(OBJECT(dev), TYPE_PCI_DEVICE)) { - acpi_pcihp_device_pre_plug_cb(hotplug_dev, dev, errp); + acpi_pcihp_device_pre_plug_cb(hotplug_dev, &lpc->pm.acpi_pci_hotplug, + dev, errp); return; } diff --git a/hw/acpi/pcihp.c b/hw/acpi/pcihp.c index bc47d1bf64..4c1fdb2211 100644 --- a/hw/acpi/pcihp.c +++ b/hw/acpi/pcihp.c @@ -291,7 +291,8 @@ void acpi_pcihp_reset(AcpiPciHpState *s, bool acpihp_root_off) #define ONBOARD_INDEX_MAX (16 * 1024 - 1) void acpi_pcihp_device_pre_plug_cb(HotplugHandler *hotplug_dev, - DeviceState *dev, Error **errp) + AcpiPciHpState *s, DeviceState *dev, + Error **errp) { PCIDevice *pdev = PCI_DEVICE(dev); @@ -331,13 +332,16 @@ void acpi_pcihp_device_pre_plug_cb(HotplugHandler *hotplug_dev, g_cmp_uint32, NULL); } - /* - * since acpi_pcihp manages hotplug, disable PCI-E power control on slot - */ - if (object_dynamic_cast(OBJECT(dev), TYPE_PCIE_ROOT_PORT) || - object_dynamic_cast(OBJECT(dev), TYPE_XIO3130_DOWNSTREAM)) { - object_property_set_bool(OBJECT(dev), COMPAT_PROP_PCP, false, - &error_abort); + /* compat knob to preserve pci_config as in 6.2 & older when pcihp in use */ + if (s->enable_pcie_pcp_cap == false) { + /* + * since acpi_pcihp manages hotplug, disable PCI-E power control on slot + */ + if (object_dynamic_cast(OBJECT(dev), TYPE_PCIE_ROOT_PORT) || + object_dynamic_cast(OBJECT(dev), TYPE_XIO3130_DOWNSTREAM)) { + object_property_set_bool(OBJECT(dev), COMPAT_PROP_PCP, false, + &error_abort); + } } } diff --git a/hw/acpi/piix4.c b/hw/acpi/piix4.c index cc37fa3416..0d25f75112 100644 --- a/hw/acpi/piix4.c +++ b/hw/acpi/piix4.c @@ -350,7 +350,8 @@ static void piix4_device_pre_plug_cb(HotplugHandler *hotplug_dev, PIIX4PMState *s = PIIX4_PM(hotplug_dev); if (object_dynamic_cast(OBJECT(dev), TYPE_PCI_DEVICE)) { - acpi_pcihp_device_pre_plug_cb(hotplug_dev, dev, errp); + acpi_pcihp_device_pre_plug_cb(hotplug_dev, &s->acpi_pci_hotplug, dev, + errp); } else if (object_dynamic_cast(OBJECT(dev), TYPE_PC_DIMM)) { if (!s->acpi_memory_hotplug.is_enabled) { error_setg(errp, diff --git a/hw/core/machine.c b/hw/core/machine.c index d856485cb4..48ffc85f8d 100644 --- a/hw/core/machine.c +++ b/hw/core/machine.c @@ -37,7 +37,9 @@ #include "hw/virtio/virtio.h" #include "hw/virtio/virtio-pci.h" -GlobalProperty hw_compat_6_2[] = {}; +GlobalProperty hw_compat_6_2[] = { + { "ICH9-LPC", "x-pcihp-enable-pcie-pcp-cap", "on" }, +}; const size_t hw_compat_6_2_len = G_N_ELEMENTS(hw_compat_6_2); GlobalProperty hw_compat_6_1[] = { -- 2.31.1 ^ permalink raw reply related [flat|nested] 29+ messages in thread
* Re: [PATCH 4/4] q35: compat: keep hotplugged PCIe device broken after migration for 6.2-older machine types 2022-02-24 17:44 ` [PATCH 4/4] q35: compat: keep hotplugged PCIe device broken after migration for 6.2-older machine types Igor Mammedov @ 2022-02-24 18:11 ` Michael S. Tsirkin 2022-02-25 8:25 ` Igor Mammedov 0 siblings, 1 reply; 29+ messages in thread From: Michael S. Tsirkin @ 2022-02-24 18:11 UTC (permalink / raw) To: Igor Mammedov; +Cc: qemu-devel, kraxel On Thu, Feb 24, 2022 at 12:44:11PM -0500, Igor Mammedov wrote: > Q35 switched to ACPI PCI hotplug by default in since 6.1 > machine type and migration worked as expected (with BARs > on target being the same as on source) > > However native PCIe fixes [1] merged in 6.2 time, regressed > migration part, resulting in disabled BARs on target. The > issue affects pc-q35-6.2 and pc-q35-6.1 machine types (and > older if qemu-6.2 binary is used on source with manually > enabled ACPI PCI hotplug). > > Introduce x-pcihp-enable-pcie-pcp-cap compat property to > keep 6.2 and older machine types in broken state when ACPI > PCI hotplug is enabled to make sure that guest does see the > same PCIe device and slot on rc & dst (i.e. with or without > power control present/configured). > > 1) commit d5daff7d312 (pcie: implement slot power control for pcie root ports) > > Signed-off-by: Igor Mammedov <imammedo@redhat.com> > --- > include/hw/acpi/pcihp.h | 4 +++- > hw/acpi/acpi-pci-hotplug-stub.c | 3 ++- > hw/acpi/ich9.c | 21 ++++++++++++++++++++- > hw/acpi/pcihp.c | 20 ++++++++++++-------- > hw/acpi/piix4.c | 3 ++- > hw/core/machine.c | 4 +++- > 6 files changed, 42 insertions(+), 13 deletions(-) > > diff --git a/include/hw/acpi/pcihp.h b/include/hw/acpi/pcihp.h > index af1a169fc3..06466dd2a8 100644 > --- a/include/hw/acpi/pcihp.h > +++ b/include/hw/acpi/pcihp.h > @@ -52,6 +52,7 @@ typedef struct AcpiPciHpState { > bool legacy_piix; > uint16_t io_base; > uint16_t io_len; > + bool enable_pcie_pcp_cap; > } AcpiPciHpState; > > void acpi_pcihp_init(Object *owner, AcpiPciHpState *, PCIBus *root, > @@ -59,7 +60,8 @@ void acpi_pcihp_init(Object *owner, AcpiPciHpState *, PCIBus *root, > uint16_t io_base); > > void acpi_pcihp_device_pre_plug_cb(HotplugHandler *hotplug_dev, > - DeviceState *dev, Error **errp); > + AcpiPciHpState *s, DeviceState *dev, > + Error **errp); > void acpi_pcihp_device_plug_cb(HotplugHandler *hotplug_dev, AcpiPciHpState *s, > DeviceState *dev, Error **errp); > void acpi_pcihp_device_unplug_cb(HotplugHandler *hotplug_dev, AcpiPciHpState *s, > diff --git a/hw/acpi/acpi-pci-hotplug-stub.c b/hw/acpi/acpi-pci-hotplug-stub.c > index 734e4c5986..6904b772b3 100644 > --- a/hw/acpi/acpi-pci-hotplug-stub.c > +++ b/hw/acpi/acpi-pci-hotplug-stub.c > @@ -18,7 +18,8 @@ void acpi_pcihp_device_plug_cb(HotplugHandler *hotplug_dev, AcpiPciHpState *s, > } > > void acpi_pcihp_device_pre_plug_cb(HotplugHandler *hotplug_dev, > - DeviceState *dev, Error **errp) > + AcpiPciHpState *s, DeviceState *dev, > + Error **errp) > { > return; > } > diff --git a/hw/acpi/ich9.c b/hw/acpi/ich9.c > index bd9bbade70..928d5c101c 100644 > --- a/hw/acpi/ich9.c > +++ b/hw/acpi/ich9.c > @@ -430,6 +430,21 @@ static void ich9_pm_set_keep_pci_slot_hpc(Object *obj, bool value, Error **errp) > s->pm.keep_pci_slot_hpc = value; > } > > +static bool ich9_pm_get_enable_pcie_pcp_cap(Object *obj, Error **errp) > +{ > + ICH9LPCState *s = ICH9_LPC_DEVICE(obj); > + > + return s->pm.acpi_pci_hotplug.enable_pcie_pcp_cap; > +} > + > +static void ich9_pm_set_enable_pcie_pcp_cap(Object *obj, bool value, > + Error **errp) > +{ > + ICH9LPCState *s = ICH9_LPC_DEVICE(obj); > + > + s->pm.acpi_pci_hotplug.enable_pcie_pcp_cap = value; > +} > + > void ich9_pm_add_properties(Object *obj, ICH9LPCPMRegs *pm) > { > static const uint32_t gpe0_len = ICH9_PMIO_GPE0_LEN; > @@ -469,6 +484,9 @@ void ich9_pm_add_properties(Object *obj, ICH9LPCPMRegs *pm) > object_property_add_bool(obj, "x-keep-pci-slot-hpc", > ich9_pm_get_keep_pci_slot_hpc, > ich9_pm_set_keep_pci_slot_hpc); > + object_property_add_bool(obj, "x-pcihp-enable-pcie-pcp-cap", > + ich9_pm_get_enable_pcie_pcp_cap, > + ich9_pm_set_enable_pcie_pcp_cap); > } > > void ich9_pm_device_pre_plug_cb(HotplugHandler *hotplug_dev, DeviceState *dev, > @@ -477,7 +495,8 @@ void ich9_pm_device_pre_plug_cb(HotplugHandler *hotplug_dev, DeviceState *dev, > ICH9LPCState *lpc = ICH9_LPC_DEVICE(hotplug_dev); > > if (object_dynamic_cast(OBJECT(dev), TYPE_PCI_DEVICE)) { > - acpi_pcihp_device_pre_plug_cb(hotplug_dev, dev, errp); > + acpi_pcihp_device_pre_plug_cb(hotplug_dev, &lpc->pm.acpi_pci_hotplug, > + dev, errp); > return; > } > > diff --git a/hw/acpi/pcihp.c b/hw/acpi/pcihp.c > index bc47d1bf64..4c1fdb2211 100644 > --- a/hw/acpi/pcihp.c > +++ b/hw/acpi/pcihp.c > @@ -291,7 +291,8 @@ void acpi_pcihp_reset(AcpiPciHpState *s, bool acpihp_root_off) > #define ONBOARD_INDEX_MAX (16 * 1024 - 1) > > void acpi_pcihp_device_pre_plug_cb(HotplugHandler *hotplug_dev, > - DeviceState *dev, Error **errp) > + AcpiPciHpState *s, DeviceState *dev, > + Error **errp) > { > PCIDevice *pdev = PCI_DEVICE(dev); > > @@ -331,13 +332,16 @@ void acpi_pcihp_device_pre_plug_cb(HotplugHandler *hotplug_dev, > g_cmp_uint32, NULL); > } > > - /* > - * since acpi_pcihp manages hotplug, disable PCI-E power control on slot > - */ > - if (object_dynamic_cast(OBJECT(dev), TYPE_PCIE_ROOT_PORT) || > - object_dynamic_cast(OBJECT(dev), TYPE_XIO3130_DOWNSTREAM)) { > - object_property_set_bool(OBJECT(dev), COMPAT_PROP_PCP, false, > - &error_abort); > + /* compat knob to preserve pci_config as in 6.2 & older when pcihp in use */ This comment probably belongs where the field is defined, not where it's used. > + if (s->enable_pcie_pcp_cap == false) { > + /* > + * since acpi_pcihp manages hotplug, disable PCI-E power control on slot Do we then need to check ACPI_PM_PROP_ACPI_PCIHP_BRIDGE to make sure acpi hotplug is actually enabled? > + */ > + if (object_dynamic_cast(OBJECT(dev), TYPE_PCIE_ROOT_PORT) || > + object_dynamic_cast(OBJECT(dev), TYPE_XIO3130_DOWNSTREAM)) { > + object_property_set_bool(OBJECT(dev), COMPAT_PROP_PCP, false, > + &error_abort); > + } > } > } > > diff --git a/hw/acpi/piix4.c b/hw/acpi/piix4.c > index cc37fa3416..0d25f75112 100644 > --- a/hw/acpi/piix4.c > +++ b/hw/acpi/piix4.c > @@ -350,7 +350,8 @@ static void piix4_device_pre_plug_cb(HotplugHandler *hotplug_dev, > PIIX4PMState *s = PIIX4_PM(hotplug_dev); > > if (object_dynamic_cast(OBJECT(dev), TYPE_PCI_DEVICE)) { > - acpi_pcihp_device_pre_plug_cb(hotplug_dev, dev, errp); > + acpi_pcihp_device_pre_plug_cb(hotplug_dev, &s->acpi_pci_hotplug, dev, > + errp); > } else if (object_dynamic_cast(OBJECT(dev), TYPE_PC_DIMM)) { > if (!s->acpi_memory_hotplug.is_enabled) { > error_setg(errp, > diff --git a/hw/core/machine.c b/hw/core/machine.c > index d856485cb4..48ffc85f8d 100644 > --- a/hw/core/machine.c > +++ b/hw/core/machine.c > @@ -37,7 +37,9 @@ > #include "hw/virtio/virtio.h" > #include "hw/virtio/virtio-pci.h" > > -GlobalProperty hw_compat_6_2[] = {}; > +GlobalProperty hw_compat_6_2[] = { > + { "ICH9-LPC", "x-pcihp-enable-pcie-pcp-cap", "on" }, > +}; > const size_t hw_compat_6_2_len = G_N_ELEMENTS(hw_compat_6_2); > > GlobalProperty hw_compat_6_1[] = { > -- > 2.31.1 ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH 4/4] q35: compat: keep hotplugged PCIe device broken after migration for 6.2-older machine types 2022-02-24 18:11 ` Michael S. Tsirkin @ 2022-02-25 8:25 ` Igor Mammedov 0 siblings, 0 replies; 29+ messages in thread From: Igor Mammedov @ 2022-02-25 8:25 UTC (permalink / raw) To: Michael S. Tsirkin; +Cc: qemu-devel, kraxel On Thu, 24 Feb 2022 13:11:53 -0500 "Michael S. Tsirkin" <mst@redhat.com> wrote: > On Thu, Feb 24, 2022 at 12:44:11PM -0500, Igor Mammedov wrote: > > Q35 switched to ACPI PCI hotplug by default in since 6.1 > > machine type and migration worked as expected (with BARs > > on target being the same as on source) > > > > However native PCIe fixes [1] merged in 6.2 time, regressed > > migration part, resulting in disabled BARs on target. The > > issue affects pc-q35-6.2 and pc-q35-6.1 machine types (and > > older if qemu-6.2 binary is used on source with manually > > enabled ACPI PCI hotplug). > > > > Introduce x-pcihp-enable-pcie-pcp-cap compat property to > > keep 6.2 and older machine types in broken state when ACPI > > PCI hotplug is enabled to make sure that guest does see the > > same PCIe device and slot on rc & dst (i.e. with or without > > power control present/configured). > > > > 1) commit d5daff7d312 (pcie: implement slot power control for pcie root ports) > > > > Signed-off-by: Igor Mammedov <imammedo@redhat.com> > > --- > > include/hw/acpi/pcihp.h | 4 +++- > > hw/acpi/acpi-pci-hotplug-stub.c | 3 ++- > > hw/acpi/ich9.c | 21 ++++++++++++++++++++- > > hw/acpi/pcihp.c | 20 ++++++++++++-------- > > hw/acpi/piix4.c | 3 ++- > > hw/core/machine.c | 4 +++- > > 6 files changed, 42 insertions(+), 13 deletions(-) > > > > diff --git a/include/hw/acpi/pcihp.h b/include/hw/acpi/pcihp.h > > index af1a169fc3..06466dd2a8 100644 > > --- a/include/hw/acpi/pcihp.h > > +++ b/include/hw/acpi/pcihp.h > > @@ -52,6 +52,7 @@ typedef struct AcpiPciHpState { > > bool legacy_piix; > > uint16_t io_base; > > uint16_t io_len; > > + bool enable_pcie_pcp_cap; > > } AcpiPciHpState; > > > > void acpi_pcihp_init(Object *owner, AcpiPciHpState *, PCIBus *root, > > @@ -59,7 +60,8 @@ void acpi_pcihp_init(Object *owner, AcpiPciHpState *, PCIBus *root, > > uint16_t io_base); > > > > void acpi_pcihp_device_pre_plug_cb(HotplugHandler *hotplug_dev, > > - DeviceState *dev, Error **errp); > > + AcpiPciHpState *s, DeviceState *dev, > > + Error **errp); > > void acpi_pcihp_device_plug_cb(HotplugHandler *hotplug_dev, AcpiPciHpState *s, > > DeviceState *dev, Error **errp); > > void acpi_pcihp_device_unplug_cb(HotplugHandler *hotplug_dev, AcpiPciHpState *s, > > diff --git a/hw/acpi/acpi-pci-hotplug-stub.c b/hw/acpi/acpi-pci-hotplug-stub.c > > index 734e4c5986..6904b772b3 100644 > > --- a/hw/acpi/acpi-pci-hotplug-stub.c > > +++ b/hw/acpi/acpi-pci-hotplug-stub.c > > @@ -18,7 +18,8 @@ void acpi_pcihp_device_plug_cb(HotplugHandler *hotplug_dev, AcpiPciHpState *s, > > } > > > > void acpi_pcihp_device_pre_plug_cb(HotplugHandler *hotplug_dev, > > - DeviceState *dev, Error **errp) > > + AcpiPciHpState *s, DeviceState *dev, > > + Error **errp) > > { > > return; > > } > > diff --git a/hw/acpi/ich9.c b/hw/acpi/ich9.c > > index bd9bbade70..928d5c101c 100644 > > --- a/hw/acpi/ich9.c > > +++ b/hw/acpi/ich9.c > > @@ -430,6 +430,21 @@ static void ich9_pm_set_keep_pci_slot_hpc(Object *obj, bool value, Error **errp) > > s->pm.keep_pci_slot_hpc = value; > > } > > > > +static bool ich9_pm_get_enable_pcie_pcp_cap(Object *obj, Error **errp) > > +{ > > + ICH9LPCState *s = ICH9_LPC_DEVICE(obj); > > + > > + return s->pm.acpi_pci_hotplug.enable_pcie_pcp_cap; > > +} > > + > > +static void ich9_pm_set_enable_pcie_pcp_cap(Object *obj, bool value, > > + Error **errp) > > +{ > > + ICH9LPCState *s = ICH9_LPC_DEVICE(obj); > > + > > + s->pm.acpi_pci_hotplug.enable_pcie_pcp_cap = value; > > +} > > + > > void ich9_pm_add_properties(Object *obj, ICH9LPCPMRegs *pm) > > { > > static const uint32_t gpe0_len = ICH9_PMIO_GPE0_LEN; > > @@ -469,6 +484,9 @@ void ich9_pm_add_properties(Object *obj, ICH9LPCPMRegs *pm) > > object_property_add_bool(obj, "x-keep-pci-slot-hpc", > > ich9_pm_get_keep_pci_slot_hpc, > > ich9_pm_set_keep_pci_slot_hpc); > > + object_property_add_bool(obj, "x-pcihp-enable-pcie-pcp-cap", > > + ich9_pm_get_enable_pcie_pcp_cap, > > + ich9_pm_set_enable_pcie_pcp_cap); > > } > > > > void ich9_pm_device_pre_plug_cb(HotplugHandler *hotplug_dev, DeviceState *dev, > > @@ -477,7 +495,8 @@ void ich9_pm_device_pre_plug_cb(HotplugHandler *hotplug_dev, DeviceState *dev, > > ICH9LPCState *lpc = ICH9_LPC_DEVICE(hotplug_dev); > > > > if (object_dynamic_cast(OBJECT(dev), TYPE_PCI_DEVICE)) { > > - acpi_pcihp_device_pre_plug_cb(hotplug_dev, dev, errp); > > + acpi_pcihp_device_pre_plug_cb(hotplug_dev, &lpc->pm.acpi_pci_hotplug, > > + dev, errp); > > return; > > } > > > > diff --git a/hw/acpi/pcihp.c b/hw/acpi/pcihp.c > > index bc47d1bf64..4c1fdb2211 100644 > > --- a/hw/acpi/pcihp.c > > +++ b/hw/acpi/pcihp.c > > @@ -291,7 +291,8 @@ void acpi_pcihp_reset(AcpiPciHpState *s, bool acpihp_root_off) > > #define ONBOARD_INDEX_MAX (16 * 1024 - 1) > > > > void acpi_pcihp_device_pre_plug_cb(HotplugHandler *hotplug_dev, > > - DeviceState *dev, Error **errp) > > + AcpiPciHpState *s, DeviceState *dev, > > + Error **errp) > > { > > PCIDevice *pdev = PCI_DEVICE(dev); > > > > @@ -331,13 +332,16 @@ void acpi_pcihp_device_pre_plug_cb(HotplugHandler *hotplug_dev, > > g_cmp_uint32, NULL); > > } > > > > - /* > > - * since acpi_pcihp manages hotplug, disable PCI-E power control on slot > > - */ > > - if (object_dynamic_cast(OBJECT(dev), TYPE_PCIE_ROOT_PORT) || > > - object_dynamic_cast(OBJECT(dev), TYPE_XIO3130_DOWNSTREAM)) { > > - object_property_set_bool(OBJECT(dev), COMPAT_PROP_PCP, false, > > - &error_abort); > > + /* compat knob to preserve pci_config as in 6.2 & older when pcihp in use */ > > This comment probably belongs where the field is defined, not where > it's used. ok > > > + if (s->enable_pcie_pcp_cap == false) { > > + /* > > + * since acpi_pcihp manages hotplug, disable PCI-E power control on slot > > Do we then need to check ACPI_PM_PROP_ACPI_PCIHP_BRIDGE to make sure > acpi hotplug is actually enabled? no need for it, this code path is called only when acpi hotplug is enabled. In case it's disabled these callbacks are not wired to affected PCI buses, and PCI[e] uses default native hotplug callbacks. > > + */ > > + if (object_dynamic_cast(OBJECT(dev), TYPE_PCIE_ROOT_PORT) || > > + object_dynamic_cast(OBJECT(dev), TYPE_XIO3130_DOWNSTREAM)) { > > + object_property_set_bool(OBJECT(dev), COMPAT_PROP_PCP, false, > > + &error_abort); > > + } > > } > > } > > > > diff --git a/hw/acpi/piix4.c b/hw/acpi/piix4.c > > index cc37fa3416..0d25f75112 100644 > > --- a/hw/acpi/piix4.c > > +++ b/hw/acpi/piix4.c > > @@ -350,7 +350,8 @@ static void piix4_device_pre_plug_cb(HotplugHandler *hotplug_dev, > > PIIX4PMState *s = PIIX4_PM(hotplug_dev); > > > > if (object_dynamic_cast(OBJECT(dev), TYPE_PCI_DEVICE)) { > > - acpi_pcihp_device_pre_plug_cb(hotplug_dev, dev, errp); > > + acpi_pcihp_device_pre_plug_cb(hotplug_dev, &s->acpi_pci_hotplug, dev, > > + errp); > > } else if (object_dynamic_cast(OBJECT(dev), TYPE_PC_DIMM)) { > > if (!s->acpi_memory_hotplug.is_enabled) { > > error_setg(errp, > > diff --git a/hw/core/machine.c b/hw/core/machine.c > > index d856485cb4..48ffc85f8d 100644 > > --- a/hw/core/machine.c > > +++ b/hw/core/machine.c > > @@ -37,7 +37,9 @@ > > #include "hw/virtio/virtio.h" > > #include "hw/virtio/virtio-pci.h" > > > > -GlobalProperty hw_compat_6_2[] = {}; > > +GlobalProperty hw_compat_6_2[] = { > > + { "ICH9-LPC", "x-pcihp-enable-pcie-pcp-cap", "on" }, > > +}; > > const size_t hw_compat_6_2_len = G_N_ELEMENTS(hw_compat_6_2); > > > > GlobalProperty hw_compat_6_1[] = { > > -- > > 2.31.1 > ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH 0/4] Fix broken PCIe device after migration 2022-02-24 17:44 [PATCH 0/4] Fix broken PCIe device after migration Igor Mammedov ` (3 preceding siblings ...) 2022-02-24 17:44 ` [PATCH 4/4] q35: compat: keep hotplugged PCIe device broken after migration for 6.2-older machine types Igor Mammedov @ 2022-02-24 18:08 ` Michael S. Tsirkin 2022-02-25 9:01 ` Igor Mammedov 2022-02-25 9:58 ` Michael S. Tsirkin 5 siblings, 1 reply; 29+ messages in thread From: Michael S. Tsirkin @ 2022-02-24 18:08 UTC (permalink / raw) To: Igor Mammedov; +Cc: qemu-devel, kraxel On Thu, Feb 24, 2022 at 12:44:07PM -0500, Igor Mammedov wrote: > Currently ACPI PCI hotplug is enabled by default for Q35 machine > type and overrides native PCIe hotplug. It works as expected when > a PCIe device is hotplugged into slot, however the device becomes > in-operational after migration. Which is caused by BARs being > disabled on target due to powered off status of the slot. > > Proposed fix disables power control on PCIe slot when ACPI pcihp > takes over hotplug control and makes PCIe slot check if power > control is enabled before trying to change slot's power. Which > leaves slot always powered on and that makes PCI core keep BARs > enabled. > > PS: > it's still hacky approach as all ACPI PCI hotplug is, but it's > the least intrusive one. Alternative, I've considered, could be > chaining hotplug callbacks and making pcihp ones call overriden > native callbacks while inhibiting hotplug event in native callbacks > somehow. But that were a bit more intrusive and spills over to SHPC > if implemented systematically, so I ditched that for now. It could > be resurrected later on if current approach turns out to be > insufficient. Yes, I am wondering about this myself. Replace callbacks with some kind of notifier, so instead of overrides we add things? I will ponder this over the weekend. > RHBZ: https://bugzilla.redhat.com/show_bug.cgi?id=2053584 > CC: mst@redhat.com > CC: kraxel@redhat.com > > Igor Mammedov (4): > pci: expose TYPE_XIO3130_DOWNSTREAM name > pcie: update slot power status only is power control is enabled > acpi: pcihp: disable power control on PCIe slot > q35: compat: keep hotplugged PCIe device broken after migration for > 6.2-older machine types > > include/hw/acpi/pcihp.h | 4 +++- > include/hw/pci-bridge/xio3130_downstream.h | 15 +++++++++++++++ > hw/acpi/acpi-pci-hotplug-stub.c | 3 ++- > hw/acpi/ich9.c | 21 ++++++++++++++++++++- > hw/acpi/pcihp.c | 16 +++++++++++++++- > hw/acpi/piix4.c | 3 ++- > hw/core/machine.c | 4 +++- > hw/pci-bridge/xio3130_downstream.c | 3 ++- > hw/pci/pcie.c | 5 ++--- > 9 files changed, 64 insertions(+), 10 deletions(-) > create mode 100644 include/hw/pci-bridge/xio3130_downstream.h > > -- > 2.31.1 ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH 0/4] Fix broken PCIe device after migration 2022-02-24 18:08 ` [PATCH 0/4] Fix broken PCIe device after migration Michael S. Tsirkin @ 2022-02-25 9:01 ` Igor Mammedov 0 siblings, 0 replies; 29+ messages in thread From: Igor Mammedov @ 2022-02-25 9:01 UTC (permalink / raw) To: Michael S. Tsirkin; +Cc: qemu-devel, kraxel On Thu, 24 Feb 2022 13:08:11 -0500 "Michael S. Tsirkin" <mst@redhat.com> wrote: > On Thu, Feb 24, 2022 at 12:44:07PM -0500, Igor Mammedov wrote: > > Currently ACPI PCI hotplug is enabled by default for Q35 machine > > type and overrides native PCIe hotplug. It works as expected when > > a PCIe device is hotplugged into slot, however the device becomes > > in-operational after migration. Which is caused by BARs being > > disabled on target due to powered off status of the slot. > > > > Proposed fix disables power control on PCIe slot when ACPI pcihp > > takes over hotplug control and makes PCIe slot check if power > > control is enabled before trying to change slot's power. Which > > leaves slot always powered on and that makes PCI core keep BARs > > enabled. > > > > PS: > > it's still hacky approach as all ACPI PCI hotplug is, but it's > > the least intrusive one. Alternative, I've considered, could be > > chaining hotplug callbacks and making pcihp ones call overriden > > native callbacks while inhibiting hotplug event in native callbacks > > somehow. But that were a bit more intrusive and spills over to SHPC > > if implemented systematically, so I ditched that for now. It could > > be resurrected later on if current approach turns out to be > > insufficient. > > Yes, I am wondering about this myself. Replace callbacks with > some kind of notifier, so instead of overrides we add things? > I will ponder this over the weekend. Callback overrides with optional chaining worked fine so far, Chaining is just a bit more complicated as often one need to refactor old code on pre and plug stages and think about how to partition job between involved components. But they are very explicit about what's supposed to call what and in what order and graceful error handling. And I dislike notifiers exactly for the lack of those properties and more difficult from review pov. I think [ATM] for this issue chaining callbacks is overkill, but maybe in future it can be done if an idea of having PCI slots described in ACPI + native hotplug proves to be a viable. > > RHBZ: https://bugzilla.redhat.com/show_bug.cgi?id=2053584 > > CC: mst@redhat.com > > CC: kraxel@redhat.com > > > > Igor Mammedov (4): > > pci: expose TYPE_XIO3130_DOWNSTREAM name > > pcie: update slot power status only is power control is enabled > > acpi: pcihp: disable power control on PCIe slot > > q35: compat: keep hotplugged PCIe device broken after migration for > > 6.2-older machine types > > > > include/hw/acpi/pcihp.h | 4 +++- > > include/hw/pci-bridge/xio3130_downstream.h | 15 +++++++++++++++ > > hw/acpi/acpi-pci-hotplug-stub.c | 3 ++- > > hw/acpi/ich9.c | 21 ++++++++++++++++++++- > > hw/acpi/pcihp.c | 16 +++++++++++++++- > > hw/acpi/piix4.c | 3 ++- > > hw/core/machine.c | 4 +++- > > hw/pci-bridge/xio3130_downstream.c | 3 ++- > > hw/pci/pcie.c | 5 ++--- > > 9 files changed, 64 insertions(+), 10 deletions(-) > > create mode 100644 include/hw/pci-bridge/xio3130_downstream.h > > > > -- > > 2.31.1 > ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH 0/4] Fix broken PCIe device after migration 2022-02-24 17:44 [PATCH 0/4] Fix broken PCIe device after migration Igor Mammedov ` (4 preceding siblings ...) 2022-02-24 18:08 ` [PATCH 0/4] Fix broken PCIe device after migration Michael S. Tsirkin @ 2022-02-25 9:58 ` Michael S. Tsirkin 2022-02-25 13:18 ` Igor Mammedov 5 siblings, 1 reply; 29+ messages in thread From: Michael S. Tsirkin @ 2022-02-25 9:58 UTC (permalink / raw) To: Igor Mammedov; +Cc: qemu-devel, kraxel On Thu, Feb 24, 2022 at 12:44:07PM -0500, Igor Mammedov wrote: > Currently ACPI PCI hotplug is enabled by default for Q35 machine > type and overrides native PCIe hotplug. It works as expected when > a PCIe device is hotplugged into slot, however the device becomes > in-operational after migration. Which is caused by BARs being > disabled on target due to powered off status of the slot. > > Proposed fix disables power control on PCIe slot when ACPI pcihp > takes over hotplug control and makes PCIe slot check if power > control is enabled before trying to change slot's power. Which > leaves slot always powered on and that makes PCI core keep BARs > enabled. I thought some more about this. One of the reasons we did not remove the hotplug capability is really so it's easier to layer acpi on top of pcihp if we want to do it down the road. And it would be quite annoying if we had to add more hack to go back to having capability. How about instead of patch 3 we call pci_set_power(dev, true) for all devices where ACPI is managing hotplug immediately on plug? This will fix the bug avoiding headache with migration. Patch 2 does seem like a good idea. > PS: > it's still hacky approach as all ACPI PCI hotplug is, but it's > the least intrusive one. Alternative, I've considered, could be > chaining hotplug callbacks and making pcihp ones call overriden > native callbacks while inhibiting hotplug event in native callbacks > somehow. But that were a bit more intrusive and spills over to SHPC > if implemented systematically, so I ditched that for now. It could > be resurrected later on if current approach turns out to be > insufficient. > > RHBZ: https://bugzilla.redhat.com/show_bug.cgi?id=2053584 > CC: mst@redhat.com > CC: kraxel@redhat.com > > Igor Mammedov (4): > pci: expose TYPE_XIO3130_DOWNSTREAM name > pcie: update slot power status only is power control is enabled > acpi: pcihp: disable power control on PCIe slot > q35: compat: keep hotplugged PCIe device broken after migration for > 6.2-older machine types > > include/hw/acpi/pcihp.h | 4 +++- > include/hw/pci-bridge/xio3130_downstream.h | 15 +++++++++++++++ > hw/acpi/acpi-pci-hotplug-stub.c | 3 ++- > hw/acpi/ich9.c | 21 ++++++++++++++++++++- > hw/acpi/pcihp.c | 16 +++++++++++++++- > hw/acpi/piix4.c | 3 ++- > hw/core/machine.c | 4 +++- > hw/pci-bridge/xio3130_downstream.c | 3 ++- > hw/pci/pcie.c | 5 ++--- > 9 files changed, 64 insertions(+), 10 deletions(-) > create mode 100644 include/hw/pci-bridge/xio3130_downstream.h > > -- > 2.31.1 ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH 0/4] Fix broken PCIe device after migration 2022-02-25 9:58 ` Michael S. Tsirkin @ 2022-02-25 13:18 ` Igor Mammedov 2022-02-25 13:50 ` Michael S. Tsirkin 2022-02-25 14:32 ` Igor Mammedov 0 siblings, 2 replies; 29+ messages in thread From: Igor Mammedov @ 2022-02-25 13:18 UTC (permalink / raw) To: Michael S. Tsirkin; +Cc: qemu-devel, kraxel On Fri, 25 Feb 2022 04:58:46 -0500 "Michael S. Tsirkin" <mst@redhat.com> wrote: > On Thu, Feb 24, 2022 at 12:44:07PM -0500, Igor Mammedov wrote: > > Currently ACPI PCI hotplug is enabled by default for Q35 machine > > type and overrides native PCIe hotplug. It works as expected when > > a PCIe device is hotplugged into slot, however the device becomes > > in-operational after migration. Which is caused by BARs being > > disabled on target due to powered off status of the slot. > > > > Proposed fix disables power control on PCIe slot when ACPI pcihp > > takes over hotplug control and makes PCIe slot check if power > > control is enabled before trying to change slot's power. Which > > leaves slot always powered on and that makes PCI core keep BARs > > enabled. > > > I thought some more about this. One of the reasons we > did not remove the hotplug capability is really so > it's easier to layer acpi on top of pcihp if we > want to do it down the road. And it would be quite annoying > if we had to add more hack to go back to having capability. > > How about instead of patch 3 we call pci_set_power(dev, true) for all > devices where ACPI is managing hotplug immediately on plug? This will > fix the bug avoiding headache with migration. true it would be more migration friendly (v6.2 still broken but that can't be helped), since we won't alter pci_config at all. Although it's still more hackish compared to disabling SLTCAP_PCP (though it seems guest OSes have no issues with SLTCAP_PCP being present but not really operational, at least for ~6months the thing was released (6.1-6.2-now)). Let me play with this idea and see if it works and at what cost, though I still prefer cleaner SLTCAP_PCP disabling to make sure guest OS won't get wrong idea about power control being present when it's not actually not. > Patch 2 does seem like a good idea. > > > PS: > > it's still hacky approach as all ACPI PCI hotplug is, but it's > > the least intrusive one. Alternative, I've considered, could be > > chaining hotplug callbacks and making pcihp ones call overriden > > native callbacks while inhibiting hotplug event in native callbacks > > somehow. But that were a bit more intrusive and spills over to SHPC > > if implemented systematically, so I ditched that for now. It could > > be resurrected later on if current approach turns out to be > > insufficient. > > > > RHBZ: https://bugzilla.redhat.com/show_bug.cgi?id=2053584 > > CC: mst@redhat.com > > CC: kraxel@redhat.com > > > > Igor Mammedov (4): > > pci: expose TYPE_XIO3130_DOWNSTREAM name > > pcie: update slot power status only is power control is enabled > > acpi: pcihp: disable power control on PCIe slot > > q35: compat: keep hotplugged PCIe device broken after migration for > > 6.2-older machine types > > > > include/hw/acpi/pcihp.h | 4 +++- > > include/hw/pci-bridge/xio3130_downstream.h | 15 +++++++++++++++ > > hw/acpi/acpi-pci-hotplug-stub.c | 3 ++- > > hw/acpi/ich9.c | 21 ++++++++++++++++++++- > > hw/acpi/pcihp.c | 16 +++++++++++++++- > > hw/acpi/piix4.c | 3 ++- > > hw/core/machine.c | 4 +++- > > hw/pci-bridge/xio3130_downstream.c | 3 ++- > > hw/pci/pcie.c | 5 ++--- > > 9 files changed, 64 insertions(+), 10 deletions(-) > > create mode 100644 include/hw/pci-bridge/xio3130_downstream.h > > > > -- > > 2.31.1 > ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH 0/4] Fix broken PCIe device after migration 2022-02-25 13:18 ` Igor Mammedov @ 2022-02-25 13:50 ` Michael S. Tsirkin 2022-02-25 15:50 ` Igor Mammedov 2022-02-28 7:49 ` Gerd Hoffmann 2022-02-25 14:32 ` Igor Mammedov 1 sibling, 2 replies; 29+ messages in thread From: Michael S. Tsirkin @ 2022-02-25 13:50 UTC (permalink / raw) To: Igor Mammedov; +Cc: qemu-devel, kraxel On Fri, Feb 25, 2022 at 02:18:23PM +0100, Igor Mammedov wrote: > On Fri, 25 Feb 2022 04:58:46 -0500 > "Michael S. Tsirkin" <mst@redhat.com> wrote: > > > On Thu, Feb 24, 2022 at 12:44:07PM -0500, Igor Mammedov wrote: > > > Currently ACPI PCI hotplug is enabled by default for Q35 machine > > > type and overrides native PCIe hotplug. It works as expected when > > > a PCIe device is hotplugged into slot, however the device becomes > > > in-operational after migration. Which is caused by BARs being > > > disabled on target due to powered off status of the slot. > > > > > > Proposed fix disables power control on PCIe slot when ACPI pcihp > > > takes over hotplug control and makes PCIe slot check if power > > > control is enabled before trying to change slot's power. Which > > > leaves slot always powered on and that makes PCI core keep BARs > > > enabled. > > > > > > I thought some more about this. One of the reasons we > > did not remove the hotplug capability is really so > > it's easier to layer acpi on top of pcihp if we > > want to do it down the road. And it would be quite annoying > > if we had to add more hack to go back to having capability. > > > > How about instead of patch 3 we call pci_set_power(dev, true) for all > > devices where ACPI is managing hotplug immediately on plug? This will > > fix the bug avoiding headache with migration. > > true it would be more migration friendly (v6.2 still broken > but that can't be helped), since we won't alter pci_config at all. > Although it's still more hackish compared to disabling SLTCAP_PCP > (though it seems guest OSes have no issues with SLTCAP_PCP being > present but not really operational, at least for ~6months the thing > was released (6.1-6.2-now)). > > Let me play with this idea and see if it works and at what cost, > though I still prefer cleaner SLTCAP_PCP disabling to make sure > guest OS won't get wrong idea about power control being present > when it's not actually not. Well the control is present, isn't it? Can be used to e.g. reset the device behind the bridge. > > > Patch 2 does seem like a good idea. > > > > > PS: > > > it's still hacky approach as all ACPI PCI hotplug is, but it's > > > the least intrusive one. Alternative, I've considered, could be > > > chaining hotplug callbacks and making pcihp ones call overriden > > > native callbacks while inhibiting hotplug event in native callbacks > > > somehow. But that were a bit more intrusive and spills over to SHPC > > > if implemented systematically, so I ditched that for now. It could > > > be resurrected later on if current approach turns out to be > > > insufficient. > > > > > > RHBZ: https://bugzilla.redhat.com/show_bug.cgi?id=2053584 > > > CC: mst@redhat.com > > > CC: kraxel@redhat.com > > > > > > Igor Mammedov (4): > > > pci: expose TYPE_XIO3130_DOWNSTREAM name > > > pcie: update slot power status only is power control is enabled > > > acpi: pcihp: disable power control on PCIe slot > > > q35: compat: keep hotplugged PCIe device broken after migration for > > > 6.2-older machine types > > > > > > include/hw/acpi/pcihp.h | 4 +++- > > > include/hw/pci-bridge/xio3130_downstream.h | 15 +++++++++++++++ > > > hw/acpi/acpi-pci-hotplug-stub.c | 3 ++- > > > hw/acpi/ich9.c | 21 ++++++++++++++++++++- > > > hw/acpi/pcihp.c | 16 +++++++++++++++- > > > hw/acpi/piix4.c | 3 ++- > > > hw/core/machine.c | 4 +++- > > > hw/pci-bridge/xio3130_downstream.c | 3 ++- > > > hw/pci/pcie.c | 5 ++--- > > > 9 files changed, 64 insertions(+), 10 deletions(-) > > > create mode 100644 include/hw/pci-bridge/xio3130_downstream.h > > > > > > -- > > > 2.31.1 > > ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH 0/4] Fix broken PCIe device after migration 2022-02-25 13:50 ` Michael S. Tsirkin @ 2022-02-25 15:50 ` Igor Mammedov 2022-02-27 10:22 ` Michael S. Tsirkin 2022-02-28 7:49 ` Gerd Hoffmann 1 sibling, 1 reply; 29+ messages in thread From: Igor Mammedov @ 2022-02-25 15:50 UTC (permalink / raw) To: Michael S. Tsirkin; +Cc: qemu-devel, kraxel On Fri, 25 Feb 2022 08:50:43 -0500 "Michael S. Tsirkin" <mst@redhat.com> wrote: > On Fri, Feb 25, 2022 at 02:18:23PM +0100, Igor Mammedov wrote: > > On Fri, 25 Feb 2022 04:58:46 -0500 > > "Michael S. Tsirkin" <mst@redhat.com> wrote: > > > > > On Thu, Feb 24, 2022 at 12:44:07PM -0500, Igor Mammedov wrote: > > > > Currently ACPI PCI hotplug is enabled by default for Q35 machine > > > > type and overrides native PCIe hotplug. It works as expected when > > > > a PCIe device is hotplugged into slot, however the device becomes > > > > in-operational after migration. Which is caused by BARs being > > > > disabled on target due to powered off status of the slot. > > > > > > > > Proposed fix disables power control on PCIe slot when ACPI pcihp > > > > takes over hotplug control and makes PCIe slot check if power > > > > control is enabled before trying to change slot's power. Which > > > > leaves slot always powered on and that makes PCI core keep BARs > > > > enabled. > > > > > > > > > I thought some more about this. One of the reasons we > > > did not remove the hotplug capability is really so > > > it's easier to layer acpi on top of pcihp if we > > > want to do it down the road. And it would be quite annoying > > > if we had to add more hack to go back to having capability. > > > > > > How about instead of patch 3 we call pci_set_power(dev, true) for all > > > devices where ACPI is managing hotplug immediately on plug? This will > > > fix the bug avoiding headache with migration. > > > > true it would be more migration friendly (v6.2 still broken > > but that can't be helped), since we won't alter pci_config at all. > > Although it's still more hackish compared to disabling SLTCAP_PCP > > (though it seems guest OSes have no issues with SLTCAP_PCP being > > present but not really operational, at least for ~6months the thing > > was released (6.1-6.2-now)). > > > > Let me play with this idea and see if it works and at what cost, > > though I still prefer cleaner SLTCAP_PCP disabling to make sure > > guest OS won't get wrong idea about power control being present > > when it's not actually not. > > Well the control is present, isn't it? Can be used to e.g. reset the > device behind the bridge. can you point to how reset is supposed to work? > > > > > > Patch 2 does seem like a good idea. > > > > > > > PS: > > > > it's still hacky approach as all ACPI PCI hotplug is, but it's > > > > the least intrusive one. Alternative, I've considered, could be > > > > chaining hotplug callbacks and making pcihp ones call overriden > > > > native callbacks while inhibiting hotplug event in native callbacks > > > > somehow. But that were a bit more intrusive and spills over to SHPC > > > > if implemented systematically, so I ditched that for now. It could > > > > be resurrected later on if current approach turns out to be > > > > insufficient. > > > > > > > > RHBZ: https://bugzilla.redhat.com/show_bug.cgi?id=2053584 > > > > CC: mst@redhat.com > > > > CC: kraxel@redhat.com > > > > > > > > Igor Mammedov (4): > > > > pci: expose TYPE_XIO3130_DOWNSTREAM name > > > > pcie: update slot power status only is power control is enabled > > > > acpi: pcihp: disable power control on PCIe slot > > > > q35: compat: keep hotplugged PCIe device broken after migration for > > > > 6.2-older machine types > > > > > > > > include/hw/acpi/pcihp.h | 4 +++- > > > > include/hw/pci-bridge/xio3130_downstream.h | 15 +++++++++++++++ > > > > hw/acpi/acpi-pci-hotplug-stub.c | 3 ++- > > > > hw/acpi/ich9.c | 21 ++++++++++++++++++++- > > > > hw/acpi/pcihp.c | 16 +++++++++++++++- > > > > hw/acpi/piix4.c | 3 ++- > > > > hw/core/machine.c | 4 +++- > > > > hw/pci-bridge/xio3130_downstream.c | 3 ++- > > > > hw/pci/pcie.c | 5 ++--- > > > > 9 files changed, 64 insertions(+), 10 deletions(-) > > > > create mode 100644 include/hw/pci-bridge/xio3130_downstream.h > > > > > > > > -- > > > > 2.31.1 > > > > ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH 0/4] Fix broken PCIe device after migration 2022-02-25 15:50 ` Igor Mammedov @ 2022-02-27 10:22 ` Michael S. Tsirkin 0 siblings, 0 replies; 29+ messages in thread From: Michael S. Tsirkin @ 2022-02-27 10:22 UTC (permalink / raw) To: Igor Mammedov; +Cc: qemu-devel, kraxel On Fri, Feb 25, 2022 at 04:50:54PM +0100, Igor Mammedov wrote: > On Fri, 25 Feb 2022 08:50:43 -0500 > "Michael S. Tsirkin" <mst@redhat.com> wrote: > > > On Fri, Feb 25, 2022 at 02:18:23PM +0100, Igor Mammedov wrote: > > > On Fri, 25 Feb 2022 04:58:46 -0500 > > > "Michael S. Tsirkin" <mst@redhat.com> wrote: > > > > > > > On Thu, Feb 24, 2022 at 12:44:07PM -0500, Igor Mammedov wrote: > > > > > Currently ACPI PCI hotplug is enabled by default for Q35 machine > > > > > type and overrides native PCIe hotplug. It works as expected when > > > > > a PCIe device is hotplugged into slot, however the device becomes > > > > > in-operational after migration. Which is caused by BARs being > > > > > disabled on target due to powered off status of the slot. > > > > > > > > > > Proposed fix disables power control on PCIe slot when ACPI pcihp > > > > > takes over hotplug control and makes PCIe slot check if power > > > > > control is enabled before trying to change slot's power. Which > > > > > leaves slot always powered on and that makes PCI core keep BARs > > > > > enabled. > > > > > > > > > > > > I thought some more about this. One of the reasons we > > > > did not remove the hotplug capability is really so > > > > it's easier to layer acpi on top of pcihp if we > > > > want to do it down the road. And it would be quite annoying > > > > if we had to add more hack to go back to having capability. > > > > > > > > How about instead of patch 3 we call pci_set_power(dev, true) for all > > > > devices where ACPI is managing hotplug immediately on plug? This will > > > > fix the bug avoiding headache with migration. > > > > > > true it would be more migration friendly (v6.2 still broken > > > but that can't be helped), since we won't alter pci_config at all. > > > Although it's still more hackish compared to disabling SLTCAP_PCP > > > (though it seems guest OSes have no issues with SLTCAP_PCP being > > > present but not really operational, at least for ~6months the thing > > > was released (6.1-6.2-now)). > > > > > > Let me play with this idea and see if it works and at what cost, > > > though I still prefer cleaner SLTCAP_PCP disabling to make sure > > > guest OS won't get wrong idea about power control being present > > > when it's not actually not. > > > > Well the control is present, isn't it? Can be used to e.g. reset the > > device behind the bridge. > > can you point to how reset is supposed to work? Well, I am alluding to this code in linux static const struct pci_reset_fn_method pci_reset_fn_methods[] = { { }, { pci_dev_specific_reset, .name = "device_specific" }, { pci_dev_acpi_reset, .name = "acpi" }, { pcie_reset_flr, .name = "flr" }, { pci_af_flr, .name = "af_flr" }, { pci_pm_reset, .name = "pm" }, { pci_reset_bus_function, .name = "bus" }, }; Thinkably down the road linux could add a method powering the secondary bus off then back on as a way to reset devices behind it. There are plenty of other ways so it's not that I can say why that specific way of doing it is useful. > > > > > > > > > Patch 2 does seem like a good idea. > > > > > > > > > PS: > > > > > it's still hacky approach as all ACPI PCI hotplug is, but it's > > > > > the least intrusive one. Alternative, I've considered, could be > > > > > chaining hotplug callbacks and making pcihp ones call overriden > > > > > native callbacks while inhibiting hotplug event in native callbacks > > > > > somehow. But that were a bit more intrusive and spills over to SHPC > > > > > if implemented systematically, so I ditched that for now. It could > > > > > be resurrected later on if current approach turns out to be > > > > > insufficient. > > > > > > > > > > RHBZ: https://bugzilla.redhat.com/show_bug.cgi?id=2053584 > > > > > CC: mst@redhat.com > > > > > CC: kraxel@redhat.com > > > > > > > > > > Igor Mammedov (4): > > > > > pci: expose TYPE_XIO3130_DOWNSTREAM name > > > > > pcie: update slot power status only is power control is enabled > > > > > acpi: pcihp: disable power control on PCIe slot > > > > > q35: compat: keep hotplugged PCIe device broken after migration for > > > > > 6.2-older machine types > > > > > > > > > > include/hw/acpi/pcihp.h | 4 +++- > > > > > include/hw/pci-bridge/xio3130_downstream.h | 15 +++++++++++++++ > > > > > hw/acpi/acpi-pci-hotplug-stub.c | 3 ++- > > > > > hw/acpi/ich9.c | 21 ++++++++++++++++++++- > > > > > hw/acpi/pcihp.c | 16 +++++++++++++++- > > > > > hw/acpi/piix4.c | 3 ++- > > > > > hw/core/machine.c | 4 +++- > > > > > hw/pci-bridge/xio3130_downstream.c | 3 ++- > > > > > hw/pci/pcie.c | 5 ++--- > > > > > 9 files changed, 64 insertions(+), 10 deletions(-) > > > > > create mode 100644 include/hw/pci-bridge/xio3130_downstream.h > > > > > > > > > > -- > > > > > 2.31.1 > > > > > > ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH 0/4] Fix broken PCIe device after migration 2022-02-25 13:50 ` Michael S. Tsirkin 2022-02-25 15:50 ` Igor Mammedov @ 2022-02-28 7:49 ` Gerd Hoffmann 1 sibling, 0 replies; 29+ messages in thread From: Gerd Hoffmann @ 2022-02-28 7:49 UTC (permalink / raw) To: Michael S. Tsirkin; +Cc: Igor Mammedov, qemu-devel Hi, > Well the control is present, isn't it? Can be used to e.g. reset the > device behind the bridge. Well, not right now b/c poweroff ejects the device. Would need a patch like this ... --- a/hw/pci/pcie.c +++ b/hw/pci/pcie.c @@ -755,7 +755,8 @@ void pcie_cap_slot_write_config(PCIDevice *dev, if ((sltsta & PCI_EXP_SLTSTA_PDS) && (val & PCI_EXP_SLTCTL_PCC) && (val & PCI_EXP_SLTCTL_PIC_OFF) == PCI_EXP_SLTCTL_PIC_OFF && (!(old_slt_ctl & PCI_EXP_SLTCTL_PCC) || - (old_slt_ctl & PCI_EXP_SLTCTL_PIC_OFF) != PCI_EXP_SLTCTL_PIC_OFF)) { + (old_slt_ctl & PCI_EXP_SLTCTL_PIC_OFF) != PCI_EXP_SLTCTL_PIC_OFF) && + dev->pending_deleted_event) { pcie_cap_slot_do_unplug(dev); } pcie_cap_update_power(dev); ... so pcie unplugs on poweroff only in case there is a pending unplug request. But with that the guest wouldn't be able to unplug devices on its own. take care, Gerd ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH 0/4] Fix broken PCIe device after migration 2022-02-25 13:18 ` Igor Mammedov 2022-02-25 13:50 ` Michael S. Tsirkin @ 2022-02-25 14:32 ` Igor Mammedov 1 sibling, 0 replies; 29+ messages in thread From: Igor Mammedov @ 2022-02-25 14:32 UTC (permalink / raw) To: Michael S. Tsirkin; +Cc: qemu-devel, kraxel On Fri, 25 Feb 2022 14:18:23 +0100 Igor Mammedov <imammedo@redhat.com> wrote: > On Fri, 25 Feb 2022 04:58:46 -0500 > "Michael S. Tsirkin" <mst@redhat.com> wrote: > > > On Thu, Feb 24, 2022 at 12:44:07PM -0500, Igor Mammedov wrote: > > > Currently ACPI PCI hotplug is enabled by default for Q35 machine > > > type and overrides native PCIe hotplug. It works as expected when > > > a PCIe device is hotplugged into slot, however the device becomes > > > in-operational after migration. Which is caused by BARs being > > > disabled on target due to powered off status of the slot. > > > > > > Proposed fix disables power control on PCIe slot when ACPI pcihp > > > takes over hotplug control and makes PCIe slot check if power > > > control is enabled before trying to change slot's power. Which > > > leaves slot always powered on and that makes PCI core keep BARs > > > enabled. > > > > > > I thought some more about this. One of the reasons we > > did not remove the hotplug capability is really so > > it's easier to layer acpi on top of pcihp if we > > want to do it down the road. And it would be quite annoying > > if we had to add more hack to go back to having capability. > > > > How about instead of patch 3 we call pci_set_power(dev, true) for all > > devices where ACPI is managing hotplug immediately on plug? This will > > fix the bug avoiding headache with migration. > > true it would be more migration friendly (v6.2 still broken > but that can't be helped), since we won't alter pci_config at all. > Although it's still more hackish compared to disabling SLTCAP_PCP > (though it seems guest OSes have no issues with SLTCAP_PCP being > present but not really operational, at least for ~6months the thing > was released (6.1-6.2-now)). > > Let me play with this idea and see if it works and at what cost, > though I still prefer cleaner SLTCAP_PCP disabling to make sure > guest OS won't get wrong idea about power control being present > when it's not actually not. It's not going to work, plug time is not the problem here. The hot-plugged device already ends up in power on state by default. It's later on target pcie_cap_slot_post_load() updates its power to 'off' due to sltctl & SLTCTL_PCC == 400 options if we go fixup route are 1) from ich9_pm_post_load() scan PCI hierarchy and do fixup it's currently called after root port pcie_cap_slot_reset() in simple test case. It's sketchy and fragile to rely on particular load() order, if it's somehow changed in future it will (silently) break this dependency. 2) do fixup from pcie_cap_slot_post_load(), means polluting generic PCI code with APCI hotplug idiosyncrasies. I thought you would be the first one to shoot that down. I guess, I pretty much convinced myself that instead of fixing up broken SLTCTL_PCC on the fly, it's more correct to fix the root issue (non functional power control) on source by disabling it completely. Quick smoke testing with Linux(FC14) and Windows (2012r2) shows it works as expected. Alternative idea of chaining native pcie callbacks into acpi ones, looks inferior as well since we would have pollute native ones with ACPI logic to inhibit native hotplug event while keeping its power mgmt part working. > > Patch 2 does seem like a good idea. > > > > > PS: > > > it's still hacky approach as all ACPI PCI hotplug is, but it's > > > the least intrusive one. Alternative, I've considered, could be > > > chaining hotplug callbacks and making pcihp ones call overriden > > > native callbacks while inhibiting hotplug event in native callbacks > > > somehow. But that were a bit more intrusive and spills over to SHPC > > > if implemented systematically, so I ditched that for now. It could > > > be resurrected later on if current approach turns out to be > > > insufficient. > > > > > > RHBZ: https://bugzilla.redhat.com/show_bug.cgi?id=2053584 > > > CC: mst@redhat.com > > > CC: kraxel@redhat.com > > > > > > Igor Mammedov (4): > > > pci: expose TYPE_XIO3130_DOWNSTREAM name > > > pcie: update slot power status only is power control is enabled > > > acpi: pcihp: disable power control on PCIe slot > > > q35: compat: keep hotplugged PCIe device broken after migration for > > > 6.2-older machine types > > > > > > include/hw/acpi/pcihp.h | 4 +++- > > > include/hw/pci-bridge/xio3130_downstream.h | 15 +++++++++++++++ > > > hw/acpi/acpi-pci-hotplug-stub.c | 3 ++- > > > hw/acpi/ich9.c | 21 ++++++++++++++++++++- > > > hw/acpi/pcihp.c | 16 +++++++++++++++- > > > hw/acpi/piix4.c | 3 ++- > > > hw/core/machine.c | 4 +++- > > > hw/pci-bridge/xio3130_downstream.c | 3 ++- > > > hw/pci/pcie.c | 5 ++--- > > > 9 files changed, 64 insertions(+), 10 deletions(-) > > > create mode 100644 include/hw/pci-bridge/xio3130_downstream.h > > > > > > -- > > > 2.31.1 > > > ^ permalink raw reply [flat|nested] 29+ messages in thread
end of thread, other threads:[~2022-02-28 9:00 UTC | newest] Thread overview: 29+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2022-02-24 17:44 [PATCH 0/4] Fix broken PCIe device after migration Igor Mammedov 2022-02-24 17:44 ` [PATCH 1/4] pci: expose TYPE_XIO3130_DOWNSTREAM name Igor Mammedov 2022-02-24 17:44 ` [PATCH 2/4] pcie: update slot power status only is power control is enabled Igor Mammedov 2022-02-24 18:05 ` Michael S. Tsirkin 2022-02-25 8:18 ` Igor Mammedov 2022-02-25 9:51 ` Michael S. Tsirkin 2022-02-25 10:05 ` Michael S. Tsirkin 2022-02-25 10:12 ` Gerd Hoffmann 2022-02-25 10:35 ` Michael S. Tsirkin 2022-02-25 13:02 ` Igor Mammedov 2022-02-25 13:08 ` Michael S. Tsirkin 2022-02-25 13:35 ` Igor Mammedov 2022-02-25 13:48 ` Michael S. Tsirkin 2022-02-25 15:39 ` Igor Mammedov 2022-02-28 7:39 ` Gerd Hoffmann 2022-02-28 8:55 ` Igor Mammedov 2022-02-24 17:44 ` [PATCH 3/4] acpi: pcihp: disable power control on PCIe slot Igor Mammedov 2022-02-24 17:44 ` [PATCH 4/4] q35: compat: keep hotplugged PCIe device broken after migration for 6.2-older machine types Igor Mammedov 2022-02-24 18:11 ` Michael S. Tsirkin 2022-02-25 8:25 ` Igor Mammedov 2022-02-24 18:08 ` [PATCH 0/4] Fix broken PCIe device after migration Michael S. Tsirkin 2022-02-25 9:01 ` Igor Mammedov 2022-02-25 9:58 ` Michael S. Tsirkin 2022-02-25 13:18 ` Igor Mammedov 2022-02-25 13:50 ` Michael S. Tsirkin 2022-02-25 15:50 ` Igor Mammedov 2022-02-27 10:22 ` Michael S. Tsirkin 2022-02-28 7:49 ` Gerd Hoffmann 2022-02-25 14:32 ` Igor Mammedov
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.