All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 1/2] i386/acpi: fix inconsistent QEMU/OVMF device paths
@ 2020-07-30 15:58 Michael S. Tsirkin
  2020-07-30 15:58 ` [PATCH 2/2] arm/acpi: fix an out of spec _UID for PCI root Michael S. Tsirkin
                   ` (4 more replies)
  0 siblings, 5 replies; 30+ messages in thread
From: Michael S. Tsirkin @ 2020-07-30 15:58 UTC (permalink / raw)
  To: qemu-devel
  Cc: vit9696, Eduardo Habkost, Paolo Bonzini, Igor Mammedov,
	Laszlo Ersek, Richard Henderson

macOS uses ACPI UIDs to build the DevicePath for NVRAM boot options,
while OVMF firmware gets them via an internal channel through QEMU.
Due to a bug in QEMU ACPI currently UEFI firmware and ACPI have
different values, and this makes the underlying operating system
unable to report its boot option.

The particular node in question is the primary PciRoot (PCI0 in ACPI),
which for some reason gets assigned 1 in ACPI UID and 0 in the
DevicePath. This is due to the _UID assigned to it by build_dsdt in
hw/i386/acpi-build.c Which does not correspond to the primary PCI
identifier given by pcibus_num in hw/pci/pci.c

Reference with the device paths, OVMF startup logs, and ACPI table
dumps (SysReport):
https://github.com/acidanthera/bugtracker/issues/1050

In UEFI v2.8, section "10.4.2 Rules with ACPI _HID and _UID" ends with
the paragraph,

    Root PCI bridges will use the plug and play ID of PNP0A03, This will
    be stored in the ACPI Device Path _HID field, or in the Expanded
    ACPI Device Path _CID field to match the ACPI name space. The _UID
    in the ACPI Device Path structure must match the _UID in the ACPI
    name space.

(See especially the last sentence.)

Considering *extra* root bridges / root buses (with bus number > 0),
QEMU's ACPI generator actually does the right thing; since QEMU commit
c96d9286a6d7 ("i386/acpi-build: more traditional _UID and _HID for PXB
root buses", 2015-06-11).

However, the _UID values for root bridge zero (on both i440fx and q35)
have always been "wrong" (from UEFI perspective), going back in QEMU to
commit 74523b850189 ("i386: add ACPI table files from seabios",
2013-10-14).

Even in SeaBIOS, these _UID values have always been 1; see commit
a4d357638c57 ("Port rombios32 code from bochs-bios.", 2008-03-08) for
i440fx, and commit ecbe3fd61511 ("seabios: q35: add dsdt", 2012-12-01)
for q35.

Suggested-by: Laszlo Ersek <lersek@redhat.com>
Tested-by: vit9696 <vit9696@protonmail.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
---
 hw/i386/acpi-build.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c
index b7bcbbbb2a..7a5a8b3521 100644
--- a/hw/i386/acpi-build.c
+++ b/hw/i386/acpi-build.c
@@ -1497,7 +1497,7 @@ build_dsdt(GArray *table_data, BIOSLinker *linker,
         dev = aml_device("PCI0");
         aml_append(dev, aml_name_decl("_HID", aml_eisaid("PNP0A03")));
         aml_append(dev, aml_name_decl("_ADR", aml_int(0)));
-        aml_append(dev, aml_name_decl("_UID", aml_int(1)));
+        aml_append(dev, aml_name_decl("_UID", aml_int(0)));
         aml_append(sb_scope, dev);
         aml_append(dsdt, sb_scope);
 
@@ -1512,7 +1512,7 @@ build_dsdt(GArray *table_data, BIOSLinker *linker,
         aml_append(dev, aml_name_decl("_HID", aml_eisaid("PNP0A08")));
         aml_append(dev, aml_name_decl("_CID", aml_eisaid("PNP0A03")));
         aml_append(dev, aml_name_decl("_ADR", aml_int(0)));
-        aml_append(dev, aml_name_decl("_UID", aml_int(1)));
+        aml_append(dev, aml_name_decl("_UID", aml_int(0)));
         aml_append(dev, build_q35_osc_method());
         aml_append(sb_scope, dev);
         aml_append(dsdt, sb_scope);
-- 
MST



^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH 2/2] arm/acpi: fix an out of spec _UID for PCI root
  2020-07-30 15:58 [PATCH 1/2] i386/acpi: fix inconsistent QEMU/OVMF device paths Michael S. Tsirkin
@ 2020-07-30 15:58 ` Michael S. Tsirkin
  2020-07-30 16:12   ` Philippe Mathieu-Daudé
                     ` (3 more replies)
  2020-07-30 16:11 ` [PATCH 1/2] i386/acpi: fix inconsistent QEMU/OVMF device paths Philippe Mathieu-Daudé
                   ` (3 subsequent siblings)
  4 siblings, 4 replies; 30+ messages in thread
From: Michael S. Tsirkin @ 2020-07-30 15:58 UTC (permalink / raw)
  To: qemu-devel
  Cc: Peter Maydell, vit9696, Eduardo Habkost, Shannon Zhao, qemu-arm,
	Paolo Bonzini, Igor Mammedov, Laszlo Ersek, Richard Henderson

On ARM/virt machine type QEMU currently reports an incorrect _UID in
ACPI.

The particular node in question is the primary PciRoot (PCI0 in ACPI),
which gets assigned PCI0 in ACPI UID and 0 in the
DevicePath. This is due to the _UID assigned to it by build_dsdt in
hw/arm/virt-acpi-build.c Which does not correspond to the primary PCI
identifier given by pcibus_num in hw/pci/pci.c

In UEFI v2.8, section "10.4.2 Rules with ACPI _HID and _UID" ends with
the paragraph,

    Root PCI bridges will use the plug and play ID of PNP0A03, This will
    be stored in the ACPI Device Path _HID field, or in the Expanded
    ACPI Device Path _CID field to match the ACPI name space. The _UID
    in the ACPI Device Path structure must match the _UID in the ACPI
    name space.

(See especially the last sentence.)

A similar bug has been reported on i386, on that architecture it has
been reported to confuse at least macOS which uses ACPI UIDs to build
the DevicePath for NVRAM boot options, while OVMF firmware gets them via
an internal channel through QEMU.  When UEFI firmware and ACPI have
different values, this makes the underlying operating system unable to
report its boot option.

Reported-by: vit9696 <vit9696@protonmail.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
---

Peter can you either ack or merge this one pls?

 hw/arm/virt-acpi-build.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/hw/arm/virt-acpi-build.c b/hw/arm/virt-acpi-build.c
index 91f0df7b13..0a482ff6f7 100644
--- a/hw/arm/virt-acpi-build.c
+++ b/hw/arm/virt-acpi-build.c
@@ -170,7 +170,7 @@ static void acpi_dsdt_add_pci(Aml *scope, const MemMapEntry *memmap,
     aml_append(dev, aml_name_decl("_CID", aml_string("PNP0A03")));
     aml_append(dev, aml_name_decl("_SEG", aml_int(0)));
     aml_append(dev, aml_name_decl("_BBN", aml_int(0)));
-    aml_append(dev, aml_name_decl("_UID", aml_string("PCI0")));
+    aml_append(dev, aml_name_decl("_UID", aml_int(0)));
     aml_append(dev, aml_name_decl("_STR", aml_unicode("PCIe 0 Device")));
     aml_append(dev, aml_name_decl("_CCA", aml_int(1)));
 
-- 
MST



^ permalink raw reply related	[flat|nested] 30+ messages in thread

* Re: [PATCH 1/2] i386/acpi: fix inconsistent QEMU/OVMF device paths
  2020-07-30 15:58 [PATCH 1/2] i386/acpi: fix inconsistent QEMU/OVMF device paths Michael S. Tsirkin
  2020-07-30 15:58 ` [PATCH 2/2] arm/acpi: fix an out of spec _UID for PCI root Michael S. Tsirkin
@ 2020-07-30 16:11 ` Philippe Mathieu-Daudé
  2020-07-30 19:35   ` Michael S. Tsirkin
  2020-07-30 19:34 ` Laszlo Ersek
                   ` (2 subsequent siblings)
  4 siblings, 1 reply; 30+ messages in thread
From: Philippe Mathieu-Daudé @ 2020-07-30 16:11 UTC (permalink / raw)
  To: Michael S. Tsirkin, qemu-devel
  Cc: vit9696, Eduardo Habkost, Igor Mammedov, Paolo Bonzini,
	Laszlo Ersek, Richard Henderson

On 7/30/20 5:58 PM, Michael S. Tsirkin wrote:
> macOS uses ACPI UIDs to build the DevicePath for NVRAM boot options,
> while OVMF firmware gets them via an internal channel through QEMU.
> Due to a bug in QEMU ACPI currently UEFI firmware and ACPI have
> different values, and this makes the underlying operating system
> unable to report its boot option.
> 
> The particular node in question is the primary PciRoot (PCI0 in ACPI),
> which for some reason gets assigned 1 in ACPI UID and 0 in the
> DevicePath. This is due to the _UID assigned to it by build_dsdt in
> hw/i386/acpi-build.c Which does not correspond to the primary PCI
> identifier given by pcibus_num in hw/pci/pci.c
> 
> Reference with the device paths, OVMF startup logs, and ACPI table
> dumps (SysReport):
> https://github.com/acidanthera/bugtracker/issues/1050
> 
> In UEFI v2.8, section "10.4.2 Rules with ACPI _HID and _UID" ends with
> the paragraph,
> 
>     Root PCI bridges will use the plug and play ID of PNP0A03, This will
>     be stored in the ACPI Device Path _HID field, or in the Expanded
>     ACPI Device Path _CID field to match the ACPI name space. The _UID
>     in the ACPI Device Path structure must match the _UID in the ACPI
>     name space.
> 
> (See especially the last sentence.)
> 
> Considering *extra* root bridges / root buses (with bus number > 0),
> QEMU's ACPI generator actually does the right thing; since QEMU commit
> c96d9286a6d7 ("i386/acpi-build: more traditional _UID and _HID for PXB
> root buses", 2015-06-11).
> 
> However, the _UID values for root bridge zero (on both i440fx and q35)
> have always been "wrong" (from UEFI perspective), going back in QEMU to
> commit 74523b850189 ("i386: add ACPI table files from seabios",
> 2013-10-14).
> 
> Even in SeaBIOS, these _UID values have always been 1; see commit
> a4d357638c57 ("Port rombios32 code from bochs-bios.", 2008-03-08) for
> i440fx, and commit ecbe3fd61511 ("seabios: q35: add dsdt", 2012-12-01)
> for q35.
> 
> Suggested-by: Laszlo Ersek <lersek@redhat.com>
> Tested-by: vit9696 <vit9696@protonmail.com>

Vitaly uses his full name on EDK2 mailing list, so I don't think he'll
have a problem to use it in QEMU too:
Tested-by: Vitaly Cheptsov <vit9696@protonmail.com>

From:
https://wiki.qemu.org/Contribute/SubmitAPatch#Patch_emails_must_include_a_Signed-off-by:_line
"Please use your real name to sign a patch (not an alias or acronym)."

> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
> ---
>  hw/i386/acpi-build.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c
> index b7bcbbbb2a..7a5a8b3521 100644
> --- a/hw/i386/acpi-build.c
> +++ b/hw/i386/acpi-build.c
> @@ -1497,7 +1497,7 @@ build_dsdt(GArray *table_data, BIOSLinker *linker,
>          dev = aml_device("PCI0");
>          aml_append(dev, aml_name_decl("_HID", aml_eisaid("PNP0A03")));
>          aml_append(dev, aml_name_decl("_ADR", aml_int(0)));
> -        aml_append(dev, aml_name_decl("_UID", aml_int(1)));
> +        aml_append(dev, aml_name_decl("_UID", aml_int(0)));
>          aml_append(sb_scope, dev);
>          aml_append(dsdt, sb_scope);
>  
> @@ -1512,7 +1512,7 @@ build_dsdt(GArray *table_data, BIOSLinker *linker,
>          aml_append(dev, aml_name_decl("_HID", aml_eisaid("PNP0A08")));
>          aml_append(dev, aml_name_decl("_CID", aml_eisaid("PNP0A03")));
>          aml_append(dev, aml_name_decl("_ADR", aml_int(0)));
> -        aml_append(dev, aml_name_decl("_UID", aml_int(1)));
> +        aml_append(dev, aml_name_decl("_UID", aml_int(0)));
>          aml_append(dev, build_q35_osc_method());
>          aml_append(sb_scope, dev);
>          aml_append(dsdt, sb_scope);
> 



^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 2/2] arm/acpi: fix an out of spec _UID for PCI root
  2020-07-30 15:58 ` [PATCH 2/2] arm/acpi: fix an out of spec _UID for PCI root Michael S. Tsirkin
@ 2020-07-30 16:12   ` Philippe Mathieu-Daudé
  2020-07-30 19:35   ` Laszlo Ersek
                     ` (2 subsequent siblings)
  3 siblings, 0 replies; 30+ messages in thread
From: Philippe Mathieu-Daudé @ 2020-07-30 16:12 UTC (permalink / raw)
  To: Michael S. Tsirkin, qemu-devel
  Cc: Peter Maydell, vit9696, Shannon Zhao, qemu-arm, Igor Mammedov,
	Paolo Bonzini, Richard Henderson, Laszlo Ersek, Eduardo Habkost

On 7/30/20 5:58 PM, Michael S. Tsirkin wrote:
> On ARM/virt machine type QEMU currently reports an incorrect _UID in
> ACPI.
> 
> The particular node in question is the primary PciRoot (PCI0 in ACPI),
> which gets assigned PCI0 in ACPI UID and 0 in the
> DevicePath. This is due to the _UID assigned to it by build_dsdt in
> hw/arm/virt-acpi-build.c Which does not correspond to the primary PCI
> identifier given by pcibus_num in hw/pci/pci.c
> 
> In UEFI v2.8, section "10.4.2 Rules with ACPI _HID and _UID" ends with
> the paragraph,
> 
>     Root PCI bridges will use the plug and play ID of PNP0A03, This will
>     be stored in the ACPI Device Path _HID field, or in the Expanded
>     ACPI Device Path _CID field to match the ACPI name space. The _UID
>     in the ACPI Device Path structure must match the _UID in the ACPI
>     name space.
> 
> (See especially the last sentence.)
> 
> A similar bug has been reported on i386, on that architecture it has
> been reported to confuse at least macOS which uses ACPI UIDs to build
> the DevicePath for NVRAM boot options, while OVMF firmware gets them via
> an internal channel through QEMU.  When UEFI firmware and ACPI have
> different values, this makes the underlying operating system unable to
> report its boot option.
> 
> Reported-by: vit9696 <vit9696@protonmail.com>

Ditto:
Reported-by: Vitaly Cheptsov <vit9696@protonmail.com>

> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
> ---
> 
> Peter can you either ack or merge this one pls?
> 
>  hw/arm/virt-acpi-build.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/hw/arm/virt-acpi-build.c b/hw/arm/virt-acpi-build.c
> index 91f0df7b13..0a482ff6f7 100644
> --- a/hw/arm/virt-acpi-build.c
> +++ b/hw/arm/virt-acpi-build.c
> @@ -170,7 +170,7 @@ static void acpi_dsdt_add_pci(Aml *scope, const MemMapEntry *memmap,
>      aml_append(dev, aml_name_decl("_CID", aml_string("PNP0A03")));
>      aml_append(dev, aml_name_decl("_SEG", aml_int(0)));
>      aml_append(dev, aml_name_decl("_BBN", aml_int(0)));
> -    aml_append(dev, aml_name_decl("_UID", aml_string("PCI0")));
> +    aml_append(dev, aml_name_decl("_UID", aml_int(0)));
>      aml_append(dev, aml_name_decl("_STR", aml_unicode("PCIe 0 Device")));
>      aml_append(dev, aml_name_decl("_CCA", aml_int(1)));
>  
> 



^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 1/2] i386/acpi: fix inconsistent QEMU/OVMF device paths
  2020-07-30 15:58 [PATCH 1/2] i386/acpi: fix inconsistent QEMU/OVMF device paths Michael S. Tsirkin
  2020-07-30 15:58 ` [PATCH 2/2] arm/acpi: fix an out of spec _UID for PCI root Michael S. Tsirkin
  2020-07-30 16:11 ` [PATCH 1/2] i386/acpi: fix inconsistent QEMU/OVMF device paths Philippe Mathieu-Daudé
@ 2020-07-30 19:34 ` Laszlo Ersek
  2020-07-31  9:30 ` Igor Mammedov
  2021-02-27 19:41 ` Thomas Lamprecht
  4 siblings, 0 replies; 30+ messages in thread
From: Laszlo Ersek @ 2020-07-30 19:34 UTC (permalink / raw)
  To: Michael S. Tsirkin, qemu-devel
  Cc: Eduardo Habkost, Paolo Bonzini, Igor Mammedov, vit9696,
	Richard Henderson

On 07/30/20 17:58, Michael S. Tsirkin wrote:
> macOS uses ACPI UIDs to build the DevicePath for NVRAM boot options,
> while OVMF firmware gets them via an internal channel through QEMU.
> Due to a bug in QEMU ACPI currently UEFI firmware and ACPI have
> different values, and this makes the underlying operating system
> unable to report its boot option.
> 
> The particular node in question is the primary PciRoot (PCI0 in ACPI),
> which for some reason gets assigned 1 in ACPI UID and 0 in the
> DevicePath. This is due to the _UID assigned to it by build_dsdt in
> hw/i386/acpi-build.c Which does not correspond to the primary PCI
> identifier given by pcibus_num in hw/pci/pci.c
> 
> Reference with the device paths, OVMF startup logs, and ACPI table
> dumps (SysReport):
> https://github.com/acidanthera/bugtracker/issues/1050
> 
> In UEFI v2.8, section "10.4.2 Rules with ACPI _HID and _UID" ends with
> the paragraph,
> 
>     Root PCI bridges will use the plug and play ID of PNP0A03, This will
>     be stored in the ACPI Device Path _HID field, or in the Expanded
>     ACPI Device Path _CID field to match the ACPI name space. The _UID
>     in the ACPI Device Path structure must match the _UID in the ACPI
>     name space.
> 
> (See especially the last sentence.)
> 
> Considering *extra* root bridges / root buses (with bus number > 0),
> QEMU's ACPI generator actually does the right thing; since QEMU commit
> c96d9286a6d7 ("i386/acpi-build: more traditional _UID and _HID for PXB
> root buses", 2015-06-11).
> 
> However, the _UID values for root bridge zero (on both i440fx and q35)
> have always been "wrong" (from UEFI perspective), going back in QEMU to
> commit 74523b850189 ("i386: add ACPI table files from seabios",
> 2013-10-14).
> 
> Even in SeaBIOS, these _UID values have always been 1; see commit
> a4d357638c57 ("Port rombios32 code from bochs-bios.", 2008-03-08) for
> i440fx, and commit ecbe3fd61511 ("seabios: q35: add dsdt", 2012-12-01)
> for q35.
> 
> Suggested-by: Laszlo Ersek <lersek@redhat.com>
> Tested-by: vit9696 <vit9696@protonmail.com>
> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
> ---
>  hw/i386/acpi-build.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c
> index b7bcbbbb2a..7a5a8b3521 100644
> --- a/hw/i386/acpi-build.c
> +++ b/hw/i386/acpi-build.c
> @@ -1497,7 +1497,7 @@ build_dsdt(GArray *table_data, BIOSLinker *linker,
>          dev = aml_device("PCI0");
>          aml_append(dev, aml_name_decl("_HID", aml_eisaid("PNP0A03")));
>          aml_append(dev, aml_name_decl("_ADR", aml_int(0)));
> -        aml_append(dev, aml_name_decl("_UID", aml_int(1)));
> +        aml_append(dev, aml_name_decl("_UID", aml_int(0)));
>          aml_append(sb_scope, dev);
>          aml_append(dsdt, sb_scope);
>  
> @@ -1512,7 +1512,7 @@ build_dsdt(GArray *table_data, BIOSLinker *linker,
>          aml_append(dev, aml_name_decl("_HID", aml_eisaid("PNP0A08")));
>          aml_append(dev, aml_name_decl("_CID", aml_eisaid("PNP0A03")));
>          aml_append(dev, aml_name_decl("_ADR", aml_int(0)));
> -        aml_append(dev, aml_name_decl("_UID", aml_int(1)));
> +        aml_append(dev, aml_name_decl("_UID", aml_int(0)));
>          aml_append(dev, build_q35_osc_method());
>          aml_append(sb_scope, dev);
>          aml_append(dsdt, sb_scope);
> 

with Phil's feedback included:

Reviewed-by: Laszlo Ersek <lersek@redhat.com>

Thank you very much for writing up the commit message on this patch,
Michael!

Laszlo



^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 2/2] arm/acpi: fix an out of spec _UID for PCI root
  2020-07-30 15:58 ` [PATCH 2/2] arm/acpi: fix an out of spec _UID for PCI root Michael S. Tsirkin
  2020-07-30 16:12   ` Philippe Mathieu-Daudé
@ 2020-07-30 19:35   ` Laszlo Ersek
  2020-07-30 20:33   ` Peter Maydell
  2020-07-31  9:31   ` Igor Mammedov
  3 siblings, 0 replies; 30+ messages in thread
From: Laszlo Ersek @ 2020-07-30 19:35 UTC (permalink / raw)
  To: Michael S. Tsirkin, qemu-devel
  Cc: Peter Maydell, Eduardo Habkost, Shannon Zhao, qemu-arm,
	Paolo Bonzini, Igor Mammedov, vit9696, Richard Henderson

On 07/30/20 17:58, Michael S. Tsirkin wrote:
> On ARM/virt machine type QEMU currently reports an incorrect _UID in
> ACPI.
> 
> The particular node in question is the primary PciRoot (PCI0 in ACPI),
> which gets assigned PCI0 in ACPI UID and 0 in the
> DevicePath. This is due to the _UID assigned to it by build_dsdt in
> hw/arm/virt-acpi-build.c Which does not correspond to the primary PCI
> identifier given by pcibus_num in hw/pci/pci.c
> 
> In UEFI v2.8, section "10.4.2 Rules with ACPI _HID and _UID" ends with
> the paragraph,
> 
>     Root PCI bridges will use the plug and play ID of PNP0A03, This will
>     be stored in the ACPI Device Path _HID field, or in the Expanded
>     ACPI Device Path _CID field to match the ACPI name space. The _UID
>     in the ACPI Device Path structure must match the _UID in the ACPI
>     name space.
> 
> (See especially the last sentence.)
> 
> A similar bug has been reported on i386, on that architecture it has
> been reported to confuse at least macOS which uses ACPI UIDs to build
> the DevicePath for NVRAM boot options, while OVMF firmware gets them via
> an internal channel through QEMU.  When UEFI firmware and ACPI have
> different values, this makes the underlying operating system unable to
> report its boot option.
> 
> Reported-by: vit9696 <vit9696@protonmail.com>
> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
> ---
> 
> Peter can you either ack or merge this one pls?
> 
>  hw/arm/virt-acpi-build.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/hw/arm/virt-acpi-build.c b/hw/arm/virt-acpi-build.c
> index 91f0df7b13..0a482ff6f7 100644
> --- a/hw/arm/virt-acpi-build.c
> +++ b/hw/arm/virt-acpi-build.c
> @@ -170,7 +170,7 @@ static void acpi_dsdt_add_pci(Aml *scope, const MemMapEntry *memmap,
>      aml_append(dev, aml_name_decl("_CID", aml_string("PNP0A03")));
>      aml_append(dev, aml_name_decl("_SEG", aml_int(0)));
>      aml_append(dev, aml_name_decl("_BBN", aml_int(0)));
> -    aml_append(dev, aml_name_decl("_UID", aml_string("PCI0")));
> +    aml_append(dev, aml_name_decl("_UID", aml_int(0)));
>      aml_append(dev, aml_name_decl("_STR", aml_unicode("PCIe 0 Device")));
>      aml_append(dev, aml_name_decl("_CCA", aml_int(1)));
>  
> 

Reviewed-by: Laszlo Ersek <lersek@redhat.com>

(Vitaly's full name could be included in the Reported-by here as well,
arguably.)

Thanks!
Laszlo



^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 1/2] i386/acpi: fix inconsistent QEMU/OVMF device paths
  2020-07-30 16:11 ` [PATCH 1/2] i386/acpi: fix inconsistent QEMU/OVMF device paths Philippe Mathieu-Daudé
@ 2020-07-30 19:35   ` Michael S. Tsirkin
  2020-07-31  2:55     ` vit9696 via
  0 siblings, 1 reply; 30+ messages in thread
From: Michael S. Tsirkin @ 2020-07-30 19:35 UTC (permalink / raw)
  To: Philippe Mathieu-Daudé
  Cc: Eduardo Habkost, qemu-devel, Igor Mammedov, Paolo Bonzini,
	Richard Henderson, Laszlo Ersek, vit9696

On Thu, Jul 30, 2020 at 06:11:17PM +0200, Philippe Mathieu-Daudé wrote:
> On 7/30/20 5:58 PM, Michael S. Tsirkin wrote:
> > macOS uses ACPI UIDs to build the DevicePath for NVRAM boot options,
> > while OVMF firmware gets them via an internal channel through QEMU.
> > Due to a bug in QEMU ACPI currently UEFI firmware and ACPI have
> > different values, and this makes the underlying operating system
> > unable to report its boot option.
> > 
> > The particular node in question is the primary PciRoot (PCI0 in ACPI),
> > which for some reason gets assigned 1 in ACPI UID and 0 in the
> > DevicePath. This is due to the _UID assigned to it by build_dsdt in
> > hw/i386/acpi-build.c Which does not correspond to the primary PCI
> > identifier given by pcibus_num in hw/pci/pci.c
> > 
> > Reference with the device paths, OVMF startup logs, and ACPI table
> > dumps (SysReport):
> > https://github.com/acidanthera/bugtracker/issues/1050
> > 
> > In UEFI v2.8, section "10.4.2 Rules with ACPI _HID and _UID" ends with
> > the paragraph,
> > 
> >     Root PCI bridges will use the plug and play ID of PNP0A03, This will
> >     be stored in the ACPI Device Path _HID field, or in the Expanded
> >     ACPI Device Path _CID field to match the ACPI name space. The _UID
> >     in the ACPI Device Path structure must match the _UID in the ACPI
> >     name space.
> > 
> > (See especially the last sentence.)
> > 
> > Considering *extra* root bridges / root buses (with bus number > 0),
> > QEMU's ACPI generator actually does the right thing; since QEMU commit
> > c96d9286a6d7 ("i386/acpi-build: more traditional _UID and _HID for PXB
> > root buses", 2015-06-11).
> > 
> > However, the _UID values for root bridge zero (on both i440fx and q35)
> > have always been "wrong" (from UEFI perspective), going back in QEMU to
> > commit 74523b850189 ("i386: add ACPI table files from seabios",
> > 2013-10-14).
> > 
> > Even in SeaBIOS, these _UID values have always been 1; see commit
> > a4d357638c57 ("Port rombios32 code from bochs-bios.", 2008-03-08) for
> > i440fx, and commit ecbe3fd61511 ("seabios: q35: add dsdt", 2012-12-01)
> > for q35.
> > 
> > Suggested-by: Laszlo Ersek <lersek@redhat.com>
> > Tested-by: vit9696 <vit9696@protonmail.com>
> 
> Vitaly uses his full name on EDK2 mailing list, so I don't think he'll
> have a problem to use it in QEMU too:
> Tested-by: Vitaly Cheptsov <vit9696@protonmail.com>
> 
> From:
> https://wiki.qemu.org/Contribute/SubmitAPatch#Patch_emails_must_include_a_Signed-off-by:_line
> "Please use your real name to sign a patch (not an alias or acronym)."

Right. Tested-by is different though, I don't think we have
a problem with anonymous testing.

Anyway, updated.

> > Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
> > ---
> >  hw/i386/acpi-build.c | 4 ++--
> >  1 file changed, 2 insertions(+), 2 deletions(-)
> > 
> > diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c
> > index b7bcbbbb2a..7a5a8b3521 100644
> > --- a/hw/i386/acpi-build.c
> > +++ b/hw/i386/acpi-build.c
> > @@ -1497,7 +1497,7 @@ build_dsdt(GArray *table_data, BIOSLinker *linker,
> >          dev = aml_device("PCI0");
> >          aml_append(dev, aml_name_decl("_HID", aml_eisaid("PNP0A03")));
> >          aml_append(dev, aml_name_decl("_ADR", aml_int(0)));
> > -        aml_append(dev, aml_name_decl("_UID", aml_int(1)));
> > +        aml_append(dev, aml_name_decl("_UID", aml_int(0)));
> >          aml_append(sb_scope, dev);
> >          aml_append(dsdt, sb_scope);
> >  
> > @@ -1512,7 +1512,7 @@ build_dsdt(GArray *table_data, BIOSLinker *linker,
> >          aml_append(dev, aml_name_decl("_HID", aml_eisaid("PNP0A08")));
> >          aml_append(dev, aml_name_decl("_CID", aml_eisaid("PNP0A03")));
> >          aml_append(dev, aml_name_decl("_ADR", aml_int(0)));
> > -        aml_append(dev, aml_name_decl("_UID", aml_int(1)));
> > +        aml_append(dev, aml_name_decl("_UID", aml_int(0)));
> >          aml_append(dev, build_q35_osc_method());
> >          aml_append(sb_scope, dev);
> >          aml_append(dsdt, sb_scope);
> > 



^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 2/2] arm/acpi: fix an out of spec _UID for PCI root
  2020-07-30 15:58 ` [PATCH 2/2] arm/acpi: fix an out of spec _UID for PCI root Michael S. Tsirkin
  2020-07-30 16:12   ` Philippe Mathieu-Daudé
  2020-07-30 19:35   ` Laszlo Ersek
@ 2020-07-30 20:33   ` Peter Maydell
  2020-07-31  9:31   ` Igor Mammedov
  3 siblings, 0 replies; 30+ messages in thread
From: Peter Maydell @ 2020-07-30 20:33 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: vit9696, QEMU Developers, Eduardo Habkost, Shannon Zhao,
	qemu-arm, Paolo Bonzini, Igor Mammedov, Laszlo Ersek,
	Richard Henderson

On Thu, 30 Jul 2020 at 16:58, Michael S. Tsirkin <mst@redhat.com> wrote:
>
> On ARM/virt machine type QEMU currently reports an incorrect _UID in
> ACPI.
>
> The particular node in question is the primary PciRoot (PCI0 in ACPI),
> which gets assigned PCI0 in ACPI UID and 0 in the
> DevicePath. This is due to the _UID assigned to it by build_dsdt in
> hw/arm/virt-acpi-build.c Which does not correspond to the primary PCI
> identifier given by pcibus_num in hw/pci/pci.c
>
> In UEFI v2.8, section "10.4.2 Rules with ACPI _HID and _UID" ends with
> the paragraph,
>
>     Root PCI bridges will use the plug and play ID of PNP0A03, This will
>     be stored in the ACPI Device Path _HID field, or in the Expanded
>     ACPI Device Path _CID field to match the ACPI name space. The _UID
>     in the ACPI Device Path structure must match the _UID in the ACPI
>     name space.
>
> (See especially the last sentence.)
>
> A similar bug has been reported on i386, on that architecture it has
> been reported to confuse at least macOS which uses ACPI UIDs to build
> the DevicePath for NVRAM boot options, while OVMF firmware gets them via
> an internal channel through QEMU.  When UEFI firmware and ACPI have
> different values, this makes the underlying operating system unable to
> report its boot option.
>
> Reported-by: vit9696 <vit9696@protonmail.com>
> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
> ---
>
> Peter can you either ack or merge this one pls?

Since you have the x86 one to do anyway, I'll let you take
this one.
Acked-by: Peter Maydell <peter.maydell@linaro.org>

thanks
-- PMM


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 1/2] i386/acpi: fix inconsistent QEMU/OVMF device paths
  2020-07-30 19:35   ` Michael S. Tsirkin
@ 2020-07-31  2:55     ` vit9696 via
  0 siblings, 0 replies; 30+ messages in thread
From: vit9696 via @ 2020-07-31  2:55 UTC (permalink / raw)
  To: Michael S. Tsirkin, Philippe Mathieu-Daudé
  Cc: qemu-devel, Eduardo Habkost, Paolo Bonzini, Igor Mammedov,
	Laszlo Ersek, Richard Henderson


[-- Attachment #1.1: Type: text/html, Size: 5320 bytes --]

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 489 bytes --]

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 1/2] i386/acpi: fix inconsistent QEMU/OVMF device paths
  2020-07-30 15:58 [PATCH 1/2] i386/acpi: fix inconsistent QEMU/OVMF device paths Michael S. Tsirkin
                   ` (2 preceding siblings ...)
  2020-07-30 19:34 ` Laszlo Ersek
@ 2020-07-31  9:30 ` Igor Mammedov
  2021-02-27 19:41 ` Thomas Lamprecht
  4 siblings, 0 replies; 30+ messages in thread
From: Igor Mammedov @ 2020-07-31  9:30 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Eduardo Habkost, qemu-devel, Paolo Bonzini, vit9696,
	Laszlo Ersek, Richard Henderson

On Thu, 30 Jul 2020 11:58:38 -0400
"Michael S. Tsirkin" <mst@redhat.com> wrote:

> macOS uses ACPI UIDs to build the DevicePath for NVRAM boot options,
> while OVMF firmware gets them via an internal channel through QEMU.
> Due to a bug in QEMU ACPI currently UEFI firmware and ACPI have
> different values, and this makes the underlying operating system
> unable to report its boot option.
> 
> The particular node in question is the primary PciRoot (PCI0 in ACPI),
> which for some reason gets assigned 1 in ACPI UID and 0 in the
> DevicePath. This is due to the _UID assigned to it by build_dsdt in
> hw/i386/acpi-build.c Which does not correspond to the primary PCI
> identifier given by pcibus_num in hw/pci/pci.c
> 
> Reference with the device paths, OVMF startup logs, and ACPI table
> dumps (SysReport):
> https://github.com/acidanthera/bugtracker/issues/1050
> 
> In UEFI v2.8, section "10.4.2 Rules with ACPI _HID and _UID" ends with
> the paragraph,
> 
>     Root PCI bridges will use the plug and play ID of PNP0A03, This will
>     be stored in the ACPI Device Path _HID field, or in the Expanded
>     ACPI Device Path _CID field to match the ACPI name space. The _UID
>     in the ACPI Device Path structure must match the _UID in the ACPI
>     name space.
> 
> (See especially the last sentence.)
> 
> Considering *extra* root bridges / root buses (with bus number > 0),
> QEMU's ACPI generator actually does the right thing; since QEMU commit
> c96d9286a6d7 ("i386/acpi-build: more traditional _UID and _HID for PXB
> root buses", 2015-06-11).
> 
> However, the _UID values for root bridge zero (on both i440fx and q35)
> have always been "wrong" (from UEFI perspective), going back in QEMU to
> commit 74523b850189 ("i386: add ACPI table files from seabios",
> 2013-10-14).
> 
> Even in SeaBIOS, these _UID values have always been 1; see commit
> a4d357638c57 ("Port rombios32 code from bochs-bios.", 2008-03-08) for
> i440fx, and commit ecbe3fd61511 ("seabios: q35: add dsdt", 2012-12-01)
> for q35.
> 
> Suggested-by: Laszlo Ersek <lersek@redhat.com>
> Tested-by: vit9696 <vit9696@protonmail.com>
> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

Reviewed-by: Igor Mammedov <imammedo@redhat.com>

> ---
>  hw/i386/acpi-build.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c
> index b7bcbbbb2a..7a5a8b3521 100644
> --- a/hw/i386/acpi-build.c
> +++ b/hw/i386/acpi-build.c
> @@ -1497,7 +1497,7 @@ build_dsdt(GArray *table_data, BIOSLinker *linker,
>          dev = aml_device("PCI0");
>          aml_append(dev, aml_name_decl("_HID", aml_eisaid("PNP0A03")));
>          aml_append(dev, aml_name_decl("_ADR", aml_int(0)));
> -        aml_append(dev, aml_name_decl("_UID", aml_int(1)));
> +        aml_append(dev, aml_name_decl("_UID", aml_int(0)));
>          aml_append(sb_scope, dev);
>          aml_append(dsdt, sb_scope);
>  
> @@ -1512,7 +1512,7 @@ build_dsdt(GArray *table_data, BIOSLinker *linker,
>          aml_append(dev, aml_name_decl("_HID", aml_eisaid("PNP0A08")));
>          aml_append(dev, aml_name_decl("_CID", aml_eisaid("PNP0A03")));
>          aml_append(dev, aml_name_decl("_ADR", aml_int(0)));
> -        aml_append(dev, aml_name_decl("_UID", aml_int(1)));
> +        aml_append(dev, aml_name_decl("_UID", aml_int(0)));
>          aml_append(dev, build_q35_osc_method());
>          aml_append(sb_scope, dev);
>          aml_append(dsdt, sb_scope);



^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 2/2] arm/acpi: fix an out of spec _UID for PCI root
  2020-07-30 15:58 ` [PATCH 2/2] arm/acpi: fix an out of spec _UID for PCI root Michael S. Tsirkin
                     ` (2 preceding siblings ...)
  2020-07-30 20:33   ` Peter Maydell
@ 2020-07-31  9:31   ` Igor Mammedov
  3 siblings, 0 replies; 30+ messages in thread
From: Igor Mammedov @ 2020-07-31  9:31 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Peter Maydell, vit9696, qemu-devel, Eduardo Habkost,
	Shannon Zhao, qemu-arm, Paolo Bonzini, Laszlo Ersek,
	Richard Henderson

On Thu, 30 Jul 2020 11:58:41 -0400
"Michael S. Tsirkin" <mst@redhat.com> wrote:

> On ARM/virt machine type QEMU currently reports an incorrect _UID in
> ACPI.
> 
> The particular node in question is the primary PciRoot (PCI0 in ACPI),
> which gets assigned PCI0 in ACPI UID and 0 in the
> DevicePath. This is due to the _UID assigned to it by build_dsdt in
> hw/arm/virt-acpi-build.c Which does not correspond to the primary PCI
> identifier given by pcibus_num in hw/pci/pci.c
> 
> In UEFI v2.8, section "10.4.2 Rules with ACPI _HID and _UID" ends with
> the paragraph,
> 
>     Root PCI bridges will use the plug and play ID of PNP0A03, This will
>     be stored in the ACPI Device Path _HID field, or in the Expanded
>     ACPI Device Path _CID field to match the ACPI name space. The _UID
>     in the ACPI Device Path structure must match the _UID in the ACPI
>     name space.
> 
> (See especially the last sentence.)
> 
> A similar bug has been reported on i386, on that architecture it has
> been reported to confuse at least macOS which uses ACPI UIDs to build
> the DevicePath for NVRAM boot options, while OVMF firmware gets them via
> an internal channel through QEMU.  When UEFI firmware and ACPI have
> different values, this makes the underlying operating system unable to
> report its boot option.
> 
> Reported-by: vit9696 <vit9696@protonmail.com>
> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

Reviewed-by: Igor Mammedov <imammedo@redhat.com>

> ---
> 
> Peter can you either ack or merge this one pls?
> 
>  hw/arm/virt-acpi-build.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/hw/arm/virt-acpi-build.c b/hw/arm/virt-acpi-build.c
> index 91f0df7b13..0a482ff6f7 100644
> --- a/hw/arm/virt-acpi-build.c
> +++ b/hw/arm/virt-acpi-build.c
> @@ -170,7 +170,7 @@ static void acpi_dsdt_add_pci(Aml *scope, const MemMapEntry *memmap,
>      aml_append(dev, aml_name_decl("_CID", aml_string("PNP0A03")));
>      aml_append(dev, aml_name_decl("_SEG", aml_int(0)));
>      aml_append(dev, aml_name_decl("_BBN", aml_int(0)));
> -    aml_append(dev, aml_name_decl("_UID", aml_string("PCI0")));
> +    aml_append(dev, aml_name_decl("_UID", aml_int(0)));
>      aml_append(dev, aml_name_decl("_STR", aml_unicode("PCIe 0 Device")));
>      aml_append(dev, aml_name_decl("_CCA", aml_int(1)));
>  



^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 1/2] i386/acpi: fix inconsistent QEMU/OVMF device paths
  2020-07-30 15:58 [PATCH 1/2] i386/acpi: fix inconsistent QEMU/OVMF device paths Michael S. Tsirkin
                   ` (3 preceding siblings ...)
  2020-07-31  9:30 ` Igor Mammedov
@ 2021-02-27 19:41 ` Thomas Lamprecht
  2021-02-28  9:11   ` vit9696
  2021-02-28 20:43   ` Michael S. Tsirkin
  4 siblings, 2 replies; 30+ messages in thread
From: Thomas Lamprecht @ 2021-02-27 19:41 UTC (permalink / raw)
  To: Michael S. Tsirkin, qemu-devel
  Cc: vit9696, Stefan Reiter, Eduardo Habkost, Igor Mammedov,
	Paolo Bonzini, Laszlo Ersek, Richard Henderson

On 30.07.20 17:58, Michael S. Tsirkin wrote:
> macOS uses ACPI UIDs to build the DevicePath for NVRAM boot options,
> while OVMF firmware gets them via an internal channel through QEMU.
> Due to a bug in QEMU ACPI currently UEFI firmware and ACPI have
> different values, and this makes the underlying operating system
> unable to report its boot option.
> 
> The particular node in question is the primary PciRoot (PCI0 in ACPI),
> which for some reason gets assigned 1 in ACPI UID and 0 in the
> DevicePath. This is due to the _UID assigned to it by build_dsdt in
> hw/i386/acpi-build.c Which does not correspond to the primary PCI
> identifier given by pcibus_num in hw/pci/pci.c
> 
> Reference with the device paths, OVMF startup logs, and ACPI table
> dumps (SysReport):
> https://github.com/acidanthera/bugtracker/issues/1050
> 
> In UEFI v2.8, section "10.4.2 Rules with ACPI _HID and _UID" ends with
> the paragraph,
> 
>     Root PCI bridges will use the plug and play ID of PNP0A03, This will
>     be stored in the ACPI Device Path _HID field, or in the Expanded
>     ACPI Device Path _CID field to match the ACPI name space. The _UID
>     in the ACPI Device Path structure must match the _UID in the ACPI
>     name space.
> 
> (See especially the last sentence.)
> 
> Considering *extra* root bridges / root buses (with bus number > 0),
> QEMU's ACPI generator actually does the right thing; since QEMU commit
> c96d9286a6d7 ("i386/acpi-build: more traditional _UID and _HID for PXB
> root buses", 2015-06-11).
> 
> However, the _UID values for root bridge zero (on both i440fx and q35)
> have always been "wrong" (from UEFI perspective), going back in QEMU to
> commit 74523b850189 ("i386: add ACPI table files from seabios",
> 2013-10-14).
> 
> Even in SeaBIOS, these _UID values have always been 1; see commit
> a4d357638c57 ("Port rombios32 code from bochs-bios.", 2008-03-08) for
> i440fx, and commit ecbe3fd61511 ("seabios: q35: add dsdt", 2012-12-01)
> for q35.
> 
> Suggested-by: Laszlo Ersek <lersek@redhat.com>
> Tested-by: vit9696 <vit9696@protonmail.com>
> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
> ---
>  hw/i386/acpi-build.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c
> index b7bcbbbb2a..7a5a8b3521 100644
> --- a/hw/i386/acpi-build.c
> +++ b/hw/i386/acpi-build.c
> @@ -1497,7 +1497,7 @@ build_dsdt(GArray *table_data, BIOSLinker *linker,
>          dev = aml_device("PCI0");
>          aml_append(dev, aml_name_decl("_HID", aml_eisaid("PNP0A03")));
>          aml_append(dev, aml_name_decl("_ADR", aml_int(0)));
> -        aml_append(dev, aml_name_decl("_UID", aml_int(1)));
> +        aml_append(dev, aml_name_decl("_UID", aml_int(0)));
>          aml_append(sb_scope, dev);
>          aml_append(dsdt, sb_scope);
>  
> @@ -1512,7 +1512,7 @@ build_dsdt(GArray *table_data, BIOSLinker *linker,
>          aml_append(dev, aml_name_decl("_HID", aml_eisaid("PNP0A08")));
>          aml_append(dev, aml_name_decl("_CID", aml_eisaid("PNP0A03")));
>          aml_append(dev, aml_name_decl("_ADR", aml_int(0)));
> -        aml_append(dev, aml_name_decl("_UID", aml_int(1)));
> +        aml_append(dev, aml_name_decl("_UID", aml_int(0)));
>          aml_append(dev, build_q35_osc_method());
>          aml_append(sb_scope, dev);
>          aml_append(dsdt, sb_scope);
> 

This "breaks" Windows guests created/installed before this change in the sense
of Windows gets confused and declares that most of the devices changed and thus
it has new entries for them in the device manager where settings of the old one
do not apply anymore.

We were made aware of this by our users when making QEMU 5.2.0 available on
a more used repository of us. Users complained that their static network
configuration got thrown out in Windows 2016 or 2019 server VMs, and Windows tried
to use DHCP (which was not available in their environments) and thus their Windows
VMs had no network connectivity at all anymore.

It's currently not yet quite 100% clear to me with what QEMU version the Windows VM
must be installed with, from reading the patch I have to believe it must be before
that, but we got mixed reports and a colleague could not replicate it from upgrade
of 4.0 to 5.2 (I did /not/ confirm that one). Anyway, just writing this all to avoid
people seeing different results and brushing this off.

So here's my personal reproducer, as said, I think that one should be able to just
use QEMU 5.1 to install a Windows guest and start it with 5.2 afterwards to see this
issue, but YMMV.

Note. I always used the exact same QEMU command (see below) for installation,
reproducing and bisect.

1. Installed Windows 2016 1616 VM using QEMU 3.0.1
   - VirtIO net/scsi driver from VirtIO win 190
2. Setup static network in the VM and shutdown
3. Started VM with 5.2.0 -> Network gone, new "Ethernet #2" adapter shows up instead

Starting the  "Device Manager" and enabling "View -> Show hidden devices" showed
me a greyed out device duplicate for basically anything attached, SCSI disk, Basic
Display Adapter, CDROM device, ..., and the Network device.

The first difference I could find was the "Device instance path" one can find in
the "Details" tab of the devices' "Properties" window.

# old, from initial installation on QEMU 3.0.1
PCI\VEN_1AF4&DEV_1000&SUBSYS_00011AF4&REV_00\3&13C0B0C5&0&90

# new, from boot with QEMU 5.2
PCI\VEN_1AF4&DEV_1000&SUBSYS_00011AF4&REV_00\3&267A616A&0&90

They match until almost the end, not sure how important that is, but it caught my
eye (I'm really no windows guy since a decade so please excuse my terrible
debugging/exploring skills there. The rest of those properties looked pretty
much identical.

I then started a bisect, always just restarting the guest with the new QEMU build
and checking "Device Manager" and network settings to see if good/bad. That worked
pretty well and I came to this commit. See the bisect log attached at the end of
this mail.

So, from reading the commit message I figure that this change is wanted, what are
the implications of just reverting it? (which works out in bringing back the
old state in Windows + working static network config again).

Or any other way/idea to address this in a sane way so that those picky Windows
guests can be handled more graciously?

I guess also that there could be more subtle effects from this patch here, the
network one may have just had quite visible effects to pop up as first issue...

Thanks if you read so far!

cheers,
Thomas


= QEMU Command =

(This was generated by our (Proxmox VE) stack, I only cleaned it up a bit to allow
easier manual running it)

./qemu-system-x86_64 \
  -name win2016 \
  -chardev 'socket,id=qmp,path=/var/run/qemu-server/11765.qmp,server,nowait' \
  -mon 'chardev=qmp,mode=control' \
  -smbios 'type=1,uuid=6324fb28-e98a-44cf-85db-694d1b3405f5' \
  -smp '2,sockets=1,cores=2,maxcpus=2' \
  -nodefaults \
  -boot 'menu=on,strict=on,reboot-timeout=1000' \
  -vnc unix:/var/run/qemu-server/11765.vnc,password \
  -no-hpet \
  -cpu 'host,hv_ipi,hv_relaxed,hv_reset,hv_runtime,hv_spinlocks=0x1fff,hv_stimer,hv_synic,hv_time,hv_vapic,hv_vpindex,+kvm_pv_eoi,+kvm_pv_unhalt,+md-clear,+pcid,+spec-ctrl' \
  -m 2048 \
  -device 'pci-bridge,id=pci.1,chassis_nr=1,bus=pci.0,addr=0x1e' \
  -device 'pci-bridge,id=pci.2,chassis_nr=2,bus=pci.0,addr=0x1f' \
  -device 'vmgenid,guid=2e56e6ca-2cf8-4f1d-8cc3-9b19a2510c01' \
  -device 'piix3-usb-uhci,id=uhci,bus=pci.0,addr=0x1.0x2' \
  -device 'usb-tablet,id=tablet,bus=uhci.0,port=1' \
  -device 'VGA,id=vga,bus=pci.0,addr=0x2,edid=off' \
  -chardev 'socket,path=/var/run/qemu-server/11765.qga,server,nowait,id=qga0' \
  -device 'virtio-serial,id=qga0,bus=pci.0,addr=0x8' \
  -device 'virtserialport,chardev=qga0,name=org.qemu.guest_agent.0' \
  -device 'virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x3' \
  -iscsi 'initiator-name=iqn.1993-08.org.debian:01:468faae9322b' \
  -drive 'file=/mnt/pve/iso/template/iso/virtio-win-0.1.190.iso,if=none,id=drive-ide0,media=cdrom,aio=threads' \
  -device 'ide-cd,bus=ide.0,unit=0,drive=drive-ide0,id=ide0,bootindex=200' \
  -drive 'file=/mnt/pve/iso/template/iso/Win2016-1616-evaluation.ISO,if=none,id=drive-ide2,media=cdrom,aio=threads' \
  -device 'ide-cd,bus=ide.1,unit=0,drive=drive-ide2,id=ide2,bootindex=201' \
  -device 'virtio-scsi-pci,id=scsihw0,bus=pci.0,addr=0x5' \
  -drive 'file=/dev/WDnvme/vm-11765-disk-0,if=none,id=drive-scsi0,format=raw,cache=none,aio=native,detect-zeroes=on' \
  -device 'scsi-hd,bus=scsihw0.0,channel=0,scsi-id=0,lun=0,drive=drive-scsi0,id=scsi0,rotation_rate=1,bootindex=100' \
  -netdev 'type=tap,id=net0,ifname=tap11765i0,script=/var/lib/qemu-server/pve-bridge,downscript=/var/lib/qemu-server/pve-bridgedown,vhost=on' \
  -device 'virtio-net-pci,mac=02:98:90:43:42:1D,netdev=net0,bus=pci.0,addr=0x12,id=net0,bootindex=300' \
  -rtc 'driftfix=slew,base=localtime' \
  -machine 'type=pc' \
  -global 'kvm-pit.lost_tick_policy=discard'


= bisect log =

git bisect start
# bad: [553032db17440f8de011390e5a1cfddd13751b0b] Update version for v5.2.0 release
git bisect bad 553032db17440f8de011390e5a1cfddd13751b0b
# good: [d0ed6a69d399ae193959225cdeaa9382746c91cc] Update version for v5.1.0 release
git bisect good d0ed6a69d399ae193959225cdeaa9382746c91cc
# bad: [ed799805d00ccdda45eb8441c7d929624d9e98a6] qom: Add kernel-doc markup to introduction doc comment
git bisect bad ed799805d00ccdda45eb8441c7d929624d9e98a6
# bad: [e4d8b7c1a95fffcfa4bdab9aa7ffd1cf590cdcf5] Merge remote-tracking branch 'remotes/nvme/tags/pull-nvme-20200902' into staging
git bisect bad e4d8b7c1a95fffcfa4bdab9aa7ffd1cf590cdcf5
# bad: [af1dfe1ec0864e6700237a43cc36018176f9eba9] acpi: update expected DSDT files with _UID changes
git bisect bad af1dfe1ec0864e6700237a43cc36018176f9eba9
# good: [d7df0ceee0fd2e512cd214a9074ebeeb40da3099] Merge remote-tracking branch 'remotes/philmd-gitlab/tags/sd-next-20200821' into staging
git bisect good d7df0ceee0fd2e512cd214a9074ebeeb40da3099
# good: [df82aa7fe10e46b675678977999d49bd586538f8] Merge remote-tracking branch 'remotes/edgar/tags/edgar/xilinx-next-2020-08-24.for-upstream' into staging
git bisect good df82aa7fe10e46b675678977999d49bd586538f8
# good: [e39a8320b088dd5efc9ebaafe387e52b3d962665] target/riscv: Support the Virtual Instruction fault
git bisect good e39a8320b088dd5efc9ebaafe387e52b3d962665
# good: [a6841a2de66fa44fe52ed996b70f9fb9f7bd6ca7] qcow2: Add subcluster support to qcow2_co_pwrite_zeroes()
git bisect good a6841a2de66fa44fe52ed996b70f9fb9f7bd6ca7
# good: [2f8cd515477edab1cbf38ecbdbfa2cac13ce1550] hw/display/artist: Fix invalidation of lines near screen border
git bisect good 2f8cd515477edab1cbf38ecbdbfa2cac13ce1550
# good: [a5d3cfa2dc775e5d99f013703b8508f1d989d588] iotests: Add tests for qcow2 images with extended L2 entries
git bisect good a5d3cfa2dc775e5d99f013703b8508f1d989d588
# good: [8e49197ca5e76fdb8928833b2649ef13fc5aab2f] Merge remote-tracking branch 'remotes/hdeller/tags/target-hppa-v3-pull-request' into staging
git bisect good 8e49197ca5e76fdb8928833b2649ef13fc5aab2f
# bad: [af1b80ae56c9495999e8ccf7b70ef894378de642] i386/acpi: fix inconsistent QEMU/OVMF device paths
git bisect bad af1b80ae56c9495999e8ccf7b70ef894378de642
# good: [42a62c20925e02aef0d849f92a0e9540888e79ae] acpi: allow DSDT changes
git bisect good 42a62c20925e02aef0d849f92a0e9540888e79ae
# first bad commit: [af1b80ae56c9495999e8ccf7b70ef894378de642] i386/acpi: fix inconsistent QEMU/OVMF device paths



^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 1/2] i386/acpi: fix inconsistent QEMU/OVMF device paths
  2021-02-27 19:41 ` Thomas Lamprecht
@ 2021-02-28  9:11   ` vit9696
  2021-02-28 10:43     ` Thomas Lamprecht
  2021-02-28 20:43   ` Michael S. Tsirkin
  1 sibling, 1 reply; 30+ messages in thread
From: vit9696 @ 2021-02-28  9:11 UTC (permalink / raw)
  To: Thomas Lamprecht
  Cc: Michael S. Tsirkin, qemu devel list, Eduardo Habkost,
	Paolo Bonzini, Igor Mammedov, Laszlo Ersek, Richard Henderson,
	Stefan Reiter


[-- Attachment #1.1: Type: text/plain, Size: 12789 bytes --]

Hi Thomas,

For us this breaks the ability to control the boot options between the operating system and the OVMF. It happens because the operating system builds the DPs based on ACPI (in fact the only source available to it), while OVMF uses another source. The previous behaviour also violates the specification, so I do not believe there is room for reverting it. I believe it is also not possible to update QEMU to internally use the 1 UID, since it may conflict with the case when there are multiple PCI bus.

In my opinion, the most logical workaround is to provide in-guest steps to update VM configuration to account for this.

Best regards,
Vitaly

> 27 февр. 2021 г., в 22:41, Thomas Lamprecht <t.lamprecht@proxmox.com> написал(а):
> 
> 
> On 30.07.20 17:58, Michael S. Tsirkin wrote:
>> macOS uses ACPI UIDs to build the DevicePath for NVRAM boot options,
>> while OVMF firmware gets them via an internal channel through QEMU.
>> Due to a bug in QEMU ACPI currently UEFI firmware and ACPI have
>> different values, and this makes the underlying operating system
>> unable to report its boot option.
>> 
>> The particular node in question is the primary PciRoot (PCI0 in ACPI),
>> which for some reason gets assigned 1 in ACPI UID and 0 in the
>> DevicePath. This is due to the _UID assigned to it by build_dsdt in
>> hw/i386/acpi-build.c Which does not correspond to the primary PCI
>> identifier given by pcibus_num in hw/pci/pci.c
>> 
>> Reference with the device paths, OVMF startup logs, and ACPI table
>> dumps (SysReport):
>> https://github.com/acidanthera/bugtracker/issues/1050
>> 
>> In UEFI v2.8, section "10.4.2 Rules with ACPI _HID and _UID" ends with
>> the paragraph,
>> 
>>    Root PCI bridges will use the plug and play ID of PNP0A03, This will
>>    be stored in the ACPI Device Path _HID field, or in the Expanded
>>    ACPI Device Path _CID field to match the ACPI name space. The _UID
>>    in the ACPI Device Path structure must match the _UID in the ACPI
>>    name space.
>> 
>> (See especially the last sentence.)
>> 
>> Considering *extra* root bridges / root buses (with bus number > 0),
>> QEMU's ACPI generator actually does the right thing; since QEMU commit
>> c96d9286a6d7 ("i386/acpi-build: more traditional _UID and _HID for PXB
>> root buses", 2015-06-11).
>> 
>> However, the _UID values for root bridge zero (on both i440fx and q35)
>> have always been "wrong" (from UEFI perspective), going back in QEMU to
>> commit 74523b850189 ("i386: add ACPI table files from seabios",
>> 2013-10-14).
>> 
>> Even in SeaBIOS, these _UID values have always been 1; see commit
>> a4d357638c57 ("Port rombios32 code from bochs-bios.", 2008-03-08) for
>> i440fx, and commit ecbe3fd61511 ("seabios: q35: add dsdt", 2012-12-01)
>> for q35.
>> 
>> Suggested-by: Laszlo Ersek <lersek@redhat.com>
>> Tested-by: vit9696 <vit9696@protonmail.com>
>> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
>> ---
>> hw/i386/acpi-build.c | 4 ++--
>> 1 file changed, 2 insertions(+), 2 deletions(-)
>> 
>> diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c
>> index b7bcbbbb2a..7a5a8b3521 100644
>> --- a/hw/i386/acpi-build.c
>> +++ b/hw/i386/acpi-build.c
>> @@ -1497,7 +1497,7 @@ build_dsdt(GArray *table_data, BIOSLinker *linker,
>>         dev = aml_device("PCI0");
>>         aml_append(dev, aml_name_decl("_HID", aml_eisaid("PNP0A03")));
>>         aml_append(dev, aml_name_decl("_ADR", aml_int(0)));
>> -        aml_append(dev, aml_name_decl("_UID", aml_int(1)));
>> +        aml_append(dev, aml_name_decl("_UID", aml_int(0)));
>>         aml_append(sb_scope, dev);
>>         aml_append(dsdt, sb_scope);
>> 
>> @@ -1512,7 +1512,7 @@ build_dsdt(GArray *table_data, BIOSLinker *linker,
>>         aml_append(dev, aml_name_decl("_HID", aml_eisaid("PNP0A08")));
>>         aml_append(dev, aml_name_decl("_CID", aml_eisaid("PNP0A03")));
>>         aml_append(dev, aml_name_decl("_ADR", aml_int(0)));
>> -        aml_append(dev, aml_name_decl("_UID", aml_int(1)));
>> +        aml_append(dev, aml_name_decl("_UID", aml_int(0)));
>>         aml_append(dev, build_q35_osc_method());
>>         aml_append(sb_scope, dev);
>>         aml_append(dsdt, sb_scope);
>> 
> 
> This "breaks" Windows guests created/installed before this change in the sense
> of Windows gets confused and declares that most of the devices changed and thus
> it has new entries for them in the device manager where settings of the old one
> do not apply anymore.
> 
> We were made aware of this by our users when making QEMU 5.2.0 available on
> a more used repository of us. Users complained that their static network
> configuration got thrown out in Windows 2016 or 2019 server VMs, and Windows tried
> to use DHCP (which was not available in their environments) and thus their Windows
> VMs had no network connectivity at all anymore.
> 
> It's currently not yet quite 100% clear to me with what QEMU version the Windows VM
> must be installed with, from reading the patch I have to believe it must be before
> that, but we got mixed reports and a colleague could not replicate it from upgrade
> of 4.0 to 5.2 (I did /not/ confirm that one). Anyway, just writing this all to avoid
> people seeing different results and brushing this off.
> 
> So here's my personal reproducer, as said, I think that one should be able to just
> use QEMU 5.1 to install a Windows guest and start it with 5.2 afterwards to see this
> issue, but YMMV.
> 
> Note. I always used the exact same QEMU command (see below) for installation,
> reproducing and bisect.
> 
> 1. Installed Windows 2016 1616 VM using QEMU 3.0.1
>   - VirtIO net/scsi driver from VirtIO win 190
> 2. Setup static network in the VM and shutdown
> 3. Started VM with 5.2.0 -> Network gone, new "Ethernet #2" adapter shows up instead
> 
> Starting the  "Device Manager" and enabling "View -> Show hidden devices" showed
> me a greyed out device duplicate for basically anything attached, SCSI disk, Basic
> Display Adapter, CDROM device, ..., and the Network device.
> 
> The first difference I could find was the "Device instance path" one can find in
> the "Details" tab of the devices' "Properties" window.
> 
> # old, from initial installation on QEMU 3.0.1
> PCI\VEN_1AF4&DEV_1000&SUBSYS_00011AF4&REV_00\3&13C0B0C5&0&90
> 
> # new, from boot with QEMU 5.2
> PCI\VEN_1AF4&DEV_1000&SUBSYS_00011AF4&REV_00\3&267A616A&0&90
> 
> They match until almost the end, not sure how important that is, but it caught my
> eye (I'm really no windows guy since a decade so please excuse my terrible
> debugging/exploring skills there. The rest of those properties looked pretty
> much identical.
> 
> I then started a bisect, always just restarting the guest with the new QEMU build
> and checking "Device Manager" and network settings to see if good/bad. That worked
> pretty well and I came to this commit. See the bisect log attached at the end of
> this mail.
> 
> So, from reading the commit message I figure that this change is wanted, what are
> the implications of just reverting it? (which works out in bringing back the
> old state in Windows + working static network config again).
> 
> Or any other way/idea to address this in a sane way so that those picky Windows
> guests can be handled more graciously?
> 
> I guess also that there could be more subtle effects from this patch here, the
> network one may have just had quite visible effects to pop up as first issue...
> 
> Thanks if you read so far!
> 
> cheers,
> Thomas
> 
> 
> = QEMU Command =
> 
> (This was generated by our (Proxmox VE) stack, I only cleaned it up a bit to allow
> easier manual running it)
> 
> ./qemu-system-x86_64 \
>  -name win2016 \
>  -chardev 'socket,id=qmp,path=/var/run/qemu-server/11765.qmp,server,nowait' \
>  -mon 'chardev=qmp,mode=control' \
>  -smbios 'type=1,uuid=6324fb28-e98a-44cf-85db-694d1b3405f5' \
>  -smp '2,sockets=1,cores=2,maxcpus=2' \
>  -nodefaults \
>  -boot 'menu=on,strict=on,reboot-timeout=1000' \
>  -vnc unix:/var/run/qemu-server/11765.vnc,password \
>  -no-hpet \
>  -cpu 'host,hv_ipi,hv_relaxed,hv_reset,hv_runtime,hv_spinlocks=0x1fff,hv_stimer,hv_synic,hv_time,hv_vapic,hv_vpindex,+kvm_pv_eoi,+kvm_pv_unhalt,+md-clear,+pcid,+spec-ctrl' \
>  -m 2048 \
>  -device 'pci-bridge,id=pci.1,chassis_nr=1,bus=pci.0,addr=0x1e' \
>  -device 'pci-bridge,id=pci.2,chassis_nr=2,bus=pci.0,addr=0x1f' \
>  -device 'vmgenid,guid=2e56e6ca-2cf8-4f1d-8cc3-9b19a2510c01' \
>  -device 'piix3-usb-uhci,id=uhci,bus=pci.0,addr=0x1.0x2' \
>  -device 'usb-tablet,id=tablet,bus=uhci.0,port=1' \
>  -device 'VGA,id=vga,bus=pci.0,addr=0x2,edid=off' \
>  -chardev 'socket,path=/var/run/qemu-server/11765.qga,server,nowait,id=qga0' \
>  -device 'virtio-serial,id=qga0,bus=pci.0,addr=0x8' \
>  -device 'virtserialport,chardev=qga0,name=org.qemu.guest_agent.0' \
>  -device 'virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x3' \
>  -iscsi 'initiator-name=iqn.1993-08.org.debian:01:468faae9322b' \
>  -drive 'file=/mnt/pve/iso/template/iso/virtio-win-0.1.190.iso,if=none,id=drive-ide0,media=cdrom,aio=threads' \
>  -device 'ide-cd,bus=ide.0,unit=0,drive=drive-ide0,id=ide0,bootindex=200' \
>  -drive 'file=/mnt/pve/iso/template/iso/Win2016-1616-evaluation.ISO,if=none,id=drive-ide2,media=cdrom,aio=threads' \
>  -device 'ide-cd,bus=ide.1,unit=0,drive=drive-ide2,id=ide2,bootindex=201' \
>  -device 'virtio-scsi-pci,id=scsihw0,bus=pci.0,addr=0x5' \
>  -drive 'file=/dev/WDnvme/vm-11765-disk-0,if=none,id=drive-scsi0,format=raw,cache=none,aio=native,detect-zeroes=on' \
>  -device 'scsi-hd,bus=scsihw0.0,channel=0,scsi-id=0,lun=0,drive=drive-scsi0,id=scsi0,rotation_rate=1,bootindex=100' \
>  -netdev 'type=tap,id=net0,ifname=tap11765i0,script=/var/lib/qemu-server/pve-bridge,downscript=/var/lib/qemu-server/pve-bridgedown,vhost=on' \
>  -device 'virtio-net-pci,mac=02:98:90:43:42:1D,netdev=net0,bus=pci.0,addr=0x12,id=net0,bootindex=300' \
>  -rtc 'driftfix=slew,base=localtime' \
>  -machine 'type=pc' \
>  -global 'kvm-pit.lost_tick_policy=discard'
> 
> 
> = bisect log =
> 
> git bisect start
> # bad: [553032db17440f8de011390e5a1cfddd13751b0b] Update version for v5.2.0 release
> git bisect bad 553032db17440f8de011390e5a1cfddd13751b0b
> # good: [d0ed6a69d399ae193959225cdeaa9382746c91cc] Update version for v5.1.0 release
> git bisect good d0ed6a69d399ae193959225cdeaa9382746c91cc
> # bad: [ed799805d00ccdda45eb8441c7d929624d9e98a6] qom: Add kernel-doc markup to introduction doc comment
> git bisect bad ed799805d00ccdda45eb8441c7d929624d9e98a6
> # bad: [e4d8b7c1a95fffcfa4bdab9aa7ffd1cf590cdcf5] Merge remote-tracking branch 'remotes/nvme/tags/pull-nvme-20200902' into staging
> git bisect bad e4d8b7c1a95fffcfa4bdab9aa7ffd1cf590cdcf5
> # bad: [af1dfe1ec0864e6700237a43cc36018176f9eba9] acpi: update expected DSDT files with _UID changes
> git bisect bad af1dfe1ec0864e6700237a43cc36018176f9eba9
> # good: [d7df0ceee0fd2e512cd214a9074ebeeb40da3099] Merge remote-tracking branch 'remotes/philmd-gitlab/tags/sd-next-20200821' into staging
> git bisect good d7df0ceee0fd2e512cd214a9074ebeeb40da3099
> # good: [df82aa7fe10e46b675678977999d49bd586538f8] Merge remote-tracking branch 'remotes/edgar/tags/edgar/xilinx-next-2020-08-24.for-upstream' into staging
> git bisect good df82aa7fe10e46b675678977999d49bd586538f8
> # good: [e39a8320b088dd5efc9ebaafe387e52b3d962665] target/riscv: Support the Virtual Instruction fault
> git bisect good e39a8320b088dd5efc9ebaafe387e52b3d962665
> # good: [a6841a2de66fa44fe52ed996b70f9fb9f7bd6ca7] qcow2: Add subcluster support to qcow2_co_pwrite_zeroes()
> git bisect good a6841a2de66fa44fe52ed996b70f9fb9f7bd6ca7
> # good: [2f8cd515477edab1cbf38ecbdbfa2cac13ce1550] hw/display/artist: Fix invalidation of lines near screen border
> git bisect good 2f8cd515477edab1cbf38ecbdbfa2cac13ce1550
> # good: [a5d3cfa2dc775e5d99f013703b8508f1d989d588] iotests: Add tests for qcow2 images with extended L2 entries
> git bisect good a5d3cfa2dc775e5d99f013703b8508f1d989d588
> # good: [8e49197ca5e76fdb8928833b2649ef13fc5aab2f] Merge remote-tracking branch 'remotes/hdeller/tags/target-hppa-v3-pull-request' into staging
> git bisect good 8e49197ca5e76fdb8928833b2649ef13fc5aab2f
> # bad: [af1b80ae56c9495999e8ccf7b70ef894378de642] i386/acpi: fix inconsistent QEMU/OVMF device paths
> git bisect bad af1b80ae56c9495999e8ccf7b70ef894378de642
> # good: [42a62c20925e02aef0d849f92a0e9540888e79ae] acpi: allow DSDT changes
> git bisect good 42a62c20925e02aef0d849f92a0e9540888e79ae
> # first bad commit: [af1b80ae56c9495999e8ccf7b70ef894378de642] i386/acpi: fix inconsistent QEMU/OVMF device paths


[-- Attachment #1.2: Type: text/html, Size: 109291 bytes --]

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 554 bytes --]

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 1/2] i386/acpi: fix inconsistent QEMU/OVMF device paths
  2021-02-28  9:11   ` vit9696
@ 2021-02-28 10:43     ` Thomas Lamprecht
  2021-02-28 20:45       ` Michael S. Tsirkin
       [not found]       ` <x3i3TiibtrC1qTDQUKxuOA_9qvmInzVwv6RrvzzSCSj-S21gLypbbZgEbYvJSGMxC1r8RaDrnHGgRbDI7vfpA_XuDINdZej9yKCW3_Sc4YM=@protonmail.com>
  0 siblings, 2 replies; 30+ messages in thread
From: Thomas Lamprecht @ 2021-02-28 10:43 UTC (permalink / raw)
  To: vit9696
  Cc: Eduardo Habkost, Michael S. Tsirkin, Stefan Reiter,
	qemu devel list, Igor Mammedov, Paolo Bonzini, Laszlo Ersek,
	Richard Henderson

Hi Vitaly,

On 28.02.21 10:11, vit9696 wrote:
> For us this breaks the ability to control the boot options between the operating system and the OVMF. It happens because the operating system builds the DPs based on ACPI (in fact the only source available to it), while OVMF uses another source. The previous behaviour also violates the specification, so I do not believe there is room for reverting it. I believe it is also not possible to update QEMU to internally use the 1 UID, since it may conflict with the case when there are multiple PCI bus.

I think you may have misunderstood me a little bit, I did not ask for this to
be reverted in upstream QEMU, it's quite clear to me that this should be the
new default behaviour and should have been since ever.

Albeit, I must ask what makes macOS special to not be allowed doing things that
Windows and Linux guest can do just fine?

I mainly asked for other drawbacks of such a revert as it is currently the
straight forward stop gap solution for us as downstream. What we probably will
do, is keeping this as default to the new standard behavior and adding a switch
to revert to the old one - our QEMU integration library in Proxmox VE can then
set this for old VMs and use the new standard for new ones on VM start, that
way we keep backward compatible - as only Windows VMs seems to be affected we
can even do this only for those (we have a OS type config property from which
we can derive this).

>
> In my opinion, the most logical workaround is to provide in-guest steps to update VM configuration to account for this.

Often the Hypervisor admin and Guest admin are not the same, so this is only
a small band-aid and for most helping only after the fact.

We also have quite easy to setup clustering so this means that such affected
VMs will seemingly break on migration to an update node for lots of users - for
us an unacceptable situation to expose our users with and honestly, I have a
hard time seeing me and colleagues to wish spending our nerves to direct
hundreds of reports to the documented solution (some will certainly find it on
their own, but whatever one does, lots won't) and dealing with their,
relatable, fit they'll throw and me having to hold back telling them off to
just use Linux instead ;-)

And I think that other integrator will get some reports too, and FWICT there's
no outside way an user can use to revert to the old behavior.
Note that QEMU 5.2 is not yet released in some major distributions, e.g.,
Debian will ship it with Bullseye which release is still months away, latest
Fedora (33) is shipping QEMU 5.1, so RHEL/CentOS are probably using something
even older and Ubuntu will only add it in 21.04, also two months away.

Currently, QEMU 5.2 which introduces this change, is only released in some is
released in faster moving targets, where Windows VMs are more often for
non-server workloads (educated guess) which again correlates with higher
probability to use of DHCP and not static address assignment (again, educated
guess) - which is the most obvious and noticeable thing we and our users saw
break.

Which brings me again to my other point, there may be lots of other things
breaking in a more subtle way, we do not know but can tell there's lots of
device reshuffling going on when checking out the Windows Device Manager I
cannot immagine that the loss of network configuration is the only thing that
breaks is the only thing that breaks.

So why all this fuss and wall of text? Because I think that this will affect
lots of users, most of them in distros which will only ship the problematic
QEMU version later this year. How many affected there will be: no idea, but we
got quite some reports (compared to usual small stuff breakage) with only
rolling this QEMU version out *partially*, to only some parts of our user base.

That's why I personally think it may be worth to think about adding a switch to
QEMU directly to keep the backwards compatible, albeit standard incompatible
behavior either as opt-in or opt-out to new standard-conform behavior. And
while I thought opt-out is the way to go when starting out this message, I now
rather think opt-in to new is, at least if rustling bells of users with
Windows + static IPs is thought to be worth to avoid. As said, if there's quorum
against this, we can live fine with keeping that switch as downstream patch but
I'd like to avoid that and certainly won't just rush forward shipping it but
wait until next week, maybe there are some other opinions or better ideas.

cheers,
Thomas



^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 1/2] i386/acpi: fix inconsistent QEMU/OVMF device paths
  2021-02-27 19:41 ` Thomas Lamprecht
  2021-02-28  9:11   ` vit9696
@ 2021-02-28 20:43   ` Michael S. Tsirkin
  2021-03-01  7:12     ` Thomas Lamprecht
  2021-03-01 13:28     ` Igor Mammedov
  1 sibling, 2 replies; 30+ messages in thread
From: Michael S. Tsirkin @ 2021-02-28 20:43 UTC (permalink / raw)
  To: Thomas Lamprecht
  Cc: Eduardo Habkost, Stefan Reiter, qemu-devel, Igor Mammedov,
	Paolo Bonzini, Richard Henderson, Laszlo Ersek, vit9696

On Sat, Feb 27, 2021 at 08:41:11PM +0100, Thomas Lamprecht wrote:
> On 30.07.20 17:58, Michael S. Tsirkin wrote:
> > macOS uses ACPI UIDs to build the DevicePath for NVRAM boot options,
> > while OVMF firmware gets them via an internal channel through QEMU.
> > Due to a bug in QEMU ACPI currently UEFI firmware and ACPI have
> > different values, and this makes the underlying operating system
> > unable to report its boot option.
> > 
> > The particular node in question is the primary PciRoot (PCI0 in ACPI),
> > which for some reason gets assigned 1 in ACPI UID and 0 in the
> > DevicePath. This is due to the _UID assigned to it by build_dsdt in
> > hw/i386/acpi-build.c Which does not correspond to the primary PCI
> > identifier given by pcibus_num in hw/pci/pci.c
> > 
> > Reference with the device paths, OVMF startup logs, and ACPI table
> > dumps (SysReport):
> > https://github.com/acidanthera/bugtracker/issues/1050
> > 
> > In UEFI v2.8, section "10.4.2 Rules with ACPI _HID and _UID" ends with
> > the paragraph,
> > 
> >     Root PCI bridges will use the plug and play ID of PNP0A03, This will
> >     be stored in the ACPI Device Path _HID field, or in the Expanded
> >     ACPI Device Path _CID field to match the ACPI name space. The _UID
> >     in the ACPI Device Path structure must match the _UID in the ACPI
> >     name space.
> > 
> > (See especially the last sentence.)
> > 
> > Considering *extra* root bridges / root buses (with bus number > 0),
> > QEMU's ACPI generator actually does the right thing; since QEMU commit
> > c96d9286a6d7 ("i386/acpi-build: more traditional _UID and _HID for PXB
> > root buses", 2015-06-11).
> > 
> > However, the _UID values for root bridge zero (on both i440fx and q35)
> > have always been "wrong" (from UEFI perspective), going back in QEMU to
> > commit 74523b850189 ("i386: add ACPI table files from seabios",
> > 2013-10-14).
> > 
> > Even in SeaBIOS, these _UID values have always been 1; see commit
> > a4d357638c57 ("Port rombios32 code from bochs-bios.", 2008-03-08) for
> > i440fx, and commit ecbe3fd61511 ("seabios: q35: add dsdt", 2012-12-01)
> > for q35.
> > 
> > Suggested-by: Laszlo Ersek <lersek@redhat.com>
> > Tested-by: vit9696 <vit9696@protonmail.com>
> > Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
> > ---
> >  hw/i386/acpi-build.c | 4 ++--
> >  1 file changed, 2 insertions(+), 2 deletions(-)
> > 
> > diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c
> > index b7bcbbbb2a..7a5a8b3521 100644
> > --- a/hw/i386/acpi-build.c
> > +++ b/hw/i386/acpi-build.c
> > @@ -1497,7 +1497,7 @@ build_dsdt(GArray *table_data, BIOSLinker *linker,
> >          dev = aml_device("PCI0");
> >          aml_append(dev, aml_name_decl("_HID", aml_eisaid("PNP0A03")));
> >          aml_append(dev, aml_name_decl("_ADR", aml_int(0)));
> > -        aml_append(dev, aml_name_decl("_UID", aml_int(1)));
> > +        aml_append(dev, aml_name_decl("_UID", aml_int(0)));
> >          aml_append(sb_scope, dev);
> >          aml_append(dsdt, sb_scope);
> >  
> > @@ -1512,7 +1512,7 @@ build_dsdt(GArray *table_data, BIOSLinker *linker,
> >          aml_append(dev, aml_name_decl("_HID", aml_eisaid("PNP0A08")));
> >          aml_append(dev, aml_name_decl("_CID", aml_eisaid("PNP0A03")));
> >          aml_append(dev, aml_name_decl("_ADR", aml_int(0)));
> > -        aml_append(dev, aml_name_decl("_UID", aml_int(1)));
> > +        aml_append(dev, aml_name_decl("_UID", aml_int(0)));
> >          aml_append(dev, build_q35_osc_method());
> >          aml_append(sb_scope, dev);
> >          aml_append(dsdt, sb_scope);
> > 
> 
> This "breaks" Windows guests created/installed before this change in the sense
> of Windows gets confused and declares that most of the devices changed and thus
> it has new entries for them in the device manager where settings of the old one
> do not apply anymore.
> 
> We were made aware of this by our users when making QEMU 5.2.0 available on
> a more used repository of us. Users complained that their static network
> configuration got thrown out in Windows 2016 or 2019 server VMs, and Windows tried
> to use DHCP (which was not available in their environments) and thus their Windows
> VMs had no network connectivity at all anymore.
> 
> It's currently not yet quite 100% clear to me with what QEMU version the Windows VM
> must be installed with, from reading the patch I have to believe it must be before
> that, but we got mixed reports and a colleague could not replicate it from upgrade
> of 4.0 to 5.2 (I did /not/ confirm that one). Anyway, just writing this all to avoid
> people seeing different results and brushing this off.
> 
> So here's my personal reproducer, as said, I think that one should be able to just
> use QEMU 5.1 to install a Windows guest and start it with 5.2 afterwards to see this
> issue, but YMMV.
> 
> Note. I always used the exact same QEMU command (see below) for installation,
> reproducing and bisect.
> 
> 1. Installed Windows 2016 1616 VM using QEMU 3.0.1
>    - VirtIO net/scsi driver from VirtIO win 190
> 2. Setup static network in the VM and shutdown
> 3. Started VM with 5.2.0 -> Network gone, new "Ethernet #2" adapter shows up instead
> 
> Starting the  "Device Manager" and enabling "View -> Show hidden devices" showed
> me a greyed out device duplicate for basically anything attached, SCSI disk, Basic
> Display Adapter, CDROM device, ..., and the Network device.
> 
> The first difference I could find was the "Device instance path" one can find in
> the "Details" tab of the devices' "Properties" window.
> 
> # old, from initial installation on QEMU 3.0.1
> PCI\VEN_1AF4&DEV_1000&SUBSYS_00011AF4&REV_00\3&13C0B0C5&0&90
> 
> # new, from boot with QEMU 5.2
> PCI\VEN_1AF4&DEV_1000&SUBSYS_00011AF4&REV_00\3&267A616A&0&90
> 
> They match until almost the end, not sure how important that is, but it caught my
> eye (I'm really no windows guy since a decade so please excuse my terrible
> debugging/exploring skills there. The rest of those properties looked pretty
> much identical.
> 
> I then started a bisect, always just restarting the guest with the new QEMU build
> and checking "Device Manager" and network settings to see if good/bad. That worked
> pretty well and I came to this commit. See the bisect log attached at the end of
> this mail.
> 
> So, from reading the commit message I figure that this change is wanted, what are
> the implications of just reverting it? (which works out in bringing back the
> old state in Windows + working static network config again).
> 
> Or any other way/idea to address this in a sane way so that those picky Windows
> guests can be handled more graciously?

Sure. The way to do that is to tie old behaviour to old machine
versions. We'll need it in stable too ...

Igor want to cook up a patch?


> I guess also that there could be more subtle effects from this patch here, the
> network one may have just had quite visible effects to pop up as first issue...
> 
> Thanks if you read so far!
> 
> cheers,
> Thomas
> 
> 
> = QEMU Command =
> 
> (This was generated by our (Proxmox VE) stack, I only cleaned it up a bit to allow
> easier manual running it)
> 
> ./qemu-system-x86_64 \
>   -name win2016 \
>   -chardev 'socket,id=qmp,path=/var/run/qemu-server/11765.qmp,server,nowait' \
>   -mon 'chardev=qmp,mode=control' \
>   -smbios 'type=1,uuid=6324fb28-e98a-44cf-85db-694d1b3405f5' \
>   -smp '2,sockets=1,cores=2,maxcpus=2' \
>   -nodefaults \
>   -boot 'menu=on,strict=on,reboot-timeout=1000' \
>   -vnc unix:/var/run/qemu-server/11765.vnc,password \
>   -no-hpet \
>   -cpu 'host,hv_ipi,hv_relaxed,hv_reset,hv_runtime,hv_spinlocks=0x1fff,hv_stimer,hv_synic,hv_time,hv_vapic,hv_vpindex,+kvm_pv_eoi,+kvm_pv_unhalt,+md-clear,+pcid,+spec-ctrl' \
>   -m 2048 \
>   -device 'pci-bridge,id=pci.1,chassis_nr=1,bus=pci.0,addr=0x1e' \
>   -device 'pci-bridge,id=pci.2,chassis_nr=2,bus=pci.0,addr=0x1f' \
>   -device 'vmgenid,guid=2e56e6ca-2cf8-4f1d-8cc3-9b19a2510c01' \
>   -device 'piix3-usb-uhci,id=uhci,bus=pci.0,addr=0x1.0x2' \
>   -device 'usb-tablet,id=tablet,bus=uhci.0,port=1' \
>   -device 'VGA,id=vga,bus=pci.0,addr=0x2,edid=off' \
>   -chardev 'socket,path=/var/run/qemu-server/11765.qga,server,nowait,id=qga0' \
>   -device 'virtio-serial,id=qga0,bus=pci.0,addr=0x8' \
>   -device 'virtserialport,chardev=qga0,name=org.qemu.guest_agent.0' \
>   -device 'virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x3' \
>   -iscsi 'initiator-name=iqn.1993-08.org.debian:01:468faae9322b' \
>   -drive 'file=/mnt/pve/iso/template/iso/virtio-win-0.1.190.iso,if=none,id=drive-ide0,media=cdrom,aio=threads' \
>   -device 'ide-cd,bus=ide.0,unit=0,drive=drive-ide0,id=ide0,bootindex=200' \
>   -drive 'file=/mnt/pve/iso/template/iso/Win2016-1616-evaluation.ISO,if=none,id=drive-ide2,media=cdrom,aio=threads' \
>   -device 'ide-cd,bus=ide.1,unit=0,drive=drive-ide2,id=ide2,bootindex=201' \
>   -device 'virtio-scsi-pci,id=scsihw0,bus=pci.0,addr=0x5' \
>   -drive 'file=/dev/WDnvme/vm-11765-disk-0,if=none,id=drive-scsi0,format=raw,cache=none,aio=native,detect-zeroes=on' \
>   -device 'scsi-hd,bus=scsihw0.0,channel=0,scsi-id=0,lun=0,drive=drive-scsi0,id=scsi0,rotation_rate=1,bootindex=100' \
>   -netdev 'type=tap,id=net0,ifname=tap11765i0,script=/var/lib/qemu-server/pve-bridge,downscript=/var/lib/qemu-server/pve-bridgedown,vhost=on' \
>   -device 'virtio-net-pci,mac=02:98:90:43:42:1D,netdev=net0,bus=pci.0,addr=0x12,id=net0,bootindex=300' \
>   -rtc 'driftfix=slew,base=localtime' \
>   -machine 'type=pc' \
>   -global 'kvm-pit.lost_tick_policy=discard'
> 
> 
> = bisect log =
> 
> git bisect start
> # bad: [553032db17440f8de011390e5a1cfddd13751b0b] Update version for v5.2.0 release
> git bisect bad 553032db17440f8de011390e5a1cfddd13751b0b
> # good: [d0ed6a69d399ae193959225cdeaa9382746c91cc] Update version for v5.1.0 release
> git bisect good d0ed6a69d399ae193959225cdeaa9382746c91cc
> # bad: [ed799805d00ccdda45eb8441c7d929624d9e98a6] qom: Add kernel-doc markup to introduction doc comment
> git bisect bad ed799805d00ccdda45eb8441c7d929624d9e98a6
> # bad: [e4d8b7c1a95fffcfa4bdab9aa7ffd1cf590cdcf5] Merge remote-tracking branch 'remotes/nvme/tags/pull-nvme-20200902' into staging
> git bisect bad e4d8b7c1a95fffcfa4bdab9aa7ffd1cf590cdcf5
> # bad: [af1dfe1ec0864e6700237a43cc36018176f9eba9] acpi: update expected DSDT files with _UID changes
> git bisect bad af1dfe1ec0864e6700237a43cc36018176f9eba9
> # good: [d7df0ceee0fd2e512cd214a9074ebeeb40da3099] Merge remote-tracking branch 'remotes/philmd-gitlab/tags/sd-next-20200821' into staging
> git bisect good d7df0ceee0fd2e512cd214a9074ebeeb40da3099
> # good: [df82aa7fe10e46b675678977999d49bd586538f8] Merge remote-tracking branch 'remotes/edgar/tags/edgar/xilinx-next-2020-08-24.for-upstream' into staging
> git bisect good df82aa7fe10e46b675678977999d49bd586538f8
> # good: [e39a8320b088dd5efc9ebaafe387e52b3d962665] target/riscv: Support the Virtual Instruction fault
> git bisect good e39a8320b088dd5efc9ebaafe387e52b3d962665
> # good: [a6841a2de66fa44fe52ed996b70f9fb9f7bd6ca7] qcow2: Add subcluster support to qcow2_co_pwrite_zeroes()
> git bisect good a6841a2de66fa44fe52ed996b70f9fb9f7bd6ca7
> # good: [2f8cd515477edab1cbf38ecbdbfa2cac13ce1550] hw/display/artist: Fix invalidation of lines near screen border
> git bisect good 2f8cd515477edab1cbf38ecbdbfa2cac13ce1550
> # good: [a5d3cfa2dc775e5d99f013703b8508f1d989d588] iotests: Add tests for qcow2 images with extended L2 entries
> git bisect good a5d3cfa2dc775e5d99f013703b8508f1d989d588
> # good: [8e49197ca5e76fdb8928833b2649ef13fc5aab2f] Merge remote-tracking branch 'remotes/hdeller/tags/target-hppa-v3-pull-request' into staging
> git bisect good 8e49197ca5e76fdb8928833b2649ef13fc5aab2f
> # bad: [af1b80ae56c9495999e8ccf7b70ef894378de642] i386/acpi: fix inconsistent QEMU/OVMF device paths
> git bisect bad af1b80ae56c9495999e8ccf7b70ef894378de642
> # good: [42a62c20925e02aef0d849f92a0e9540888e79ae] acpi: allow DSDT changes
> git bisect good 42a62c20925e02aef0d849f92a0e9540888e79ae
> # first bad commit: [af1b80ae56c9495999e8ccf7b70ef894378de642] i386/acpi: fix inconsistent QEMU/OVMF device paths



^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 1/2] i386/acpi: fix inconsistent QEMU/OVMF device paths
  2021-02-28 10:43     ` Thomas Lamprecht
@ 2021-02-28 20:45       ` Michael S. Tsirkin
       [not found]       ` <x3i3TiibtrC1qTDQUKxuOA_9qvmInzVwv6RrvzzSCSj-S21gLypbbZgEbYvJSGMxC1r8RaDrnHGgRbDI7vfpA_XuDINdZej9yKCW3_Sc4YM=@protonmail.com>
  1 sibling, 0 replies; 30+ messages in thread
From: Michael S. Tsirkin @ 2021-02-28 20:45 UTC (permalink / raw)
  To: Thomas Lamprecht
  Cc: vit9696, Stefan Reiter, qemu devel list, Igor Mammedov,
	Paolo Bonzini, Richard Henderson, Laszlo Ersek, Eduardo Habkost

On Sun, Feb 28, 2021 at 11:43:55AM +0100, Thomas Lamprecht wrote:
> Hi Vitaly,
> 
> On 28.02.21 10:11, vit9696 wrote:
> > For us this breaks the ability to control the boot options between the operating system and the OVMF. It happens because the operating system builds the DPs based on ACPI (in fact the only source available to it), while OVMF uses another source. The previous behaviour also violates the specification, so I do not believe there is room for reverting it. I believe it is also not possible to update QEMU to internally use the 1 UID, since it may conflict with the case when there are multiple PCI bus.
> 
> I think you may have misunderstood me a little bit, I did not ask for this to
> be reverted in upstream QEMU, it's quite clear to me that this should be the
> new default behaviour and should have been since ever.

We do make an effort to avoid guest visible changes within machine
version though. this is what we should do here I think.

-- 
MST



^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 1/2] i386/acpi: fix inconsistent QEMU/OVMF device paths
       [not found]       ` <x3i3TiibtrC1qTDQUKxuOA_9qvmInzVwv6RrvzzSCSj-S21gLypbbZgEbYvJSGMxC1r8RaDrnHGgRbDI7vfpA_XuDINdZej9yKCW3_Sc4YM=@protonmail.com>
@ 2021-02-28 21:37         ` Michael S. Tsirkin
  0 siblings, 0 replies; 30+ messages in thread
From: Michael S. Tsirkin @ 2021-02-28 21:37 UTC (permalink / raw)
  To: vit9696
  Cc: Eduardo Habkost, Stefan Reiter, qemu devel list, Igor Mammedov,
	Paolo Bonzini, Laszlo Ersek, Thomas Lamprecht, Richard Henderson

On Sun, Feb 28, 2021 at 09:28:26PM +0000, vit9696 wrote:
> Thomas, macOS is not really "special" here, it is rather that you will not
> frequently use boot options in a VM. One of the most popular uses for boot
> options is to switch between the operating systems, but for virtual machines
> it is rarely the case. However, macOS does indeed use boot options for itself.
> One example is to install updates. As long as the created boot option is not
> valid an automated reboot during the update installation may result in the
> wrong bootloader being chosen or in a stall within the firmware UI awaiting
> manual boot option selection.
> 
> Michael, does your suggestion mean that the default approach will be to keep
> the new behaviour, but if you manually specify an older q35 machine version it
> will provide the original behaviour. If so, it seems fair to me.
> 
> Best regards,
> Vitaly

Exactly. Vitaly, could you cook up a patch like this?

-- 
MST



^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 1/2] i386/acpi: fix inconsistent QEMU/OVMF device paths
  2021-02-28 20:43   ` Michael S. Tsirkin
@ 2021-03-01  7:12     ` Thomas Lamprecht
  2021-03-01  7:20       ` Michael S. Tsirkin
  2021-03-01 13:28     ` Igor Mammedov
  1 sibling, 1 reply; 30+ messages in thread
From: Thomas Lamprecht @ 2021-03-01  7:12 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Eduardo Habkost, Stefan Reiter, qemu-devel, Igor Mammedov,
	Paolo Bonzini, Richard Henderson, Laszlo Ersek, vit9696

On 28.02.21 21:43, Michael S. Tsirkin wrote:
> Sure. The way to do that is to tie old behaviour to old machine
> versions. We'll need it in stable too ...

Yeah, using machine types is how its meant to be with solving migration
breakage, sure.
But that means we have to permanently pin the VM, and any backup restored from
that to that machine type *forever*. That'd be new for us as we always could
allow a newer machine type for a fresh start (i.e., non migration or the like)
here, and mean that lots of other improvements guarded by a newer machine type
for those VMs will.

Why not a switch + machine type, solves migration and any special cases of it
but also allows machine updates but also to keep the old behavior?

And yeah, stable is wanted, but extrapolating from the current stable releases
frequency, where normally there's maximal one after 5-6 months from the .0
release, means that this will probably still hit all those distributions I
mentioned or is there something more soon planned?

Also, is there any regression testing infrastructure around to avoid such
changes in the future? This change got undetected for 7 months, which can be
pretty the norm for QEMU releases, so some earlier safety net would be good? Is
there anything which dumps various default machine HW layouts and uses them for
an ABI check of some sorts?



^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 1/2] i386/acpi: fix inconsistent QEMU/OVMF device paths
  2021-03-01  7:12     ` Thomas Lamprecht
@ 2021-03-01  7:20       ` Michael S. Tsirkin
  2021-03-01  7:45         ` Thomas Lamprecht
  0 siblings, 1 reply; 30+ messages in thread
From: Michael S. Tsirkin @ 2021-03-01  7:20 UTC (permalink / raw)
  To: Thomas Lamprecht
  Cc: Eduardo Habkost, Stefan Reiter, qemu-devel, Igor Mammedov,
	Paolo Bonzini, Richard Henderson, Laszlo Ersek, vit9696

On Mon, Mar 01, 2021 at 08:12:35AM +0100, Thomas Lamprecht wrote:
> On 28.02.21 21:43, Michael S. Tsirkin wrote:
> > Sure. The way to do that is to tie old behaviour to old machine
> > versions. We'll need it in stable too ...
> 
> Yeah, using machine types is how its meant to be with solving migration
> breakage, sure.
> But that means we have to permanently pin the VM, and any backup restored from
> that to that machine type *forever*. That'd be new for us as we always could
> allow a newer machine type for a fresh start (i.e., non migration or the like)
> here, and mean that lots of other improvements guarded by a newer machine type
> for those VMs will.

If you don't do that, that is a bug as any virtual hardware
can change across machine types.

> Why not a switch + machine type, solves migration and any special cases of it
> but also allows machine updates but also to keep the old behavior?

I am not really sure what you mean here, sound like a new feature -
at a guess it will take a while to formulate and is unlikely
to be backported to stable and so help with historical
releases.

> And yeah, stable is wanted, but extrapolating from the current stable releases
> frequency, where normally there's maximal one after 5-6 months from the .0
> release, means that this will probably still hit all those distributions I
> mentioned or is there something more soon planned?
> 
> Also, is there any regression testing infrastructure around to avoid such
> changes in the future? This change got undetected for 7 months, which can be
> pretty the norm for QEMU releases, so some earlier safety net would be good? Is
> there anything which dumps various default machine HW layouts and uses them for
> an ABI check of some sorts?

There are various testing efforts the reason this got undetected is
because it does not affect linux guests, and even for windows
they kind of recover, there's just some boot slowdown around reconfiguration.
Not easy to detect automatically given windows has lots of random
downtime during boot around updates etc etc.

-- 
MST



^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 1/2] i386/acpi: fix inconsistent QEMU/OVMF device paths
  2021-03-01  7:20       ` Michael S. Tsirkin
@ 2021-03-01  7:45         ` Thomas Lamprecht
  2021-03-01 14:20           ` Igor Mammedov
  2021-03-01 15:31           ` Michael S. Tsirkin
  0 siblings, 2 replies; 30+ messages in thread
From: Thomas Lamprecht @ 2021-03-01  7:45 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Eduardo Habkost, Stefan Reiter, qemu-devel, Igor Mammedov,
	Paolo Bonzini, Richard Henderson, Laszlo Ersek, vit9696

On 01.03.21 08:20, Michael S. Tsirkin wrote:
> On Mon, Mar 01, 2021 at 08:12:35AM +0100, Thomas Lamprecht wrote:
>> On 28.02.21 21:43, Michael S. Tsirkin wrote:
>>> Sure. The way to do that is to tie old behaviour to old machine
>>> versions. We'll need it in stable too ...
>>
>> Yeah, using machine types is how its meant to be with solving migration
>> breakage, sure.
>> But that means we have to permanently pin the VM, and any backup restored from
>> that to that machine type *forever*. That'd be new for us as we always could
>> allow a newer machine type for a fresh start (i.e., non migration or the like)
>> here, and mean that lots of other improvements guarded by a newer machine type
>> for those VMs will.
> 
> If you don't do that, that is a bug as any virtual hardware
> can change across machine types.

For us a feature, for fresh starts one gets the current virtual HW but for
live migration or our live snapshot code it stays compatible. Works quite
well here for many years, as we can simply test the HW changes on existing
VMs - which failed here due to lack of static IPs in the test bed. So yes,
it has its problems as it is not really  what an OS considers as HW change
so big that it makes it a new device, mostly Windows is a PITA here as seen
in this issue.

I mean, QEMU deprecates very old machines at some point anyway, so even then
it is impossible to keep to the old machine forever, but otoh redoing some
changes after a decade or two can be fine, I guess?

> 
>> And yeah, stable is wanted, but extrapolating from the current stable releases
>> frequency, where normally there's maximal one after 5-6 months from the .0
>> release, means that this will probably still hit all those distributions I
>> mentioned or is there something more soon planned?
>>
>> Also, is there any regression testing infrastructure around to avoid such
>> changes in the future? This change got undetected for 7 months, which can be
>> pretty the norm for QEMU releases, so some earlier safety net would be good? Is
>> there anything which dumps various default machine HW layouts and uses them for
>> an ABI check of some sorts?
> 
> There are various testing efforts the reason this got undetected is
> because it does not affect linux guests, and even for windows
> they kind of recover, there's just some boot slowdown around reconfiguration.
> Not easy to detect automatically given windows has lots of random
> downtime during boot around updates etc etc.
> 

No, Windows does not reconfigure, this is a permanent change, one is just lucky
if one has a DHCP server around in the network accessible for the guest.
As static addresses setup on that virtual NIC before that config is gone,
no recovery whatsoever until manual intervention.

I meant more of a "dump HW layout to .txt file, commit to git, and ensure
there's no diff without and machine version bump" (very boiled down), e.g., like
ABI checks for kernel builds are often done by distros - albeit those are easier
as its quite clear what and how the kernel ABI can be used.



^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 1/2] i386/acpi: fix inconsistent QEMU/OVMF device paths
  2021-02-28 20:43   ` Michael S. Tsirkin
  2021-03-01  7:12     ` Thomas Lamprecht
@ 2021-03-01 13:28     ` Igor Mammedov
  2021-03-01 16:14       ` Michael S. Tsirkin
  1 sibling, 1 reply; 30+ messages in thread
From: Igor Mammedov @ 2021-03-01 13:28 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Eduardo Habkost, Stefan Reiter, qemu-devel, Paolo Bonzini,
	vit9696, Laszlo Ersek, Thomas Lamprecht, Richard Henderson

On Sun, 28 Feb 2021 15:43:40 -0500
"Michael S. Tsirkin" <mst@redhat.com> wrote:

> On Sat, Feb 27, 2021 at 08:41:11PM +0100, Thomas Lamprecht wrote:
> > On 30.07.20 17:58, Michael S. Tsirkin wrote:  
> > > macOS uses ACPI UIDs to build the DevicePath for NVRAM boot options,
> > > while OVMF firmware gets them via an internal channel through QEMU.
> > > Due to a bug in QEMU ACPI currently UEFI firmware and ACPI have
> > > different values, and this makes the underlying operating system
> > > unable to report its boot option.
> > > 
> > > The particular node in question is the primary PciRoot (PCI0 in ACPI),
> > > which for some reason gets assigned 1 in ACPI UID and 0 in the
> > > DevicePath. This is due to the _UID assigned to it by build_dsdt in
> > > hw/i386/acpi-build.c Which does not correspond to the primary PCI
> > > identifier given by pcibus_num in hw/pci/pci.c
> > > 
> > > Reference with the device paths, OVMF startup logs, and ACPI table
> > > dumps (SysReport):
> > > https://github.com/acidanthera/bugtracker/issues/1050
> > > 
> > > In UEFI v2.8, section "10.4.2 Rules with ACPI _HID and _UID" ends with
> > > the paragraph,
> > > 
> > >     Root PCI bridges will use the plug and play ID of PNP0A03, This will
> > >     be stored in the ACPI Device Path _HID field, or in the Expanded
> > >     ACPI Device Path _CID field to match the ACPI name space. The _UID
> > >     in the ACPI Device Path structure must match the _UID in the ACPI
> > >     name space.
> > > 
> > > (See especially the last sentence.)
> > > 
> > > Considering *extra* root bridges / root buses (with bus number > 0),
> > > QEMU's ACPI generator actually does the right thing; since QEMU commit
> > > c96d9286a6d7 ("i386/acpi-build: more traditional _UID and _HID for PXB
> > > root buses", 2015-06-11).
> > > 
> > > However, the _UID values for root bridge zero (on both i440fx and q35)
> > > have always been "wrong" (from UEFI perspective), going back in QEMU to
> > > commit 74523b850189 ("i386: add ACPI table files from seabios",
> > > 2013-10-14).
> > > 
> > > Even in SeaBIOS, these _UID values have always been 1; see commit
> > > a4d357638c57 ("Port rombios32 code from bochs-bios.", 2008-03-08) for
> > > i440fx, and commit ecbe3fd61511 ("seabios: q35: add dsdt", 2012-12-01)
> > > for q35.
> > > 
> > > Suggested-by: Laszlo Ersek <lersek@redhat.com>
> > > Tested-by: vit9696 <vit9696@protonmail.com>
> > > Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
> > > ---
> > >  hw/i386/acpi-build.c | 4 ++--
> > >  1 file changed, 2 insertions(+), 2 deletions(-)
> > > 
> > > diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c
> > > index b7bcbbbb2a..7a5a8b3521 100644
> > > --- a/hw/i386/acpi-build.c
> > > +++ b/hw/i386/acpi-build.c
> > > @@ -1497,7 +1497,7 @@ build_dsdt(GArray *table_data, BIOSLinker *linker,
> > >          dev = aml_device("PCI0");
> > >          aml_append(dev, aml_name_decl("_HID", aml_eisaid("PNP0A03")));
> > >          aml_append(dev, aml_name_decl("_ADR", aml_int(0)));
> > > -        aml_append(dev, aml_name_decl("_UID", aml_int(1)));
> > > +        aml_append(dev, aml_name_decl("_UID", aml_int(0)));
> > >          aml_append(sb_scope, dev);
> > >          aml_append(dsdt, sb_scope);
> > >  
> > > @@ -1512,7 +1512,7 @@ build_dsdt(GArray *table_data, BIOSLinker *linker,
> > >          aml_append(dev, aml_name_decl("_HID", aml_eisaid("PNP0A08")));
> > >          aml_append(dev, aml_name_decl("_CID", aml_eisaid("PNP0A03")));
> > >          aml_append(dev, aml_name_decl("_ADR", aml_int(0)));
> > > -        aml_append(dev, aml_name_decl("_UID", aml_int(1)));
> > > +        aml_append(dev, aml_name_decl("_UID", aml_int(0)));
> > >          aml_append(dev, build_q35_osc_method());
> > >          aml_append(sb_scope, dev);
> > >          aml_append(dsdt, sb_scope);
> > >   
> > 
> > This "breaks" Windows guests created/installed before this change in the sense
> > of Windows gets confused and declares that most of the devices changed and thus
> > it has new entries for them in the device manager where settings of the old one
> > do not apply anymore.
> > 
> > We were made aware of this by our users when making QEMU 5.2.0 available on
> > a more used repository of us. Users complained that their static network
> > configuration got thrown out in Windows 2016 or 2019 server VMs, and Windows tried
> > to use DHCP (which was not available in their environments) and thus their Windows
> > VMs had no network connectivity at all anymore.
> > 
> > It's currently not yet quite 100% clear to me with what QEMU version the Windows VM
> > must be installed with, from reading the patch I have to believe it must be before
> > that, but we got mixed reports and a colleague could not replicate it from upgrade
> > of 4.0 to 5.2 (I did /not/ confirm that one). Anyway, just writing this all to avoid
> > people seeing different results and brushing this off.
> > 
> > So here's my personal reproducer, as said, I think that one should be able to just
> > use QEMU 5.1 to install a Windows guest and start it with 5.2 afterwards to see this
> > issue, but YMMV.
> > 
> > Note. I always used the exact same QEMU command (see below) for installation,
> > reproducing and bisect.
> > 
> > 1. Installed Windows 2016 1616 VM using QEMU 3.0.1
> >    - VirtIO net/scsi driver from VirtIO win 190
> > 2. Setup static network in the VM and shutdown
> > 3. Started VM with 5.2.0 -> Network gone, new "Ethernet #2" adapter shows up instead
> > 
> > Starting the  "Device Manager" and enabling "View -> Show hidden devices" showed
> > me a greyed out device duplicate for basically anything attached, SCSI disk, Basic
> > Display Adapter, CDROM device, ..., and the Network device.
> > 
> > The first difference I could find was the "Device instance path" one can find in
> > the "Details" tab of the devices' "Properties" window.
> > 
> > # old, from initial installation on QEMU 3.0.1
> > PCI\VEN_1AF4&DEV_1000&SUBSYS_00011AF4&REV_00\3&13C0B0C5&0&90
> > 
> > # new, from boot with QEMU 5.2
> > PCI\VEN_1AF4&DEV_1000&SUBSYS_00011AF4&REV_00\3&267A616A&0&90
> > 
> > They match until almost the end, not sure how important that is, but it caught my
> > eye (I'm really no windows guy since a decade so please excuse my terrible
> > debugging/exploring skills there. The rest of those properties looked pretty
> > much identical.
> > 
> > I then started a bisect, always just restarting the guest with the new QEMU build
> > and checking "Device Manager" and network settings to see if good/bad. That worked
> > pretty well and I came to this commit. See the bisect log attached at the end of
> > this mail.
> > 
> > So, from reading the commit message I figure that this change is wanted, what are
> > the implications of just reverting it? (which works out in bringing back the
> > old state in Windows + working static network config again).
> > 
> > Or any other way/idea to address this in a sane way so that those picky Windows
> > guests can be handled more graciously?  
> 
> Sure. The way to do that is to tie old behaviour to old machine
> versions. We'll need it in stable too ...
> 
> Igor want to cook up a patch?

It might be too late for that,
I mean VMs installed on qemu-5.2 will use new ACPI tables on all
machine types and reverting behavior back for old machine types
will cause the same headache for them.

The difference is that probably there are a lot less new
Windows installations than the old ones (especially with static IP assignment),
so it could be better to restore bug for old machine types to
avoid guest reconfiguration.

How about:
 * buggy ACPI for 5.1 machine types and older
 * fixed ACPI for 5.2 and newer?


> > I guess also that there could be more subtle effects from this patch here, the
> > network one may have just had quite visible effects to pop up as first issue...
> > 
> > Thanks if you read so far!
> > 
> > cheers,
> > Thomas
> > 
> > 
> > = QEMU Command =
> > 
> > (This was generated by our (Proxmox VE) stack, I only cleaned it up a bit to allow
> > easier manual running it)
> > 
> > ./qemu-system-x86_64 \
> >   -name win2016 \
> >   -chardev 'socket,id=qmp,path=/var/run/qemu-server/11765.qmp,server,nowait' \
> >   -mon 'chardev=qmp,mode=control' \
> >   -smbios 'type=1,uuid=6324fb28-e98a-44cf-85db-694d1b3405f5' \
> >   -smp '2,sockets=1,cores=2,maxcpus=2' \
> >   -nodefaults \
> >   -boot 'menu=on,strict=on,reboot-timeout=1000' \
> >   -vnc unix:/var/run/qemu-server/11765.vnc,password \
> >   -no-hpet \
> >   -cpu 'host,hv_ipi,hv_relaxed,hv_reset,hv_runtime,hv_spinlocks=0x1fff,hv_stimer,hv_synic,hv_time,hv_vapic,hv_vpindex,+kvm_pv_eoi,+kvm_pv_unhalt,+md-clear,+pcid,+spec-ctrl' \
> >   -m 2048 \
> >   -device 'pci-bridge,id=pci.1,chassis_nr=1,bus=pci.0,addr=0x1e' \
> >   -device 'pci-bridge,id=pci.2,chassis_nr=2,bus=pci.0,addr=0x1f' \
> >   -device 'vmgenid,guid=2e56e6ca-2cf8-4f1d-8cc3-9b19a2510c01' \
> >   -device 'piix3-usb-uhci,id=uhci,bus=pci.0,addr=0x1.0x2' \
> >   -device 'usb-tablet,id=tablet,bus=uhci.0,port=1' \
> >   -device 'VGA,id=vga,bus=pci.0,addr=0x2,edid=off' \
> >   -chardev 'socket,path=/var/run/qemu-server/11765.qga,server,nowait,id=qga0' \
> >   -device 'virtio-serial,id=qga0,bus=pci.0,addr=0x8' \
> >   -device 'virtserialport,chardev=qga0,name=org.qemu.guest_agent.0' \
> >   -device 'virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x3' \
> >   -iscsi 'initiator-name=iqn.1993-08.org.debian:01:468faae9322b' \
> >   -drive 'file=/mnt/pve/iso/template/iso/virtio-win-0.1.190.iso,if=none,id=drive-ide0,media=cdrom,aio=threads' \
> >   -device 'ide-cd,bus=ide.0,unit=0,drive=drive-ide0,id=ide0,bootindex=200' \
> >   -drive 'file=/mnt/pve/iso/template/iso/Win2016-1616-evaluation.ISO,if=none,id=drive-ide2,media=cdrom,aio=threads' \
> >   -device 'ide-cd,bus=ide.1,unit=0,drive=drive-ide2,id=ide2,bootindex=201' \
> >   -device 'virtio-scsi-pci,id=scsihw0,bus=pci.0,addr=0x5' \
> >   -drive 'file=/dev/WDnvme/vm-11765-disk-0,if=none,id=drive-scsi0,format=raw,cache=none,aio=native,detect-zeroes=on' \
> >   -device 'scsi-hd,bus=scsihw0.0,channel=0,scsi-id=0,lun=0,drive=drive-scsi0,id=scsi0,rotation_rate=1,bootindex=100' \
> >   -netdev 'type=tap,id=net0,ifname=tap11765i0,script=/var/lib/qemu-server/pve-bridge,downscript=/var/lib/qemu-server/pve-bridgedown,vhost=on' \
> >   -device 'virtio-net-pci,mac=02:98:90:43:42:1D,netdev=net0,bus=pci.0,addr=0x12,id=net0,bootindex=300' \
> >   -rtc 'driftfix=slew,base=localtime' \
> >   -machine 'type=pc' \
> >   -global 'kvm-pit.lost_tick_policy=discard'
> > 
> > 
> > = bisect log =
> > 
> > git bisect start
> > # bad: [553032db17440f8de011390e5a1cfddd13751b0b] Update version for v5.2.0 release
> > git bisect bad 553032db17440f8de011390e5a1cfddd13751b0b
> > # good: [d0ed6a69d399ae193959225cdeaa9382746c91cc] Update version for v5.1.0 release
> > git bisect good d0ed6a69d399ae193959225cdeaa9382746c91cc
> > # bad: [ed799805d00ccdda45eb8441c7d929624d9e98a6] qom: Add kernel-doc markup to introduction doc comment
> > git bisect bad ed799805d00ccdda45eb8441c7d929624d9e98a6
> > # bad: [e4d8b7c1a95fffcfa4bdab9aa7ffd1cf590cdcf5] Merge remote-tracking branch 'remotes/nvme/tags/pull-nvme-20200902' into staging
> > git bisect bad e4d8b7c1a95fffcfa4bdab9aa7ffd1cf590cdcf5
> > # bad: [af1dfe1ec0864e6700237a43cc36018176f9eba9] acpi: update expected DSDT files with _UID changes
> > git bisect bad af1dfe1ec0864e6700237a43cc36018176f9eba9
> > # good: [d7df0ceee0fd2e512cd214a9074ebeeb40da3099] Merge remote-tracking branch 'remotes/philmd-gitlab/tags/sd-next-20200821' into staging
> > git bisect good d7df0ceee0fd2e512cd214a9074ebeeb40da3099
> > # good: [df82aa7fe10e46b675678977999d49bd586538f8] Merge remote-tracking branch 'remotes/edgar/tags/edgar/xilinx-next-2020-08-24.for-upstream' into staging
> > git bisect good df82aa7fe10e46b675678977999d49bd586538f8
> > # good: [e39a8320b088dd5efc9ebaafe387e52b3d962665] target/riscv: Support the Virtual Instruction fault
> > git bisect good e39a8320b088dd5efc9ebaafe387e52b3d962665
> > # good: [a6841a2de66fa44fe52ed996b70f9fb9f7bd6ca7] qcow2: Add subcluster support to qcow2_co_pwrite_zeroes()
> > git bisect good a6841a2de66fa44fe52ed996b70f9fb9f7bd6ca7
> > # good: [2f8cd515477edab1cbf38ecbdbfa2cac13ce1550] hw/display/artist: Fix invalidation of lines near screen border
> > git bisect good 2f8cd515477edab1cbf38ecbdbfa2cac13ce1550
> > # good: [a5d3cfa2dc775e5d99f013703b8508f1d989d588] iotests: Add tests for qcow2 images with extended L2 entries
> > git bisect good a5d3cfa2dc775e5d99f013703b8508f1d989d588
> > # good: [8e49197ca5e76fdb8928833b2649ef13fc5aab2f] Merge remote-tracking branch 'remotes/hdeller/tags/target-hppa-v3-pull-request' into staging
> > git bisect good 8e49197ca5e76fdb8928833b2649ef13fc5aab2f
> > # bad: [af1b80ae56c9495999e8ccf7b70ef894378de642] i386/acpi: fix inconsistent QEMU/OVMF device paths
> > git bisect bad af1b80ae56c9495999e8ccf7b70ef894378de642
> > # good: [42a62c20925e02aef0d849f92a0e9540888e79ae] acpi: allow DSDT changes
> > git bisect good 42a62c20925e02aef0d849f92a0e9540888e79ae
> > # first bad commit: [af1b80ae56c9495999e8ccf7b70ef894378de642] i386/acpi: fix inconsistent QEMU/OVMF device paths  
> 
> 



^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 1/2] i386/acpi: fix inconsistent QEMU/OVMF device paths
  2021-03-01  7:45         ` Thomas Lamprecht
@ 2021-03-01 14:20           ` Igor Mammedov
  2021-03-01 14:27             ` Thomas Lamprecht
  2021-03-01 15:31           ` Michael S. Tsirkin
  1 sibling, 1 reply; 30+ messages in thread
From: Igor Mammedov @ 2021-03-01 14:20 UTC (permalink / raw)
  To: Thomas Lamprecht
  Cc: vit9696, Michael S. Tsirkin, Stefan Reiter, qemu-devel,
	Paolo Bonzini, Richard Henderson, Laszlo Ersek, Eduardo Habkost

On Mon, 1 Mar 2021 08:45:53 +0100
Thomas Lamprecht <t.lamprecht@proxmox.com> wrote:

> On 01.03.21 08:20, Michael S. Tsirkin wrote:
> > On Mon, Mar 01, 2021 at 08:12:35AM +0100, Thomas Lamprecht wrote:  
> >> On 28.02.21 21:43, Michael S. Tsirkin wrote:  
> >>> Sure. The way to do that is to tie old behaviour to old machine
> >>> versions. We'll need it in stable too ...  
> >>
> >> Yeah, using machine types is how its meant to be with solving migration
> >> breakage, sure.
> >> But that means we have to permanently pin the VM, and any backup restored from
> >> that to that machine type *forever*. That'd be new for us as we always could
> >> allow a newer machine type for a fresh start (i.e., non migration or the like)
> >> here, and mean that lots of other improvements guarded by a newer machine type
> >> for those VMs will.  
> > 
> > If you don't do that, that is a bug as any virtual hardware
> > can change across machine types.  
> 
> For us a feature, for fresh starts one gets the current virtual HW but for
> live migration or our live snapshot code it stays compatible. Works quite
> well here for many years, as we can simply test the HW changes on existing
> VMs - which failed here due to lack of static IPs in the test bed. So yes,
> it has its problems as it is not really  what an OS considers as HW change
> so big that it makes it a new device, mostly Windows is a PITA here as seen
> in this issue.
> 
> I mean, QEMU deprecates very old machines at some point anyway, so even then
> it is impossible to keep to the old machine forever, but otoh redoing some
> changes after a decade or two can be fine, I guess?
> 
> >   
> >> And yeah, stable is wanted, but extrapolating from the current stable releases
> >> frequency, where normally there's maximal one after 5-6 months from the .0
> >> release, means that this will probably still hit all those distributions I
> >> mentioned or is there something more soon planned?
> >>
> >> Also, is there any regression testing infrastructure around to avoid such
> >> changes in the future? This change got undetected for 7 months, which can be
> >> pretty the norm for QEMU releases, so some earlier safety net would be good? Is
> >> there anything which dumps various default machine HW layouts and uses them for
> >> an ABI check of some sorts?  
> > 
> > There are various testing efforts the reason this got undetected is
> > because it does not affect linux guests, and even for windows
> > they kind of recover, there's just some boot slowdown around reconfiguration.
> > Not easy to detect automatically given windows has lots of random
> > downtime during boot around updates etc etc.
> >   
> 
> No, Windows does not reconfigure, this is a permanent change, one is just lucky
> if one has a DHCP server around in the network accessible for the guest.
> As static addresses setup on that virtual NIC before that config is gone,
> no recovery whatsoever until manual intervention.
Static IP's are the pain guest admin picked up to deal with so he might have to
reconfigure guest OS when it decides to rename NICs. In this case moving
to new QEMU is alike to updating BIOS which fixed PCI description.
(On QEMU side we try to avoid breaking changes, but sometime it happens anyway
and it's up guest admin to fix OS quirks)

> I meant more of a "dump HW layout to .txt file, commit to git, and ensure
> there's no diff without and machine version bump" (very boiled down), e.g., like
> ABI checks for kernel builds are often done by distros - albeit those are easier
> as its quite clear what and how the kernel ABI can be used.
ACPI tables are not considered as ABI change in QEMU, technically tables that QEMU
generates are firmware and not version-ed (same like we don't tie anything to
specific firmware versions). 

However we rarely do version ACPI changes (only when it breaks something or
we suspect it would break and we can't accept that breakage), this time it took
a lot of time to find out that. We try to minimize such cases as every
versioning knob adds up to maintenance.

For ACPI tables changes, QEMU has bios-tables-test, but it lets us to catch
unintended changes only.
Technically it's possible to keep master tables for old machine versions
and test against it. But I'm not sure if we should do that, because some
(most) changes are harmless or useful and should apply to all machine
versions.
So we will end up in the same situation, where we decide if a change
should be versioned or not.



^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 1/2] i386/acpi: fix inconsistent QEMU/OVMF device paths
  2021-03-01 14:20           ` Igor Mammedov
@ 2021-03-01 14:27             ` Thomas Lamprecht
  2021-03-01 20:16               ` Igor Mammedov
  0 siblings, 1 reply; 30+ messages in thread
From: Thomas Lamprecht @ 2021-03-01 14:27 UTC (permalink / raw)
  To: Igor Mammedov
  Cc: vit9696, Michael S. Tsirkin, Stefan Reiter, qemu-devel,
	Paolo Bonzini, Richard Henderson, Laszlo Ersek, Eduardo Habkost

On 01.03.21 15:20, Igor Mammedov wrote:
> On Mon, 1 Mar 2021 08:45:53 +0100
> Thomas Lamprecht <t.lamprecht@proxmox.com> wrote:
>> On 01.03.21 08:20, Michael S. Tsirkin wrote:
>>> There are various testing efforts the reason this got undetected is
>>> because it does not affect linux guests, and even for windows
>>> they kind of recover, there's just some boot slowdown around reconfiguration.
>>> Not easy to detect automatically given windows has lots of random
>>> downtime during boot around updates etc etc.
>>>   
>>
>> No, Windows does not reconfigure, this is a permanent change, one is just lucky
>> if one has a DHCP server around in the network accessible for the guest.
>> As static addresses setup on that virtual NIC before that config is gone,
>> no recovery whatsoever until manual intervention.
> Static IP's are the pain guest admin picked up to deal with so he might have to
> reconfigure guest OS when it decides to rename NICs. In this case moving
> to new QEMU is alike to updating BIOS which fixed PCI description.
> (On QEMU side we try to avoid breaking changes, but sometime it happens anyway
> and it's up guest admin to fix OS quirks)
> 

heh, I agree, but users see it very differently, QEMU got updated, something
stopped working/changed/... -> QEMU at fault.

>> I meant more of a "dump HW layout to .txt file, commit to git, and ensure
>> there's no diff without and machine version bump" (very boiled down), e.g., like
>> ABI checks for kernel builds are often done by distros - albeit those are easier
>> as its quite clear what and how the kernel ABI can be used.
> ACPI tables are not considered as ABI change in QEMU, technically tables that QEMU
> generates are firmware and not version-ed (same like we don't tie anything to
> specific firmware versions). 
> 
> However we rarely do version ACPI changes (only when it breaks something or
> we suspect it would break and we can't accept that breakage), this time it took
> a lot of time to find out that. We try to minimize such cases as every
> versioning knob adds up to maintenance.
> 
> For ACPI tables changes, QEMU has bios-tables-test, but it lets us to catch
> unintended changes only.
> Technically it's possible to keep master tables for old machine versions
> and test against it. But I'm not sure if we should do that, because some
> (most) changes are harmless or useful and should apply to all machine
> versions.
> So we will end up in the same situation, where we decide if a change
> should be versioned or not.
> 
> 

OK, fair enough. Many thanks for providing some rationale!



^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 1/2] i386/acpi: fix inconsistent QEMU/OVMF device paths
  2021-03-01  7:45         ` Thomas Lamprecht
  2021-03-01 14:20           ` Igor Mammedov
@ 2021-03-01 15:31           ` Michael S. Tsirkin
  1 sibling, 0 replies; 30+ messages in thread
From: Michael S. Tsirkin @ 2021-03-01 15:31 UTC (permalink / raw)
  To: Thomas Lamprecht
  Cc: Eduardo Habkost, Stefan Reiter, qemu-devel, Igor Mammedov,
	Paolo Bonzini, Richard Henderson, Laszlo Ersek, vit9696

On Mon, Mar 01, 2021 at 08:45:53AM +0100, Thomas Lamprecht wrote:
> On 01.03.21 08:20, Michael S. Tsirkin wrote:
> > On Mon, Mar 01, 2021 at 08:12:35AM +0100, Thomas Lamprecht wrote:
> >> On 28.02.21 21:43, Michael S. Tsirkin wrote:
> >>> Sure. The way to do that is to tie old behaviour to old machine
> >>> versions. We'll need it in stable too ...
> >>
> >> Yeah, using machine types is how its meant to be with solving migration
> >> breakage, sure.
> >> But that means we have to permanently pin the VM, and any backup restored from
> >> that to that machine type *forever*. That'd be new for us as we always could
> >> allow a newer machine type for a fresh start (i.e., non migration or the like)
> >> here, and mean that lots of other improvements guarded by a newer machine type
> >> for those VMs will.
> > 
> > If you don't do that, that is a bug as any virtual hardware
> > can change across machine types.
> 
> For us a feature, for fresh starts one gets the current virtual HW but for
> live migration or our live snapshot code it stays compatible. Works quite
> well here for many years, as we can simply test the HW changes on existing
> VMs - which failed here due to lack of static IPs in the test bed. So yes,
> it has its problems as it is not really  what an OS considers as HW change
> so big that it makes it a new device, mostly Windows is a PITA here as seen
> in this issue.
> 
> I mean, QEMU deprecates very old machines at some point anyway, so even then
> it is impossible to keep to the old machine forever, but otoh redoing some
> changes after a decade or two can be fine, I guess?
> 
> > 
> >> And yeah, stable is wanted, but extrapolating from the current stable releases
> >> frequency, where normally there's maximal one after 5-6 months from the .0
> >> release, means that this will probably still hit all those distributions I
> >> mentioned or is there something more soon planned?
> >>
> >> Also, is there any regression testing infrastructure around to avoid such
> >> changes in the future? This change got undetected for 7 months, which can be
> >> pretty the norm for QEMU releases, so some earlier safety net would be good? Is
> >> there anything which dumps various default machine HW layouts and uses them for
> >> an ABI check of some sorts?
> > 
> > There are various testing efforts the reason this got undetected is
> > because it does not affect linux guests, and even for windows
> > they kind of recover, there's just some boot slowdown around reconfiguration.
> > Not easy to detect automatically given windows has lots of random
> > downtime during boot around updates etc etc.
> > 
> 
> No, Windows does not reconfigure, this is a permanent change, one is just lucky
> if one has a DHCP server around in the network accessible for the guest.
> As static addresses setup on that virtual NIC before that config is gone,
> no recovery whatsoever until manual intervention.

Right, it so happened no one tested with a static IP.

> I meant more of a "dump HW layout to .txt file, commit to git, and ensure
> there's no diff without and machine version bump" (very boiled down), e.g., like
> ABI checks for kernel builds are often done by distros - albeit those are easier
> as its quite clear what and how the kernel ABI can be used.

Exactly. We have such tests for ACPI which is what changed here.  We
just *do* change ACPI once in a while - it is code after all, and it is
normally fine to change as long as changes are not material.  So what do
we do  when we change it?  All we have right now is inspecting
it manually.

-- 
MST



^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 1/2] i386/acpi: fix inconsistent QEMU/OVMF device paths
  2021-03-01 13:28     ` Igor Mammedov
@ 2021-03-01 16:14       ` Michael S. Tsirkin
  2021-03-01 16:28         ` Laszlo Ersek
  0 siblings, 1 reply; 30+ messages in thread
From: Michael S. Tsirkin @ 2021-03-01 16:14 UTC (permalink / raw)
  To: Igor Mammedov
  Cc: Eduardo Habkost, Stefan Reiter, qemu-devel, Paolo Bonzini,
	vit9696, Laszlo Ersek, Thomas Lamprecht, Richard Henderson

On Mon, Mar 01, 2021 at 02:28:19PM +0100, Igor Mammedov wrote:
> On Sun, 28 Feb 2021 15:43:40 -0500
> "Michael S. Tsirkin" <mst@redhat.com> wrote:
> 
> > On Sat, Feb 27, 2021 at 08:41:11PM +0100, Thomas Lamprecht wrote:
> > > On 30.07.20 17:58, Michael S. Tsirkin wrote:  
> > > > macOS uses ACPI UIDs to build the DevicePath for NVRAM boot options,
> > > > while OVMF firmware gets them via an internal channel through QEMU.
> > > > Due to a bug in QEMU ACPI currently UEFI firmware and ACPI have
> > > > different values, and this makes the underlying operating system
> > > > unable to report its boot option.
> > > > 
> > > > The particular node in question is the primary PciRoot (PCI0 in ACPI),
> > > > which for some reason gets assigned 1 in ACPI UID and 0 in the
> > > > DevicePath. This is due to the _UID assigned to it by build_dsdt in
> > > > hw/i386/acpi-build.c Which does not correspond to the primary PCI
> > > > identifier given by pcibus_num in hw/pci/pci.c
> > > > 
> > > > Reference with the device paths, OVMF startup logs, and ACPI table
> > > > dumps (SysReport):
> > > > https://github.com/acidanthera/bugtracker/issues/1050
> > > > 
> > > > In UEFI v2.8, section "10.4.2 Rules with ACPI _HID and _UID" ends with
> > > > the paragraph,
> > > > 
> > > >     Root PCI bridges will use the plug and play ID of PNP0A03, This will
> > > >     be stored in the ACPI Device Path _HID field, or in the Expanded
> > > >     ACPI Device Path _CID field to match the ACPI name space. The _UID
> > > >     in the ACPI Device Path structure must match the _UID in the ACPI
> > > >     name space.
> > > > 
> > > > (See especially the last sentence.)
> > > > 
> > > > Considering *extra* root bridges / root buses (with bus number > 0),
> > > > QEMU's ACPI generator actually does the right thing; since QEMU commit
> > > > c96d9286a6d7 ("i386/acpi-build: more traditional _UID and _HID for PXB
> > > > root buses", 2015-06-11).
> > > > 
> > > > However, the _UID values for root bridge zero (on both i440fx and q35)
> > > > have always been "wrong" (from UEFI perspective), going back in QEMU to
> > > > commit 74523b850189 ("i386: add ACPI table files from seabios",
> > > > 2013-10-14).
> > > > 
> > > > Even in SeaBIOS, these _UID values have always been 1; see commit
> > > > a4d357638c57 ("Port rombios32 code from bochs-bios.", 2008-03-08) for
> > > > i440fx, and commit ecbe3fd61511 ("seabios: q35: add dsdt", 2012-12-01)
> > > > for q35.
> > > > 
> > > > Suggested-by: Laszlo Ersek <lersek@redhat.com>
> > > > Tested-by: vit9696 <vit9696@protonmail.com>
> > > > Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
> > > > ---
> > > >  hw/i386/acpi-build.c | 4 ++--
> > > >  1 file changed, 2 insertions(+), 2 deletions(-)
> > > > 
> > > > diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c
> > > > index b7bcbbbb2a..7a5a8b3521 100644
> > > > --- a/hw/i386/acpi-build.c
> > > > +++ b/hw/i386/acpi-build.c
> > > > @@ -1497,7 +1497,7 @@ build_dsdt(GArray *table_data, BIOSLinker *linker,
> > > >          dev = aml_device("PCI0");
> > > >          aml_append(dev, aml_name_decl("_HID", aml_eisaid("PNP0A03")));
> > > >          aml_append(dev, aml_name_decl("_ADR", aml_int(0)));
> > > > -        aml_append(dev, aml_name_decl("_UID", aml_int(1)));
> > > > +        aml_append(dev, aml_name_decl("_UID", aml_int(0)));
> > > >          aml_append(sb_scope, dev);
> > > >          aml_append(dsdt, sb_scope);
> > > >  
> > > > @@ -1512,7 +1512,7 @@ build_dsdt(GArray *table_data, BIOSLinker *linker,
> > > >          aml_append(dev, aml_name_decl("_HID", aml_eisaid("PNP0A08")));
> > > >          aml_append(dev, aml_name_decl("_CID", aml_eisaid("PNP0A03")));
> > > >          aml_append(dev, aml_name_decl("_ADR", aml_int(0)));
> > > > -        aml_append(dev, aml_name_decl("_UID", aml_int(1)));
> > > > +        aml_append(dev, aml_name_decl("_UID", aml_int(0)));
> > > >          aml_append(dev, build_q35_osc_method());
> > > >          aml_append(sb_scope, dev);
> > > >          aml_append(dsdt, sb_scope);
> > > >   
> > > 
> > > This "breaks" Windows guests created/installed before this change in the sense
> > > of Windows gets confused and declares that most of the devices changed and thus
> > > it has new entries for them in the device manager where settings of the old one
> > > do not apply anymore.
> > > 
> > > We were made aware of this by our users when making QEMU 5.2.0 available on
> > > a more used repository of us. Users complained that their static network
> > > configuration got thrown out in Windows 2016 or 2019 server VMs, and Windows tried
> > > to use DHCP (which was not available in their environments) and thus their Windows
> > > VMs had no network connectivity at all anymore.
> > > 
> > > It's currently not yet quite 100% clear to me with what QEMU version the Windows VM
> > > must be installed with, from reading the patch I have to believe it must be before
> > > that, but we got mixed reports and a colleague could not replicate it from upgrade
> > > of 4.0 to 5.2 (I did /not/ confirm that one). Anyway, just writing this all to avoid
> > > people seeing different results and brushing this off.
> > > 
> > > So here's my personal reproducer, as said, I think that one should be able to just
> > > use QEMU 5.1 to install a Windows guest and start it with 5.2 afterwards to see this
> > > issue, but YMMV.
> > > 
> > > Note. I always used the exact same QEMU command (see below) for installation,
> > > reproducing and bisect.
> > > 
> > > 1. Installed Windows 2016 1616 VM using QEMU 3.0.1
> > >    - VirtIO net/scsi driver from VirtIO win 190
> > > 2. Setup static network in the VM and shutdown
> > > 3. Started VM with 5.2.0 -> Network gone, new "Ethernet #2" adapter shows up instead
> > > 
> > > Starting the  "Device Manager" and enabling "View -> Show hidden devices" showed
> > > me a greyed out device duplicate for basically anything attached, SCSI disk, Basic
> > > Display Adapter, CDROM device, ..., and the Network device.
> > > 
> > > The first difference I could find was the "Device instance path" one can find in
> > > the "Details" tab of the devices' "Properties" window.
> > > 
> > > # old, from initial installation on QEMU 3.0.1
> > > PCI\VEN_1AF4&DEV_1000&SUBSYS_00011AF4&REV_00\3&13C0B0C5&0&90
> > > 
> > > # new, from boot with QEMU 5.2
> > > PCI\VEN_1AF4&DEV_1000&SUBSYS_00011AF4&REV_00\3&267A616A&0&90
> > > 
> > > They match until almost the end, not sure how important that is, but it caught my
> > > eye (I'm really no windows guy since a decade so please excuse my terrible
> > > debugging/exploring skills there. The rest of those properties looked pretty
> > > much identical.
> > > 
> > > I then started a bisect, always just restarting the guest with the new QEMU build
> > > and checking "Device Manager" and network settings to see if good/bad. That worked
> > > pretty well and I came to this commit. See the bisect log attached at the end of
> > > this mail.
> > > 
> > > So, from reading the commit message I figure that this change is wanted, what are
> > > the implications of just reverting it? (which works out in bringing back the
> > > old state in Windows + working static network config again).
> > > 
> > > Or any other way/idea to address this in a sane way so that those picky Windows
> > > guests can be handled more graciously?  
> > 
> > Sure. The way to do that is to tie old behaviour to old machine
> > versions. We'll need it in stable too ...
> > 
> > Igor want to cook up a patch?
> 
> It might be too late for that,
> I mean VMs installed on qemu-5.2 will use new ACPI tables on all
> machine types and reverting behavior back for old machine types
> will cause the same headache for them.
> 
> The difference is that probably there are a lot less new
> Windows installations than the old ones (especially with static IP assignment),
> so it could be better to restore bug for old machine types to
> avoid guest reconfiguration.

Yes. And new installations using e.g. libvirt will always use the
latest machine type available.

> How about:
>  * buggy ACPI for 5.1 machine types and older
>  * fixed ACPI for 5.2 and newer?

Exactly.

> 
> > > I guess also that there could be more subtle effects from this patch here, the
> > > network one may have just had quite visible effects to pop up as first issue...
> > > 
> > > Thanks if you read so far!
> > > 
> > > cheers,
> > > Thomas
> > > 
> > > 
> > > = QEMU Command =
> > > 
> > > (This was generated by our (Proxmox VE) stack, I only cleaned it up a bit to allow
> > > easier manual running it)
> > > 
> > > ./qemu-system-x86_64 \
> > >   -name win2016 \
> > >   -chardev 'socket,id=qmp,path=/var/run/qemu-server/11765.qmp,server,nowait' \
> > >   -mon 'chardev=qmp,mode=control' \
> > >   -smbios 'type=1,uuid=6324fb28-e98a-44cf-85db-694d1b3405f5' \
> > >   -smp '2,sockets=1,cores=2,maxcpus=2' \
> > >   -nodefaults \
> > >   -boot 'menu=on,strict=on,reboot-timeout=1000' \
> > >   -vnc unix:/var/run/qemu-server/11765.vnc,password \
> > >   -no-hpet \
> > >   -cpu 'host,hv_ipi,hv_relaxed,hv_reset,hv_runtime,hv_spinlocks=0x1fff,hv_stimer,hv_synic,hv_time,hv_vapic,hv_vpindex,+kvm_pv_eoi,+kvm_pv_unhalt,+md-clear,+pcid,+spec-ctrl' \
> > >   -m 2048 \
> > >   -device 'pci-bridge,id=pci.1,chassis_nr=1,bus=pci.0,addr=0x1e' \
> > >   -device 'pci-bridge,id=pci.2,chassis_nr=2,bus=pci.0,addr=0x1f' \
> > >   -device 'vmgenid,guid=2e56e6ca-2cf8-4f1d-8cc3-9b19a2510c01' \
> > >   -device 'piix3-usb-uhci,id=uhci,bus=pci.0,addr=0x1.0x2' \
> > >   -device 'usb-tablet,id=tablet,bus=uhci.0,port=1' \
> > >   -device 'VGA,id=vga,bus=pci.0,addr=0x2,edid=off' \
> > >   -chardev 'socket,path=/var/run/qemu-server/11765.qga,server,nowait,id=qga0' \
> > >   -device 'virtio-serial,id=qga0,bus=pci.0,addr=0x8' \
> > >   -device 'virtserialport,chardev=qga0,name=org.qemu.guest_agent.0' \
> > >   -device 'virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x3' \
> > >   -iscsi 'initiator-name=iqn.1993-08.org.debian:01:468faae9322b' \
> > >   -drive 'file=/mnt/pve/iso/template/iso/virtio-win-0.1.190.iso,if=none,id=drive-ide0,media=cdrom,aio=threads' \
> > >   -device 'ide-cd,bus=ide.0,unit=0,drive=drive-ide0,id=ide0,bootindex=200' \
> > >   -drive 'file=/mnt/pve/iso/template/iso/Win2016-1616-evaluation.ISO,if=none,id=drive-ide2,media=cdrom,aio=threads' \
> > >   -device 'ide-cd,bus=ide.1,unit=0,drive=drive-ide2,id=ide2,bootindex=201' \
> > >   -device 'virtio-scsi-pci,id=scsihw0,bus=pci.0,addr=0x5' \
> > >   -drive 'file=/dev/WDnvme/vm-11765-disk-0,if=none,id=drive-scsi0,format=raw,cache=none,aio=native,detect-zeroes=on' \
> > >   -device 'scsi-hd,bus=scsihw0.0,channel=0,scsi-id=0,lun=0,drive=drive-scsi0,id=scsi0,rotation_rate=1,bootindex=100' \
> > >   -netdev 'type=tap,id=net0,ifname=tap11765i0,script=/var/lib/qemu-server/pve-bridge,downscript=/var/lib/qemu-server/pve-bridgedown,vhost=on' \
> > >   -device 'virtio-net-pci,mac=02:98:90:43:42:1D,netdev=net0,bus=pci.0,addr=0x12,id=net0,bootindex=300' \
> > >   -rtc 'driftfix=slew,base=localtime' \
> > >   -machine 'type=pc' \
> > >   -global 'kvm-pit.lost_tick_policy=discard'
> > > 
> > > 
> > > = bisect log =
> > > 
> > > git bisect start
> > > # bad: [553032db17440f8de011390e5a1cfddd13751b0b] Update version for v5.2.0 release
> > > git bisect bad 553032db17440f8de011390e5a1cfddd13751b0b
> > > # good: [d0ed6a69d399ae193959225cdeaa9382746c91cc] Update version for v5.1.0 release
> > > git bisect good d0ed6a69d399ae193959225cdeaa9382746c91cc
> > > # bad: [ed799805d00ccdda45eb8441c7d929624d9e98a6] qom: Add kernel-doc markup to introduction doc comment
> > > git bisect bad ed799805d00ccdda45eb8441c7d929624d9e98a6
> > > # bad: [e4d8b7c1a95fffcfa4bdab9aa7ffd1cf590cdcf5] Merge remote-tracking branch 'remotes/nvme/tags/pull-nvme-20200902' into staging
> > > git bisect bad e4d8b7c1a95fffcfa4bdab9aa7ffd1cf590cdcf5
> > > # bad: [af1dfe1ec0864e6700237a43cc36018176f9eba9] acpi: update expected DSDT files with _UID changes
> > > git bisect bad af1dfe1ec0864e6700237a43cc36018176f9eba9
> > > # good: [d7df0ceee0fd2e512cd214a9074ebeeb40da3099] Merge remote-tracking branch 'remotes/philmd-gitlab/tags/sd-next-20200821' into staging
> > > git bisect good d7df0ceee0fd2e512cd214a9074ebeeb40da3099
> > > # good: [df82aa7fe10e46b675678977999d49bd586538f8] Merge remote-tracking branch 'remotes/edgar/tags/edgar/xilinx-next-2020-08-24.for-upstream' into staging
> > > git bisect good df82aa7fe10e46b675678977999d49bd586538f8
> > > # good: [e39a8320b088dd5efc9ebaafe387e52b3d962665] target/riscv: Support the Virtual Instruction fault
> > > git bisect good e39a8320b088dd5efc9ebaafe387e52b3d962665
> > > # good: [a6841a2de66fa44fe52ed996b70f9fb9f7bd6ca7] qcow2: Add subcluster support to qcow2_co_pwrite_zeroes()
> > > git bisect good a6841a2de66fa44fe52ed996b70f9fb9f7bd6ca7
> > > # good: [2f8cd515477edab1cbf38ecbdbfa2cac13ce1550] hw/display/artist: Fix invalidation of lines near screen border
> > > git bisect good 2f8cd515477edab1cbf38ecbdbfa2cac13ce1550
> > > # good: [a5d3cfa2dc775e5d99f013703b8508f1d989d588] iotests: Add tests for qcow2 images with extended L2 entries
> > > git bisect good a5d3cfa2dc775e5d99f013703b8508f1d989d588
> > > # good: [8e49197ca5e76fdb8928833b2649ef13fc5aab2f] Merge remote-tracking branch 'remotes/hdeller/tags/target-hppa-v3-pull-request' into staging
> > > git bisect good 8e49197ca5e76fdb8928833b2649ef13fc5aab2f
> > > # bad: [af1b80ae56c9495999e8ccf7b70ef894378de642] i386/acpi: fix inconsistent QEMU/OVMF device paths
> > > git bisect bad af1b80ae56c9495999e8ccf7b70ef894378de642
> > > # good: [42a62c20925e02aef0d849f92a0e9540888e79ae] acpi: allow DSDT changes
> > > git bisect good 42a62c20925e02aef0d849f92a0e9540888e79ae
> > > # first bad commit: [af1b80ae56c9495999e8ccf7b70ef894378de642] i386/acpi: fix inconsistent QEMU/OVMF device paths  
> > 
> > 



^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 1/2] i386/acpi: fix inconsistent QEMU/OVMF device paths
  2021-03-01 16:14       ` Michael S. Tsirkin
@ 2021-03-01 16:28         ` Laszlo Ersek
  2021-03-01 19:08           ` Igor Mammedov
  0 siblings, 1 reply; 30+ messages in thread
From: Laszlo Ersek @ 2021-03-01 16:28 UTC (permalink / raw)
  To: Michael S. Tsirkin, Igor Mammedov
  Cc: Eduardo Habkost, Stefan Reiter, qemu-devel, Paolo Bonzini,
	vit9696, Thomas Lamprecht, Richard Henderson

On 03/01/21 17:14, Michael S. Tsirkin wrote:
> On Mon, Mar 01, 2021 at 02:28:19PM +0100, Igor Mammedov wrote:

>> How about:
>>  * buggy ACPI for 5.1 machine types and older
>>  * fixed ACPI for 5.2 and newer?
> 
> Exactly.

Sounds OK to me as well (even though it's quite unfortunate that this is
one of those exceptions that require us to version the ACPI generator).

Thanks
Laszlo



^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 1/2] i386/acpi: fix inconsistent QEMU/OVMF device paths
  2021-03-01 16:28         ` Laszlo Ersek
@ 2021-03-01 19:08           ` Igor Mammedov
  2021-03-01 20:06             ` vit9696
  2021-03-02  8:40             ` Laszlo Ersek
  0 siblings, 2 replies; 30+ messages in thread
From: Igor Mammedov @ 2021-03-01 19:08 UTC (permalink / raw)
  To: Laszlo Ersek
  Cc: vit9696, Michael S. Tsirkin, Stefan Reiter, qemu-devel,
	Paolo Bonzini, Richard Henderson, Thomas Lamprecht,
	Eduardo Habkost

On Mon, 1 Mar 2021 17:28:05 +0100
Laszlo Ersek <lersek@redhat.com> wrote:

> On 03/01/21 17:14, Michael S. Tsirkin wrote:
> > On Mon, Mar 01, 2021 at 02:28:19PM +0100, Igor Mammedov wrote:  
> 
> >> How about:
> >>  * buggy ACPI for 5.1 machine types and older
> >>  * fixed ACPI for 5.2 and newer?  
> > 
> > Exactly.  
> 
> Sounds OK to me as well (even though it's quite unfortunate that this is
> one of those exceptions that require us to version the ACPI generator).
it is unfortunate, and I do resist to such changes usually.
in this case, I fill avoiding complaints/bug reports justifies such exception.


> Thanks
> Laszlo
> 
> 



^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 1/2] i386/acpi: fix inconsistent QEMU/OVMF device paths
  2021-03-01 19:08           ` Igor Mammedov
@ 2021-03-01 20:06             ` vit9696
  2021-03-02  8:40             ` Laszlo Ersek
  1 sibling, 0 replies; 30+ messages in thread
From: vit9696 @ 2021-03-01 20:06 UTC (permalink / raw)
  To: Michael S. Tsirkin, Thomas Lamprecht
  Cc: Laszlo Ersek, Eduardo Habkost, Stefan Reiter, qemu devel list,
	Paolo Bonzini, Richard Henderson

[-- Attachment #1: Type: text/plain, Size: 1078 bytes --]

I provided the initial version of the patch to the mailing list:
[PATCH] i386/acpi: restore device paths for pre-5.1 vms

Unfortunately I do not have easy access to a VM where I can test it at the moment. Please make sure that it works for you and reply with `Tested-by`.

Thanks,
Vitaly

> 1 марта 2021 г., в 22:08, Igor Mammedov <imammedo@redhat.com> написал(а):
> 
> 
> On Mon, 1 Mar 2021 17:28:05 +0100
> Laszlo Ersek <lersek@redhat.com> wrote:
> 
>> On 03/01/21 17:14, Michael S. Tsirkin wrote:
>>> On Mon, Mar 01, 2021 at 02:28:19PM +0100, Igor Mammedov wrote:
>> 
>>>> How about:
>>>> * buggy ACPI for 5.1 machine types and older
>>>> * fixed ACPI for 5.2 and newer?
>>> 
>>> Exactly.
>> 
>> Sounds OK to me as well (even though it's quite unfortunate that this is
>> one of those exceptions that require us to version the ACPI generator).
> it is unfortunate, and I do resist to such changes usually.
> in this case, I fill avoiding complaints/bug reports justifies such exception.
> 
> 
>> Thanks
>> Laszlo
>> 
>> 
> 


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 554 bytes --]

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 1/2] i386/acpi: fix inconsistent QEMU/OVMF device paths
  2021-03-01 14:27             ` Thomas Lamprecht
@ 2021-03-01 20:16               ` Igor Mammedov
  0 siblings, 0 replies; 30+ messages in thread
From: Igor Mammedov @ 2021-03-01 20:16 UTC (permalink / raw)
  To: Thomas Lamprecht
  Cc: vit9696, Michael S. Tsirkin, Stefan Reiter, qemu-devel,
	Eduardo Habkost, Wainer dos Santos Moschetta, Cleber Rosa,
	Paolo Bonzini, Laszlo Ersek, Richard Henderson

On Mon, 1 Mar 2021 15:27:38 +0100
Thomas Lamprecht <t.lamprecht@proxmox.com> wrote:

> On 01.03.21 15:20, Igor Mammedov wrote:
> > On Mon, 1 Mar 2021 08:45:53 +0100
> > Thomas Lamprecht <t.lamprecht@proxmox.com> wrote:  
> >> On 01.03.21 08:20, Michael S. Tsirkin wrote:  
> >>> There are various testing efforts the reason this got undetected is
> >>> because it does not affect linux guests, and even for windows
> >>> they kind of recover, there's just some boot slowdown around reconfiguration.
> >>> Not easy to detect automatically given windows has lots of random
> >>> downtime during boot around updates etc etc.
> >>>     
> >>
> >> No, Windows does not reconfigure, this is a permanent change, one is just lucky
> >> if one has a DHCP server around in the network accessible for the guest.
> >> As static addresses setup on that virtual NIC before that config is gone,
> >> no recovery whatsoever until manual intervention.  
> > Static IP's are the pain guest admin picked up to deal with so he might have to
> > reconfigure guest OS when it decides to rename NICs. In this case moving
> > to new QEMU is alike to updating BIOS which fixed PCI description.
> > (On QEMU side we try to avoid breaking changes, but sometime it happens anyway
> > and it's up guest admin to fix OS quirks)
> >   
> 
> heh, I agree, but users see it very differently, QEMU got updated, something
> stopped working/changed/... -> QEMU at fault.
lets try to workaround it as Michael have suggested, i.e. old behavior for old
machine types.

> >> I meant more of a "dump HW layout to .txt file, commit to git, and ensure
> >> there's no diff without and machine version bump" (very boiled down), e.g., like
> >> ABI checks for kernel builds are often done by distros - albeit those are easier
> >> as its quite clear what and how the kernel ABI can be used.  
> > ACPI tables are not considered as ABI change in QEMU, technically tables that QEMU
> > generates are firmware and not version-ed (same like we don't tie anything to
> > specific firmware versions). 
> > 
> > However we rarely do version ACPI changes (only when it breaks something or
> > we suspect it would break and we can't accept that breakage), this time it took
> > a lot of time to find out that. We try to minimize such cases as every
> > versioning knob adds up to maintenance.
> > 
> > For ACPI tables changes, QEMU has bios-tables-test, but it lets us to catch
> > unintended changes only.
> > Technically it's possible to keep master tables for old machine versions
> > and test against it. But I'm not sure if we should do that, because some
> > (most) changes are harmless or useful and should apply to all machine
> > versions.
> > So we will end up in the same situation, where we decide if a change
> > should be versioned or not.

If you can come up with a test case for this issue, patches are welcome.

We probably should test PCI device enumeration on Windows,
as it's what's broken. Testing that in qtest (make check) is out of question,
but there are acceptance tests which run various guest OSes and provide tools
to interact with guest. Perhaps adding test there could work.
Though I don't know if there are Windows based guest images available for it.

I guess it's something to discuss with guys who work on CI/testing
infrastructure (CCed).
 
> OK, fair enough. Many thanks for providing some rationale!
> 
> 



^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 1/2] i386/acpi: fix inconsistent QEMU/OVMF device paths
  2021-03-01 19:08           ` Igor Mammedov
  2021-03-01 20:06             ` vit9696
@ 2021-03-02  8:40             ` Laszlo Ersek
  1 sibling, 0 replies; 30+ messages in thread
From: Laszlo Ersek @ 2021-03-02  8:40 UTC (permalink / raw)
  To: Igor Mammedov
  Cc: vit9696, Michael S. Tsirkin, Stefan Reiter, qemu-devel,
	Paolo Bonzini, Richard Henderson, Thomas Lamprecht,
	Eduardo Habkost

On 03/01/21 20:08, Igor Mammedov wrote:
> On Mon, 1 Mar 2021 17:28:05 +0100
> Laszlo Ersek <lersek@redhat.com> wrote:
> 
>> On 03/01/21 17:14, Michael S. Tsirkin wrote:
>>> On Mon, Mar 01, 2021 at 02:28:19PM +0100, Igor Mammedov wrote:  
>>
>>>> How about:
>>>>  * buggy ACPI for 5.1 machine types and older
>>>>  * fixed ACPI for 5.2 and newer?  
>>>
>>> Exactly.  
>>
>> Sounds OK to me as well (even though it's quite unfortunate that this is
>> one of those exceptions that require us to version the ACPI generator).
> it is unfortunate, and I do resist to such changes usually.
> in this case, I fill avoiding complaints/bug reports justifies such exception.

Right, thanks.
Laszlo



^ permalink raw reply	[flat|nested] 30+ messages in thread

end of thread, other threads:[~2021-03-02  8:42 UTC | newest]

Thread overview: 30+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-07-30 15:58 [PATCH 1/2] i386/acpi: fix inconsistent QEMU/OVMF device paths Michael S. Tsirkin
2020-07-30 15:58 ` [PATCH 2/2] arm/acpi: fix an out of spec _UID for PCI root Michael S. Tsirkin
2020-07-30 16:12   ` Philippe Mathieu-Daudé
2020-07-30 19:35   ` Laszlo Ersek
2020-07-30 20:33   ` Peter Maydell
2020-07-31  9:31   ` Igor Mammedov
2020-07-30 16:11 ` [PATCH 1/2] i386/acpi: fix inconsistent QEMU/OVMF device paths Philippe Mathieu-Daudé
2020-07-30 19:35   ` Michael S. Tsirkin
2020-07-31  2:55     ` vit9696 via
2020-07-30 19:34 ` Laszlo Ersek
2020-07-31  9:30 ` Igor Mammedov
2021-02-27 19:41 ` Thomas Lamprecht
2021-02-28  9:11   ` vit9696
2021-02-28 10:43     ` Thomas Lamprecht
2021-02-28 20:45       ` Michael S. Tsirkin
     [not found]       ` <x3i3TiibtrC1qTDQUKxuOA_9qvmInzVwv6RrvzzSCSj-S21gLypbbZgEbYvJSGMxC1r8RaDrnHGgRbDI7vfpA_XuDINdZej9yKCW3_Sc4YM=@protonmail.com>
2021-02-28 21:37         ` Michael S. Tsirkin
2021-02-28 20:43   ` Michael S. Tsirkin
2021-03-01  7:12     ` Thomas Lamprecht
2021-03-01  7:20       ` Michael S. Tsirkin
2021-03-01  7:45         ` Thomas Lamprecht
2021-03-01 14:20           ` Igor Mammedov
2021-03-01 14:27             ` Thomas Lamprecht
2021-03-01 20:16               ` Igor Mammedov
2021-03-01 15:31           ` Michael S. Tsirkin
2021-03-01 13:28     ` Igor Mammedov
2021-03-01 16:14       ` Michael S. Tsirkin
2021-03-01 16:28         ` Laszlo Ersek
2021-03-01 19:08           ` Igor Mammedov
2021-03-01 20:06             ` vit9696
2021-03-02  8:40             ` Laszlo Ersek

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.