qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 1/2] i386/acpi: fix inconsistent QEMU/OVMF device paths
@ 2020-07-30 15:58 Michael S. Tsirkin
  2020-07-30 15:58 ` [PATCH 2/2] arm/acpi: fix an out of spec _UID for PCI root Michael S. Tsirkin
                   ` (4 more replies)
  0 siblings, 5 replies; 33+ messages in thread
From: Michael S. Tsirkin @ 2020-07-30 15:58 UTC (permalink / raw)
  To: qemu-devel
  Cc: vit9696, Eduardo Habkost, Paolo Bonzini, Igor Mammedov,
	Laszlo Ersek, Richard Henderson

macOS uses ACPI UIDs to build the DevicePath for NVRAM boot options,
while OVMF firmware gets them via an internal channel through QEMU.
Due to a bug in QEMU ACPI currently UEFI firmware and ACPI have
different values, and this makes the underlying operating system
unable to report its boot option.

The particular node in question is the primary PciRoot (PCI0 in ACPI),
which for some reason gets assigned 1 in ACPI UID and 0 in the
DevicePath. This is due to the _UID assigned to it by build_dsdt in
hw/i386/acpi-build.c Which does not correspond to the primary PCI
identifier given by pcibus_num in hw/pci/pci.c

Reference with the device paths, OVMF startup logs, and ACPI table
dumps (SysReport):
https://github.com/acidanthera/bugtracker/issues/1050

In UEFI v2.8, section "10.4.2 Rules with ACPI _HID and _UID" ends with
the paragraph,

    Root PCI bridges will use the plug and play ID of PNP0A03, This will
    be stored in the ACPI Device Path _HID field, or in the Expanded
    ACPI Device Path _CID field to match the ACPI name space. The _UID
    in the ACPI Device Path structure must match the _UID in the ACPI
    name space.

(See especially the last sentence.)

Considering *extra* root bridges / root buses (with bus number > 0),
QEMU's ACPI generator actually does the right thing; since QEMU commit
c96d9286a6d7 ("i386/acpi-build: more traditional _UID and _HID for PXB
root buses", 2015-06-11).

However, the _UID values for root bridge zero (on both i440fx and q35)
have always been "wrong" (from UEFI perspective), going back in QEMU to
commit 74523b850189 ("i386: add ACPI table files from seabios",
2013-10-14).

Even in SeaBIOS, these _UID values have always been 1; see commit
a4d357638c57 ("Port rombios32 code from bochs-bios.", 2008-03-08) for
i440fx, and commit ecbe3fd61511 ("seabios: q35: add dsdt", 2012-12-01)
for q35.

Suggested-by: Laszlo Ersek <lersek@redhat.com>
Tested-by: vit9696 <vit9696@protonmail.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
---
 hw/i386/acpi-build.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c
index b7bcbbbb2a..7a5a8b3521 100644
--- a/hw/i386/acpi-build.c
+++ b/hw/i386/acpi-build.c
@@ -1497,7 +1497,7 @@ build_dsdt(GArray *table_data, BIOSLinker *linker,
         dev = aml_device("PCI0");
         aml_append(dev, aml_name_decl("_HID", aml_eisaid("PNP0A03")));
         aml_append(dev, aml_name_decl("_ADR", aml_int(0)));
-        aml_append(dev, aml_name_decl("_UID", aml_int(1)));
+        aml_append(dev, aml_name_decl("_UID", aml_int(0)));
         aml_append(sb_scope, dev);
         aml_append(dsdt, sb_scope);
 
@@ -1512,7 +1512,7 @@ build_dsdt(GArray *table_data, BIOSLinker *linker,
         aml_append(dev, aml_name_decl("_HID", aml_eisaid("PNP0A08")));
         aml_append(dev, aml_name_decl("_CID", aml_eisaid("PNP0A03")));
         aml_append(dev, aml_name_decl("_ADR", aml_int(0)));
-        aml_append(dev, aml_name_decl("_UID", aml_int(1)));
+        aml_append(dev, aml_name_decl("_UID", aml_int(0)));
         aml_append(dev, build_q35_osc_method());
         aml_append(sb_scope, dev);
         aml_append(dsdt, sb_scope);
-- 
MST



^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH 2/2] arm/acpi: fix an out of spec _UID for PCI root
  2020-07-30 15:58 [PATCH 1/2] i386/acpi: fix inconsistent QEMU/OVMF device paths Michael S. Tsirkin
@ 2020-07-30 15:58 ` Michael S. Tsirkin
  2020-07-30 16:12   ` Philippe Mathieu-Daudé
                     ` (3 more replies)
  2020-07-30 16:11 ` [PATCH 1/2] i386/acpi: fix inconsistent QEMU/OVMF device paths Philippe Mathieu-Daudé
                   ` (3 subsequent siblings)
  4 siblings, 4 replies; 33+ messages in thread
From: Michael S. Tsirkin @ 2020-07-30 15:58 UTC (permalink / raw)
  To: qemu-devel
  Cc: Peter Maydell, vit9696, Eduardo Habkost, Shannon Zhao, qemu-arm,
	Paolo Bonzini, Igor Mammedov, Laszlo Ersek, Richard Henderson

On ARM/virt machine type QEMU currently reports an incorrect _UID in
ACPI.

The particular node in question is the primary PciRoot (PCI0 in ACPI),
which gets assigned PCI0 in ACPI UID and 0 in the
DevicePath. This is due to the _UID assigned to it by build_dsdt in
hw/arm/virt-acpi-build.c Which does not correspond to the primary PCI
identifier given by pcibus_num in hw/pci/pci.c

In UEFI v2.8, section "10.4.2 Rules with ACPI _HID and _UID" ends with
the paragraph,

    Root PCI bridges will use the plug and play ID of PNP0A03, This will
    be stored in the ACPI Device Path _HID field, or in the Expanded
    ACPI Device Path _CID field to match the ACPI name space. The _UID
    in the ACPI Device Path structure must match the _UID in the ACPI
    name space.

(See especially the last sentence.)

A similar bug has been reported on i386, on that architecture it has
been reported to confuse at least macOS which uses ACPI UIDs to build
the DevicePath for NVRAM boot options, while OVMF firmware gets them via
an internal channel through QEMU.  When UEFI firmware and ACPI have
different values, this makes the underlying operating system unable to
report its boot option.

Reported-by: vit9696 <vit9696@protonmail.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
---

Peter can you either ack or merge this one pls?

 hw/arm/virt-acpi-build.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/hw/arm/virt-acpi-build.c b/hw/arm/virt-acpi-build.c
index 91f0df7b13..0a482ff6f7 100644
--- a/hw/arm/virt-acpi-build.c
+++ b/hw/arm/virt-acpi-build.c
@@ -170,7 +170,7 @@ static void acpi_dsdt_add_pci(Aml *scope, const MemMapEntry *memmap,
     aml_append(dev, aml_name_decl("_CID", aml_string("PNP0A03")));
     aml_append(dev, aml_name_decl("_SEG", aml_int(0)));
     aml_append(dev, aml_name_decl("_BBN", aml_int(0)));
-    aml_append(dev, aml_name_decl("_UID", aml_string("PCI0")));
+    aml_append(dev, aml_name_decl("_UID", aml_int(0)));
     aml_append(dev, aml_name_decl("_STR", aml_unicode("PCIe 0 Device")));
     aml_append(dev, aml_name_decl("_CCA", aml_int(1)));
 
-- 
MST



^ permalink raw reply related	[flat|nested] 33+ messages in thread

* Re: [PATCH 1/2] i386/acpi: fix inconsistent QEMU/OVMF device paths
  2020-07-30 15:58 [PATCH 1/2] i386/acpi: fix inconsistent QEMU/OVMF device paths Michael S. Tsirkin
  2020-07-30 15:58 ` [PATCH 2/2] arm/acpi: fix an out of spec _UID for PCI root Michael S. Tsirkin
@ 2020-07-30 16:11 ` Philippe Mathieu-Daudé
  2020-07-30 19:35   ` Michael S. Tsirkin
  2020-07-30 19:34 ` Laszlo Ersek
                   ` (2 subsequent siblings)
  4 siblings, 1 reply; 33+ messages in thread
From: Philippe Mathieu-Daudé @ 2020-07-30 16:11 UTC (permalink / raw)
  To: Michael S. Tsirkin, qemu-devel
  Cc: vit9696, Eduardo Habkost, Igor Mammedov, Paolo Bonzini,
	Laszlo Ersek, Richard Henderson

On 7/30/20 5:58 PM, Michael S. Tsirkin wrote:
> macOS uses ACPI UIDs to build the DevicePath for NVRAM boot options,
> while OVMF firmware gets them via an internal channel through QEMU.
> Due to a bug in QEMU ACPI currently UEFI firmware and ACPI have
> different values, and this makes the underlying operating system
> unable to report its boot option.
> 
> The particular node in question is the primary PciRoot (PCI0 in ACPI),
> which for some reason gets assigned 1 in ACPI UID and 0 in the
> DevicePath. This is due to the _UID assigned to it by build_dsdt in
> hw/i386/acpi-build.c Which does not correspond to the primary PCI
> identifier given by pcibus_num in hw/pci/pci.c
> 
> Reference with the device paths, OVMF startup logs, and ACPI table
> dumps (SysReport):
> https://github.com/acidanthera/bugtracker/issues/1050
> 
> In UEFI v2.8, section "10.4.2 Rules with ACPI _HID and _UID" ends with
> the paragraph,
> 
>     Root PCI bridges will use the plug and play ID of PNP0A03, This will
>     be stored in the ACPI Device Path _HID field, or in the Expanded
>     ACPI Device Path _CID field to match the ACPI name space. The _UID
>     in the ACPI Device Path structure must match the _UID in the ACPI
>     name space.
> 
> (See especially the last sentence.)
> 
> Considering *extra* root bridges / root buses (with bus number > 0),
> QEMU's ACPI generator actually does the right thing; since QEMU commit
> c96d9286a6d7 ("i386/acpi-build: more traditional _UID and _HID for PXB
> root buses", 2015-06-11).
> 
> However, the _UID values for root bridge zero (on both i440fx and q35)
> have always been "wrong" (from UEFI perspective), going back in QEMU to
> commit 74523b850189 ("i386: add ACPI table files from seabios",
> 2013-10-14).
> 
> Even in SeaBIOS, these _UID values have always been 1; see commit
> a4d357638c57 ("Port rombios32 code from bochs-bios.", 2008-03-08) for
> i440fx, and commit ecbe3fd61511 ("seabios: q35: add dsdt", 2012-12-01)
> for q35.
> 
> Suggested-by: Laszlo Ersek <lersek@redhat.com>
> Tested-by: vit9696 <vit9696@protonmail.com>

Vitaly uses his full name on EDK2 mailing list, so I don't think he'll
have a problem to use it in QEMU too:
Tested-by: Vitaly Cheptsov <vit9696@protonmail.com>

From:
https://wiki.qemu.org/Contribute/SubmitAPatch#Patch_emails_must_include_a_Signed-off-by:_line
"Please use your real name to sign a patch (not an alias or acronym)."

> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
> ---
>  hw/i386/acpi-build.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c
> index b7bcbbbb2a..7a5a8b3521 100644
> --- a/hw/i386/acpi-build.c
> +++ b/hw/i386/acpi-build.c
> @@ -1497,7 +1497,7 @@ build_dsdt(GArray *table_data, BIOSLinker *linker,
>          dev = aml_device("PCI0");
>          aml_append(dev, aml_name_decl("_HID", aml_eisaid("PNP0A03")));
>          aml_append(dev, aml_name_decl("_ADR", aml_int(0)));
> -        aml_append(dev, aml_name_decl("_UID", aml_int(1)));
> +        aml_append(dev, aml_name_decl("_UID", aml_int(0)));
>          aml_append(sb_scope, dev);
>          aml_append(dsdt, sb_scope);
>  
> @@ -1512,7 +1512,7 @@ build_dsdt(GArray *table_data, BIOSLinker *linker,
>          aml_append(dev, aml_name_decl("_HID", aml_eisaid("PNP0A08")));
>          aml_append(dev, aml_name_decl("_CID", aml_eisaid("PNP0A03")));
>          aml_append(dev, aml_name_decl("_ADR", aml_int(0)));
> -        aml_append(dev, aml_name_decl("_UID", aml_int(1)));
> +        aml_append(dev, aml_name_decl("_UID", aml_int(0)));
>          aml_append(dev, build_q35_osc_method());
>          aml_append(sb_scope, dev);
>          aml_append(dsdt, sb_scope);
> 



^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH 2/2] arm/acpi: fix an out of spec _UID for PCI root
  2020-07-30 15:58 ` [PATCH 2/2] arm/acpi: fix an out of spec _UID for PCI root Michael S. Tsirkin
@ 2020-07-30 16:12   ` Philippe Mathieu-Daudé
  2020-07-30 19:35   ` Laszlo Ersek
                     ` (2 subsequent siblings)
  3 siblings, 0 replies; 33+ messages in thread
From: Philippe Mathieu-Daudé @ 2020-07-30 16:12 UTC (permalink / raw)
  To: Michael S. Tsirkin, qemu-devel
  Cc: Peter Maydell, vit9696, Shannon Zhao, qemu-arm, Igor Mammedov,
	Paolo Bonzini, Richard Henderson, Laszlo Ersek, Eduardo Habkost

On 7/30/20 5:58 PM, Michael S. Tsirkin wrote:
> On ARM/virt machine type QEMU currently reports an incorrect _UID in
> ACPI.
> 
> The particular node in question is the primary PciRoot (PCI0 in ACPI),
> which gets assigned PCI0 in ACPI UID and 0 in the
> DevicePath. This is due to the _UID assigned to it by build_dsdt in
> hw/arm/virt-acpi-build.c Which does not correspond to the primary PCI
> identifier given by pcibus_num in hw/pci/pci.c
> 
> In UEFI v2.8, section "10.4.2 Rules with ACPI _HID and _UID" ends with
> the paragraph,
> 
>     Root PCI bridges will use the plug and play ID of PNP0A03, This will
>     be stored in the ACPI Device Path _HID field, or in the Expanded
>     ACPI Device Path _CID field to match the ACPI name space. The _UID
>     in the ACPI Device Path structure must match the _UID in the ACPI
>     name space.
> 
> (See especially the last sentence.)
> 
> A similar bug has been reported on i386, on that architecture it has
> been reported to confuse at least macOS which uses ACPI UIDs to build
> the DevicePath for NVRAM boot options, while OVMF firmware gets them via
> an internal channel through QEMU.  When UEFI firmware and ACPI have
> different values, this makes the underlying operating system unable to
> report its boot option.
> 
> Reported-by: vit9696 <vit9696@protonmail.com>

Ditto:
Reported-by: Vitaly Cheptsov <vit9696@protonmail.com>

> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
> ---
> 
> Peter can you either ack or merge this one pls?
> 
>  hw/arm/virt-acpi-build.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/hw/arm/virt-acpi-build.c b/hw/arm/virt-acpi-build.c
> index 91f0df7b13..0a482ff6f7 100644
> --- a/hw/arm/virt-acpi-build.c
> +++ b/hw/arm/virt-acpi-build.c
> @@ -170,7 +170,7 @@ static void acpi_dsdt_add_pci(Aml *scope, const MemMapEntry *memmap,
>      aml_append(dev, aml_name_decl("_CID", aml_string("PNP0A03")));
>      aml_append(dev, aml_name_decl("_SEG", aml_int(0)));
>      aml_append(dev, aml_name_decl("_BBN", aml_int(0)));
> -    aml_append(dev, aml_name_decl("_UID", aml_string("PCI0")));
> +    aml_append(dev, aml_name_decl("_UID", aml_int(0)));
>      aml_append(dev, aml_name_decl("_STR", aml_unicode("PCIe 0 Device")));
>      aml_append(dev, aml_name_decl("_CCA", aml_int(1)));
>  
> 



^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH 1/2] i386/acpi: fix inconsistent QEMU/OVMF device paths
  2020-07-30 15:58 [PATCH 1/2] i386/acpi: fix inconsistent QEMU/OVMF device paths Michael S. Tsirkin
  2020-07-30 15:58 ` [PATCH 2/2] arm/acpi: fix an out of spec _UID for PCI root Michael S. Tsirkin
  2020-07-30 16:11 ` [PATCH 1/2] i386/acpi: fix inconsistent QEMU/OVMF device paths Philippe Mathieu-Daudé
@ 2020-07-30 19:34 ` Laszlo Ersek
  2020-07-31  9:30 ` Igor Mammedov
  2021-02-27 19:41 ` Thomas Lamprecht
  4 siblings, 0 replies; 33+ messages in thread
From: Laszlo Ersek @ 2020-07-30 19:34 UTC (permalink / raw)
  To: Michael S. Tsirkin, qemu-devel
  Cc: Eduardo Habkost, Paolo Bonzini, Igor Mammedov, vit9696,
	Richard Henderson

On 07/30/20 17:58, Michael S. Tsirkin wrote:
> macOS uses ACPI UIDs to build the DevicePath for NVRAM boot options,
> while OVMF firmware gets them via an internal channel through QEMU.
> Due to a bug in QEMU ACPI currently UEFI firmware and ACPI have
> different values, and this makes the underlying operating system
> unable to report its boot option.
> 
> The particular node in question is the primary PciRoot (PCI0 in ACPI),
> which for some reason gets assigned 1 in ACPI UID and 0 in the
> DevicePath. This is due to the _UID assigned to it by build_dsdt in
> hw/i386/acpi-build.c Which does not correspond to the primary PCI
> identifier given by pcibus_num in hw/pci/pci.c
> 
> Reference with the device paths, OVMF startup logs, and ACPI table
> dumps (SysReport):
> https://github.com/acidanthera/bugtracker/issues/1050
> 
> In UEFI v2.8, section "10.4.2 Rules with ACPI _HID and _UID" ends with
> the paragraph,
> 
>     Root PCI bridges will use the plug and play ID of PNP0A03, This will
>     be stored in the ACPI Device Path _HID field, or in the Expanded
>     ACPI Device Path _CID field to match the ACPI name space. The _UID
>     in the ACPI Device Path structure must match the _UID in the ACPI
>     name space.
> 
> (See especially the last sentence.)
> 
> Considering *extra* root bridges / root buses (with bus number > 0),
> QEMU's ACPI generator actually does the right thing; since QEMU commit
> c96d9286a6d7 ("i386/acpi-build: more traditional _UID and _HID for PXB
> root buses", 2015-06-11).
> 
> However, the _UID values for root bridge zero (on both i440fx and q35)
> have always been "wrong" (from UEFI perspective), going back in QEMU to
> commit 74523b850189 ("i386: add ACPI table files from seabios",
> 2013-10-14).
> 
> Even in SeaBIOS, these _UID values have always been 1; see commit
> a4d357638c57 ("Port rombios32 code from bochs-bios.", 2008-03-08) for
> i440fx, and commit ecbe3fd61511 ("seabios: q35: add dsdt", 2012-12-01)
> for q35.
> 
> Suggested-by: Laszlo Ersek <lersek@redhat.com>
> Tested-by: vit9696 <vit9696@protonmail.com>
> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
> ---
>  hw/i386/acpi-build.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c
> index b7bcbbbb2a..7a5a8b3521 100644
> --- a/hw/i386/acpi-build.c
> +++ b/hw/i386/acpi-build.c
> @@ -1497,7 +1497,7 @@ build_dsdt(GArray *table_data, BIOSLinker *linker,
>          dev = aml_device("PCI0");
>          aml_append(dev, aml_name_decl("_HID", aml_eisaid("PNP0A03")));
>          aml_append(dev, aml_name_decl("_ADR", aml_int(0)));
> -        aml_append(dev, aml_name_decl("_UID", aml_int(1)));
> +        aml_append(dev, aml_name_decl("_UID", aml_int(0)));
>          aml_append(sb_scope, dev);
>          aml_append(dsdt, sb_scope);
>  
> @@ -1512,7 +1512,7 @@ build_dsdt(GArray *table_data, BIOSLinker *linker,
>          aml_append(dev, aml_name_decl("_HID", aml_eisaid("PNP0A08")));
>          aml_append(dev, aml_name_decl("_CID", aml_eisaid("PNP0A03")));
>          aml_append(dev, aml_name_decl("_ADR", aml_int(0)));
> -        aml_append(dev, aml_name_decl("_UID", aml_int(1)));
> +        aml_append(dev, aml_name_decl("_UID", aml_int(0)));
>          aml_append(dev, build_q35_osc_method());
>          aml_append(sb_scope, dev);
>          aml_append(dsdt, sb_scope);
> 

with Phil's feedback included:

Reviewed-by: Laszlo Ersek <lersek@redhat.com>

Thank you very much for writing up the commit message on this patch,
Michael!

Laszlo



^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH 2/2] arm/acpi: fix an out of spec _UID for PCI root
  2020-07-30 15:58 ` [PATCH 2/2] arm/acpi: fix an out of spec _UID for PCI root Michael S. Tsirkin
  2020-07-30 16:12   ` Philippe Mathieu-Daudé
@ 2020-07-30 19:35   ` Laszlo Ersek
  2020-07-30 20:33   ` Peter Maydell
  2020-07-31  9:31   ` Igor Mammedov
  3 siblings, 0 replies; 33+ messages in thread
From: Laszlo Ersek @ 2020-07-30 19:35 UTC (permalink / raw)
  To: Michael S. Tsirkin, qemu-devel
  Cc: Peter Maydell, Eduardo Habkost, Shannon Zhao, qemu-arm,
	Paolo Bonzini, Igor Mammedov, vit9696, Richard Henderson

On 07/30/20 17:58, Michael S. Tsirkin wrote:
> On ARM/virt machine type QEMU currently reports an incorrect _UID in
> ACPI.
> 
> The particular node in question is the primary PciRoot (PCI0 in ACPI),
> which gets assigned PCI0 in ACPI UID and 0 in the
> DevicePath. This is due to the _UID assigned to it by build_dsdt in
> hw/arm/virt-acpi-build.c Which does not correspond to the primary PCI
> identifier given by pcibus_num in hw/pci/pci.c
> 
> In UEFI v2.8, section "10.4.2 Rules with ACPI _HID and _UID" ends with
> the paragraph,
> 
>     Root PCI bridges will use the plug and play ID of PNP0A03, This will
>     be stored in the ACPI Device Path _HID field, or in the Expanded
>     ACPI Device Path _CID field to match the ACPI name space. The _UID
>     in the ACPI Device Path structure must match the _UID in the ACPI
>     name space.
> 
> (See especially the last sentence.)
> 
> A similar bug has been reported on i386, on that architecture it has
> been reported to confuse at least macOS which uses ACPI UIDs to build
> the DevicePath for NVRAM boot options, while OVMF firmware gets them via
> an internal channel through QEMU.  When UEFI firmware and ACPI have
> different values, this makes the underlying operating system unable to
> report its boot option.
> 
> Reported-by: vit9696 <vit9696@protonmail.com>
> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
> ---
> 
> Peter can you either ack or merge this one pls?
> 
>  hw/arm/virt-acpi-build.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/hw/arm/virt-acpi-build.c b/hw/arm/virt-acpi-build.c
> index 91f0df7b13..0a482ff6f7 100644
> --- a/hw/arm/virt-acpi-build.c
> +++ b/hw/arm/virt-acpi-build.c
> @@ -170,7 +170,7 @@ static void acpi_dsdt_add_pci(Aml *scope, const MemMapEntry *memmap,
>      aml_append(dev, aml_name_decl("_CID", aml_string("PNP0A03")));
>      aml_append(dev, aml_name_decl("_SEG", aml_int(0)));
>      aml_append(dev, aml_name_decl("_BBN", aml_int(0)));
> -    aml_append(dev, aml_name_decl("_UID", aml_string("PCI0")));
> +    aml_append(dev, aml_name_decl("_UID", aml_int(0)));
>      aml_append(dev, aml_name_decl("_STR", aml_unicode("PCIe 0 Device")));
>      aml_append(dev, aml_name_decl("_CCA", aml_int(1)));
>  
> 

Reviewed-by: Laszlo Ersek <lersek@redhat.com>

(Vitaly's full name could be included in the Reported-by here as well,
arguably.)

Thanks!
Laszlo



^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH 1/2] i386/acpi: fix inconsistent QEMU/OVMF device paths
  2020-07-30 16:11 ` [PATCH 1/2] i386/acpi: fix inconsistent QEMU/OVMF device paths Philippe Mathieu-Daudé
@ 2020-07-30 19:35   ` Michael S. Tsirkin
  2020-07-31  2:55     ` vit9696 via
  0 siblings, 1 reply; 33+ messages in thread
From: Michael S. Tsirkin @ 2020-07-30 19:35 UTC (permalink / raw)
  To: Philippe Mathieu-Daudé
  Cc: Eduardo Habkost, qemu-devel, Igor Mammedov, Paolo Bonzini,
	Richard Henderson, Laszlo Ersek, vit9696

On Thu, Jul 30, 2020 at 06:11:17PM +0200, Philippe Mathieu-Daudé wrote:
> On 7/30/20 5:58 PM, Michael S. Tsirkin wrote:
> > macOS uses ACPI UIDs to build the DevicePath for NVRAM boot options,
> > while OVMF firmware gets them via an internal channel through QEMU.
> > Due to a bug in QEMU ACPI currently UEFI firmware and ACPI have
> > different values, and this makes the underlying operating system
> > unable to report its boot option.
> > 
> > The particular node in question is the primary PciRoot (PCI0 in ACPI),
> > which for some reason gets assigned 1 in ACPI UID and 0 in the
> > DevicePath. This is due to the _UID assigned to it by build_dsdt in
> > hw/i386/acpi-build.c Which does not correspond to the primary PCI
> > identifier given by pcibus_num in hw/pci/pci.c
> > 
> > Reference with the device paths, OVMF startup logs, and ACPI table
> > dumps (SysReport):
> > https://github.com/acidanthera/bugtracker/issues/1050
> > 
> > In UEFI v2.8, section "10.4.2 Rules with ACPI _HID and _UID" ends with
> > the paragraph,
> > 
> >     Root PCI bridges will use the plug and play ID of PNP0A03, This will
> >     be stored in the ACPI Device Path _HID field, or in the Expanded
> >     ACPI Device Path _CID field to match the ACPI name space. The _UID
> >     in the ACPI Device Path structure must match the _UID in the ACPI
> >     name space.
> > 
> > (See especially the last sentence.)
> > 
> > Considering *extra* root bridges / root buses (with bus number > 0),
> > QEMU's ACPI generator actually does the right thing; since QEMU commit
> > c96d9286a6d7 ("i386/acpi-build: more traditional _UID and _HID for PXB
> > root buses", 2015-06-11).
> > 
> > However, the _UID values for root bridge zero (on both i440fx and q35)
> > have always been "wrong" (from UEFI perspective), going back in QEMU to
> > commit 74523b850189 ("i386: add ACPI table files from seabios",
> > 2013-10-14).
> > 
> > Even in SeaBIOS, these _UID values have always been 1; see commit
> > a4d357638c57 ("Port rombios32 code from bochs-bios.", 2008-03-08) for
> > i440fx, and commit ecbe3fd61511 ("seabios: q35: add dsdt", 2012-12-01)
> > for q35.
> > 
> > Suggested-by: Laszlo Ersek <lersek@redhat.com>
> > Tested-by: vit9696 <vit9696@protonmail.com>
> 
> Vitaly uses his full name on EDK2 mailing list, so I don't think he'll
> have a problem to use it in QEMU too:
> Tested-by: Vitaly Cheptsov <vit9696@protonmail.com>
> 
> From:
> https://wiki.qemu.org/Contribute/SubmitAPatch#Patch_emails_must_include_a_Signed-off-by:_line
> "Please use your real name to sign a patch (not an alias or acronym)."

Right. Tested-by is different though, I don't think we have
a problem with anonymous testing.

Anyway, updated.

> > Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
> > ---
> >  hw/i386/acpi-build.c | 4 ++--
> >  1 file changed, 2 insertions(+), 2 deletions(-)
> > 
> > diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c
> > index b7bcbbbb2a..7a5a8b3521 100644
> > --- a/hw/i386/acpi-build.c
> > +++ b/hw/i386/acpi-build.c
> > @@ -1497,7 +1497,7 @@ build_dsdt(GArray *table_data, BIOSLinker *linker,
> >          dev = aml_device("PCI0");
> >          aml_append(dev, aml_name_decl("_HID", aml_eisaid("PNP0A03")));
> >          aml_append(dev, aml_name_decl("_ADR", aml_int(0)));
> > -        aml_append(dev, aml_name_decl("_UID", aml_int(1)));
> > +        aml_append(dev, aml_name_decl("_UID", aml_int(0)));
> >          aml_append(sb_scope, dev);
> >          aml_append(dsdt, sb_scope);
> >  
> > @@ -1512,7 +1512,7 @@ build_dsdt(GArray *table_data, BIOSLinker *linker,
> >          aml_append(dev, aml_name_decl("_HID", aml_eisaid("PNP0A08")));
> >          aml_append(dev, aml_name_decl("_CID", aml_eisaid("PNP0A03")));
> >          aml_append(dev, aml_name_decl("_ADR", aml_int(0)));
> > -        aml_append(dev, aml_name_decl("_UID", aml_int(1)));
> > +        aml_append(dev, aml_name_decl("_UID", aml_int(0)));
> >          aml_append(dev, build_q35_osc_method());
> >          aml_append(sb_scope, dev);
> >          aml_append(dsdt, sb_scope);
> > 



^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH 2/2] arm/acpi: fix an out of spec _UID for PCI root
  2020-07-30 15:58 ` [PATCH 2/2] arm/acpi: fix an out of spec _UID for PCI root Michael S. Tsirkin
  2020-07-30 16:12   ` Philippe Mathieu-Daudé
  2020-07-30 19:35   ` Laszlo Ersek
@ 2020-07-30 20:33   ` Peter Maydell
  2020-07-31  9:31   ` Igor Mammedov
  3 siblings, 0 replies; 33+ messages in thread
From: Peter Maydell @ 2020-07-30 20:33 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: vit9696, QEMU Developers, Eduardo Habkost, Shannon Zhao,
	qemu-arm, Paolo Bonzini, Igor Mammedov, Laszlo Ersek,
	Richard Henderson

On Thu, 30 Jul 2020 at 16:58, Michael S. Tsirkin <mst@redhat.com> wrote:
>
> On ARM/virt machine type QEMU currently reports an incorrect _UID in
> ACPI.
>
> The particular node in question is the primary PciRoot (PCI0 in ACPI),
> which gets assigned PCI0 in ACPI UID and 0 in the
> DevicePath. This is due to the _UID assigned to it by build_dsdt in
> hw/arm/virt-acpi-build.c Which does not correspond to the primary PCI
> identifier given by pcibus_num in hw/pci/pci.c
>
> In UEFI v2.8, section "10.4.2 Rules with ACPI _HID and _UID" ends with
> the paragraph,
>
>     Root PCI bridges will use the plug and play ID of PNP0A03, This will
>     be stored in the ACPI Device Path _HID field, or in the Expanded
>     ACPI Device Path _CID field to match the ACPI name space. The _UID
>     in the ACPI Device Path structure must match the _UID in the ACPI
>     name space.
>
> (See especially the last sentence.)
>
> A similar bug has been reported on i386, on that architecture it has
> been reported to confuse at least macOS which uses ACPI UIDs to build
> the DevicePath for NVRAM boot options, while OVMF firmware gets them via
> an internal channel through QEMU.  When UEFI firmware and ACPI have
> different values, this makes the underlying operating system unable to
> report its boot option.
>
> Reported-by: vit9696 <vit9696@protonmail.com>
> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
> ---
>
> Peter can you either ack or merge this one pls?

Since you have the x86 one to do anyway, I'll let you take
this one.
Acked-by: Peter Maydell <peter.maydell@linaro.org>

thanks
-- PMM


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH 1/2] i386/acpi: fix inconsistent QEMU/OVMF device paths
  2020-07-30 19:35   ` Michael S. Tsirkin
@ 2020-07-31  2:55     ` vit9696 via
  0 siblings, 0 replies; 33+ messages in thread
From: vit9696 via @ 2020-07-31  2:55 UTC (permalink / raw)
  To: Michael S. Tsirkin, Philippe Mathieu-Daudé
  Cc: qemu-devel, Eduardo Habkost, Paolo Bonzini, Igor Mammedov,
	Laszlo Ersek, Richard Henderson


[-- Attachment #1.1: Type: text/html, Size: 5320 bytes --]

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 489 bytes --]

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH 1/2] i386/acpi: fix inconsistent QEMU/OVMF device paths
  2020-07-30 15:58 [PATCH 1/2] i386/acpi: fix inconsistent QEMU/OVMF device paths Michael S. Tsirkin
                   ` (2 preceding siblings ...)
  2020-07-30 19:34 ` Laszlo Ersek
@ 2020-07-31  9:30 ` Igor Mammedov
  2021-02-27 19:41 ` Thomas Lamprecht
  4 siblings, 0 replies; 33+ messages in thread
From: Igor Mammedov @ 2020-07-31  9:30 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Eduardo Habkost, qemu-devel, Paolo Bonzini, vit9696,
	Laszlo Ersek, Richard Henderson

On Thu, 30 Jul 2020 11:58:38 -0400
"Michael S. Tsirkin" <mst@redhat.com> wrote:

> macOS uses ACPI UIDs to build the DevicePath for NVRAM boot options,
> while OVMF firmware gets them via an internal channel through QEMU.
> Due to a bug in QEMU ACPI currently UEFI firmware and ACPI have
> different values, and this makes the underlying operating system
> unable to report its boot option.
> 
> The particular node in question is the primary PciRoot (PCI0 in ACPI),
> which for some reason gets assigned 1 in ACPI UID and 0 in the
> DevicePath. This is due to the _UID assigned to it by build_dsdt in
> hw/i386/acpi-build.c Which does not correspond to the primary PCI
> identifier given by pcibus_num in hw/pci/pci.c
> 
> Reference with the device paths, OVMF startup logs, and ACPI table
> dumps (SysReport):
> https://github.com/acidanthera/bugtracker/issues/1050
> 
> In UEFI v2.8, section "10.4.2 Rules with ACPI _HID and _UID" ends with
> the paragraph,
> 
>     Root PCI bridges will use the plug and play ID of PNP0A03, This will
>     be stored in the ACPI Device Path _HID field, or in the Expanded
>     ACPI Device Path _CID field to match the ACPI name space. The _UID
>     in the ACPI Device Path structure must match the _UID in the ACPI
>     name space.
> 
> (See especially the last sentence.)
> 
> Considering *extra* root bridges / root buses (with bus number > 0),
> QEMU's ACPI generator actually does the right thing; since QEMU commit
> c96d9286a6d7 ("i386/acpi-build: more traditional _UID and _HID for PXB
> root buses", 2015-06-11).
> 
> However, the _UID values for root bridge zero (on both i440fx and q35)
> have always been "wrong" (from UEFI perspective), going back in QEMU to
> commit 74523b850189 ("i386: add ACPI table files from seabios",
> 2013-10-14).
> 
> Even in SeaBIOS, these _UID values have always been 1; see commit
> a4d357638c57 ("Port rombios32 code from bochs-bios.", 2008-03-08) for
> i440fx, and commit ecbe3fd61511 ("seabios: q35: add dsdt", 2012-12-01)
> for q35.
> 
> Suggested-by: Laszlo Ersek <lersek@redhat.com>
> Tested-by: vit9696 <vit9696@protonmail.com>
> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

Reviewed-by: Igor Mammedov <imammedo@redhat.com>

> ---
>  hw/i386/acpi-build.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c
> index b7bcbbbb2a..7a5a8b3521 100644
> --- a/hw/i386/acpi-build.c
> +++ b/hw/i386/acpi-build.c
> @@ -1497,7 +1497,7 @@ build_dsdt(GArray *table_data, BIOSLinker *linker,
>          dev = aml_device("PCI0");
>          aml_append(dev, aml_name_decl("_HID", aml_eisaid("PNP0A03")));
>          aml_append(dev, aml_name_decl("_ADR", aml_int(0)));
> -        aml_append(dev, aml_name_decl("_UID", aml_int(1)));
> +        aml_append(dev, aml_name_decl("_UID", aml_int(0)));
>          aml_append(sb_scope, dev);
>          aml_append(dsdt, sb_scope);
>  
> @@ -1512,7 +1512,7 @@ build_dsdt(GArray *table_data, BIOSLinker *linker,
>          aml_append(dev, aml_name_decl("_HID", aml_eisaid("PNP0A08")));
>          aml_append(dev, aml_name_decl("_CID", aml_eisaid("PNP0A03")));
>          aml_append(dev, aml_name_decl("_ADR", aml_int(0)));
> -        aml_append(dev, aml_name_decl("_UID", aml_int(1)));
> +        aml_append(dev, aml_name_decl("_UID", aml_int(0)));
>          aml_append(dev, build_q35_osc_method());
>          aml_append(sb_scope, dev);
>          aml_append(dsdt, sb_scope);



^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH 2/2] arm/acpi: fix an out of spec _UID for PCI root
  2020-07-30 15:58 ` [PATCH 2/2] arm/acpi: fix an out of spec _UID for PCI root Michael S. Tsirkin
                     ` (2 preceding siblings ...)
  2020-07-30 20:33   ` Peter Maydell
@ 2020-07-31  9:31   ` Igor Mammedov
  3 siblings, 0 replies; 33+ messages in thread
From: Igor Mammedov @ 2020-07-31  9:31 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Peter Maydell, vit9696, qemu-devel, Eduardo Habkost,
	Shannon Zhao, qemu-arm, Paolo Bonzini, Laszlo Ersek,
	Richard Henderson

On Thu, 30 Jul 2020 11:58:41 -0400
"Michael S. Tsirkin" <mst@redhat.com> wrote:

> On ARM/virt machine type QEMU currently reports an incorrect _UID in
> ACPI.
> 
> The particular node in question is the primary PciRoot (PCI0 in ACPI),
> which gets assigned PCI0 in ACPI UID and 0 in the
> DevicePath. This is due to the _UID assigned to it by build_dsdt in
> hw/arm/virt-acpi-build.c Which does not correspond to the primary PCI
> identifier given by pcibus_num in hw/pci/pci.c
> 
> In UEFI v2.8, section "10.4.2 Rules with ACPI _HID and _UID" ends with
> the paragraph,
> 
>     Root PCI bridges will use the plug and play ID of PNP0A03, This will
>     be stored in the ACPI Device Path _HID field, or in the Expanded
>     ACPI Device Path _CID field to match the ACPI name space. The _UID
>     in the ACPI Device Path structure must match the _UID in the ACPI
>     name space.
> 
> (See especially the last sentence.)
> 
> A similar bug has been reported on i386, on that architecture it has
> been reported to confuse at least macOS which uses ACPI UIDs to build
> the DevicePath for NVRAM boot options, while OVMF firmware gets them via
> an internal channel through QEMU.  When UEFI firmware and ACPI have
> different values, this makes the underlying operating system unable to
> report its boot option.
> 
> Reported-by: vit9696 <vit9696@protonmail.com>
> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

Reviewed-by: Igor Mammedov <imammedo@redhat.com>

> ---
> 
> Peter can you either ack or merge this one pls?
> 
>  hw/arm/virt-acpi-build.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/hw/arm/virt-acpi-build.c b/hw/arm/virt-acpi-build.c
> index 91f0df7b13..0a482ff6f7 100644
> --- a/hw/arm/virt-acpi-build.c
> +++ b/hw/arm/virt-acpi-build.c
> @@ -170,7 +170,7 @@ static void acpi_dsdt_add_pci(Aml *scope, const MemMapEntry *memmap,
>      aml_append(dev, aml_name_decl("_CID", aml_string("PNP0A03")));
>      aml_append(dev, aml_name_decl("_SEG", aml_int(0)));
>      aml_append(dev, aml_name_decl("_BBN", aml_int(0)));
> -    aml_append(dev, aml_name_decl("_UID", aml_string("PCI0")));
> +    aml_append(dev, aml_name_decl("_UID", aml_int(0)));
>      aml_append(dev, aml_name_decl("_STR", aml_unicode("PCIe 0 Device")));
>      aml_append(dev, aml_name_decl("_CCA", aml_int(1)));
>  



^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH 1/2] i386/acpi: fix inconsistent QEMU/OVMF device paths
  2020-07-30 15:58 [PATCH 1/2] i386/acpi: fix inconsistent QEMU/OVMF device paths Michael S. Tsirkin
                   ` (3 preceding siblings ...)
  2020-07-31  9:30 ` Igor Mammedov
@ 2021-02-27 19:41 ` Thomas Lamprecht
  2021-02-28  9:11   ` vit9696
  2021-02-28 20:43   ` Michael S. Tsirkin
  4 siblings, 2 replies; 33+ messages in thread
From: Thomas Lamprecht @ 2021-02-27 19:41 UTC (permalink / raw)
  To: Michael S. Tsirkin, qemu-devel
  Cc: vit9696, Stefan Reiter, Eduardo Habkost, Igor Mammedov,
	Paolo Bonzini, Laszlo Ersek, Richard Henderson

On 30.07.20 17:58, Michael S. Tsirkin wrote:
> macOS uses ACPI UIDs to build the DevicePath for NVRAM boot options,
> while OVMF firmware gets them via an internal channel through QEMU.
> Due to a bug in QEMU ACPI currently UEFI firmware and ACPI have
> different values, and this makes the underlying operating system
> unable to report its boot option.
> 
> The particular node in question is the primary PciRoot (PCI0 in ACPI),
> which for some reason gets assigned 1 in ACPI UID and 0 in the
> DevicePath. This is due to the _UID assigned to it by build_dsdt in
> hw/i386/acpi-build.c Which does not correspond to the primary PCI
> identifier given by pcibus_num in hw/pci/pci.c
> 
> Reference with the device paths, OVMF startup logs, and ACPI table
> dumps (SysReport):
> https://github.com/acidanthera/bugtracker/issues/1050
> 
> In UEFI v2.8, section "10.4.2 Rules with ACPI _HID and _UID" ends with
> the paragraph,
> 
>     Root PCI bridges will use the plug and play ID of PNP0A03, This will
>     be stored in the ACPI Device Path _HID field, or in the Expanded
>     ACPI Device Path _CID field to match the ACPI name space. The _UID
>     in the ACPI Device Path structure must match the _UID in the ACPI
>     name space.
> 
> (See especially the last sentence.)
> 
> Considering *extra* root bridges / root buses (with bus number > 0),
> QEMU's ACPI generator actually does the right thing; since QEMU commit
> c96d9286a6d7 ("i386/acpi-build: more traditional _UID and _HID for PXB
> root buses", 2015-06-11).
> 
> However, the _UID values for root bridge zero (on both i440fx and q35)
> have always been "wrong" (from UEFI perspective), going back in QEMU to
> commit 74523b850189 ("i386: add ACPI table files from seabios",
> 2013-10-14).
> 
> Even in SeaBIOS, these _UID values have always been 1; see commit
> a4d357638c57 ("Port rombios32 code from bochs-bios.", 2008-03-08) for
> i440fx, and commit ecbe3fd61511 ("seabios: q35: add dsdt", 2012-12-01)
> for q35.
> 
> Suggested-by: Laszlo Ersek <lersek@redhat.com>
> Tested-by: vit9696 <vit9696@protonmail.com>
> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
> ---
>  hw/i386/acpi-build.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c
> index b7bcbbbb2a..7a5a8b3521 100644
> --- a/hw/i386/acpi-build.c
> +++ b/hw/i386/acpi-build.c
> @@ -1497,7 +1497,7 @@ build_dsdt(GArray *table_data, BIOSLinker *linker,
>          dev = aml_device("PCI0");
>          aml_append(dev, aml_name_decl("_HID", aml_eisaid("PNP0A03")));
>          aml_append(dev, aml_name_decl("_ADR", aml_int(0)));
> -        aml_append(dev, aml_name_decl("_UID", aml_int(1)));
> +        aml_append(dev, aml_name_decl("_UID", aml_int(0)));
>          aml_append(sb_scope, dev);
>          aml_append(dsdt, sb_scope);
>  
> @@ -1512,7 +1512,7 @@ build_dsdt(GArray *table_data, BIOSLinker *linker,
>          aml_append(dev, aml_name_decl("_HID", aml_eisaid("PNP0A08")));
>          aml_append(dev, aml_name_decl("_CID", aml_eisaid("PNP0A03")));
>          aml_append(dev, aml_name_decl("_ADR", aml_int(0)));
> -        aml_append(dev, aml_name_decl("_UID", aml_int(1)));
> +        aml_append(dev, aml_name_decl("_UID", aml_int(0)));
>          aml_append(dev, build_q35_osc_method());
>          aml_append(sb_scope, dev);
>          aml_append(dsdt, sb_scope);
> 

This "breaks" Windows guests created/installed before this change in the sense
of Windows gets confused and declares that most of the devices changed and thus
it has new entries for them in the device manager where settings of the old one
do not apply anymore.

We were made aware of this by our users when making QEMU 5.2.0 available on
a more used repository of us. Users complained that their static network
configuration got thrown out in Windows 2016 or 2019 server VMs, and Windows tried
to use DHCP (which was not available in their environments) and thus their Windows
VMs had no network connectivity at all anymore.

It's currently not yet quite 100% clear to me with what QEMU version the Windows VM
must be installed with, from reading the patch I have to believe it must be before
that, but we got mixed reports and a colleague could not replicate it from upgrade
of 4.0 to 5.2 (I did /not/ confirm that one). Anyway, just writing this all to avoid
people seeing different results and brushing this off.

So here's my personal reproducer, as said, I think that one should be able to just
use QEMU 5.1 to install a Windows guest and start it with 5.2 afterwards to see this
issue, but YMMV.

Note. I always used the exact same QEMU command (see below) for installation,
reproducing and bisect.

1. Installed Windows 2016 1616 VM using QEMU 3.0.1
   - VirtIO net/scsi driver from VirtIO win 190
2. Setup static network in the VM and shutdown
3. Started VM with 5.2.0 -> Network gone, new "Ethernet #2" adapter shows up instead

Starting the  "Device Manager" and enabling "View -> Show hidden devices" showed
me a greyed out device duplicate for basically anything attached, SCSI disk, Basic
Display Adapter, CDROM device, ..., and the Network device.

The first difference I could find was the "Device instance path" one can find in
the "Details" tab of the devices' "Properties" window.

# old, from initial installation on QEMU 3.0.1
PCI\VEN_1AF4&DEV_1000&SUBSYS_00011AF4&REV_00\3&13C0B0C5&0&90

# new, from boot with QEMU 5.2
PCI\VEN_1AF4&DEV_1000&SUBSYS_00011AF4&REV_00\3&267A616A&0&90

They match until almost the end, not sure how important that is, but it caught my
eye (I'm really no windows guy since a decade so please excuse my terrible
debugging/exploring skills there. The rest of those properties looked pretty
much identical.

I then started a bisect, always just restarting the guest with the new QEMU build
and checking "Device Manager" and network settings to see if good/bad. That worked
pretty well and I came to this commit. See the bisect log attached at the end of
this mail.

So, from reading the commit message I figure that this change is wanted, what are
the implications of just reverting it? (which works out in bringing back the
old state in Windows + working static network config again).

Or any other way/idea to address this in a sane way so that those picky Windows
guests can be handled more graciously?

I guess also that there could be more subtle effects from this patch here, the
network one may have just had quite visible effects to pop up as first issue...

Thanks if you read so far!

cheers,
Thomas


= QEMU Command =

(This was generated by our (Proxmox VE) stack, I only cleaned it up a bit to allow
easier manual running it)

./qemu-system-x86_64 \
  -name win2016 \
  -chardev 'socket,id=qmp,path=/var/run/qemu-server/11765.qmp,server,nowait' \
  -mon 'chardev=qmp,mode=control' \
  -smbios 'type=1,uuid=6324fb28-e98a-44cf-85db-694d1b3405f5' \
  -smp '2,sockets=1,cores=2,maxcpus=2' \
  -nodefaults \
  -boot 'menu=on,strict=on,reboot-timeout=1000' \
  -vnc unix:/var/run/qemu-server/11765.vnc,password \
  -no-hpet \
  -cpu 'host,hv_ipi,hv_relaxed,hv_reset,hv_runtime,hv_spinlocks=0x1fff,hv_stimer,hv_synic,hv_time,hv_vapic,hv_vpindex,+kvm_pv_eoi,+kvm_pv_unhalt,+md-clear,+pcid,+spec-ctrl' \
  -m 2048 \
  -device 'pci-bridge,id=pci.1,chassis_nr=1,bus=pci.0,addr=0x1e' \
  -device 'pci-bridge,id=pci.2,chassis_nr=2,bus=pci.0,addr=0x1f' \
  -device 'vmgenid,guid=2e56e6ca-2cf8-4f1d-8cc3-9b19a2510c01' \
  -device 'piix3-usb-uhci,id=uhci,bus=pci.0,addr=0x1.0x2' \
  -device 'usb-tablet,id=tablet,bus=uhci.0,port=1' \
  -device 'VGA,id=vga,bus=pci.0,addr=0x2,edid=off' \
  -chardev 'socket,path=/var/run/qemu-server/11765.qga,server,nowait,id=qga0' \
  -device 'virtio-serial,id=qga0,bus=pci.0,addr=0x8' \
  -device 'virtserialport,chardev=qga0,name=org.qemu.guest_agent.0' \
  -device 'virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x3' \
  -iscsi 'initiator-name=iqn.1993-08.org.debian:01:468faae9322b' \
  -drive 'file=/mnt/pve/iso/template/iso/virtio-win-0.1.190.iso,if=none,id=drive-ide0,media=cdrom,aio=threads' \
  -device 'ide-cd,bus=ide.0,unit=0,drive=drive-ide0,id=ide0,bootindex=200' \
  -drive 'file=/mnt/pve/iso/template/iso/Win2016-1616-evaluation.ISO,if=none,id=drive-ide2,media=cdrom,aio=threads' \
  -device 'ide-cd,bus=ide.1,unit=0,drive=drive-ide2,id=ide2,bootindex=201' \
  -device 'virtio-scsi-pci,id=scsihw0,bus=pci.0,addr=0x5' \
  -drive 'file=/dev/WDnvme/vm-11765-disk-0,if=none,id=drive-scsi0,format=raw,cache=none,aio=native,detect-zeroes=on' \
  -device 'scsi-hd,bus=scsihw0.0,channel=0,scsi-id=0,lun=0,drive=drive-scsi0,id=scsi0,rotation_rate=1,bootindex=100' \
  -netdev 'type=tap,id=net0,ifname=tap11765i0,script=/var/lib/qemu-server/pve-bridge,downscript=/var/lib/qemu-server/pve-bridgedown,vhost=on' \
  -device 'virtio-net-pci,mac=02:98:90:43:42:1D,netdev=net0,bus=pci.0,addr=0x12,id=net0,bootindex=300' \
  -rtc 'driftfix=slew,base=localtime' \
  -machine 'type=pc' \
  -global 'kvm-pit.lost_tick_policy=discard'


= bisect log =

git bisect start
# bad: [553032db17440f8de011390e5a1cfddd13751b0b] Update version for v5.2.0 release
git bisect bad 553032db17440f8de011390e5a1cfddd13751b0b
# good: [d0ed6a69d399ae193959225cdeaa9382746c91cc] Update version for v5.1.0 release
git bisect good d0ed6a69d399ae193959225cdeaa9382746c91cc
# bad: [ed799805d00ccdda45eb8441c7d929624d9e98a6] qom: Add kernel-doc markup to introduction doc comment
git bisect bad ed799805d00ccdda45eb8441c7d929624d9e98a6
# bad: [e4d8b7c1a95fffcfa4bdab9aa7ffd1cf590cdcf5] Merge remote-tracking branch 'remotes/nvme/tags/pull-nvme-20200902' into staging
git bisect bad e4d8b7c1a95fffcfa4bdab9aa7ffd1cf590cdcf5
# bad: [af1dfe1ec0864e6700237a43cc36018176f9eba9] acpi: update expected DSDT files with _UID changes
git bisect bad af1dfe1ec0864e6700237a43cc36018176f9eba9
# good: [d7df0ceee0fd2e512cd214a9074ebeeb40da3099] Merge remote-tracking branch 'remotes/philmd-gitlab/tags/sd-next-20200821' into staging
git bisect good d7df0ceee0fd2e512cd214a9074ebeeb40da3099
# good: [df82aa7fe10e46b675678977999d49bd586538f8] Merge remote-tracking branch 'remotes/edgar/tags/edgar/xilinx-next-2020-08-24.for-upstream' into staging
git bisect good df82aa7fe10e46b675678977999d49bd586538f8
# good: [e39a8320b088dd5efc9ebaafe387e52b3d962665] target/riscv: Support the Virtual Instruction fault
git bisect good e39a8320b088dd5efc9ebaafe387e52b3d962665
# good: [a6841a2de66fa44fe52ed996b70f9fb9f7bd6ca7] qcow2: Add subcluster support to qcow2_co_pwrite_zeroes()
git bisect good a6841a2de66fa44fe52ed996b70f9fb9f7bd6ca7
# good: [2f8cd515477edab1cbf38ecbdbfa2cac13ce1550] hw/display/artist: Fix invalidation of lines near screen border
git bisect good 2f8cd515477edab1cbf38ecbdbfa2cac13ce1550
# good: [a5d3cfa2dc775e5d99f013703b8508f1d989d588] iotests: Add tests for qcow2 images with extended L2 entries
git bisect good a5d3cfa2dc775e5d99f013703b8508f1d989d588
# good: [8e49197ca5e76fdb8928833b2649ef13fc5aab2f] Merge remote-tracking branch 'remotes/hdeller/tags/target-hppa-v3-pull-request' into staging
git bisect good 8e49197ca5e76fdb8928833b2649ef13fc5aab2f
# bad: [af1b80ae56c9495999e8ccf7b70ef894378de642] i386/acpi: fix inconsistent QEMU/OVMF device paths
git bisect bad af1b80ae56c9495999e8ccf7b70ef894378de642
# good: [42a62c20925e02aef0d849f92a0e9540888e79ae] acpi: allow DSDT changes
git bisect good 42a62c20925e02aef0d849f92a0e9540888e79ae
# first bad commit: [af1b80ae56c9495999e8ccf7b70ef894378de642] i386/acpi: fix inconsistent QEMU/OVMF device paths



^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH 1/2] i386/acpi: fix inconsistent QEMU/OVMF device paths
  2021-02-27 19:41 ` Thomas Lamprecht
@ 2021-02-28  9:11   ` vit9696
  2021-02-28 10:43     ` Thomas Lamprecht
  2021-02-28 20:43   ` Michael S. Tsirkin
  1 sibling, 1 reply; 33+ messages in thread
From: vit9696 @ 2021-02-28  9:11 UTC (permalink / raw)
  To: Thomas Lamprecht
  Cc: Michael S. Tsirkin, qemu devel list, Eduardo Habkost,
	Paolo Bonzini, Igor Mammedov, Laszlo Ersek, Richard Henderson,
	Stefan Reiter


[-- Attachment #1.1: Type: text/plain, Size: 12789 bytes --]

Hi Thomas,

For us this breaks the ability to control the boot options between the operating system and the OVMF. It happens because the operating system builds the DPs based on ACPI (in fact the only source available to it), while OVMF uses another source. The previous behaviour also violates the specification, so I do not believe there is room for reverting it. I believe it is also not possible to update QEMU to internally use the 1 UID, since it may conflict with the case when there are multiple PCI bus.

In my opinion, the most logical workaround is to provide in-guest steps to update VM configuration to account for this.

Best regards,
Vitaly

> 27 февр. 2021 г., в 22:41, Thomas Lamprecht <t.lamprecht@proxmox.com> написал(а):
> 
> 
> On 30.07.20 17:58, Michael S. Tsirkin wrote:
>> macOS uses ACPI UIDs to build the DevicePath for NVRAM boot options,
>> while OVMF firmware gets them via an internal channel through QEMU.
>> Due to a bug in QEMU ACPI currently UEFI firmware and ACPI have
>> different values, and this makes the underlying operating system
>> unable to report its boot option.
>> 
>> The particular node in question is the primary PciRoot (PCI0 in ACPI),
>> which for some reason gets assigned 1 in ACPI UID and 0 in the
>> DevicePath. This is due to the _UID assigned to it by build_dsdt in
>> hw/i386/acpi-build.c Which does not correspond to the primary PCI
>> identifier given by pcibus_num in hw/pci/pci.c
>> 
>> Reference with the device paths, OVMF startup logs, and ACPI table
>> dumps (SysReport):
>> https://github.com/acidanthera/bugtracker/issues/1050
>> 
>> In UEFI v2.8, section "10.4.2 Rules with ACPI _HID and _UID" ends with
>> the paragraph,
>> 
>>    Root PCI bridges will use the plug and play ID of PNP0A03, This will
>>    be stored in the ACPI Device Path _HID field, or in the Expanded
>>    ACPI Device Path _CID field to match the ACPI name space. The _UID
>>    in the ACPI Device Path structure must match the _UID in the ACPI
>>    name space.
>> 
>> (See especially the last sentence.)
>> 
>> Considering *extra* root bridges / root buses (with bus number > 0),
>> QEMU's ACPI generator actually does the right thing; since QEMU commit
>> c96d9286a6d7 ("i386/acpi-build: more traditional _UID and _HID for PXB
>> root buses", 2015-06-11).
>> 
>> However, the _UID values for root bridge zero (on both i440fx and q35)
>> have always been "wrong" (from UEFI perspective), going back in QEMU to
>> commit 74523b850189 ("i386: add ACPI table files from seabios",
>> 2013-10-14).
>> 
>> Even in SeaBIOS, these _UID values have always been 1; see commit
>> a4d357638c57 ("Port rombios32 code from bochs-bios.", 2008-03-08) for
>> i440fx, and commit ecbe3fd61511 ("seabios: q35: add dsdt", 2012-12-01)
>> for q35.
>> 
>> Suggested-by: Laszlo Ersek <lersek@redhat.com>
>> Tested-by: vit9696 <vit9696@protonmail.com>
>> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
>> ---
>> hw/i386/acpi-build.c | 4 ++--
>> 1 file changed, 2 insertions(+), 2 deletions(-)
>> 
>> diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c
>> index b7bcbbbb2a..7a5a8b3521 100644
>> --- a/hw/i386/acpi-build.c
>> +++ b/hw/i386/acpi-build.c
>> @@ -1497,7 +1497,7 @@ build_dsdt(GArray *table_data, BIOSLinker *linker,
>>         dev = aml_device("PCI0");
>>         aml_append(dev, aml_name_decl("_HID", aml_eisaid("PNP0A03")));
>>         aml_append(dev, aml_name_decl("_ADR", aml_int(0)));
>> -        aml_append(dev, aml_name_decl("_UID", aml_int(1)));
>> +        aml_append(dev, aml_name_decl("_UID", aml_int(0)));
>>         aml_append(sb_scope, dev);
>>         aml_append(dsdt, sb_scope);
>> 
>> @@ -1512,7 +1512,7 @@ build_dsdt(GArray *table_data, BIOSLinker *linker,
>>         aml_append(dev, aml_name_decl("_HID", aml_eisaid("PNP0A08")));
>>         aml_append(dev, aml_name_decl("_CID", aml_eisaid("PNP0A03")));
>>         aml_append(dev, aml_name_decl("_ADR", aml_int(0)));
>> -        aml_append(dev, aml_name_decl("_UID", aml_int(1)));
>> +        aml_append(dev, aml_name_decl("_UID", aml_int(0)));
>>         aml_append(dev, build_q35_osc_method());
>>         aml_append(sb_scope, dev);
>>         aml_append(dsdt, sb_scope);
>> 
> 
> This "breaks" Windows guests created/installed before this change in the sense
> of Windows gets confused and declares that most of the devices changed and thus
> it has new entries for them in the device manager where settings of the old one
> do not apply anymore.
> 
> We were made aware of this by our users when making QEMU 5.2.0 available on
> a more used repository of us. Users complained that their static network
> configuration got thrown out in Windows 2016 or 2019 server VMs, and Windows tried
> to use DHCP (which was not available in their environments) and thus their Windows
> VMs had no network connectivity at all anymore.
> 
> It's currently not yet quite 100% clear to me with what QEMU version the Windows VM
> must be installed with, from reading the patch I have to believe it must be before
> that, but we got mixed reports and a colleague could not replicate it from upgrade
> of 4.0 to 5.2 (I did /not/ confirm that one). Anyway, just writing this all to avoid
> people seeing different results and brushing this off.
> 
> So here's my personal reproducer, as said, I think that one should be able to just
> use QEMU 5.1 to install a Windows guest and start it with 5.2 afterwards to see this
> issue, but YMMV.
> 
> Note. I always used the exact same QEMU command (see below) for installation,
> reproducing and bisect.
> 
> 1. Installed Windows 2016 1616 VM using QEMU 3.0.1
>   - VirtIO net/scsi driver from VirtIO win 190
> 2. Setup static network in the VM and shutdown
> 3. Started VM with 5.2.0 -> Network gone, new "Ethernet #2" adapter shows up instead
> 
> Starting the  "Device Manager" and enabling "View -> Show hidden devices" showed
> me a greyed out device duplicate for basically anything attached, SCSI disk, Basic
> Display Adapter, CDROM device, ..., and the Network device.
> 
> The first difference I could find was the "Device instance path" one can find in
> the "Details" tab of the devices' "Properties" window.
> 
> # old, from initial installation on QEMU 3.0.1
> PCI\VEN_1AF4&DEV_1000&SUBSYS_00011AF4&REV_00\3&13C0B0C5&0&90
> 
> # new, from boot with QEMU 5.2
> PCI\VEN_1AF4&DEV_1000&SUBSYS_00011AF4&REV_00\3&267A616A&0&90
> 
> They match until almost the end, not sure how important that is, but it caught my
> eye (I'm really no windows guy since a decade so please excuse my terrible
> debugging/exploring skills there. The rest of those properties looked pretty
> much identical.
> 
> I then started a bisect, always just restarting the guest with the new QEMU build
> and checking "Device Manager" and network settings to see if good/bad. That worked
> pretty well and I came to this commit. See the bisect log attached at the end of
> this mail.
> 
> So, from reading the commit message I figure that this change is wanted, what are
> the implications of just reverting it? (which works out in bringing back the
> old state in Windows + working static network config again).
> 
> Or any other way/idea to address this in a sane way so that those picky Windows
> guests can be handled more graciously?
> 
> I guess also that there could be more subtle effects from this patch here, the
> network one may have just had quite visible effects to pop up as first issue...
> 
> Thanks if you read so far!
> 
> cheers,
> Thomas
> 
> 
> = QEMU Command =
> 
> (This was generated by our (Proxmox VE) stack, I only cleaned it up a bit to allow
> easier manual running it)
> 
> ./qemu-system-x86_64 \
>  -name win2016 \
>  -chardev 'socket,id=qmp,path=/var/run/qemu-server/11765.qmp,server,nowait' \
>  -mon 'chardev=qmp,mode=control' \
>  -smbios 'type=1,uuid=6324fb28-e98a-44cf-85db-694d1b3405f5' \
>  -smp '2,sockets=1,cores=2,maxcpus=2' \
>  -nodefaults \
>  -boot 'menu=on,strict=on,reboot-timeout=1000' \
>  -vnc unix:/var/run/qemu-server/11765.vnc,password \
>  -no-hpet \
>  -cpu 'host,hv_ipi,hv_relaxed,hv_reset,hv_runtime,hv_spinlocks=0x1fff,hv_stimer,hv_synic,hv_time,hv_vapic,hv_vpindex,+kvm_pv_eoi,+kvm_pv_unhalt,+md-clear,+pcid,+spec-ctrl' \
>  -m 2048 \
>  -device 'pci-bridge,id=pci.1,chassis_nr=1,bus=pci.0,addr=0x1e' \
>  -device 'pci-bridge,id=pci.2,chassis_nr=2,bus=pci.0,addr=0x1f' \
>  -device 'vmgenid,guid=2e56e6ca-2cf8-4f1d-8cc3-9b19a2510c01' \
>  -device 'piix3-usb-uhci,id=uhci,bus=pci.0,addr=0x1.0x2' \
>  -device 'usb-tablet,id=tablet,bus=uhci.0,port=1' \
>  -device 'VGA,id=vga,bus=pci.0,addr=0x2,edid=off' \
>  -chardev 'socket,path=/var/run/qemu-server/11765.qga,server,nowait,id=qga0' \
>  -device 'virtio-serial,id=qga0,bus=pci.0,addr=0x8' \
>  -device 'virtserialport,chardev=qga0,name=org.qemu.guest_agent.0' \
>  -device 'virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x3' \
>  -iscsi 'initiator-name=iqn.1993-08.org.debian:01:468faae9322b' \
>  -drive 'file=/mnt/pve/iso/template/iso/virtio-win-0.1.190.iso,if=none,id=drive-ide0,media=cdrom,aio=threads' \
>  -device 'ide-cd,bus=ide.0,unit=0,drive=drive-ide0,id=ide0,bootindex=200' \
>  -drive 'file=/mnt/pve/iso/template/iso/Win2016-1616-evaluation.ISO,if=none,id=drive-ide2,media=cdrom,aio=threads' \
>  -device 'ide-cd,bus=ide.1,unit=0,drive=drive-ide2,id=ide2,bootindex=201' \
>  -device 'virtio-scsi-pci,id=scsihw0,bus=pci.0,addr=0x5' \
>  -drive 'file=/dev/WDnvme/vm-11765-disk-0,if=none,id=drive-scsi0,format=raw,cache=none,aio=native,detect-zeroes=on' \
>  -device 'scsi-hd,bus=scsihw0.0,channel=0,scsi-id=0,lun=0,drive=drive-scsi0,id=scsi0,rotation_rate=1,bootindex=100' \
>  -netdev 'type=tap,id=net0,ifname=tap11765i0,script=/var/lib/qemu-server/pve-bridge,downscript=/var/lib/qemu-server/pve-bridgedown,vhost=on' \
>  -device 'virtio-net-pci,mac=02:98:90:43:42:1D,netdev=net0,bus=pci.0,addr=0x12,id=net0,bootindex=300' \
>  -rtc 'driftfix=slew,base=localtime' \
>  -machine 'type=pc' \
>  -global 'kvm-pit.lost_tick_policy=discard'
> 
> 
> = bisect log =
> 
> git bisect start
> # bad: [553032db17440f8de011390e5a1cfddd13751b0b] Update version for v5.2.0 release
> git bisect bad 553032db17440f8de011390e5a1cfddd13751b0b
> # good: [d0ed6a69d399ae193959225cdeaa9382746c91cc] Update version for v5.1.0 release
> git bisect good d0ed6a69d399ae193959225cdeaa9382746c91cc
> # bad: [ed799805d00ccdda45eb8441c7d929624d9e98a6] qom: Add kernel-doc markup to introduction doc comment
> git bisect bad ed799805d00ccdda45eb8441c7d929624d9e98a6
> # bad: [e4d8b7c1a95fffcfa4bdab9aa7ffd1cf590cdcf5] Merge remote-tracking branch 'remotes/nvme/tags/pull-nvme-20200902' into staging
> git bisect bad e4d8b7c1a95fffcfa4bdab9aa7ffd1cf590cdcf5
> # bad: [af1dfe1ec0864e6700237a43cc36018176f9eba9] acpi: update expected DSDT files with _UID changes
> git bisect bad af1dfe1ec0864e6700237a43cc36018176f9eba9
> # good: [d7df0ceee0fd2e512cd214a9074ebeeb40da3099] Merge remote-tracking branch 'remotes/philmd-gitlab/tags/sd-next-20200821' into staging
> git bisect good d7df0ceee0fd2e512cd214a9074ebeeb40da3099
> # good: [df82aa7fe10e46b675678977999d49bd586538f8] Merge remote-tracking branch 'remotes/edgar/tags/edgar/xilinx-next-2020-08-24.for-upstream' into staging
> git bisect good df82aa7fe10e46b675678977999d49bd586538f8
> # good: [e39a8320b088dd5efc9ebaafe387e52b3d962665] target/riscv: Support the Virtual Instruction fault
> git bisect good e39a8320b088dd5efc9ebaafe387e52b3d962665
> # good: [a6841a2de66fa44fe52ed996b70f9fb9f7bd6ca7] qcow2: Add subcluster support to qcow2_co_pwrite_zeroes()
> git bisect good a6841a2de66fa44fe52ed996b70f9fb9f7bd6ca7
> # good: [2f8cd515477edab1cbf38ecbdbfa2cac13ce1550] hw/display/artist: Fix invalidation of lines near screen border
> git bisect good 2f8cd515477edab1cbf38ecbdbfa2cac13ce1550
> # good: [a5d3cfa2dc775e5d99f013703b8508f1d989d588] iotests: Add tests for qcow2 images with extended L2 entries
> git bisect good a5d3cfa2dc775e5d99f013703b8508f1d989d588
> # good: [8e49197ca5e76fdb8928833b2649ef13fc5aab2f] Merge remote-tracking branch 'remotes/hdeller/tags/target-hppa-v3-pull-request' into staging
> git bisect good 8e49197ca5e76fdb8928833b2649ef13fc5aab2f
> # bad: [af1b80ae56c9495999e8ccf7b70ef894378de642] i386/acpi: fix inconsistent QEMU/OVMF device paths
> git bisect bad af1b80ae56c9495999e8ccf7b70ef894378de642
> # good: [42a62c20925e02aef0d849f92a0e9540888e79ae] acpi: allow DSDT changes
> git bisect good 42a62c20925e02aef0d849f92a0e9540888e79ae
> # first bad commit: [af1b80ae56c9495999e8ccf7b70ef894378de642] i386/acpi: fix inconsistent QEMU/OVMF device paths


[-- Attachment #1.2: Type: text/html, Size: 109291 bytes --]

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 554 bytes --]

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH 1/2] i386/acpi: fix inconsistent QEMU/OVMF device paths
  2021-02-28  9:11   ` vit9696
@ 2021-02-28 10:43     ` Thomas Lamprecht
  2021-02-28 20:45       ` Michael S. Tsirkin
       [not found]       ` <x3i3TiibtrC1qTDQUKxuOA_9qvmInzVwv6RrvzzSCSj-S21gLypbbZgEbYvJSGMxC1r8RaDrnHGgRbDI7vfpA_XuDINdZej9yKCW3_Sc4YM=@protonmail.com>
  0 siblings, 2 replies; 33+ messages in thread
From: Thomas Lamprecht @ 2021-02-28 10:43 UTC (permalink / raw)
  To: vit9696
  Cc: Eduardo Habkost, Michael S. Tsirkin, Stefan Reiter,
	qemu devel list, Igor Mammedov, Paolo Bonzini, Laszlo Ersek,
	Richard Henderson

Hi Vitaly,

On 28.02.21 10:11, vit9696 wrote:
> For us this breaks the ability to control the boot options between the operating system and the OVMF. It happens because the operating system builds the DPs based on ACPI (in fact the only source available to it), while OVMF uses another source. The previous behaviour also violates the specification, so I do not believe there is room for reverting it. I believe it is also not possible to update QEMU to internally use the 1 UID, since it may conflict with the case when there are multiple PCI bus.

I think you may have misunderstood me a little bit, I did not ask for this to
be reverted in upstream QEMU, it's quite clear to me that this should be the
new default behaviour and should have been since ever.

Albeit, I must ask what makes macOS special to not be allowed doing things that
Windows and Linux guest can do just fine?

I mainly asked for other drawbacks of such a revert as it is currently the
straight forward stop gap solution for us as downstream. What we probably will
do, is keeping this as default to the new standard behavior and adding a switch
to revert to the old one - our QEMU integration library in Proxmox VE can then
set this for old VMs and use the new standard for new ones on VM start, that
way we keep backward compatible - as only Windows VMs seems to be affected we
can even do this only for those (we have a OS type config property from which
we can derive this).

>
> In my opinion, the most logical workaround is to provide in-guest steps to update VM configuration to account for this.

Often the Hypervisor admin and Guest admin are not the same, so this is only
a small band-aid and for most helping only after the fact.

We also have quite easy to setup clustering so this means that such affected
VMs will seemingly break on migration to an update node for lots of users - for
us an unacceptable situation to expose our users with and honestly, I have a
hard time seeing me and colleagues to wish spending our nerves to direct
hundreds of reports to the documented solution (some will certainly find it on
their own, but whatever one does, lots won't) and dealing with their,
relatable, fit they'll throw and me having to hold back telling them off to
just use Linux instead ;-)

And I think that other integrator will get some reports too, and FWICT there's
no outside way an user can use to revert to the old behavior.
Note that QEMU 5.2 is not yet released in some major distributions, e.g.,
Debian will ship it with Bullseye which release is still months away, latest
Fedora (33) is shipping QEMU 5.1, so RHEL/CentOS are probably using something
even older and Ubuntu will only add it in 21.04, also two months away.

Currently, QEMU 5.2 which introduces this change, is only released in some is
released in faster moving targets, where Windows VMs are more often for
non-server workloads (educated guess) which again correlates with higher
probability to use of DHCP and not static address assignment (again, educated
guess) - which is the most obvious and noticeable thing we and our users saw
break.

Which brings me again to my other point, there may be lots of other things
breaking in a more subtle way, we do not know but can tell there's lots of
device reshuffling going on when checking out the Windows Device Manager I
cannot immagine that the loss of network configuration is the only thing that
breaks is the only thing that breaks.

So why all this fuss and wall of text? Because I think that this will affect
lots of users, most of them in distros which will only ship the problematic
QEMU version later this year. How many affected there will be: no idea, but we
got quite some reports (compared to usual small stuff breakage) with only
rolling this QEMU version out *partially*, to only some parts of our user base.

That's why I personally think it may be worth to think about adding a switch to
QEMU directly to keep the backwards compatible, albeit standard incompatible
behavior either as opt-in or opt-out to new standard-conform behavior. And
while I thought opt-out is the way to go when starting out this message, I now
rather think opt-in to new is, at least if rustling bells of users with
Windows + static IPs is thought to be worth to avoid. As said, if there's quorum
against this, we can live fine with keeping that switch as downstream patch but
I'd like to avoid that and certainly won't just rush forward shipping it but
wait until next week, maybe there are some other opinions or better ideas.

cheers,
Thomas



^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH 1/2] i386/acpi: fix inconsistent QEMU/OVMF device paths
  2021-02-27 19:41 ` Thomas Lamprecht
  2021-02-28  9:11   ` vit9696
@ 2021-02-28 20:43   ` Michael S. Tsirkin
  2021-03-01  7:12     ` Thomas Lamprecht
  2021-03-01 13:28     ` Igor Mammedov
  1 sibling, 2 replies; 33+ messages in thread
From: Michael S. Tsirkin @ 2021-02-28 20:43 UTC (permalink / raw)
  To: Thomas Lamprecht
  Cc: Eduardo Habkost, Stefan Reiter, qemu-devel, Igor Mammedov,
	Paolo Bonzini, Richard Henderson, Laszlo Ersek, vit9696

On Sat, Feb 27, 2021 at 08:41:11PM +0100, Thomas Lamprecht wrote:
> On 30.07.20 17:58, Michael S. Tsirkin wrote:
> > macOS uses ACPI UIDs to build the DevicePath for NVRAM boot options,
> > while OVMF firmware gets them via an internal channel through QEMU.
> > Due to a bug in QEMU ACPI currently UEFI firmware and ACPI have
> > different values, and this makes the underlying operating system
> > unable to report its boot option.
> > 
> > The particular node in question is the primary PciRoot (PCI0 in ACPI),
> > which for some reason gets assigned 1 in ACPI UID and 0 in the
> > DevicePath. This is due to the _UID assigned to it by build_dsdt in
> > hw/i386/acpi-build.c Which does not correspond to the primary PCI
> > identifier given by pcibus_num in hw/pci/pci.c
> > 
> > Reference with the device paths, OVMF startup logs, and ACPI table
> > dumps (SysReport):
> > https://github.com/acidanthera/bugtracker/issues/1050
> > 
> > In UEFI v2.8, section "10.4.2 Rules with ACPI _HID and _UID" ends with
> > the paragraph,
> > 
> >     Root PCI bridges will use the plug and play ID of PNP0A03, This will
> >     be stored in the ACPI Device Path _HID field, or in the Expanded
> >     ACPI Device Path _CID field to match the ACPI name space. The _UID
> >     in the ACPI Device Path structure must match the _UID in the ACPI
> >     name space.
> > 
> > (See especially the last sentence.)
> > 
> > Considering *extra* root bridges / root buses (with bus number > 0),
> > QEMU's ACPI generator actually does the right thing; since QEMU commit
> > c96d9286a6d7 ("i386/acpi-build: more traditional _UID and _HID for PXB
> > root buses", 2015-06-11).
> > 
> > However, the _UID values for root bridge zero (on both i440fx and q35)
> > have always been "wrong" (from UEFI perspective), going back in QEMU to
> > commit 74523b850189 ("i386: add ACPI table files from seabios",
> > 2013-10-14).
> > 
> > Even in SeaBIOS, these _UID values have always been 1; see commit
> > a4d357638c57 ("Port rombios32 code from bochs-bios.", 2008-03-08) for
> > i440fx, and commit ecbe3fd61511 ("seabios: q35: add dsdt", 2012-12-01)
> > for q35.
> > 
> > Suggested-by: Laszlo Ersek <lersek@redhat.com>
> > Tested-by: vit9696 <vit9696@protonmail.com>
> > Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
> > ---
> >  hw/i386/acpi-build.c | 4 ++--
> >  1 file changed, 2 insertions(+), 2 deletions(-)
> > 
> > diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c
> > index b7bcbbbb2a..7a5a8b3521 100644
> > --- a/hw/i386/acpi-build.c
> > +++ b/hw/i386/acpi-build.c
> > @@ -1497,7 +1497,7 @@ build_dsdt(GArray *table_data, BIOSLinker *linker,
> >          dev = aml_device("PCI0");
> >          aml_append(dev, aml_name_decl("_HID", aml_eisaid("PNP0A03")));
> >          aml_append(dev, aml_name_decl("_ADR", aml_int(0)));
> > -        aml_append(dev, aml_name_decl("_UID", aml_int(1)));
> > +        aml_append(dev, aml_name_decl("_UID", aml_int(0)));
> >          aml_append(sb_scope, dev);
> >          aml_append(dsdt, sb_scope);
> >  
> > @@ -1512,7 +1512,7 @@ build_dsdt(GArray *table_data, BIOSLinker *linker,
> >          aml_append(dev, aml_name_decl("_HID", aml_eisaid("PNP0A08")));
> >          aml_append(dev, aml_name_decl("_CID", aml_eisaid("PNP0A03")));
> >          aml_append(dev, aml_name_decl("_ADR", aml_int(0)));
> > -        aml_append(dev, aml_name_decl("_UID", aml_int(1)));
> > +        aml_append(dev, aml_name_decl("_UID", aml_int(0)));
> >          aml_append(dev, build_q35_osc_method());
> >          aml_append(sb_scope, dev);
> >          aml_append(dsdt, sb_scope);
> > 
> 
> This "breaks" Windows guests created/installed before this change in the sense
> of Windows gets confused and declares that most of the devices changed and thus
> it has new entries for them in the device manager where settings of the old one
> do not apply anymore.
> 
> We were made aware of this by our users when making QEMU 5.2.0 available on
> a more used repository of us. Users complained that their static network
> configuration got thrown out in Windows 2016 or 2019 server VMs, and Windows tried
> to use DHCP (which was not available in their environments) and thus their Windows
> VMs had no network connectivity at all anymore.
> 
> It's currently not yet quite 100% clear to me with what QEMU version the Windows VM
> must be installed with, from reading the patch I have to believe it must be before
> that, but we got mixed reports and a colleague could not replicate it from upgrade
> of 4.0 to 5.2 (I did /not/ confirm that one). Anyway, just writing this all to avoid
> people seeing different results and brushing this off.
> 
> So here's my personal reproducer, as said, I think that one should be able to just
> use QEMU 5.1 to install a Windows guest and start it with 5.2 afterwards to see this
> issue, but YMMV.
> 
> Note. I always used the exact same QEMU command (see below) for installation,
> reproducing and bisect.
> 
> 1. Installed Windows 2016 1616 VM using QEMU 3.0.1
>    - VirtIO net/scsi driver from VirtIO win 190
> 2. Setup static network in the VM and shutdown
> 3. Started VM with 5.2.0 -> Network gone, new "Ethernet #2" adapter shows up instead
> 
> Starting the  "Device Manager" and enabling "View -> Show hidden devices" showed
> me a greyed out device duplicate for basically anything attached, SCSI disk, Basic
> Display Adapter, CDROM device, ..., and the Network device.
> 
> The first difference I could find was the "Device instance path" one can find in
> the "Details" tab of the devices' "Properties" window.
> 
> # old, from initial installation on QEMU 3.0.1
> PCI\VEN_1AF4&DEV_1000&SUBSYS_00011AF4&REV_00\3&13C0B0C5&0&90
> 
> # new, from boot with QEMU 5.2
> PCI\VEN_1AF4&DEV_1000&SUBSYS_00011AF4&REV_00\3&267A616A&0&90
> 
> They match until almost the end, not sure how important that is, but it caught my
> eye (I'm really no windows guy since a decade so please excuse my terrible
> debugging/exploring skills there. The rest of those properties looked pretty
> much identical.
> 
> I then started a bisect, always just restarting the guest with the new QEMU build
> and checking "Device Manager" and network settings to see if good/bad. That worked
> pretty well and I came to this commit. See the bisect log attached at the end of
> this mail.
> 
> So, from reading the commit message I figure that this change is wanted, what are
> the implications of just reverting it? (which works out in bringing back the
> old state in Windows + working static network config again).
> 
> Or any other way/idea to address this in a sane way so that those picky Windows
> guests can be handled more graciously?

Sure. The way to do that is to tie old behaviour to old machine
versions. We'll need it in stable too ...

Igor want to cook up a patch?


> I guess also that there could be more subtle effects from this patch here, the
> network one may have just had quite visible effects to pop up as first issue...
> 
> Thanks if you read so far!
> 
> cheers,
> Thomas
> 
> 
> = QEMU Command =
> 
> (This was generated by our (Proxmox VE) stack, I only cleaned it up a bit to allow
> easier manual running it)
> 
> ./qemu-system-x86_64 \
>   -name win2016 \
>   -chardev 'socket,id=qmp,path=/var/run/qemu-server/11765.qmp,server,nowait' \
>   -mon 'chardev=qmp,mode=control' \
>   -smbios 'type=1,uuid=6324fb28-e98a-44cf-85db-694d1b3405f5' \
>   -smp '2,sockets=1,cores=2,maxcpus=2' \
>   -nodefaults \
>   -boot 'menu=on,strict=on,reboot-timeout=1000' \
>   -vnc unix:/var/run/qemu-server/11765.vnc,password \
>   -no-hpet \
>   -cpu 'host,hv_ipi,hv_relaxed,hv_reset,hv_runtime,hv_spinlocks=0x1fff,hv_stimer,hv_synic,hv_time,hv_vapic,hv_vpindex,+kvm_pv_eoi,+kvm_pv_unhalt,+md-clear,+pcid,+spec-ctrl' \
>   -m 2048 \
>   -device 'pci-bridge,id=pci.1,chassis_nr=1,bus=pci.0,addr=0x1e' \
>   -device 'pci-bridge,id=pci.2,chassis_nr=2,bus=pci.0,addr=0x1f' \
>   -device 'vmgenid,guid=2e56e6ca-2cf8-4f1d-8cc3-9b19a2510c01' \
>   -device 'piix3-usb-uhci,id=uhci,bus=pci.0,addr=0x1.0x2' \
>   -device 'usb-tablet,id=tablet,bus=uhci.0,port=1' \
>   -device 'VGA,id=vga,bus=pci.0,addr=0x2,edid=off' \
>   -chardev 'socket,path=/var/run/qemu-server/11765.qga,server,nowait,id=qga0' \
>   -device 'virtio-serial,id=qga0,bus=pci.0,addr=0x8' \
>   -device 'virtserialport,chardev=qga0,name=org.qemu.guest_agent.0' \
>   -device 'virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x3' \
>   -iscsi 'initiator-name=iqn.1993-08.org.debian:01:468faae9322b' \
>   -drive 'file=/mnt/pve/iso/template/iso/virtio-win-0.1.190.iso,if=none,id=drive-ide0,media=cdrom,aio=threads' \
>   -device 'ide-cd,bus=ide.0,unit=0,drive=drive-ide0,id=ide0,bootindex=200' \
>   -drive 'file=/mnt/pve/iso/template/iso/Win2016-1616-evaluation.ISO,if=none,id=drive-ide2,media=cdrom,aio=threads' \
>   -device 'ide-cd,bus=ide.1,unit=0,drive=drive-ide2,id=ide2,bootindex=201' \
>   -device 'virtio-scsi-pci,id=scsihw0,bus=pci.0,addr=0x5' \
>   -drive 'file=/dev/WDnvme/vm-11765-disk-0,if=none,id=drive-scsi0,format=raw,cache=none,aio=native,detect-zeroes=on' \
>   -device 'scsi-hd,bus=scsihw0.0,channel=0,scsi-id=0,lun=0,drive=drive-scsi0,id=scsi0,rotation_rate=1,bootindex=100' \
>   -netdev 'type=tap,id=net0,ifname=tap11765i0,script=/var/lib/qemu-server/pve-bridge,downscript=/var/lib/qemu-server/pve-bridgedown,vhost=on' \
>   -device 'virtio-net-pci,mac=02:98:90:43:42:1D,netdev=net0,bus=pci.0,addr=0x12,id=net0,bootindex=300' \
>   -rtc 'driftfix=slew,base=localtime' \
>   -machine 'type=pc' \
>   -global 'kvm-pit.lost_tick_policy=discard'
> 
> 
> = bisect log =
> 
> git bisect start
> # bad: [553032db17440f8de011390e5a1cfddd13751b0b] Update version for v5.2.0 release
> git bisect bad 553032db17440f8de011390e5a1cfddd13751b0b
> # good: [d0ed6a69d399ae193959225cdeaa9382746c91cc] Update version for v5.1.0 release
> git bisect good d0ed6a69d399ae193959225cdeaa9382746c91cc
> # bad: [ed799805d00ccdda45eb8441c7d929624d9e98a6] qom: Add kernel-doc markup to introduction doc comment
> git bisect bad ed799805d00ccdda45eb8441c7d929624d9e98a6
> # bad: [e4d8b7c1a95fffcfa4bdab9aa7ffd1cf590cdcf5] Merge remote-tracking branch 'remotes/nvme/tags/pull-nvme-20200902' into staging
> git bisect bad e4d8b7c1a95fffcfa4bdab9aa7ffd1cf590cdcf5
> # bad: [af1dfe1ec0864e6700237a43cc36018176f9eba9] acpi: update expected DSDT files with _UID changes
> git bisect bad af1dfe1ec0864e6700237a43cc36018176f9eba9
> # good: [d7df0ceee0fd2e512cd214a9074ebeeb40da3099] Merge remote-tracking branch 'remotes/philmd-gitlab/tags/sd-next-20200821' into staging
> git bisect good d7df0ceee0fd2e512cd214a9074ebeeb40da3099
> # good: [df82aa7fe10e46b675678977999d49bd586538f8] Merge remote-tracking branch 'remotes/edgar/tags/edgar/xilinx-next-2020-08-24.for-upstream' into staging
> git bisect good df82aa7fe10e46b675678977999d49bd586538f8
> # good: [e39a8320b088dd5efc9ebaafe387e52b3d962665] target/riscv: Support the Virtual Instruction fault
> git bisect good e39a8320b088dd5efc9ebaafe387e52b3d962665
> # good: [a6841a2de66fa44fe52ed996b70f9fb9f7bd6ca7] qcow2: Add subcluster support to qcow2_co_pwrite_zeroes()
> git bisect good a6841a2de66fa44fe52ed996b70f9fb9f7bd6ca7
> # good: [2f8cd515477edab1cbf38ecbdbfa2cac13ce1550] hw/display/artist: Fix invalidation of lines near screen border
> git bisect good 2f8cd515477edab1cbf38ecbdbfa2cac13ce1550
> # good: [a5d3cfa2dc775e5d99f013703b8508f1d989d588] iotests: Add tests for qcow2 images with extended L2 entries
> git bisect good a5d3cfa2dc775e5d99f013703b8508f1d989d588
> # good: [8e49197ca5e76fdb8928833b2649ef13fc5aab2f] Merge remote-tracking branch 'remotes/hdeller/tags/target-hppa-v3-pull-request' into staging
> git bisect good 8e49197ca5e76fdb8928833b2649ef13fc5aab2f
> # bad: [af1b80ae56c9495999e8ccf7b70ef894378de642] i386/acpi: fix inconsistent QEMU/OVMF device paths
> git bisect bad af1b80ae56c9495999e8ccf7b70ef894378de642
> # good: [42a62c20925e02aef0d849f92a0e9540888e79ae] acpi: allow DSDT changes
> git bisect good 42a62c20925e02aef0d849f92a0e9540888e79ae
> # first bad commit: [af1b80ae56c9495999e8ccf7b70ef894378de642] i386/acpi: fix inconsistent QEMU/OVMF device paths



^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH 1/2] i386/acpi: fix inconsistent QEMU/OVMF device paths
  2021-02-28 10:43     ` Thomas Lamprecht
@ 2021-02-28 20:45       ` Michael S. Tsirkin
       [not found]       ` <x3i3TiibtrC1qTDQUKxuOA_9qvmInzVwv6RrvzzSCSj-S21gLypbbZgEbYvJSGMxC1r8RaDrnHGgRbDI7vfpA_XuDINdZej9yKCW3_Sc4YM=@protonmail.com>
  1 sibling, 0 replies; 33+ messages in thread
From: Michael S. Tsirkin @ 2021-02-28 20:45 UTC (permalink / raw)
  To: Thomas Lamprecht
  Cc: vit9696, Stefan Reiter, qemu devel list, Igor Mammedov,
	Paolo Bonzini, Richard Henderson, Laszlo Ersek, Eduardo Habkost

On Sun, Feb 28, 2021 at 11:43:55AM +0100, Thomas Lamprecht wrote:
> Hi Vitaly,
> 
> On 28.02.21 10:11, vit9696 wrote:
> > For us this breaks the ability to control the boot options between the operating system and the OVMF. It happens because the operating system builds the DPs based on ACPI (in fact the only source available to it), while OVMF uses another source. The previous behaviour also violates the specification, so I do not believe there is room for reverting it. I believe it is also not possible to update QEMU to internally use the 1 UID, since it may conflict with the case when there are multiple PCI bus.
> 
> I think you may have misunderstood me a little bit, I did not ask for this to
> be reverted in upstream QEMU, it's quite clear to me that this should be the
> new default behaviour and should have been since ever.

We do make an effort to avoid guest visible changes within machine
version though. this is what we should do here I think.

-- 
MST



^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH 1/2] i386/acpi: fix inconsistent QEMU/OVMF device paths
       [not found]       ` <x3i3TiibtrC1qTDQUKxuOA_9qvmInzVwv6RrvzzSCSj-S21gLypbbZgEbYvJSGMxC1r8RaDrnHGgRbDI7vfpA_XuDINdZej9yKCW3_Sc4YM=@protonmail.com>
@ 2021-02-28 21:37         ` Michael S. Tsirkin
  0 siblings, 0 replies; 33+ messages in thread
From: Michael S. Tsirkin @ 2021-02-28 21:37 UTC (permalink / raw)
  To: vit9696
  Cc: Eduardo Habkost, Stefan Reiter, qemu devel list, Igor Mammedov,
	Paolo Bonzini, Laszlo Ersek, Thomas Lamprecht, Richard Henderson

On Sun, Feb 28, 2021 at 09:28:26PM +0000, vit9696 wrote:
> Thomas, macOS is not really "special" here, it is rather that you will not
> frequently use boot options in a VM. One of the most popular uses for boot
> options is to switch between the operating systems, but for virtual machines
> it is rarely the case. However, macOS does indeed use boot options for itself.
> One example is to install updates. As long as the created boot option is not
> valid an automated reboot during the update installation may result in the
> wrong bootloader being chosen or in a stall within the firmware UI awaiting
> manual boot option selection.
> 
> Michael, does your suggestion mean that the default approach will be to keep
> the new behaviour, but if you manually specify an older q35 machine version it
> will provide the original behaviour. If so, it seems fair to me.
> 
> Best regards,
> Vitaly

Exactly. Vitaly, could you cook up a patch like this?

-- 
MST



^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH 1/2] i386/acpi: fix inconsistent QEMU/OVMF device paths
  2021-02-28 20:43   ` Michael S. Tsirkin
@ 2021-03-01  7:12     ` Thomas Lamprecht
  2021-03-01  7:20       ` Michael S. Tsirkin
  2021-03-01 13:28     ` Igor Mammedov
  1 sibling, 1 reply; 33+ messages in thread
From: Thomas Lamprecht @ 2021-03-01  7:12 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Eduardo Habkost, Stefan Reiter, qemu-devel, Igor Mammedov,
	Paolo Bonzini, Richard Henderson, Laszlo Ersek, vit9696

On 28.02.21 21:43, Michael S. Tsirkin wrote:
> Sure. The way to do that is to tie old behaviour to old machine
> versions. We'll need it in stable too ...

Yeah, using machine types is how its meant to be with solving migration
breakage, sure.
But that means we have to permanently pin the VM, and any backup restored from
that to that machine type *forever*. That'd be new for us as we always could
allow a newer machine type for a fresh start (i.e., non migration or the like)
here, and mean that lots of other improvements guarded by a newer machine type
for those VMs will.

Why not a switch + machine type, solves migration and any special cases of it
but also allows machine updates but also to keep the old behavior?

And yeah, stable is wanted, but extrapolating from the current stable releases
frequency, where normally there's maximal one after 5-6 months from the .0
release, means that this will probably still hit all those distributions I
mentioned or is there something more soon planned?

Also, is there any regression testing infrastructure around to avoid such
changes in the future? This change got undetected for 7 months, which can be
pretty the norm for QEMU releases, so some earlier safety net would be good? Is
there anything which dumps various default machine HW layouts and uses them for
an ABI check of some sorts?



^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH 1/2] i386/acpi: fix inconsistent QEMU/OVMF device paths
  2021-03-01  7:12     ` Thomas Lamprecht
@ 2021-03-01  7:20       ` Michael S. Tsirkin
  2021-03-01  7:45         ` Thomas Lamprecht
  0 siblings, 1 reply; 33+ messages in thread
From: Michael S. Tsirkin @ 2021-03-01  7:20 UTC (permalink / raw)
  To: Thomas Lamprecht
  Cc: Eduardo Habkost, Stefan Reiter, qemu-devel, Igor Mammedov,
	Paolo Bonzini, Richard Henderson, Laszlo Ersek, vit9696

On Mon, Mar 01, 2021 at 08:12:35AM +0100, Thomas Lamprecht wrote:
> On 28.02.21 21:43, Michael S. Tsirkin wrote:
> > Sure. The way to do that is to tie old behaviour to old machine
> > versions. We'll need it in stable too ...
> 
> Yeah, using machine types is how its meant to be with solving migration
> breakage, sure.
> But that means we have to permanently pin the VM, and any backup restored from
> that to that machine type *forever*. That'd be new for us as we always could
> allow a newer machine type for a fresh start (i.e., non migration or the like)
> here, and mean that lots of other improvements guarded by a newer machine type
> for those VMs will.

If you don't do that, that is a bug as any virtual hardware
can change across machine types.

> Why not a switch + machine type, solves migration and any special cases of it
> but also allows machine updates but also to keep the old behavior?

I am not really sure what you mean here, sound like a new feature -
at a guess it will take a while to formulate and is unlikely
to be backported to stable and so help with historical
releases.

> And yeah, stable is wanted, but extrapolating from the current stable releases
> frequency, where normally there's maximal one after 5-6 months from the .0
> release, means that this will probably still hit all those distributions I
> mentioned or is there something more soon planned?
> 
> Also, is there any regression testing infrastructure around to avoid such
> changes in the future? This change got undetected for 7 months, which can be
> pretty the norm for QEMU releases, so some earlier safety net would be good? Is
> there anything which dumps various default machine HW layouts and uses them for
> an ABI check of some sorts?

There are various testing efforts the reason this got undetected is
because it does not affect linux guests, and even for windows
they kind of recover, there's just some boot slowdown around reconfiguration.
Not easy to detect automatically given windows has lots of random
downtime during boot around updates etc etc.

-- 
MST



^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH 1/2] i386/acpi: fix inconsistent QEMU/OVMF device paths
  2021-03-01  7:20       ` Michael S. Tsirkin
@ 2021-03-01  7:45         ` Thomas Lamprecht
  2021-03-01 14:20           ` Igor Mammedov
  2021-03-01 15:31           ` Michael S. Tsirkin
  0 siblings, 2 replies; 33+ messages in thread
From: Thomas Lamprecht @ 2021-03-01  7:45 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Eduardo Habkost, Stefan Reiter, qemu-devel, Igor Mammedov,
	Paolo Bonzini, Richard Henderson, Laszlo Ersek, vit9696

On 01.03.21 08:20, Michael S. Tsirkin wrote:
> On Mon, Mar 01, 2021 at 08:12:35AM +0100, Thomas Lamprecht wrote:
>> On 28.02.21 21:43, Michael S. Tsirkin wrote:
>>> Sure. The way to do that is to tie old behaviour to old machine
>>> versions. We'll need it in stable too ...
>>
>> Yeah, using machine types is how its meant to be with solving migration
>> breakage, sure.
>> But that means we have to permanently pin the VM, and any backup restored from
>> that to that machine type *forever*. That'd be new for us as we always could
>> allow a newer machine type for a fresh start (i.e., non migration or the like)
>> here, and mean that lots of other improvements guarded by a newer machine type
>> for those VMs will.
> 
> If you don't do that, that is a bug as any virtual hardware
> can change across machine types.

For us a feature, for fresh starts one gets the current virtual HW but for
live migration or our live snapshot code it stays compatible. Works quite
well here for many years, as we can simply test the HW changes on existing
VMs - which failed here due to lack of static IPs in the test bed. So yes,
it has its problems as it is not really  what an OS considers as HW change
so big that it makes it a new device, mostly Windows is a PITA here as seen
in this issue.

I mean, QEMU deprecates very old machines at some point anyway, so even then
it is impossible to keep to the old machine forever, but otoh redoing some
changes after a decade or two can be fine, I guess?

> 
>> And yeah, stable is wanted, but extrapolating from the current stable releases
>> frequency, where normally there's maximal one after 5-6 months from the .0
>> release, means that this will probably still hit all those distributions I
>> mentioned or is there something more soon planned?
>>
>> Also, is there any regression testing infrastructure around to avoid such
>> changes in the future? This change got undetected for 7 months, which can be
>> pretty the norm for QEMU releases, so some earlier safety net would be good? Is
>> there anything which dumps various default machine HW layouts and uses them for
>> an ABI check of some sorts?
> 
> There are various testing efforts the reason this got undetected is
> because it does not affect linux guests, and even for windows
> they kind of recover, there's just some boot slowdown around reconfiguration.
> Not easy to detect automatically given windows has lots of random
> downtime during boot around updates etc etc.
> 

No, Windows does not reconfigure, this is a permanent change, one is just lucky
if one has a DHCP server around in the network accessible for the guest.
As static addresses setup on that virtual NIC before that config is gone,
no recovery whatsoever until manual intervention.

I meant more of a "dump HW layout to .txt file, commit to git, and ensure
there's no diff without and machine version bump" (very boiled down), e.g., like
ABI checks for kernel builds are often done by distros - albeit those are easier
as its quite clear what and how the kernel ABI can be used.



^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH 1/2] i386/acpi: fix inconsistent QEMU/OVMF device paths
  2021-02-28 20:43   ` Michael S. Tsirkin
  2021-03-01  7:12     ` Thomas Lamprecht
@ 2021-03-01 13:28     ` Igor Mammedov
  2021-03-01 16:14       ` Michael S. Tsirkin
  1 sibling, 1 reply; 33+ messages in thread
From: Igor Mammedov @ 2021-03-01 13:28 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Eduardo Habkost, Stefan Reiter, qemu-devel, Paolo Bonzini,
	vit9696, Laszlo Ersek, Thomas Lamprecht, Richard Henderson

On Sun, 28 Feb 2021 15:43:40 -0500
"Michael S. Tsirkin" <mst@redhat.com> wrote:

> On Sat, Feb 27, 2021 at 08:41:11PM +0100, Thomas Lamprecht wrote:
> > On 30.07.20 17:58, Michael S. Tsirkin wrote:  
> > > macOS uses ACPI UIDs to build the DevicePath for NVRAM boot options,
> > > while OVMF firmware gets them via an internal channel through QEMU.
> > > Due to a bug in QEMU ACPI currently UEFI firmware and ACPI have
> > > different values, and this makes the underlying operating system
> > > unable to report its boot option.
> > > 
> > > The particular node in question is the primary PciRoot (PCI0 in ACPI),
> > > which for some reason gets assigned 1 in ACPI UID and 0 in the
> > > DevicePath. This is due to the _UID assigned to it by build_dsdt in
> > > hw/i386/acpi-build.c Which does not correspond to the primary PCI
> > > identifier given by pcibus_num in hw/pci/pci.c
> > > 
> > > Reference with the device paths, OVMF startup logs, and ACPI table
> > > dumps (SysReport):
> > > https://github.com/acidanthera/bugtracker/issues/1050
> > > 
> > > In UEFI v2.8, section "10.4.2 Rules with ACPI _HID and _UID" ends with
> > > the paragraph,
> > > 
> > >     Root PCI bridges will use the plug and play ID of PNP0A03, This will
> > >     be stored in the ACPI Device Path _HID field, or in the Expanded
> > >     ACPI Device Path _CID field to match the ACPI name space. The _UID
> > >     in the ACPI Device Path structure must match the _UID in the ACPI
> > >     name space.
> > > 
> > > (See especially the last sentence.)
> > > 
> > > Considering *extra* root bridges / root buses (with bus number > 0),
> > > QEMU's ACPI generator actually does the right thing; since QEMU commit
> > > c96d9286a6d7 ("i386/acpi-build: more traditional _UID and _HID for PXB
> > > root buses", 2015-06-11).
> > > 
> > > However, the _UID values for root bridge zero (on both i440fx and q35)
> > > have always been "wrong" (from UEFI perspective), going back in QEMU to
> > > commit 74523b850189 ("i386: add ACPI table files from seabios",
> > > 2013-10-14).
> > > 
> > > Even in SeaBIOS, these _UID values have always been 1; see commit
> > > a4d357638c57 ("Port rombios32 code from bochs-bios.", 2008-03-08) for
> > > i440fx, and commit ecbe3fd61511 ("seabios: q35: add dsdt", 2012-12-01)
> > > for q35.
> > > 
> > > Suggested-by: Laszlo Ersek <lersek@redhat.com>
> > > Tested-by: vit9696 <vit9696@protonmail.com>
> > > Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
> > > ---
> > >  hw/i386/acpi-build.c | 4 ++--
> > >  1 file changed, 2 insertions(+), 2 deletions(-)
> > > 
> > > diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c
> > > index b7bcbbbb2a..7a5a8b3521 100644
> > > --- a/hw/i386/acpi-build.c
> > > +++ b/hw/i386/acpi-build.c
> > > @@ -1497,7 +1497,7 @@ build_dsdt(GArray *table_data, BIOSLinker *linker,
> > >          dev = aml_device("PCI0");
> > >          aml_append(dev, aml_name_decl("_HID", aml_eisaid("PNP0A03")));
> > >          aml_append(dev, aml_name_decl("_ADR", aml_int(0)));
> > > -        aml_append(dev, aml_name_decl("_UID", aml_int(1)));
> > > +        aml_append(dev, aml_name_decl("_UID", aml_int(0)));
> > >          aml_append(sb_scope, dev);
> > >          aml_append(dsdt, sb_scope);
> > >  
> > > @@ -1512,7 +1512,7 @@ build_dsdt(GArray *table_data, BIOSLinker *linker,
> > >          aml_append(dev, aml_name_decl("_HID", aml_eisaid("PNP0A08")));
> > >          aml_append(dev, aml_name_decl("_CID", aml_eisaid("PNP0A03")));
> > >          aml_append(dev, aml_name_decl("_ADR", aml_int(0)));
> > > -        aml_append(dev, aml_name_decl("_UID", aml_int(1)));
> > > +        aml_append(dev, aml_name_decl("_UID", aml_int(0)));
> > >          aml_append(dev, build_q35_osc_method());
> > >          aml_append(sb_scope, dev);
> > >          aml_append(dsdt, sb_scope);
> > >   
> > 
> > This "breaks" Windows guests created/installed before this change in the sense
> > of Windows gets confused and declares that most of the devices changed and thus
> > it has new entries for them in the device manager where settings of the old one
> > do not apply anymore.
> > 
> > We were made aware of this by our users when making QEMU 5.2.0 available on
> > a more used repository of us. Users complained that their static network
> > configuration got thrown out in Windows 2016 or 2019 server VMs, and Windows tried
> > to use DHCP (which was not available in their environments) and thus their Windows
> > VMs had no network connectivity at all anymore.
> > 
> > It's currently not yet quite 100% clear to me with what QEMU version the Windows VM
> > must be installed with, from reading the patch I have to believe it must be before
> > that, but we got mixed reports and a colleague could not replicate it from upgrade
> > of 4.0 to 5.2 (I did /not/ confirm that one). Anyway, just writing this all to avoid
> > people seeing different results and brushing this off.
> > 
> > So here's my personal reproducer, as said, I think that one should be able to just
> > use QEMU 5.1 to install a Windows guest and start it with 5.2 afterwards to see this
> > issue, but YMMV.
> > 
> > Note. I always used the exact same QEMU command (see below) for installation,
> > reproducing and bisect.
> > 
> > 1. Installed Windows 2016 1616 VM using QEMU 3.0.1
> >    - VirtIO net/scsi driver from VirtIO win 190
> > 2. Setup static network in the VM and shutdown
> > 3. Started VM with 5.2.0 -> Network gone, new "Ethernet #2" adapter shows up instead
> > 
> > Starting the  "Device Manager" and enabling "View -> Show hidden devices" showed
> > me a greyed out device duplicate for basically anything attached, SCSI disk, Basic
> > Display Adapter, CDROM device, ..., and the Network device.
> > 
> > The first difference I could find was the "Device instance path" one can find in
> > the "Details" tab of the devices' "Properties" window.
> > 
> > # old, from initial installation on QEMU 3.0.1
> > PCI\VEN_1AF4&DEV_1000&SUBSYS_00011AF4&REV_00\3&13C0B0C5&0&90
> > 
> > # new, from boot with QEMU 5.2
> > PCI\VEN_1AF4&DEV_1000&SUBSYS_00011AF4&REV_00\3&267A616A&0&90
> > 
> > They match until almost the end, not sure how important that is, but it caught my
> > eye (I'm really no windows guy since a decade so please excuse my terrible
> > debugging/exploring skills there. The rest of those properties looked pretty
> > much identical.
> > 
> > I then started a bisect, always just restarting the guest with the new QEMU build
> > and checking "Device Manager" and network settings to see if good/bad. That worked
> > pretty well and I came to this commit. See the bisect log attached at the end of
> > this mail.
> > 
> > So, from reading the commit message I figure that this change is wanted, what are
> > the implications of just reverting it? (which works out in bringing back the
> > old state in Windows + working static network config again).
> > 
> > Or any other way/idea to address this in a sane way so that those picky Windows
> > guests can be handled more graciously?  
> 
> Sure. The way to do that is to tie old behaviour to old machine
> versions. We'll need it in stable too ...
> 
> Igor want to cook up a patch?

It might be too late for that,
I mean VMs installed on qemu-5.2 will use new ACPI tables on all
machine types and reverting behavior back for old machine types
will cause the same headache for them.

The difference is that probably there are a lot less new
Windows installations than the old ones (especially with static IP assignment),
so it could be better to restore bug for old machine types to
avoid guest reconfiguration.

How about:
 * buggy ACPI for 5.1 machine types and older
 * fixed ACPI for 5.2 and newer?


> > I guess also that there could be more subtle effects from this patch here, the
> > network one may have just had quite visible effects to pop up as first issue...
> > 
> > Thanks if you read so far!
> > 
> > cheers,
> > Thomas
> > 
> > 
> > = QEMU Command =
> > 
> > (This was generated by our (Proxmox VE) stack, I only cleaned it up a bit to allow
> > easier manual running it)
> > 
> > ./qemu-system-x86_64 \
> >   -name win2016 \
> >   -chardev 'socket,id=qmp,path=/var/run/qemu-server/11765.qmp,server,nowait' \
> >   -mon 'chardev=qmp,mode=control' \
> >   -smbios 'type=1,uuid=6324fb28-e98a-44cf-85db-694d1b3405f5' \
> >   -smp '2,sockets=1,cores=2,maxcpus=2' \
> >   -nodefaults \
> >   -boot 'menu=on,strict=on,reboot-timeout=1000' \
> >   -vnc unix:/var/run/qemu-server/11765.vnc,password \
> >   -no-hpet \
> >   -cpu 'host,hv_ipi,hv_relaxed,hv_reset,hv_runtime,hv_spinlocks=0x1fff,hv_stimer,hv_synic,hv_time,hv_vapic,hv_vpindex,+kvm_pv_eoi,+kvm_pv_unhalt,+md-clear,+pcid,+spec-ctrl' \
> >   -m 2048 \
> >   -device 'pci-bridge,id=pci.1,chassis_nr=1,bus=pci.0,addr=0x1e' \
> >   -device 'pci-bridge,id=pci.2,chassis_nr=2,bus=pci.0,addr=0x1f' \
> >   -device 'vmgenid,guid=2e56e6ca-2cf8-4f1d-8cc3-9b19a2510c01' \
> >   -device 'piix3-usb-uhci,id=uhci,bus=pci.0,addr=0x1.0x2' \
> >   -device 'usb-tablet,id=tablet,bus=uhci.0,port=1' \
> >   -device 'VGA,id=vga,bus=pci.0,addr=0x2,edid=off' \
> >   -chardev 'socket,path=/var/run/qemu-server/11765.qga,server,nowait,id=qga0' \
> >   -device 'virtio-serial,id=qga0,bus=pci.0,addr=0x8' \
> >   -device 'virtserialport,chardev=qga0,name=org.qemu.guest_agent.0' \
> >   -device 'virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x3' \
> >   -iscsi 'initiator-name=iqn.1993-08.org.debian:01:468faae9322b' \
> >   -drive 'file=/mnt/pve/iso/template/iso/virtio-win-0.1.190.iso,if=none,id=drive-ide0,media=cdrom,aio=threads' \
> >   -device 'ide-cd,bus=ide.0,unit=0,drive=drive-ide0,id=ide0,bootindex=200' \
> >   -drive 'file=/mnt/pve/iso/template/iso/Win2016-1616-evaluation.ISO,if=none,id=drive-ide2,media=cdrom,aio=threads' \
> >   -device 'ide-cd,bus=ide.1,unit=0,drive=drive-ide2,id=ide2,bootindex=201' \
> >   -device 'virtio-scsi-pci,id=scsihw0,bus=pci.0,addr=0x5' \
> >   -drive 'file=/dev/WDnvme/vm-11765-disk-0,if=none,id=drive-scsi0,format=raw,cache=none,aio=native,detect-zeroes=on' \
> >   -device 'scsi-hd,bus=scsihw0.0,channel=0,scsi-id=0,lun=0,drive=drive-scsi0,id=scsi0,rotation_rate=1,bootindex=100' \
> >   -netdev 'type=tap,id=net0,ifname=tap11765i0,script=/var/lib/qemu-server/pve-bridge,downscript=/var/lib/qemu-server/pve-bridgedown,vhost=on' \
> >   -device 'virtio-net-pci,mac=02:98:90:43:42:1D,netdev=net0,bus=pci.0,addr=0x12,id=net0,bootindex=300' \
> >   -rtc 'driftfix=slew,base=localtime' \
> >   -machine 'type=pc' \
> >   -global 'kvm-pit.lost_tick_policy=discard'
> > 
> > 
> > = bisect log =
> > 
> > git bisect start
> > # bad: [553032db17440f8de011390e5a1cfddd13751b0b] Update version for v5.2.0 release
> > git bisect bad 553032db17440f8de011390e5a1cfddd13751b0b
> > # good: [d0ed6a69d399ae193959225cdeaa9382746c91cc] Update version for v5.1.0 release
> > git bisect good d0ed6a69d399ae193959225cdeaa9382746c91cc
> > # bad: [ed799805d00ccdda45eb8441c7d929624d9e98a6] qom: Add kernel-doc markup to introduction doc comment
> > git bisect bad ed799805d00ccdda45eb8441c7d929624d9e98a6
> > # bad: [e4d8b7c1a95fffcfa4bdab9aa7ffd1cf590cdcf5] Merge remote-tracking branch 'remotes/nvme/tags/pull-nvme-20200902' into staging
> > git bisect bad e4d8b7c1a95fffcfa4bdab9aa7ffd1cf590cdcf5
> > # bad: [af1dfe1ec0864e6700237a43cc36018176f9eba9] acpi: update expected DSDT files with _UID changes
> > git bisect bad af1dfe1ec0864e6700237a43cc36018176f9eba9
> > # good: [d7df0ceee0fd2e512cd214a9074ebeeb40da3099] Merge remote-tracking branch 'remotes/philmd-gitlab/tags/sd-next-20200821' into staging
> > git bisect good d7df0ceee0fd2e512cd214a9074ebeeb40da3099
> > # good: [df82aa7fe10e46b675678977999d49bd586538f8] Merge remote-tracking branch 'remotes/edgar/tags/edgar/xilinx-next-2020-08-24.for-upstream' into staging
> > git bisect good df82aa7fe10e46b675678977999d49bd586538f8
> > # good: [e39a8320b088dd5efc9ebaafe387e52b3d962665] target/riscv: Support the Virtual Instruction fault
> > git bisect good e39a8320b088dd5efc9ebaafe387e52b3d962665
> > # good: [a6841a2de66fa44fe52ed996b70f9fb9f7bd6ca7] qcow2: Add subcluster support to qcow2_co_pwrite_zeroes()
> > git bisect good a6841a2de66fa44fe52ed996b70f9fb9f7bd6ca7
> > # good: [2f8cd515477edab1cbf38ecbdbfa2cac13ce1550] hw/display/artist: Fix invalidation of lines near screen border
> > git bisect good 2f8cd515477edab1cbf38ecbdbfa2cac13ce1550
> > # good: [a5d3cfa2dc775e5d99f013703b8508f1d989d588] iotests: Add tests for qcow2 images with extended L2 entries
> > git bisect good a5d3cfa2dc775e5d99f013703b8508f1d989d588
> > # good: [8e49197ca5e76fdb8928833b2649ef13fc5aab2f] Merge remote-tracking branch 'remotes/hdeller/tags/target-hppa-v3-pull-request' into staging
> > git bisect good 8e49197ca5e76fdb8928833b2649ef13fc5aab2f
> > # bad: [af1b80ae56c9495999e8ccf7b70ef894378de642] i386/acpi: fix inconsistent QEMU/OVMF device paths
> > git bisect bad af1b80ae56c9495999e8ccf7b70ef894378de642
> > # good: [42a62c20925e02aef0d849f92a0e9540888e79ae] acpi: allow DSDT changes
> > git bisect good 42a62c20925e02aef0d849f92a0e9540888e79ae
> > # first bad commit: [af1b80ae56c9495999e8ccf7b70ef894378de642] i386/acpi: fix inconsistent QEMU/OVMF device paths  
> 
> 



^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH 1/2] i386/acpi: fix inconsistent QEMU/OVMF device paths
  2021-03-01  7:45         ` Thomas Lamprecht
@ 2021-03-01 14:20           ` Igor Mammedov
  2021-03-01 14:27             ` Thomas Lamprecht
  2021-03-01 15:31           ` Michael S. Tsirkin
  1 sibling, 1 reply; 33+ messages in thread
From: Igor Mammedov @ 2021-03-01 14:20 UTC (permalink / raw)
  To: Thomas Lamprecht
  Cc: vit9696, Michael S. Tsirkin, Stefan Reiter, qemu-devel,
	Paolo Bonzini, Richard Henderson, Laszlo Ersek, Eduardo Habkost

On Mon, 1 Mar 2021 08:45:53 +0100
Thomas Lamprecht <t.lamprecht@proxmox.com> wrote:

> On 01.03.21 08:20, Michael S. Tsirkin wrote:
> > On Mon, Mar 01, 2021 at 08:12:35AM +0100, Thomas Lamprecht wrote:  
> >> On 28.02.21 21:43, Michael S. Tsirkin wrote:  
> >>> Sure. The way to do that is to tie old behaviour to old machine
> >>> versions. We'll need it in stable too ...  
> >>
> >> Yeah, using machine types is how its meant to be with solving migration
> >> breakage, sure.
> >> But that means we have to permanently pin the VM, and any backup restored from
> >> that to that machine type *forever*. That'd be new for us as we always could
> >> allow a newer machine type for a fresh start (i.e., non migration or the like)
> >> here, and mean that lots of other improvements guarded by a newer machine type
> >> for those VMs will.  
> > 
> > If you don't do that, that is a bug as any virtual hardware
> > can change across machine types.  
> 
> For us a feature, for fresh starts one gets the current virtual HW but for
> live migration or our live snapshot code it stays compatible. Works quite
> well here for many years, as we can simply test the HW changes on existing
> VMs - which failed here due to lack of static IPs in the test bed. So yes,
> it has its problems as it is not really  what an OS considers as HW change
> so big that it makes it a new device, mostly Windows is a PITA here as seen
> in this issue.
> 
> I mean, QEMU deprecates very old machines at some point anyway, so even then
> it is impossible to keep to the old machine forever, but otoh redoing some
> changes after a decade or two can be fine, I guess?
> 
> >   
> >> And yeah, stable is wanted, but extrapolating from the current stable releases
> >> frequency, where normally there's maximal one after 5-6 months from the .0
> >> release, means that this will probably still hit all those distributions I
> >> mentioned or is there something more soon planned?
> >>
> >> Also, is there any regression testing infrastructure around to avoid such
> >> changes in the future? This change got undetected for 7 months, which can be
> >> pretty the norm for QEMU releases, so some earlier safety net would be good? Is
> >> there anything which dumps various default machine HW layouts and uses them for
> >> an ABI check of some sorts?  
> > 
> > There are various testing efforts the reason this got undetected is
> > because it does not affect linux guests, and even for windows
> > they kind of recover, there's just some boot slowdown around reconfiguration.
> > Not easy to detect automatically given windows has lots of random
> > downtime during boot around updates etc etc.
> >   
> 
> No, Windows does not reconfigure, this is a permanent change, one is just lucky
> if one has a DHCP server around in the network accessible for the guest.
> As static addresses setup on that virtual NIC before that config is gone,
> no recovery whatsoever until manual intervention.
Static IP's are the pain guest admin picked up to deal with so he might have to
reconfigure guest OS when it decides to rename NICs. In this case moving
to new QEMU is alike to updating BIOS which fixed PCI description.
(On QEMU side we try to avoid breaking changes, but sometime it happens anyway
and it's up guest admin to fix OS quirks)

> I meant more of a "dump HW layout to .txt file, commit to git, and ensure
> there's no diff without and machine version bump" (very boiled down), e.g., like
> ABI checks for kernel builds are often done by distros - albeit those are easier
> as its quite clear what and how the kernel ABI can be used.
ACPI tables are not considered as ABI change in QEMU, technically tables that QEMU
generates are firmware and not version-ed (same like we don't tie anything to
specific firmware versions). 

However we rarely do version ACPI changes (only when it breaks something or
we suspect it would break and we can't accept that breakage), this time it took
a lot of time to find out that. We try to minimize such cases as every
versioning knob adds up to maintenance.

For ACPI tables changes, QEMU has bios-tables-test, but it lets us to catch
unintended changes only.
Technically it's possible to keep master tables for old machine versions
and test against it. But I'm not sure if we should do that, because some
(most) changes are harmless or useful and should apply to all machine
versions.
So we will end up in the same situation, where we decide if a change
should be versioned or not.



^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH 1/2] i386/acpi: fix inconsistent QEMU/OVMF device paths
  2021-03-01 14:20           ` Igor Mammedov
@ 2021-03-01 14:27             ` Thomas Lamprecht
  2021-03-01 20:16               ` Igor Mammedov
  0 siblings, 1 reply; 33+ messages in thread
From: Thomas Lamprecht @ 2021-03-01 14:27 UTC (permalink / raw)
  To: Igor Mammedov
  Cc: vit9696, Michael S. Tsirkin, Stefan Reiter, qemu-devel,
	Paolo Bonzini, Richard Henderson, Laszlo Ersek, Eduardo Habkost

On 01.03.21 15:20, Igor Mammedov wrote:
> On Mon, 1 Mar 2021 08:45:53 +0100
> Thomas Lamprecht <t.lamprecht@proxmox.com> wrote:
>> On 01.03.21 08:20, Michael S. Tsirkin wrote:
>>> There are various testing efforts the reason this got undetected is
>>> because it does not affect linux guests, and even for windows
>>> they kind of recover, there's just some boot slowdown around reconfiguration.
>>> Not easy to detect automatically given windows has lots of random
>>> downtime during boot around updates etc etc.
>>>   
>>
>> No, Windows does not reconfigure, this is a permanent change, one is just lucky
>> if one has a DHCP server around in the network accessible for the guest.
>> As static addresses setup on that virtual NIC before that config is gone,
>> no recovery whatsoever until manual intervention.
> Static IP's are the pain guest admin picked up to deal with so he might have to
> reconfigure guest OS when it decides to rename NICs. In this case moving
> to new QEMU is alike to updating BIOS which fixed PCI description.
> (On QEMU side we try to avoid breaking changes, but sometime it happens anyway
> and it's up guest admin to fix OS quirks)
> 

heh, I agree, but users see it very differently, QEMU got updated, something
stopped working/changed/... -> QEMU at fault.

>> I meant more of a "dump HW layout to .txt file, commit to git, and ensure
>> there's no diff without and machine version bump" (very boiled down), e.g., like
>> ABI checks for kernel builds are often done by distros - albeit those are easier
>> as its quite clear what and how the kernel ABI can be used.
> ACPI tables are not considered as ABI change in QEMU, technically tables that QEMU
> generates are firmware and not version-ed (same like we don't tie anything to
> specific firmware versions). 
> 
> However we rarely do version ACPI changes (only when it breaks something or
> we suspect it would break and we can't accept that breakage), this time it took
> a lot of time to find out that. We try to minimize such cases as every
> versioning knob adds up to maintenance.
> 
> For ACPI tables changes, QEMU has bios-tables-test, but it lets us to catch
> unintended changes only.
> Technically it's possible to keep master tables for old machine versions
> and test against it. But I'm not sure if we should do that, because some
> (most) changes are harmless or useful and should apply to all machine
> versions.
> So we will end up in the same situation, where we decide if a change
> should be versioned or not.
> 
> 

OK, fair enough. Many thanks for providing some rationale!



^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH 1/2] i386/acpi: fix inconsistent QEMU/OVMF device paths
  2021-03-01  7:45         ` Thomas Lamprecht
  2021-03-01 14:20           ` Igor Mammedov
@ 2021-03-01 15:31           ` Michael S. Tsirkin
  1 sibling, 0 replies; 33+ messages in thread
From: Michael S. Tsirkin @ 2021-03-01 15:31 UTC (permalink / raw)
  To: Thomas Lamprecht
  Cc: Eduardo Habkost, Stefan Reiter, qemu-devel, Igor Mammedov,
	Paolo Bonzini, Richard Henderson, Laszlo Ersek, vit9696

On Mon, Mar 01, 2021 at 08:45:53AM +0100, Thomas Lamprecht wrote:
> On 01.03.21 08:20, Michael S. Tsirkin wrote:
> > On Mon, Mar 01, 2021 at 08:12:35AM +0100, Thomas Lamprecht wrote:
> >> On 28.02.21 21:43, Michael S. Tsirkin wrote:
> >>> Sure. The way to do that is to tie old behaviour to old machine
> >>> versions. We'll need it in stable too ...
> >>
> >> Yeah, using machine types is how its meant to be with solving migration
> >> breakage, sure.
> >> But that means we have to permanently pin the VM, and any backup restored from
> >> that to that machine type *forever*. That'd be new for us as we always could
> >> allow a newer machine type for a fresh start (i.e., non migration or the like)
> >> here, and mean that lots of other improvements guarded by a newer machine type
> >> for those VMs will.
> > 
> > If you don't do that, that is a bug as any virtual hardware
> > can change across machine types.
> 
> For us a feature, for fresh starts one gets the current virtual HW but for
> live migration or our live snapshot code it stays compatible. Works quite
> well here for many years, as we can simply test the HW changes on existing
> VMs - which failed here due to lack of static IPs in the test bed. So yes,
> it has its problems as it is not really  what an OS considers as HW change
> so big that it makes it a new device, mostly Windows is a PITA here as seen
> in this issue.
> 
> I mean, QEMU deprecates very old machines at some point anyway, so even then
> it is impossible to keep to the old machine forever, but otoh redoing some
> changes after a decade or two can be fine, I guess?
> 
> > 
> >> And yeah, stable is wanted, but extrapolating from the current stable releases
> >> frequency, where normally there's maximal one after 5-6 months from the .0
> >> release, means that this will probably still hit all those distributions I
> >> mentioned or is there something more soon planned?
> >>
> >> Also, is there any regression testing infrastructure around to avoid such
> >> changes in the future? This change got undetected for 7 months, which can be
> >> pretty the norm for QEMU releases, so some earlier safety net would be good? Is
> >> there anything which dumps various default machine HW layouts and uses them for
> >> an ABI check of some sorts?
> > 
> > There are various testing efforts the reason this got undetected is
> > because it does not affect linux guests, and even for windows
> > they kind of recover, there's just some boot slowdown around reconfiguration.
> > Not easy to detect automatically given windows has lots of random
> > downtime during boot around updates etc etc.
> > 
> 
> No, Windows does not reconfigure, this is a permanent change, one is just lucky
> if one has a DHCP server around in the network accessible for the guest.
> As static addresses setup on that virtual NIC before that config is gone,
> no recovery whatsoever until manual intervention.

Right, it so happened no one tested with a static IP.

> I meant more of a "dump HW layout to .txt file, commit to git, and ensure
> there's no diff without and machine version bump" (very boiled down), e.g., like
> ABI checks for kernel builds are often done by distros - albeit those are easier
> as its quite clear what and how the kernel ABI can be used.

Exactly. We have such tests for ACPI which is what changed here.  We
just *do* change ACPI once in a while - it is code after all, and it is
normally fine to change as long as changes are not material.  So what do
we do  when we change it?  All we have right now is inspecting
it manually.

-- 
MST



^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH 1/2] i386/acpi: fix inconsistent QEMU/OVMF device paths
  2021-03-01 13:28     ` Igor Mammedov
@ 2021-03-01 16:14       ` Michael S. Tsirkin
  2021-03-01 16:28         ` Laszlo Ersek
  0 siblings, 1 reply; 33+ messages in thread
From: Michael S. Tsirkin @ 2021-03-01 16:14 UTC (permalink / raw)
  To: Igor Mammedov
  Cc: Eduardo Habkost, Stefan Reiter, qemu-devel, Paolo Bonzini,
	vit9696, Laszlo Ersek, Thomas Lamprecht, Richard Henderson

On Mon, Mar 01, 2021 at 02:28:19PM +0100, Igor Mammedov wrote:
> On Sun, 28 Feb 2021 15:43:40 -0500
> "Michael S. Tsirkin" <mst@redhat.com> wrote:
> 
> > On Sat, Feb 27, 2021 at 08:41:11PM +0100, Thomas Lamprecht wrote:
> > > On 30.07.20 17:58, Michael S. Tsirkin wrote:  
> > > > macOS uses ACPI UIDs to build the DevicePath for NVRAM boot options,
> > > > while OVMF firmware gets them via an internal channel through QEMU.
> > > > Due to a bug in QEMU ACPI currently UEFI firmware and ACPI have
> > > > different values, and this makes the underlying operating system
> > > > unable to report its boot option.
> > > > 
> > > > The particular node in question is the primary PciRoot (PCI0 in ACPI),
> > > > which for some reason gets assigned 1 in ACPI UID and 0 in the
> > > > DevicePath. This is due to the _UID assigned to it by build_dsdt in
> > > > hw/i386/acpi-build.c Which does not correspond to the primary PCI
> > > > identifier given by pcibus_num in hw/pci/pci.c
> > > > 
> > > > Reference with the device paths, OVMF startup logs, and ACPI table
> > > > dumps (SysReport):
> > > > https://github.com/acidanthera/bugtracker/issues/1050
> > > > 
> > > > In UEFI v2.8, section "10.4.2 Rules with ACPI _HID and _UID" ends with
> > > > the paragraph,
> > > > 
> > > >     Root PCI bridges will use the plug and play ID of PNP0A03, This will
> > > >     be stored in the ACPI Device Path _HID field, or in the Expanded
> > > >     ACPI Device Path _CID field to match the ACPI name space. The _UID
> > > >     in the ACPI Device Path structure must match the _UID in the ACPI
> > > >     name space.
> > > > 
> > > > (See especially the last sentence.)
> > > > 
> > > > Considering *extra* root bridges / root buses (with bus number > 0),
> > > > QEMU's ACPI generator actually does the right thing; since QEMU commit
> > > > c96d9286a6d7 ("i386/acpi-build: more traditional _UID and _HID for PXB
> > > > root buses", 2015-06-11).
> > > > 
> > > > However, the _UID values for root bridge zero (on both i440fx and q35)
> > > > have always been "wrong" (from UEFI perspective), going back in QEMU to
> > > > commit 74523b850189 ("i386: add ACPI table files from seabios",
> > > > 2013-10-14).
> > > > 
> > > > Even in SeaBIOS, these _UID values have always been 1; see commit
> > > > a4d357638c57 ("Port rombios32 code from bochs-bios.", 2008-03-08) for
> > > > i440fx, and commit ecbe3fd61511 ("seabios: q35: add dsdt", 2012-12-01)
> > > > for q35.
> > > > 
> > > > Suggested-by: Laszlo Ersek <lersek@redhat.com>
> > > > Tested-by: vit9696 <vit9696@protonmail.com>
> > > > Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
> > > > ---
> > > >  hw/i386/acpi-build.c | 4 ++--
> > > >  1 file changed, 2 insertions(+), 2 deletions(-)
> > > > 
> > > > diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c
> > > > index b7bcbbbb2a..7a5a8b3521 100644
> > > > --- a/hw/i386/acpi-build.c
> > > > +++ b/hw/i386/acpi-build.c
> > > > @@ -1497,7 +1497,7 @@ build_dsdt(GArray *table_data, BIOSLinker *linker,
> > > >          dev = aml_device("PCI0");
> > > >          aml_append(dev, aml_name_decl("_HID", aml_eisaid("PNP0A03")));
> > > >          aml_append(dev, aml_name_decl("_ADR", aml_int(0)));
> > > > -        aml_append(dev, aml_name_decl("_UID", aml_int(1)));
> > > > +        aml_append(dev, aml_name_decl("_UID", aml_int(0)));
> > > >          aml_append(sb_scope, dev);
> > > >          aml_append(dsdt, sb_scope);
> > > >  
> > > > @@ -1512,7 +1512,7 @@ build_dsdt(GArray *table_data, BIOSLinker *linker,
> > > >          aml_append(dev, aml_name_decl("_HID", aml_eisaid("PNP0A08")));
> > > >          aml_append(dev, aml_name_decl("_CID", aml_eisaid("PNP0A03")));
> > > >          aml_append(dev, aml_name_decl("_ADR", aml_int(0)));
> > > > -        aml_append(dev, aml_name_decl("_UID", aml_int(1)));
> > > > +        aml_append(dev, aml_name_decl("_UID", aml_int(0)));
> > > >          aml_append(dev, build_q35_osc_method());
> > > >          aml_append(sb_scope, dev);
> > > >          aml_append(dsdt, sb_scope);
> > > >   
> > > 
> > > This "breaks" Windows guests created/installed before this change in the sense
> > > of Windows gets confused and declares that most of the devices changed and thus
> > > it has new entries for them in the device manager where settings of the old one
> > > do not apply anymore.
> > > 
> > > We were made aware of this by our users when making QEMU 5.2.0 available on
> > > a more used repository of us. Users complained that their static network
> > > configuration got thrown out in Windows 2016 or 2019 server VMs, and Windows tried
> > > to use DHCP (which was not available in their environments) and thus their Windows
> > > VMs had no network connectivity at all anymore.
> > > 
> > > It's currently not yet quite 100% clear to me with what QEMU version the Windows VM
> > > must be installed with, from reading the patch I have to believe it must be before
> > > that, but we got mixed reports and a colleague could not replicate it from upgrade
> > > of 4.0 to 5.2 (I did /not/ confirm that one). Anyway, just writing this all to avoid
> > > people seeing different results and brushing this off.
> > > 
> > > So here's my personal reproducer, as said, I think that one should be able to just
> > > use QEMU 5.1 to install a Windows guest and start it with 5.2 afterwards to see this
> > > issue, but YMMV.
> > > 
> > > Note. I always used the exact same QEMU command (see below) for installation,
> > > reproducing and bisect.
> > > 
> > > 1. Installed Windows 2016 1616 VM using QEMU 3.0.1
> > >    - VirtIO net/scsi driver from VirtIO win 190
> > > 2. Setup static network in the VM and shutdown
> > > 3. Started VM with 5.2.0 -> Network gone, new "Ethernet #2" adapter shows up instead
> > > 
> > > Starting the  "Device Manager" and enabling "View -> Show hidden devices" showed
> > > me a greyed out device duplicate for basically anything attached, SCSI disk, Basic
> > > Display Adapter, CDROM device, ..., and the Network device.
> > > 
> > > The first difference I could find was the "Device instance path" one can find in
> > > the "Details" tab of the devices' "Properties" window.
> > > 
> > > # old, from initial installation on QEMU 3.0.1
> > > PCI\VEN_1AF4&DEV_1000&SUBSYS_00011AF4&REV_00\3&13C0B0C5&0&90
> > > 
> > > # new, from boot with QEMU 5.2
> > > PCI\VEN_1AF4&DEV_1000&SUBSYS_00011AF4&REV_00\3&267A616A&0&90
> > > 
> > > They match until almost the end, not sure how important that is, but it caught my
> > > eye (I'm really no windows guy since a decade so please excuse my terrible
> > > debugging/exploring skills there. The rest of those properties looked pretty
> > > much identical.
> > > 
> > > I then started a bisect, always just restarting the guest with the new QEMU build
> > > and checking "Device Manager" and network settings to see if good/bad. That worked
> > > pretty well and I came to this commit. See the bisect log attached at the end of
> > > this mail.
> > > 
> > > So, from reading the commit message I figure that this change is wanted, what are
> > > the implications of just reverting it? (which works out in bringing back the
> > > old state in Windows + working static network config again).
> > > 
> > > Or any other way/idea to address this in a sane way so that those picky Windows
> > > guests can be handled more graciously?  
> > 
> > Sure. The way to do that is to tie old behaviour to old machine
> > versions. We'll need it in stable too ...
> > 
> > Igor want to cook up a patch?
> 
> It might be too late for that,
> I mean VMs installed on qemu-5.2 will use new ACPI tables on all
> machine types and reverting behavior back for old machine types
> will cause the same headache for them.
> 
> The difference is that probably there are a lot less new
> Windows installations than the old ones (especially with static IP assignment),
> so it could be better to restore bug for old machine types to
> avoid guest reconfiguration.

Yes. And new installations using e.g. libvirt will always use the
latest machine type available.

> How about:
>  * buggy ACPI for 5.1 machine types and older
>  * fixed ACPI for 5.2 and newer?

Exactly.

> 
> > > I guess also that there could be more subtle effects from this patch here, the
> > > network one may have just had quite visible effects to pop up as first issue...
> > > 
> > > Thanks if you read so far!
> > > 
> > > cheers,
> > > Thomas
> > > 
> > > 
> > > = QEMU Command =
> > > 
> > > (This was generated by our (Proxmox VE) stack, I only cleaned it up a bit to allow
> > > easier manual running it)
> > > 
> > > ./qemu-system-x86_64 \
> > >   -name win2016 \
> > >   -chardev 'socket,id=qmp,path=/var/run/qemu-server/11765.qmp,server,nowait' \
> > >   -mon 'chardev=qmp,mode=control' \
> > >   -smbios 'type=1,uuid=6324fb28-e98a-44cf-85db-694d1b3405f5' \
> > >   -smp '2,sockets=1,cores=2,maxcpus=2' \
> > >   -nodefaults \
> > >   -boot 'menu=on,strict=on,reboot-timeout=1000' \
> > >   -vnc unix:/var/run/qemu-server/11765.vnc,password \
> > >   -no-hpet \
> > >   -cpu 'host,hv_ipi,hv_relaxed,hv_reset,hv_runtime,hv_spinlocks=0x1fff,hv_stimer,hv_synic,hv_time,hv_vapic,hv_vpindex,+kvm_pv_eoi,+kvm_pv_unhalt,+md-clear,+pcid,+spec-ctrl' \
> > >   -m 2048 \
> > >   -device 'pci-bridge,id=pci.1,chassis_nr=1,bus=pci.0,addr=0x1e' \
> > >   -device 'pci-bridge,id=pci.2,chassis_nr=2,bus=pci.0,addr=0x1f' \
> > >   -device 'vmgenid,guid=2e56e6ca-2cf8-4f1d-8cc3-9b19a2510c01' \
> > >   -device 'piix3-usb-uhci,id=uhci,bus=pci.0,addr=0x1.0x2' \
> > >   -device 'usb-tablet,id=tablet,bus=uhci.0,port=1' \
> > >   -device 'VGA,id=vga,bus=pci.0,addr=0x2,edid=off' \
> > >   -chardev 'socket,path=/var/run/qemu-server/11765.qga,server,nowait,id=qga0' \
> > >   -device 'virtio-serial,id=qga0,bus=pci.0,addr=0x8' \
> > >   -device 'virtserialport,chardev=qga0,name=org.qemu.guest_agent.0' \
> > >   -device 'virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x3' \
> > >   -iscsi 'initiator-name=iqn.1993-08.org.debian:01:468faae9322b' \
> > >   -drive 'file=/mnt/pve/iso/template/iso/virtio-win-0.1.190.iso,if=none,id=drive-ide0,media=cdrom,aio=threads' \
> > >   -device 'ide-cd,bus=ide.0,unit=0,drive=drive-ide0,id=ide0,bootindex=200' \
> > >   -drive 'file=/mnt/pve/iso/template/iso/Win2016-1616-evaluation.ISO,if=none,id=drive-ide2,media=cdrom,aio=threads' \
> > >   -device 'ide-cd,bus=ide.1,unit=0,drive=drive-ide2,id=ide2,bootindex=201' \
> > >   -device 'virtio-scsi-pci,id=scsihw0,bus=pci.0,addr=0x5' \
> > >   -drive 'file=/dev/WDnvme/vm-11765-disk-0,if=none,id=drive-scsi0,format=raw,cache=none,aio=native,detect-zeroes=on' \
> > >   -device 'scsi-hd,bus=scsihw0.0,channel=0,scsi-id=0,lun=0,drive=drive-scsi0,id=scsi0,rotation_rate=1,bootindex=100' \
> > >   -netdev 'type=tap,id=net0,ifname=tap11765i0,script=/var/lib/qemu-server/pve-bridge,downscript=/var/lib/qemu-server/pve-bridgedown,vhost=on' \
> > >   -device 'virtio-net-pci,mac=02:98:90:43:42:1D,netdev=net0,bus=pci.0,addr=0x12,id=net0,bootindex=300' \
> > >   -rtc 'driftfix=slew,base=localtime' \
> > >   -machine 'type=pc' \
> > >   -global 'kvm-pit.lost_tick_policy=discard'
> > > 
> > > 
> > > = bisect log =
> > > 
> > > git bisect start
> > > # bad: [553032db17440f8de011390e5a1cfddd13751b0b] Update version for v5.2.0 release
> > > git bisect bad 553032db17440f8de011390e5a1cfddd13751b0b
> > > # good: [d0ed6a69d399ae193959225cdeaa9382746c91cc] Update version for v5.1.0 release
> > > git bisect good d0ed6a69d399ae193959225cdeaa9382746c91cc
> > > # bad: [ed799805d00ccdda45eb8441c7d929624d9e98a6] qom: Add kernel-doc markup to introduction doc comment
> > > git bisect bad ed799805d00ccdda45eb8441c7d929624d9e98a6
> > > # bad: [e4d8b7c1a95fffcfa4bdab9aa7ffd1cf590cdcf5] Merge remote-tracking branch 'remotes/nvme/tags/pull-nvme-20200902' into staging
> > > git bisect bad e4d8b7c1a95fffcfa4bdab9aa7ffd1cf590cdcf5
> > > # bad: [af1dfe1ec0864e6700237a43cc36018176f9eba9] acpi: update expected DSDT files with _UID changes
> > > git bisect bad af1dfe1ec0864e6700237a43cc36018176f9eba9
> > > # good: [d7df0ceee0fd2e512cd214a9074ebeeb40da3099] Merge remote-tracking branch 'remotes/philmd-gitlab/tags/sd-next-20200821' into staging
> > > git bisect good d7df0ceee0fd2e512cd214a9074ebeeb40da3099
> > > # good: [df82aa7fe10e46b675678977999d49bd586538f8] Merge remote-tracking branch 'remotes/edgar/tags/edgar/xilinx-next-2020-08-24.for-upstream' into staging
> > > git bisect good df82aa7fe10e46b675678977999d49bd586538f8
> > > # good: [e39a8320b088dd5efc9ebaafe387e52b3d962665] target/riscv: Support the Virtual Instruction fault
> > > git bisect good e39a8320b088dd5efc9ebaafe387e52b3d962665
> > > # good: [a6841a2de66fa44fe52ed996b70f9fb9f7bd6ca7] qcow2: Add subcluster support to qcow2_co_pwrite_zeroes()
> > > git bisect good a6841a2de66fa44fe52ed996b70f9fb9f7bd6ca7
> > > # good: [2f8cd515477edab1cbf38ecbdbfa2cac13ce1550] hw/display/artist: Fix invalidation of lines near screen border
> > > git bisect good 2f8cd515477edab1cbf38ecbdbfa2cac13ce1550
> > > # good: [a5d3cfa2dc775e5d99f013703b8508f1d989d588] iotests: Add tests for qcow2 images with extended L2 entries
> > > git bisect good a5d3cfa2dc775e5d99f013703b8508f1d989d588
> > > # good: [8e49197ca5e76fdb8928833b2649ef13fc5aab2f] Merge remote-tracking branch 'remotes/hdeller/tags/target-hppa-v3-pull-request' into staging
> > > git bisect good 8e49197ca5e76fdb8928833b2649ef13fc5aab2f
> > > # bad: [af1b80ae56c9495999e8ccf7b70ef894378de642] i386/acpi: fix inconsistent QEMU/OVMF device paths
> > > git bisect bad af1b80ae56c9495999e8ccf7b70ef894378de642
> > > # good: [42a62c20925e02aef0d849f92a0e9540888e79ae] acpi: allow DSDT changes
> > > git bisect good 42a62c20925e02aef0d849f92a0e9540888e79ae
> > > # first bad commit: [af1b80ae56c9495999e8ccf7b70ef894378de642] i386/acpi: fix inconsistent QEMU/OVMF device paths  
> > 
> > 



^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH 1/2] i386/acpi: fix inconsistent QEMU/OVMF device paths
  2021-03-01 16:14       ` Michael S. Tsirkin
@ 2021-03-01 16:28         ` Laszlo Ersek
  2021-03-01 19:08           ` Igor Mammedov
  0 siblings, 1 reply; 33+ messages in thread
From: Laszlo Ersek @ 2021-03-01 16:28 UTC (permalink / raw)
  To: Michael S. Tsirkin, Igor Mammedov
  Cc: Eduardo Habkost, Stefan Reiter, qemu-devel, Paolo Bonzini,
	vit9696, Thomas Lamprecht, Richard Henderson

On 03/01/21 17:14, Michael S. Tsirkin wrote:
> On Mon, Mar 01, 2021 at 02:28:19PM +0100, Igor Mammedov wrote:

>> How about:
>>  * buggy ACPI for 5.1 machine types and older
>>  * fixed ACPI for 5.2 and newer?
> 
> Exactly.

Sounds OK to me as well (even though it's quite unfortunate that this is
one of those exceptions that require us to version the ACPI generator).

Thanks
Laszlo



^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH 1/2] i386/acpi: fix inconsistent QEMU/OVMF device paths
  2021-03-01 16:28         ` Laszlo Ersek
@ 2021-03-01 19:08           ` Igor Mammedov
  2021-03-01 20:06             ` vit9696
  2021-03-02  8:40             ` Laszlo Ersek
  0 siblings, 2 replies; 33+ messages in thread
From: Igor Mammedov @ 2021-03-01 19:08 UTC (permalink / raw)
  To: Laszlo Ersek
  Cc: vit9696, Michael S. Tsirkin, Stefan Reiter, qemu-devel,
	Paolo Bonzini, Richard Henderson, Thomas Lamprecht,
	Eduardo Habkost

On Mon, 1 Mar 2021 17:28:05 +0100
Laszlo Ersek <lersek@redhat.com> wrote:

> On 03/01/21 17:14, Michael S. Tsirkin wrote:
> > On Mon, Mar 01, 2021 at 02:28:19PM +0100, Igor Mammedov wrote:  
> 
> >> How about:
> >>  * buggy ACPI for 5.1 machine types and older
> >>  * fixed ACPI for 5.2 and newer?  
> > 
> > Exactly.  
> 
> Sounds OK to me as well (even though it's quite unfortunate that this is
> one of those exceptions that require us to version the ACPI generator).
it is unfortunate, and I do resist to such changes usually.
in this case, I fill avoiding complaints/bug reports justifies such exception.


> Thanks
> Laszlo
> 
> 



^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH 1/2] i386/acpi: fix inconsistent QEMU/OVMF device paths
  2021-03-01 19:08           ` Igor Mammedov
@ 2021-03-01 20:06             ` vit9696
  2021-03-02  8:40             ` Laszlo Ersek
  1 sibling, 0 replies; 33+ messages in thread
From: vit9696 @ 2021-03-01 20:06 UTC (permalink / raw)
  To: Michael S. Tsirkin, Thomas Lamprecht
  Cc: Laszlo Ersek, Eduardo Habkost, Stefan Reiter, qemu devel list,
	Paolo Bonzini, Richard Henderson

[-- Attachment #1: Type: text/plain, Size: 1078 bytes --]

I provided the initial version of the patch to the mailing list:
[PATCH] i386/acpi: restore device paths for pre-5.1 vms

Unfortunately I do not have easy access to a VM where I can test it at the moment. Please make sure that it works for you and reply with `Tested-by`.

Thanks,
Vitaly

> 1 марта 2021 г., в 22:08, Igor Mammedov <imammedo@redhat.com> написал(а):
> 
> 
> On Mon, 1 Mar 2021 17:28:05 +0100
> Laszlo Ersek <lersek@redhat.com> wrote:
> 
>> On 03/01/21 17:14, Michael S. Tsirkin wrote:
>>> On Mon, Mar 01, 2021 at 02:28:19PM +0100, Igor Mammedov wrote:
>> 
>>>> How about:
>>>> * buggy ACPI for 5.1 machine types and older
>>>> * fixed ACPI for 5.2 and newer?
>>> 
>>> Exactly.
>> 
>> Sounds OK to me as well (even though it's quite unfortunate that this is
>> one of those exceptions that require us to version the ACPI generator).
> it is unfortunate, and I do resist to such changes usually.
> in this case, I fill avoiding complaints/bug reports justifies such exception.
> 
> 
>> Thanks
>> Laszlo
>> 
>> 
> 


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 554 bytes --]

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH 1/2] i386/acpi: fix inconsistent QEMU/OVMF device paths
  2021-03-01 14:27             ` Thomas Lamprecht
@ 2021-03-01 20:16               ` Igor Mammedov
  0 siblings, 0 replies; 33+ messages in thread
From: Igor Mammedov @ 2021-03-01 20:16 UTC (permalink / raw)
  To: Thomas Lamprecht
  Cc: vit9696, Michael S. Tsirkin, Stefan Reiter, qemu-devel,
	Eduardo Habkost, Wainer dos Santos Moschetta, Cleber Rosa,
	Paolo Bonzini, Laszlo Ersek, Richard Henderson

On Mon, 1 Mar 2021 15:27:38 +0100
Thomas Lamprecht <t.lamprecht@proxmox.com> wrote:

> On 01.03.21 15:20, Igor Mammedov wrote:
> > On Mon, 1 Mar 2021 08:45:53 +0100
> > Thomas Lamprecht <t.lamprecht@proxmox.com> wrote:  
> >> On 01.03.21 08:20, Michael S. Tsirkin wrote:  
> >>> There are various testing efforts the reason this got undetected is
> >>> because it does not affect linux guests, and even for windows
> >>> they kind of recover, there's just some boot slowdown around reconfiguration.
> >>> Not easy to detect automatically given windows has lots of random
> >>> downtime during boot around updates etc etc.
> >>>     
> >>
> >> No, Windows does not reconfigure, this is a permanent change, one is just lucky
> >> if one has a DHCP server around in the network accessible for the guest.
> >> As static addresses setup on that virtual NIC before that config is gone,
> >> no recovery whatsoever until manual intervention.  
> > Static IP's are the pain guest admin picked up to deal with so he might have to
> > reconfigure guest OS when it decides to rename NICs. In this case moving
> > to new QEMU is alike to updating BIOS which fixed PCI description.
> > (On QEMU side we try to avoid breaking changes, but sometime it happens anyway
> > and it's up guest admin to fix OS quirks)
> >   
> 
> heh, I agree, but users see it very differently, QEMU got updated, something
> stopped working/changed/... -> QEMU at fault.
lets try to workaround it as Michael have suggested, i.e. old behavior for old
machine types.

> >> I meant more of a "dump HW layout to .txt file, commit to git, and ensure
> >> there's no diff without and machine version bump" (very boiled down), e.g., like
> >> ABI checks for kernel builds are often done by distros - albeit those are easier
> >> as its quite clear what and how the kernel ABI can be used.  
> > ACPI tables are not considered as ABI change in QEMU, technically tables that QEMU
> > generates are firmware and not version-ed (same like we don't tie anything to
> > specific firmware versions). 
> > 
> > However we rarely do version ACPI changes (only when it breaks something or
> > we suspect it would break and we can't accept that breakage), this time it took
> > a lot of time to find out that. We try to minimize such cases as every
> > versioning knob adds up to maintenance.
> > 
> > For ACPI tables changes, QEMU has bios-tables-test, but it lets us to catch
> > unintended changes only.
> > Technically it's possible to keep master tables for old machine versions
> > and test against it. But I'm not sure if we should do that, because some
> > (most) changes are harmless or useful and should apply to all machine
> > versions.
> > So we will end up in the same situation, where we decide if a change
> > should be versioned or not.

If you can come up with a test case for this issue, patches are welcome.

We probably should test PCI device enumeration on Windows,
as it's what's broken. Testing that in qtest (make check) is out of question,
but there are acceptance tests which run various guest OSes and provide tools
to interact with guest. Perhaps adding test there could work.
Though I don't know if there are Windows based guest images available for it.

I guess it's something to discuss with guys who work on CI/testing
infrastructure (CCed).
 
> OK, fair enough. Many thanks for providing some rationale!
> 
> 



^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH 1/2] i386/acpi: fix inconsistent QEMU/OVMF device paths
  2021-03-01 19:08           ` Igor Mammedov
  2021-03-01 20:06             ` vit9696
@ 2021-03-02  8:40             ` Laszlo Ersek
  1 sibling, 0 replies; 33+ messages in thread
From: Laszlo Ersek @ 2021-03-02  8:40 UTC (permalink / raw)
  To: Igor Mammedov
  Cc: vit9696, Michael S. Tsirkin, Stefan Reiter, qemu-devel,
	Paolo Bonzini, Richard Henderson, Thomas Lamprecht,
	Eduardo Habkost

On 03/01/21 20:08, Igor Mammedov wrote:
> On Mon, 1 Mar 2021 17:28:05 +0100
> Laszlo Ersek <lersek@redhat.com> wrote:
> 
>> On 03/01/21 17:14, Michael S. Tsirkin wrote:
>>> On Mon, Mar 01, 2021 at 02:28:19PM +0100, Igor Mammedov wrote:  
>>
>>>> How about:
>>>>  * buggy ACPI for 5.1 machine types and older
>>>>  * fixed ACPI for 5.2 and newer?  
>>>
>>> Exactly.  
>>
>> Sounds OK to me as well (even though it's quite unfortunate that this is
>> one of those exceptions that require us to version the ACPI generator).
> it is unfortunate, and I do resist to such changes usually.
> in this case, I fill avoiding complaints/bug reports justifies such exception.

Right, thanks.
Laszlo



^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH 1/2] i386/acpi: fix inconsistent QEMU/OVMF device paths
  2023-04-19  2:48 zhangying (AZ) via
@ 2023-04-20 14:54 ` Igor Mammedov
  0 siblings, 0 replies; 33+ messages in thread
From: Igor Mammedov @ 2023-04-20 14:54 UTC (permalink / raw)
  To: zhangying
  Cc: qemu-devel, Thomas Lamprecht, Michael S. Tsirkin, Renxuming,
	Wangyuan, suxiaodong

On Wed, 19 Apr 2023 02:48:55 +0000
"zhangying (AZ)" <zhangying134@huawei.com> wrote:

> > On Tue, 18 Apr 2023 09:06:30 +0000
> > "zhangying (AZ)" via <qemu-devel@nongnu.org> wrote:
> >   
> > > > On 30.07.20 17:58, Michael S. Tsirkin wrote:  
> > > > > macOS uses ACPI UIDs to build the DevicePath for NVRAM boot
> > > > > options, while OVMF firmware gets them via an internal channel through  
> > QEMU.  
> > > > > Due to a bug in QEMU ACPI currently UEFI firmware and ACPI have
> > > > > different values, and this makes the underlying operating system
> > > > > unable to report its boot option.
> > > > >
> > > > > The particular node in question is the primary PciRoot (PCI0 in
> > > > > ACPI), which for some reason gets assigned 1 in ACPI UID and 0 in
> > > > > the DevicePath. This is due to the _UID assigned to it by
> > > > > build_dsdt in hw/i386/acpi-build.c Which does not correspond to
> > > > > the primary PCI identifier given by pcibus_num in hw/pci/pci.c
> > > > >
> > > > > Reference with the device paths, OVMF startup logs, and ACPI table
> > > > > dumps (SysReport):
> > > > > https://github.com/acidanthera/bugtracker/issues/1050
> > > > >
> > > > > In UEFI v2.8, section "10.4.2 Rules with ACPI _HID and _UID" ends
> > > > > with the paragraph,
> > > > >
> > > > >     Root PCI bridges will use the plug and play ID of PNP0A03, This will
> > > > >     be stored in the ACPI Device Path _HID field, or in the Expanded
> > > > >     ACPI Device Path _CID field to match the ACPI name space. The _UID
> > > > >     in the ACPI Device Path structure must match the _UID in the ACPI
> > > > >     name space.
> > > > >
> > > > > (See especially the last sentence.)
> > > > >
> > > > > Considering *extra* root bridges / root buses (with bus number >
> > > > > 0), QEMU's ACPI generator actually does the right thing; since
> > > > > QEMU commit
> > > > > c96d9286a6d7 ("i386/acpi-build: more traditional _UID and _HID for
> > > > > PXB root buses", 2015-06-11).
> > > > >
> > > > > However, the _UID values for root bridge zero (on both i440fx and
> > > > > q35) have always been "wrong" (from UEFI perspective), going back
> > > > > in QEMU to commit 74523b850189 ("i386: add ACPI table files from
> > > > > seabios", 2013-10-14).
> > > > >
> > > > > Even in SeaBIOS, these _UID values have always been 1; see commit
> > > > > a4d357638c57 ("Port rombios32 code from bochs-bios.", 2008-03-08)
> > > > > for i440fx, and commit ecbe3fd61511 ("seabios: q35: add dsdt",
> > > > > 2012-12-01) for q35.
> > > > >
> > > > > Suggested-by: Laszlo Ersek <lersek@redhat.com>
> > > > > Tested-by: vit9696 <vit9696@protonmail.com>
> > > > > Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
> > > > > ---
> > > > >  hw/i386/acpi-build.c | 4 ++--
> > > > >  1 file changed, 2 insertions(+), 2 deletions(-)
> > > > >
> > > > > diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c index
> > > > > b7bcbbbb2a..7a5a8b3521 100644
> > > > > --- a/hw/i386/acpi-build.c
> > > > > +++ b/hw/i386/acpi-build.c
> > > > > @@ -1497,7 +1497,7 @@ build_dsdt(GArray *table_data, BIOSLinker  
> > *linker,  
> > > > >          dev = aml_device("PCI0");
> > > > >          aml_append(dev, aml_name_decl("_HID",  
> > > > aml_eisaid("PNP0A03")));  
> > > > >          aml_append(dev, aml_name_decl("_ADR", aml_int(0)));
> > > > > -        aml_append(dev, aml_name_decl("_UID", aml_int(1)));
> > > > > +        aml_append(dev, aml_name_decl("_UID", aml_int(0)));
> > > > >          aml_append(sb_scope, dev);
> > > > >          aml_append(dsdt, sb_scope);
> > > > >
> > > > > @@ -1512,7 +1512,7 @@ build_dsdt(GArray *table_data, BIOSLinker  
> > *linker,  
> > > > >          aml_append(dev, aml_name_decl("_HID",  
> > > > aml_eisaid("PNP0A08")));  
> > > > >          aml_append(dev, aml_name_decl("_CID",  
> > > > aml_eisaid("PNP0A03")));  
> > > > >          aml_append(dev, aml_name_decl("_ADR", aml_int(0)));
> > > > > -        aml_append(dev, aml_name_decl("_UID", aml_int(1)));
> > > > > +        aml_append(dev, aml_name_decl("_UID", aml_int(0)));
> > > > >          aml_append(dev, build_q35_osc_method());
> > > > >          aml_append(sb_scope, dev);
> > > > >          aml_append(dsdt, sb_scope);
> > > > >  
> > > >
> > > > This "breaks" Windows guests created/installed before this change in
> > > > the sense of Windows gets confused and declares that most of the
> > > > devices changed and thus it has new entries for them in the device
> > > > manager where settings of the old one do not apply anymore.
> > > >
> > > > We were made aware of this by our users when making QEMU 5.2.0
> > > > available on a more used repository of us. Users complained that
> > > > their static network configuration got thrown out in Windows 2016 or
> > > > 2019 server VMs, and Windows tried to use DHCP (which was not
> > > > available in their environments) and thus their Windows VMs had no network  
> > connectivity at all anymore.  
> > > >
> > > > It's currently not yet quite 100% clear to me with what QEMU version
> > > > the Windows VM must be installed with, from reading the patch I have
> > > > to believe it must be before that, but we got mixed reports and a
> > > > colleague could not replicate it from upgrade of 4.0 to 5.2 (I did
> > > > /not/ confirm that one). Anyway, just writing this all to avoid people seeing  
> > different results and brushing this off.  
> > > >
> > > > So here's my personal reproducer, as said, I think that one should
> > > > be able to just use QEMU 5.1 to install a Windows guest and start it
> > > > with 5.2 afterwards to see this issue, but YMMV.
> > > >
> > > > Note. I always used the exact same QEMU command (see below) for
> > > > installation, reproducing and bisect.
> > > >
> > > > 1. Installed Windows 2016 1616 VM using QEMU 3.0.1
> > > >    - VirtIO net/scsi driver from VirtIO win 190 2. Setup static
> > > > network in the VM and shutdown 3. Started VM with 5.2.0 -> Network gone,  
> > new "Ethernet #2"  
> > > > adapter shows up instead
> > > >
> > > > Starting the  "Device Manager" and enabling "View -> Show hidden devices"
> > > > showed me a greyed out device duplicate for basically anything
> > > > attached, SCSI disk, Basic Display Adapter, CDROM device, ..., and the  
> > Network device.  
> > > >
> > > > The first difference I could find was the "Device instance path" one
> > > > can find in the "Details" tab of the devices' "Properties" window.
> > > >
> > > > # old, from initial installation on QEMU 3.0.1
> > > > PCI\VEN_1AF4&DEV_1000&SUBSYS_00011AF4&REV_00\3&13C0B0C5&0&90
> > > >
> > > > # new, from boot with QEMU 5.2
> > > > PCI\VEN_1AF4&DEV_1000&SUBSYS_00011AF4&REV_00\3&267A616A&0&90
> > > >
> > > > They match until almost the end, not sure how important that is, but
> > > > it caught my eye (I'm really no windows guy since a decade so please
> > > > excuse my terrible debugging/exploring skills there. The rest of
> > > > those properties looked pretty much identical.
> > > >
> > > > I then started a bisect, always just restarting the guest with the
> > > > new QEMU build and checking "Device Manager" and network settings to
> > > > see if good/bad. That worked pretty well and I came to this commit.
> > > > See the bisect log attached at the end of this mail.
> > > >
> > > > So, from reading the commit message I figure that this change is
> > > > wanted, what are the implications of just reverting it? (which works
> > > > out in bringing back the old state in Windows + working static network config  
> > again).  
> > > >
> > > > Or any other way/idea to address this in a sane way so that those
> > > > picky Windows guests can be handled more graciously?
> > > >
> > > > I guess also that there could be more subtle effects from this patch
> > > > here, the network one may have just had quite visible effects to pop up as  
> > first issue...  
> > > >
> > > > Thanks if you read so far!
> > > >
> > > > cheers,
> > > > Thomas
> > > >  
> > >
> > > We have a similar problem and want to solve it further.
> > >
> > > Description of problem:
> > >
> > > When QEMU is upgraded from 4.1 to 6.2, if the machine type is not fixed as 4.1  
> > and NIC was configured with static IP address, Windows will make original
> > 'network connection' inactive and create a new one (which is not configured as
> > desired). As result guest looses network connectivity.  
> > >
> > > Test 1:
> > > Steps to test the guest loses the network connection:
> > > 1. on QEMU 4.1 install Windows Server 2019 guest with virtio NIC 2.
> > > configure NIC with static IP and shutdown guest 3. start guest on
> > > qemu-6.2 with machine version as qemu 6.2
> > >
> > > Test 2:
> > > Steps to test the guest does not lose the network connection:
> > > 1. on QEMU 4.1 install Windows Server 2019 guest with virtio NIC 2.
> > > configure NIC with static IP and shutdown guest 3. start guest on
> > > qemu-6.2 with machine version as qemu 4.1
> > >
> > > Here's a historical analysis of the problem:
> > >
> > > Commit af1b80ae56c9 ("i386/acpi: fix inconsistent QEMU/OVMF device paths")  
> > fixed UID of PCI root bridge in ACPI tables for all pc/q35 machine
> > versions.however it was discovered that this change updates Windows
> > configuration in an incompatible way causing network configuration failure unless
> > DHCP is used.
> > 
> > that's expected,
> > by upgrading machine type you basically change hardware.
> >   
> > >
> > > And Commit 0a343a5add75 ("i386/acpi: restore device paths for pre-5.1 vms")  
> > reverts the _UID update from 1 to 0 for q35 and i440fx VMs before version 5.2 to
> > maintain the original behaviour when upgrading. This requires that the same
> > machine type be used after the QEMU upgrade to 5.2 or later.  
> > >
> > > We want all VMs to be able to use the features of the new qemu 6.2 version  
> > After upgrade, we'd like to ask some questions:  
> > >
> > > 1. When the QEMU is upgraded from 4.1 to 6.2, is there any method to ensure  
> > that the guest does not lose the network connection for Test 1? could anyone
> > give some suggestions?
> > 
> > Upgrading qemu itself is fine as long as you keep using 4.1 machine type.
> > 
> > If you wish to upgrade machine type,
> > you will have to reconfigure manually configured 'Network connection's (nothing
> > else would help you)
> > 
> > From admin point of view it would be better to use DHPC (and if you need fixed
> > IPs, pin them on DHCP server side to VMs MAC addresses)
> >   
> > > 2. If no other method is available, reverts the _UID update from 1 to 0 for all  
> > QEMU versions(the prerequisite is that macOS is not used). Is there any risk? Is it
> > recommended to do this?  
> > > 3. When the QEMU is upgraded from 4.1 to 6.2, set _UID to 1 for versions  
> > earlier than QEMU 5.1 and to 0 for versions later than QEMU 5.2 Through
> > Parameters, but start guest on qemu-6.2 with machine version as qemu 6.2 to
> > solve Test 1's problem. Is this solution feasible? Is there any risk?
> > 
> > that's what we already do by flipping UID value based on machine type version.
> > Risk is that by staying on old machine type you might not get some fixes or
> > changed defaults/behavior. You can analyze changes by looking at version-ed
> > FOO_machine_options() and hw_compat_x.y pc_compat_x.y property list and
> > decide if old machine type is good enough for you.  
> 
> 
> Thanks for the answer, but there are still some questions:
> 1. Because our services do not involve macOS, we want to roll back the _UID of all qemu versions from 1 to 0. However, we do not know whether there are potential risks. Do you suggest this?

you'll have to maintain downstream patch on your own

> 2. Flipping UID value based on machine type version, This will cause multiple VMs of different machine types to run on the same host. We want to run VMs of only one machine type on one host. If the machine type remains unchanged, and a new parameter is added to notify the VM, If the VM is upgraded from 4.1, the UID is set to 1. If the VM is newly created, the UID is set to 0. However, this will cause different operations for the same machine type version. Is this appropriate? Will there be some potential problems?

frankly speaking you lost me between 'if-s'

even after upgrading qemu to the latest you can still use old machine type (pc-q35-4.1) for exiting and new VMs to have the same behavior as qemu-4.1

> 3. QEMU is upgraded from 4.1 to 6.2. Is there any other problem about the _UID except that the IP address of Windows is lost?

as for risks, no one would be able to tell you.
Common sense tells me "Check Changelogs to decide if you need a new version and do some testing before rolling out changes."

> 
> 
> > > Thanks.
> > >  
> > > > = QEMU Command =
> > > >
> > > > (This was generated by our (Proxmox VE) stack, I only cleaned it up
> > > > a bit to allow easier manual running it)
> > > >
> > > > ./qemu-system-x86_64 \
> > > >   -name win2016 \
> > > >   -chardev
> > > > 'socket,id=qmp,path=/var/run/qemu-server/11765.qmp,server,nowait' \
> > > >   -mon 'chardev=qmp,mode=control' \
> > > >   -smbios 'type=1,uuid=6324fb28-e98a-44cf-85db-694d1b3405f5' \
> > > >   -smp '2,sockets=1,cores=2,maxcpus=2' \
> > > >   -nodefaults \
> > > >   -boot 'menu=on,strict=on,reboot-timeout=1000' \
> > > >   -vnc unix:/var/run/qemu-server/11765.vnc,password \
> > > >   -no-hpet \
> > > >   -cpu
> > > > 'host,hv_ipi,hv_relaxed,hv_reset,hv_runtime,hv_spinlocks=0x1fff,hv_s
> > > > timer,hv
> > > > _synic,hv_time,hv_vapic,hv_vpindex,+kvm_pv_eoi,+kvm_pv_unhalt,+md-cl
> > > > ear,
> > > > +pcid,+spec-ctrl' \
> > > >   -m 2048 \
> > > >   -device 'pci-bridge,id=pci.1,chassis_nr=1,bus=pci.0,addr=0x1e' \
> > > >   -device 'pci-bridge,id=pci.2,chassis_nr=2,bus=pci.0,addr=0x1f' \
> > > >   -device 'vmgenid,guid=2e56e6ca-2cf8-4f1d-8cc3-9b19a2510c01' \
> > > >   -device 'piix3-usb-uhci,id=uhci,bus=pci.0,addr=0x1.0x2' \
> > > >   -device 'usb-tablet,id=tablet,bus=uhci.0,port=1' \
> > > >   -device 'VGA,id=vga,bus=pci.0,addr=0x2,edid=off' \
> > > >   -chardev
> > > > 'socket,path=/var/run/qemu-server/11765.qga,server,nowait,id=qga0' \
> > > >   -device 'virtio-serial,id=qga0,bus=pci.0,addr=0x8' \
> > > >   -device 'virtserialport,chardev=qga0,name=org.qemu.guest_agent.0' \
> > > >   -device 'virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x3' \
> > > >   -iscsi 'initiator-name=iqn.1993-08.org.debian:01:468faae9322b' \
> > > >   -drive
> > > > 'file=/mnt/pve/iso/template/iso/virtio-win-0.1.190.iso,if=none,id=dr
> > > > ive-ide0,me
> > > > dia=cdrom,aio=threads' \
> > > >   -device 'ide-cd,bus=ide.0,unit=0,drive=drive-ide0,id=ide0,bootindex=200'  
> > \  
> > > >   -drive
> > > > 'file=/mnt/pve/iso/template/iso/Win2016-1616-evaluation.ISO,if=none,
> > > > id=drive-
> > > > ide2,media=cdrom,aio=threads' \
> > > >   -device 'ide-cd,bus=ide.1,unit=0,drive=drive-ide2,id=ide2,bootindex=201'  
> > \  
> > > >   -device 'virtio-scsi-pci,id=scsihw0,bus=pci.0,addr=0x5' \
> > > >   -drive
> > > > 'file=/dev/WDnvme/vm-11765-disk-0,if=none,id=drive-scsi0,format=raw,
> > > > cache=
> > > > none,aio=native,detect-zeroes=on' \
> > > >   -device
> > > > 'scsi-hd,bus=scsihw0.0,channel=0,scsi-id=0,lun=0,drive=drive-scsi0,i
> > > > d=scsi0,rotati
> > > > on_rate=1,bootindex=100' \
> > > >   -netdev
> > > > 'type=tap,id=net0,ifname=tap11765i0,script=/var/lib/qemu-server/pve-
> > > > bridge,d ownscript=/var/lib/qemu-server/pve-bridgedown,vhost=on' \
> > > >   -device
> > > > 'virtio-net-pci,mac=02:98:90:43:42:1D,netdev=net0,bus=pci.0,addr=0x1
> > > > 2,id=net0,
> > > > bootindex=300' \
> > > >   -rtc 'driftfix=slew,base=localtime' \
> > > >   -machine 'type=pc' \
> > > >   -global 'kvm-pit.lost_tick_policy=discard'
> > > >
> > > >
> > > > = bisect log =
> > > >
> > > > git bisect start
> > > > # bad: [553032db17440f8de011390e5a1cfddd13751b0b] Update version for
> > > > v5.2.0 release git bisect bad
> > > > 553032db17440f8de011390e5a1cfddd13751b0b
> > > > # good: [d0ed6a69d399ae193959225cdeaa9382746c91cc] Update version
> > > > for
> > > > v5.1.0 release git bisect good
> > > > d0ed6a69d399ae193959225cdeaa9382746c91cc
> > > > # bad: [ed799805d00ccdda45eb8441c7d929624d9e98a6] qom: Add
> > > > kernel-doc markup to introduction doc comment git bisect bad
> > > > ed799805d00ccdda45eb8441c7d929624d9e98a6
> > > > # bad: [e4d8b7c1a95fffcfa4bdab9aa7ffd1cf590cdcf5] Merge
> > > > remote-tracking branch 'remotes/nvme/tags/pull-nvme-20200902' into
> > > > staging git bisect bad
> > > > e4d8b7c1a95fffcfa4bdab9aa7ffd1cf590cdcf5
> > > > # bad: [af1dfe1ec0864e6700237a43cc36018176f9eba9] acpi: update
> > > > expected DSDT files with _UID changes git bisect bad
> > > > af1dfe1ec0864e6700237a43cc36018176f9eba9
> > > > # good: [d7df0ceee0fd2e512cd214a9074ebeeb40da3099] Merge
> > > > remote-tracking branch 'remotes/philmd-gitlab/tags/sd-next-20200821'
> > > > into staging git bisect good
> > > > d7df0ceee0fd2e512cd214a9074ebeeb40da3099
> > > > # good: [df82aa7fe10e46b675678977999d49bd586538f8] Merge
> > > > remote-tracking branch
> > > > 'remotes/edgar/tags/edgar/xilinx-next-2020-08-24.for-upstream' into
> > > > staging git bisect good df82aa7fe10e46b675678977999d49bd586538f8
> > > > # good: [e39a8320b088dd5efc9ebaafe387e52b3d962665] target/riscv:
> > > > Support the Virtual Instruction fault git bisect good
> > > > e39a8320b088dd5efc9ebaafe387e52b3d962665
> > > > # good: [a6841a2de66fa44fe52ed996b70f9fb9f7bd6ca7] qcow2: Add
> > > > subcluster support to qcow2_co_pwrite_zeroes() git bisect good
> > > > a6841a2de66fa44fe52ed996b70f9fb9f7bd6ca7
> > > > # good: [2f8cd515477edab1cbf38ecbdbfa2cac13ce1550]
> > > > hw/display/artist: Fix invalidation of lines near screen border git
> > > > bisect good
> > > > 2f8cd515477edab1cbf38ecbdbfa2cac13ce1550
> > > > # good: [a5d3cfa2dc775e5d99f013703b8508f1d989d588] iotests: Add
> > > > tests for
> > > > qcow2 images with extended L2 entries git bisect good
> > > > a5d3cfa2dc775e5d99f013703b8508f1d989d588
> > > > # good: [8e49197ca5e76fdb8928833b2649ef13fc5aab2f] Merge
> > > > remote-tracking branch
> > > > 'remotes/hdeller/tags/target-hppa-v3-pull-request' into staging git
> > > > bisect good 8e49197ca5e76fdb8928833b2649ef13fc5aab2f
> > > > # bad: [af1b80ae56c9495999e8ccf7b70ef894378de642] i386/acpi: fix
> > > > inconsistent QEMU/OVMF device paths git bisect bad
> > > > af1b80ae56c9495999e8ccf7b70ef894378de642
> > > > # good: [42a62c20925e02aef0d849f92a0e9540888e79ae] acpi: allow DSDT
> > > > changes git bisect good 42a62c20925e02aef0d849f92a0e9540888e79ae
> > > > # first bad commit: [af1b80ae56c9495999e8ccf7b70ef894378de642]
> > > > i386/acpi: fix inconsistent QEMU/OVMF device paths  
> > >  
> >   
> 



^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH 1/2] i386/acpi: fix inconsistent QEMU/OVMF device paths
@ 2023-04-19  2:48 zhangying (AZ) via
  2023-04-20 14:54 ` Igor Mammedov
  0 siblings, 1 reply; 33+ messages in thread
From: zhangying (AZ) via @ 2023-04-19  2:48 UTC (permalink / raw)
  To: Igor Mammedov, qemu-devel
  Cc: Thomas Lamprecht, Michael S. Tsirkin, Renxuming, Wangyuan (Ethan),
	suxiaodong


> On Tue, 18 Apr 2023 09:06:30 +0000
> "zhangying (AZ)" via <qemu-devel@nongnu.org> wrote:
> 
> > > On 30.07.20 17:58, Michael S. Tsirkin wrote:
> > > > macOS uses ACPI UIDs to build the DevicePath for NVRAM boot
> > > > options, while OVMF firmware gets them via an internal channel through
> QEMU.
> > > > Due to a bug in QEMU ACPI currently UEFI firmware and ACPI have
> > > > different values, and this makes the underlying operating system
> > > > unable to report its boot option.
> > > >
> > > > The particular node in question is the primary PciRoot (PCI0 in
> > > > ACPI), which for some reason gets assigned 1 in ACPI UID and 0 in
> > > > the DevicePath. This is due to the _UID assigned to it by
> > > > build_dsdt in hw/i386/acpi-build.c Which does not correspond to
> > > > the primary PCI identifier given by pcibus_num in hw/pci/pci.c
> > > >
> > > > Reference with the device paths, OVMF startup logs, and ACPI table
> > > > dumps (SysReport):
> > > > https://github.com/acidanthera/bugtracker/issues/1050
> > > >
> > > > In UEFI v2.8, section "10.4.2 Rules with ACPI _HID and _UID" ends
> > > > with the paragraph,
> > > >
> > > >     Root PCI bridges will use the plug and play ID of PNP0A03, This will
> > > >     be stored in the ACPI Device Path _HID field, or in the Expanded
> > > >     ACPI Device Path _CID field to match the ACPI name space. The _UID
> > > >     in the ACPI Device Path structure must match the _UID in the ACPI
> > > >     name space.
> > > >
> > > > (See especially the last sentence.)
> > > >
> > > > Considering *extra* root bridges / root buses (with bus number >
> > > > 0), QEMU's ACPI generator actually does the right thing; since
> > > > QEMU commit
> > > > c96d9286a6d7 ("i386/acpi-build: more traditional _UID and _HID for
> > > > PXB root buses", 2015-06-11).
> > > >
> > > > However, the _UID values for root bridge zero (on both i440fx and
> > > > q35) have always been "wrong" (from UEFI perspective), going back
> > > > in QEMU to commit 74523b850189 ("i386: add ACPI table files from
> > > > seabios", 2013-10-14).
> > > >
> > > > Even in SeaBIOS, these _UID values have always been 1; see commit
> > > > a4d357638c57 ("Port rombios32 code from bochs-bios.", 2008-03-08)
> > > > for i440fx, and commit ecbe3fd61511 ("seabios: q35: add dsdt",
> > > > 2012-12-01) for q35.
> > > >
> > > > Suggested-by: Laszlo Ersek <lersek@redhat.com>
> > > > Tested-by: vit9696 <vit9696@protonmail.com>
> > > > Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
> > > > ---
> > > >  hw/i386/acpi-build.c | 4 ++--
> > > >  1 file changed, 2 insertions(+), 2 deletions(-)
> > > >
> > > > diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c index
> > > > b7bcbbbb2a..7a5a8b3521 100644
> > > > --- a/hw/i386/acpi-build.c
> > > > +++ b/hw/i386/acpi-build.c
> > > > @@ -1497,7 +1497,7 @@ build_dsdt(GArray *table_data, BIOSLinker
> *linker,
> > > >          dev = aml_device("PCI0");
> > > >          aml_append(dev, aml_name_decl("_HID",
> > > aml_eisaid("PNP0A03")));
> > > >          aml_append(dev, aml_name_decl("_ADR", aml_int(0)));
> > > > -        aml_append(dev, aml_name_decl("_UID", aml_int(1)));
> > > > +        aml_append(dev, aml_name_decl("_UID", aml_int(0)));
> > > >          aml_append(sb_scope, dev);
> > > >          aml_append(dsdt, sb_scope);
> > > >
> > > > @@ -1512,7 +1512,7 @@ build_dsdt(GArray *table_data, BIOSLinker
> *linker,
> > > >          aml_append(dev, aml_name_decl("_HID",
> > > aml_eisaid("PNP0A08")));
> > > >          aml_append(dev, aml_name_decl("_CID",
> > > aml_eisaid("PNP0A03")));
> > > >          aml_append(dev, aml_name_decl("_ADR", aml_int(0)));
> > > > -        aml_append(dev, aml_name_decl("_UID", aml_int(1)));
> > > > +        aml_append(dev, aml_name_decl("_UID", aml_int(0)));
> > > >          aml_append(dev, build_q35_osc_method());
> > > >          aml_append(sb_scope, dev);
> > > >          aml_append(dsdt, sb_scope);
> > > >
> > >
> > > This "breaks" Windows guests created/installed before this change in
> > > the sense of Windows gets confused and declares that most of the
> > > devices changed and thus it has new entries for them in the device
> > > manager where settings of the old one do not apply anymore.
> > >
> > > We were made aware of this by our users when making QEMU 5.2.0
> > > available on a more used repository of us. Users complained that
> > > their static network configuration got thrown out in Windows 2016 or
> > > 2019 server VMs, and Windows tried to use DHCP (which was not
> > > available in their environments) and thus their Windows VMs had no network
> connectivity at all anymore.
> > >
> > > It's currently not yet quite 100% clear to me with what QEMU version
> > > the Windows VM must be installed with, from reading the patch I have
> > > to believe it must be before that, but we got mixed reports and a
> > > colleague could not replicate it from upgrade of 4.0 to 5.2 (I did
> > > /not/ confirm that one). Anyway, just writing this all to avoid people seeing
> different results and brushing this off.
> > >
> > > So here's my personal reproducer, as said, I think that one should
> > > be able to just use QEMU 5.1 to install a Windows guest and start it
> > > with 5.2 afterwards to see this issue, but YMMV.
> > >
> > > Note. I always used the exact same QEMU command (see below) for
> > > installation, reproducing and bisect.
> > >
> > > 1. Installed Windows 2016 1616 VM using QEMU 3.0.1
> > >    - VirtIO net/scsi driver from VirtIO win 190 2. Setup static
> > > network in the VM and shutdown 3. Started VM with 5.2.0 -> Network gone,
> new "Ethernet #2"
> > > adapter shows up instead
> > >
> > > Starting the  "Device Manager" and enabling "View -> Show hidden devices"
> > > showed me a greyed out device duplicate for basically anything
> > > attached, SCSI disk, Basic Display Adapter, CDROM device, ..., and the
> Network device.
> > >
> > > The first difference I could find was the "Device instance path" one
> > > can find in the "Details" tab of the devices' "Properties" window.
> > >
> > > # old, from initial installation on QEMU 3.0.1
> > > PCI\VEN_1AF4&DEV_1000&SUBSYS_00011AF4&REV_00\3&13C0B0C5&0&90
> > >
> > > # new, from boot with QEMU 5.2
> > > PCI\VEN_1AF4&DEV_1000&SUBSYS_00011AF4&REV_00\3&267A616A&0&90
> > >
> > > They match until almost the end, not sure how important that is, but
> > > it caught my eye (I'm really no windows guy since a decade so please
> > > excuse my terrible debugging/exploring skills there. The rest of
> > > those properties looked pretty much identical.
> > >
> > > I then started a bisect, always just restarting the guest with the
> > > new QEMU build and checking "Device Manager" and network settings to
> > > see if good/bad. That worked pretty well and I came to this commit.
> > > See the bisect log attached at the end of this mail.
> > >
> > > So, from reading the commit message I figure that this change is
> > > wanted, what are the implications of just reverting it? (which works
> > > out in bringing back the old state in Windows + working static network config
> again).
> > >
> > > Or any other way/idea to address this in a sane way so that those
> > > picky Windows guests can be handled more graciously?
> > >
> > > I guess also that there could be more subtle effects from this patch
> > > here, the network one may have just had quite visible effects to pop up as
> first issue...
> > >
> > > Thanks if you read so far!
> > >
> > > cheers,
> > > Thomas
> > >
> >
> > We have a similar problem and want to solve it further.
> >
> > Description of problem:
> >
> > When QEMU is upgraded from 4.1 to 6.2, if the machine type is not fixed as 4.1
> and NIC was configured with static IP address, Windows will make original
> 'network connection' inactive and create a new one (which is not configured as
> desired). As result guest looses network connectivity.
> >
> > Test 1:
> > Steps to test the guest loses the network connection:
> > 1. on QEMU 4.1 install Windows Server 2019 guest with virtio NIC 2.
> > configure NIC with static IP and shutdown guest 3. start guest on
> > qemu-6.2 with machine version as qemu 6.2
> >
> > Test 2:
> > Steps to test the guest does not lose the network connection:
> > 1. on QEMU 4.1 install Windows Server 2019 guest with virtio NIC 2.
> > configure NIC with static IP and shutdown guest 3. start guest on
> > qemu-6.2 with machine version as qemu 4.1
> >
> > Here's a historical analysis of the problem:
> >
> > Commit af1b80ae56c9 ("i386/acpi: fix inconsistent QEMU/OVMF device paths")
> fixed UID of PCI root bridge in ACPI tables for all pc/q35 machine
> versions.however it was discovered that this change updates Windows
> configuration in an incompatible way causing network configuration failure unless
> DHCP is used.
> 
> that's expected,
> by upgrading machine type you basically change hardware.
> 
> >
> > And Commit 0a343a5add75 ("i386/acpi: restore device paths for pre-5.1 vms")
> reverts the _UID update from 1 to 0 for q35 and i440fx VMs before version 5.2 to
> maintain the original behaviour when upgrading. This requires that the same
> machine type be used after the QEMU upgrade to 5.2 or later.
> >
> > We want all VMs to be able to use the features of the new qemu 6.2 version
> After upgrade, we'd like to ask some questions:
> >
> > 1. When the QEMU is upgraded from 4.1 to 6.2, is there any method to ensure
> that the guest does not lose the network connection for Test 1? could anyone
> give some suggestions?
> 
> Upgrading qemu itself is fine as long as you keep using 4.1 machine type.
> 
> If you wish to upgrade machine type,
> you will have to reconfigure manually configured 'Network connection's (nothing
> else would help you)
> 
> From admin point of view it would be better to use DHPC (and if you need fixed
> IPs, pin them on DHCP server side to VMs MAC addresses)
> 
> > 2. If no other method is available, reverts the _UID update from 1 to 0 for all
> QEMU versions(the prerequisite is that macOS is not used). Is there any risk? Is it
> recommended to do this?
> > 3. When the QEMU is upgraded from 4.1 to 6.2, set _UID to 1 for versions
> earlier than QEMU 5.1 and to 0 for versions later than QEMU 5.2 Through
> Parameters, but start guest on qemu-6.2 with machine version as qemu 6.2 to
> solve Test 1's problem. Is this solution feasible? Is there any risk?
> 
> that's what we already do by flipping UID value based on machine type version.
> Risk is that by staying on old machine type you might not get some fixes or
> changed defaults/behavior. You can analyze changes by looking at version-ed
> FOO_machine_options() and hw_compat_x.y pc_compat_x.y property list and
> decide if old machine type is good enough for you.


Thanks for the answer, but there are still some questions:
1. Because our services do not involve macOS, we want to roll back the _UID of all qemu versions from 1 to 0. However, we do not know whether there are potential risks. Do you suggest this?
2. Flipping UID value based on machine type version, This will cause multiple VMs of different machine types to run on the same host. We want to run VMs of only one machine type on one host. If the machine type remains unchanged, and a new parameter is added to notify the VM, If the VM is upgraded from 4.1, the UID is set to 1. If the VM is newly created, the UID is set to 0. However, this will cause different operations for the same machine type version. Is this appropriate? Will there be some potential problems?
3. QEMU is upgraded from 4.1 to 6.2. Is there any other problem about the _UID except that the IP address of Windows is lost?


> > Thanks.
> >
> > > = QEMU Command =
> > >
> > > (This was generated by our (Proxmox VE) stack, I only cleaned it up
> > > a bit to allow easier manual running it)
> > >
> > > ./qemu-system-x86_64 \
> > >   -name win2016 \
> > >   -chardev
> > > 'socket,id=qmp,path=/var/run/qemu-server/11765.qmp,server,nowait' \
> > >   -mon 'chardev=qmp,mode=control' \
> > >   -smbios 'type=1,uuid=6324fb28-e98a-44cf-85db-694d1b3405f5' \
> > >   -smp '2,sockets=1,cores=2,maxcpus=2' \
> > >   -nodefaults \
> > >   -boot 'menu=on,strict=on,reboot-timeout=1000' \
> > >   -vnc unix:/var/run/qemu-server/11765.vnc,password \
> > >   -no-hpet \
> > >   -cpu
> > > 'host,hv_ipi,hv_relaxed,hv_reset,hv_runtime,hv_spinlocks=0x1fff,hv_s
> > > timer,hv
> > > _synic,hv_time,hv_vapic,hv_vpindex,+kvm_pv_eoi,+kvm_pv_unhalt,+md-cl
> > > ear,
> > > +pcid,+spec-ctrl' \
> > >   -m 2048 \
> > >   -device 'pci-bridge,id=pci.1,chassis_nr=1,bus=pci.0,addr=0x1e' \
> > >   -device 'pci-bridge,id=pci.2,chassis_nr=2,bus=pci.0,addr=0x1f' \
> > >   -device 'vmgenid,guid=2e56e6ca-2cf8-4f1d-8cc3-9b19a2510c01' \
> > >   -device 'piix3-usb-uhci,id=uhci,bus=pci.0,addr=0x1.0x2' \
> > >   -device 'usb-tablet,id=tablet,bus=uhci.0,port=1' \
> > >   -device 'VGA,id=vga,bus=pci.0,addr=0x2,edid=off' \
> > >   -chardev
> > > 'socket,path=/var/run/qemu-server/11765.qga,server,nowait,id=qga0' \
> > >   -device 'virtio-serial,id=qga0,bus=pci.0,addr=0x8' \
> > >   -device 'virtserialport,chardev=qga0,name=org.qemu.guest_agent.0' \
> > >   -device 'virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x3' \
> > >   -iscsi 'initiator-name=iqn.1993-08.org.debian:01:468faae9322b' \
> > >   -drive
> > > 'file=/mnt/pve/iso/template/iso/virtio-win-0.1.190.iso,if=none,id=dr
> > > ive-ide0,me
> > > dia=cdrom,aio=threads' \
> > >   -device 'ide-cd,bus=ide.0,unit=0,drive=drive-ide0,id=ide0,bootindex=200'
> \
> > >   -drive
> > > 'file=/mnt/pve/iso/template/iso/Win2016-1616-evaluation.ISO,if=none,
> > > id=drive-
> > > ide2,media=cdrom,aio=threads' \
> > >   -device 'ide-cd,bus=ide.1,unit=0,drive=drive-ide2,id=ide2,bootindex=201'
> \
> > >   -device 'virtio-scsi-pci,id=scsihw0,bus=pci.0,addr=0x5' \
> > >   -drive
> > > 'file=/dev/WDnvme/vm-11765-disk-0,if=none,id=drive-scsi0,format=raw,
> > > cache=
> > > none,aio=native,detect-zeroes=on' \
> > >   -device
> > > 'scsi-hd,bus=scsihw0.0,channel=0,scsi-id=0,lun=0,drive=drive-scsi0,i
> > > d=scsi0,rotati
> > > on_rate=1,bootindex=100' \
> > >   -netdev
> > > 'type=tap,id=net0,ifname=tap11765i0,script=/var/lib/qemu-server/pve-
> > > bridge,d ownscript=/var/lib/qemu-server/pve-bridgedown,vhost=on' \
> > >   -device
> > > 'virtio-net-pci,mac=02:98:90:43:42:1D,netdev=net0,bus=pci.0,addr=0x1
> > > 2,id=net0,
> > > bootindex=300' \
> > >   -rtc 'driftfix=slew,base=localtime' \
> > >   -machine 'type=pc' \
> > >   -global 'kvm-pit.lost_tick_policy=discard'
> > >
> > >
> > > = bisect log =
> > >
> > > git bisect start
> > > # bad: [553032db17440f8de011390e5a1cfddd13751b0b] Update version for
> > > v5.2.0 release git bisect bad
> > > 553032db17440f8de011390e5a1cfddd13751b0b
> > > # good: [d0ed6a69d399ae193959225cdeaa9382746c91cc] Update version
> > > for
> > > v5.1.0 release git bisect good
> > > d0ed6a69d399ae193959225cdeaa9382746c91cc
> > > # bad: [ed799805d00ccdda45eb8441c7d929624d9e98a6] qom: Add
> > > kernel-doc markup to introduction doc comment git bisect bad
> > > ed799805d00ccdda45eb8441c7d929624d9e98a6
> > > # bad: [e4d8b7c1a95fffcfa4bdab9aa7ffd1cf590cdcf5] Merge
> > > remote-tracking branch 'remotes/nvme/tags/pull-nvme-20200902' into
> > > staging git bisect bad
> > > e4d8b7c1a95fffcfa4bdab9aa7ffd1cf590cdcf5
> > > # bad: [af1dfe1ec0864e6700237a43cc36018176f9eba9] acpi: update
> > > expected DSDT files with _UID changes git bisect bad
> > > af1dfe1ec0864e6700237a43cc36018176f9eba9
> > > # good: [d7df0ceee0fd2e512cd214a9074ebeeb40da3099] Merge
> > > remote-tracking branch 'remotes/philmd-gitlab/tags/sd-next-20200821'
> > > into staging git bisect good
> > > d7df0ceee0fd2e512cd214a9074ebeeb40da3099
> > > # good: [df82aa7fe10e46b675678977999d49bd586538f8] Merge
> > > remote-tracking branch
> > > 'remotes/edgar/tags/edgar/xilinx-next-2020-08-24.for-upstream' into
> > > staging git bisect good df82aa7fe10e46b675678977999d49bd586538f8
> > > # good: [e39a8320b088dd5efc9ebaafe387e52b3d962665] target/riscv:
> > > Support the Virtual Instruction fault git bisect good
> > > e39a8320b088dd5efc9ebaafe387e52b3d962665
> > > # good: [a6841a2de66fa44fe52ed996b70f9fb9f7bd6ca7] qcow2: Add
> > > subcluster support to qcow2_co_pwrite_zeroes() git bisect good
> > > a6841a2de66fa44fe52ed996b70f9fb9f7bd6ca7
> > > # good: [2f8cd515477edab1cbf38ecbdbfa2cac13ce1550]
> > > hw/display/artist: Fix invalidation of lines near screen border git
> > > bisect good
> > > 2f8cd515477edab1cbf38ecbdbfa2cac13ce1550
> > > # good: [a5d3cfa2dc775e5d99f013703b8508f1d989d588] iotests: Add
> > > tests for
> > > qcow2 images with extended L2 entries git bisect good
> > > a5d3cfa2dc775e5d99f013703b8508f1d989d588
> > > # good: [8e49197ca5e76fdb8928833b2649ef13fc5aab2f] Merge
> > > remote-tracking branch
> > > 'remotes/hdeller/tags/target-hppa-v3-pull-request' into staging git
> > > bisect good 8e49197ca5e76fdb8928833b2649ef13fc5aab2f
> > > # bad: [af1b80ae56c9495999e8ccf7b70ef894378de642] i386/acpi: fix
> > > inconsistent QEMU/OVMF device paths git bisect bad
> > > af1b80ae56c9495999e8ccf7b70ef894378de642
> > > # good: [42a62c20925e02aef0d849f92a0e9540888e79ae] acpi: allow DSDT
> > > changes git bisect good 42a62c20925e02aef0d849f92a0e9540888e79ae
> > > # first bad commit: [af1b80ae56c9495999e8ccf7b70ef894378de642]
> > > i386/acpi: fix inconsistent QEMU/OVMF device paths
> >
> 


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH 1/2] i386/acpi: fix inconsistent QEMU/OVMF device paths
@ 2023-04-18  9:06 zhangying (AZ) via
  0 siblings, 0 replies; 33+ messages in thread
From: zhangying (AZ) via @ 2023-04-18  9:06 UTC (permalink / raw)
  To: Thomas Lamprecht, Michael S. Tsirkin, qemu-devel
  Cc: Renxuming, Wangyuan (Ethan), suxiaodong

> On 30.07.20 17:58, Michael S. Tsirkin wrote:
> > macOS uses ACPI UIDs to build the DevicePath for NVRAM boot options,
> > while OVMF firmware gets them via an internal channel through QEMU.
> > Due to a bug in QEMU ACPI currently UEFI firmware and ACPI have
> > different values, and this makes the underlying operating system
> > unable to report its boot option.
> >
> > The particular node in question is the primary PciRoot (PCI0 in ACPI),
> > which for some reason gets assigned 1 in ACPI UID and 0 in the
> > DevicePath. This is due to the _UID assigned to it by build_dsdt in
> > hw/i386/acpi-build.c Which does not correspond to the primary PCI
> > identifier given by pcibus_num in hw/pci/pci.c
> >
> > Reference with the device paths, OVMF startup logs, and ACPI table
> > dumps (SysReport):
> > https://github.com/acidanthera/bugtracker/issues/1050
> >
> > In UEFI v2.8, section "10.4.2 Rules with ACPI _HID and _UID" ends with
> > the paragraph,
> >
> >     Root PCI bridges will use the plug and play ID of PNP0A03, This will
> >     be stored in the ACPI Device Path _HID field, or in the Expanded
> >     ACPI Device Path _CID field to match the ACPI name space. The _UID
> >     in the ACPI Device Path structure must match the _UID in the ACPI
> >     name space.
> >
> > (See especially the last sentence.)
> >
> > Considering *extra* root bridges / root buses (with bus number > 0),
> > QEMU's ACPI generator actually does the right thing; since QEMU commit
> > c96d9286a6d7 ("i386/acpi-build: more traditional _UID and _HID for PXB
> > root buses", 2015-06-11).
> >
> > However, the _UID values for root bridge zero (on both i440fx and q35)
> > have always been "wrong" (from UEFI perspective), going back in QEMU
> > to commit 74523b850189 ("i386: add ACPI table files from seabios",
> > 2013-10-14).
> >
> > Even in SeaBIOS, these _UID values have always been 1; see commit
> > a4d357638c57 ("Port rombios32 code from bochs-bios.", 2008-03-08) for
> > i440fx, and commit ecbe3fd61511 ("seabios: q35: add dsdt", 2012-12-01)
> > for q35.
> >
> > Suggested-by: Laszlo Ersek <lersek@redhat.com>
> > Tested-by: vit9696 <vit9696@protonmail.com>
> > Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
> > ---
> >  hw/i386/acpi-build.c | 4 ++--
> >  1 file changed, 2 insertions(+), 2 deletions(-)
> >
> > diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c index
> > b7bcbbbb2a..7a5a8b3521 100644
> > --- a/hw/i386/acpi-build.c
> > +++ b/hw/i386/acpi-build.c
> > @@ -1497,7 +1497,7 @@ build_dsdt(GArray *table_data, BIOSLinker *linker,
> >          dev = aml_device("PCI0");
> >          aml_append(dev, aml_name_decl("_HID",
> aml_eisaid("PNP0A03")));
> >          aml_append(dev, aml_name_decl("_ADR", aml_int(0)));
> > -        aml_append(dev, aml_name_decl("_UID", aml_int(1)));
> > +        aml_append(dev, aml_name_decl("_UID", aml_int(0)));
> >          aml_append(sb_scope, dev);
> >          aml_append(dsdt, sb_scope);
> >
> > @@ -1512,7 +1512,7 @@ build_dsdt(GArray *table_data, BIOSLinker *linker,
> >          aml_append(dev, aml_name_decl("_HID",
> aml_eisaid("PNP0A08")));
> >          aml_append(dev, aml_name_decl("_CID",
> aml_eisaid("PNP0A03")));
> >          aml_append(dev, aml_name_decl("_ADR", aml_int(0)));
> > -        aml_append(dev, aml_name_decl("_UID", aml_int(1)));
> > +        aml_append(dev, aml_name_decl("_UID", aml_int(0)));
> >          aml_append(dev, build_q35_osc_method());
> >          aml_append(sb_scope, dev);
> >          aml_append(dsdt, sb_scope);
> >
> 
> This "breaks" Windows guests created/installed before this change in the sense
> of Windows gets confused and declares that most of the devices changed and
> thus it has new entries for them in the device manager where settings of the old
> one do not apply anymore.
> 
> We were made aware of this by our users when making QEMU 5.2.0 available on
> a more used repository of us. Users complained that their static network
> configuration got thrown out in Windows 2016 or 2019 server VMs, and Windows
> tried to use DHCP (which was not available in their environments) and thus their
> Windows VMs had no network connectivity at all anymore.
> 
> It's currently not yet quite 100% clear to me with what QEMU version the
> Windows VM must be installed with, from reading the patch I have to believe it
> must be before that, but we got mixed reports and a colleague could not
> replicate it from upgrade of 4.0 to 5.2 (I did /not/ confirm that one). Anyway, just
> writing this all to avoid people seeing different results and brushing this off.
> 
> So here's my personal reproducer, as said, I think that one should be able to just
> use QEMU 5.1 to install a Windows guest and start it with 5.2 afterwards to see
> this issue, but YMMV.
> 
> Note. I always used the exact same QEMU command (see below) for installation,
> reproducing and bisect.
> 
> 1. Installed Windows 2016 1616 VM using QEMU 3.0.1
>    - VirtIO net/scsi driver from VirtIO win 190 2. Setup static network in the VM
> and shutdown 3. Started VM with 5.2.0 -> Network gone, new "Ethernet #2"
> adapter shows up instead
> 
> Starting the  "Device Manager" and enabling "View -> Show hidden devices"
> showed me a greyed out device duplicate for basically anything attached, SCSI
> disk, Basic Display Adapter, CDROM device, ..., and the Network device.
> 
> The first difference I could find was the "Device instance path" one can find in the
> "Details" tab of the devices' "Properties" window.
> 
> # old, from initial installation on QEMU 3.0.1
> PCI\VEN_1AF4&DEV_1000&SUBSYS_00011AF4&REV_00\3&13C0B0C5&0&90
> 
> # new, from boot with QEMU 5.2
> PCI\VEN_1AF4&DEV_1000&SUBSYS_00011AF4&REV_00\3&267A616A&0&90
> 
> They match until almost the end, not sure how important that is, but it caught my
> eye (I'm really no windows guy since a decade so please excuse my terrible
> debugging/exploring skills there. The rest of those properties looked pretty
> much identical.
> 
> I then started a bisect, always just restarting the guest with the new QEMU build
> and checking "Device Manager" and network settings to see if good/bad. That
> worked pretty well and I came to this commit. See the bisect log attached at the
> end of this mail.
> 
> So, from reading the commit message I figure that this change is wanted, what
> are the implications of just reverting it? (which works out in bringing back the old
> state in Windows + working static network config again).
> 
> Or any other way/idea to address this in a sane way so that those picky Windows
> guests can be handled more graciously?
> 
> I guess also that there could be more subtle effects from this patch here, the
> network one may have just had quite visible effects to pop up as first issue...
> 
> Thanks if you read so far!
> 
> cheers,
> Thomas
> 

We have a similar problem and want to solve it further.

Description of problem:

When QEMU is upgraded from 4.1 to 6.2, if the machine type is not fixed as 4.1 and NIC was configured with static IP address, Windows will make original 'network connection' inactive and create a new one (which is not configured as desired). As result guest looses network connectivity.

Test 1: 
Steps to test the guest loses the network connection:
1. on QEMU 4.1 install Windows Server 2019 guest with virtio NIC
2. configure NIC with static IP and shutdown guest
3. start guest on qemu-6.2 with machine version as qemu 6.2

Test 2: 
Steps to test the guest does not lose the network connection:
1. on QEMU 4.1 install Windows Server 2019 guest with virtio NIC
2. configure NIC with static IP and shutdown guest
3. start guest on qemu-6.2 with machine version as qemu 4.1

Here's a historical analysis of the problem:

Commit af1b80ae56c9 ("i386/acpi: fix inconsistent QEMU/OVMF device paths") fixed UID of PCI root bridge in ACPI tables for all pc/q35 machine versions.however it was discovered that this change updates Windows configuration in an incompatible way causing network configuration failure unless DHCP is used.

And Commit 0a343a5add75 ("i386/acpi: restore device paths for pre-5.1 vms") reverts the _UID update from 1 to 0 for q35 and i440fx VMs before version 5.2 to maintain the original behaviour when upgrading. This requires that the same machine type be used after the QEMU upgrade to 5.2 or later.

We want all VMs to be able to use the features of the new qemu 6.2 version After upgrade, we'd like to ask some questions:

1. When the QEMU is upgraded from 4.1 to 6.2, is there any method to ensure that the guest does not lose the network connection for Test 1? could anyone give some suggestions?
2. If no other method is available, reverts the _UID update from 1 to 0 for all QEMU versions(the prerequisite is that macOS is not used). Is there any risk? Is it recommended to do this?
3. When the QEMU is upgraded from 4.1 to 6.2, set _UID to 1 for versions earlier than QEMU 5.1 and to 0 for versions later than QEMU 5.2 Through Parameters, but start guest on qemu-6.2 with machine version as qemu 6.2 to solve Test 1's problem. Is this solution feasible? Is there any risk?

Thanks.

> = QEMU Command =
> 
> (This was generated by our (Proxmox VE) stack, I only cleaned it up a bit to allow
> easier manual running it)
> 
> ./qemu-system-x86_64 \
>   -name win2016 \
>   -chardev
> 'socket,id=qmp,path=/var/run/qemu-server/11765.qmp,server,nowait' \
>   -mon 'chardev=qmp,mode=control' \
>   -smbios 'type=1,uuid=6324fb28-e98a-44cf-85db-694d1b3405f5' \
>   -smp '2,sockets=1,cores=2,maxcpus=2' \
>   -nodefaults \
>   -boot 'menu=on,strict=on,reboot-timeout=1000' \
>   -vnc unix:/var/run/qemu-server/11765.vnc,password \
>   -no-hpet \
>   -cpu
> 'host,hv_ipi,hv_relaxed,hv_reset,hv_runtime,hv_spinlocks=0x1fff,hv_stimer,hv
> _synic,hv_time,hv_vapic,hv_vpindex,+kvm_pv_eoi,+kvm_pv_unhalt,+md-clear,
> +pcid,+spec-ctrl' \
>   -m 2048 \
>   -device 'pci-bridge,id=pci.1,chassis_nr=1,bus=pci.0,addr=0x1e' \
>   -device 'pci-bridge,id=pci.2,chassis_nr=2,bus=pci.0,addr=0x1f' \
>   -device 'vmgenid,guid=2e56e6ca-2cf8-4f1d-8cc3-9b19a2510c01' \
>   -device 'piix3-usb-uhci,id=uhci,bus=pci.0,addr=0x1.0x2' \
>   -device 'usb-tablet,id=tablet,bus=uhci.0,port=1' \
>   -device 'VGA,id=vga,bus=pci.0,addr=0x2,edid=off' \
>   -chardev
> 'socket,path=/var/run/qemu-server/11765.qga,server,nowait,id=qga0' \
>   -device 'virtio-serial,id=qga0,bus=pci.0,addr=0x8' \
>   -device 'virtserialport,chardev=qga0,name=org.qemu.guest_agent.0' \
>   -device 'virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x3' \
>   -iscsi 'initiator-name=iqn.1993-08.org.debian:01:468faae9322b' \
>   -drive
> 'file=/mnt/pve/iso/template/iso/virtio-win-0.1.190.iso,if=none,id=drive-ide0,me
> dia=cdrom,aio=threads' \
>   -device 'ide-cd,bus=ide.0,unit=0,drive=drive-ide0,id=ide0,bootindex=200' \
>   -drive
> 'file=/mnt/pve/iso/template/iso/Win2016-1616-evaluation.ISO,if=none,id=drive-
> ide2,media=cdrom,aio=threads' \
>   -device 'ide-cd,bus=ide.1,unit=0,drive=drive-ide2,id=ide2,bootindex=201' \
>   -device 'virtio-scsi-pci,id=scsihw0,bus=pci.0,addr=0x5' \
>   -drive
> 'file=/dev/WDnvme/vm-11765-disk-0,if=none,id=drive-scsi0,format=raw,cache=
> none,aio=native,detect-zeroes=on' \
>   -device
> 'scsi-hd,bus=scsihw0.0,channel=0,scsi-id=0,lun=0,drive=drive-scsi0,id=scsi0,rotati
> on_rate=1,bootindex=100' \
>   -netdev
> 'type=tap,id=net0,ifname=tap11765i0,script=/var/lib/qemu-server/pve-bridge,d
> ownscript=/var/lib/qemu-server/pve-bridgedown,vhost=on' \
>   -device
> 'virtio-net-pci,mac=02:98:90:43:42:1D,netdev=net0,bus=pci.0,addr=0x12,id=net0,
> bootindex=300' \
>   -rtc 'driftfix=slew,base=localtime' \
>   -machine 'type=pc' \
>   -global 'kvm-pit.lost_tick_policy=discard'
> 
> 
> = bisect log =
> 
> git bisect start
> # bad: [553032db17440f8de011390e5a1cfddd13751b0b] Update version for v5.2.0
> release git bisect bad 553032db17440f8de011390e5a1cfddd13751b0b
> # good: [d0ed6a69d399ae193959225cdeaa9382746c91cc] Update version for
> v5.1.0 release git bisect good d0ed6a69d399ae193959225cdeaa9382746c91cc
> # bad: [ed799805d00ccdda45eb8441c7d929624d9e98a6] qom: Add kernel-doc
> markup to introduction doc comment git bisect bad
> ed799805d00ccdda45eb8441c7d929624d9e98a6
> # bad: [e4d8b7c1a95fffcfa4bdab9aa7ffd1cf590cdcf5] Merge remote-tracking
> branch 'remotes/nvme/tags/pull-nvme-20200902' into staging git bisect bad
> e4d8b7c1a95fffcfa4bdab9aa7ffd1cf590cdcf5
> # bad: [af1dfe1ec0864e6700237a43cc36018176f9eba9] acpi: update expected
> DSDT files with _UID changes git bisect bad
> af1dfe1ec0864e6700237a43cc36018176f9eba9
> # good: [d7df0ceee0fd2e512cd214a9074ebeeb40da3099] Merge remote-tracking
> branch 'remotes/philmd-gitlab/tags/sd-next-20200821' into staging git bisect
> good d7df0ceee0fd2e512cd214a9074ebeeb40da3099
> # good: [df82aa7fe10e46b675678977999d49bd586538f8] Merge remote-tracking
> branch 'remotes/edgar/tags/edgar/xilinx-next-2020-08-24.for-upstream' into
> staging git bisect good df82aa7fe10e46b675678977999d49bd586538f8
> # good: [e39a8320b088dd5efc9ebaafe387e52b3d962665] target/riscv: Support
> the Virtual Instruction fault git bisect good
> e39a8320b088dd5efc9ebaafe387e52b3d962665
> # good: [a6841a2de66fa44fe52ed996b70f9fb9f7bd6ca7] qcow2: Add subcluster
> support to qcow2_co_pwrite_zeroes() git bisect good
> a6841a2de66fa44fe52ed996b70f9fb9f7bd6ca7
> # good: [2f8cd515477edab1cbf38ecbdbfa2cac13ce1550] hw/display/artist: Fix
> invalidation of lines near screen border git bisect good
> 2f8cd515477edab1cbf38ecbdbfa2cac13ce1550
> # good: [a5d3cfa2dc775e5d99f013703b8508f1d989d588] iotests: Add tests for
> qcow2 images with extended L2 entries git bisect good
> a5d3cfa2dc775e5d99f013703b8508f1d989d588
> # good: [8e49197ca5e76fdb8928833b2649ef13fc5aab2f] Merge remote-tracking
> branch 'remotes/hdeller/tags/target-hppa-v3-pull-request' into staging git bisect
> good 8e49197ca5e76fdb8928833b2649ef13fc5aab2f
> # bad: [af1b80ae56c9495999e8ccf7b70ef894378de642] i386/acpi: fix inconsistent
> QEMU/OVMF device paths git bisect bad
> af1b80ae56c9495999e8ccf7b70ef894378de642
> # good: [42a62c20925e02aef0d849f92a0e9540888e79ae] acpi: allow DSDT changes
> git bisect good 42a62c20925e02aef0d849f92a0e9540888e79ae
> # first bad commit: [af1b80ae56c9495999e8ccf7b70ef894378de642] i386/acpi: fix
> inconsistent QEMU/OVMF device paths


^ permalink raw reply	[flat|nested] 33+ messages in thread

end of thread, other threads:[~2023-04-20 14:55 UTC | newest]

Thread overview: 33+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-07-30 15:58 [PATCH 1/2] i386/acpi: fix inconsistent QEMU/OVMF device paths Michael S. Tsirkin
2020-07-30 15:58 ` [PATCH 2/2] arm/acpi: fix an out of spec _UID for PCI root Michael S. Tsirkin
2020-07-30 16:12   ` Philippe Mathieu-Daudé
2020-07-30 19:35   ` Laszlo Ersek
2020-07-30 20:33   ` Peter Maydell
2020-07-31  9:31   ` Igor Mammedov
2020-07-30 16:11 ` [PATCH 1/2] i386/acpi: fix inconsistent QEMU/OVMF device paths Philippe Mathieu-Daudé
2020-07-30 19:35   ` Michael S. Tsirkin
2020-07-31  2:55     ` vit9696 via
2020-07-30 19:34 ` Laszlo Ersek
2020-07-31  9:30 ` Igor Mammedov
2021-02-27 19:41 ` Thomas Lamprecht
2021-02-28  9:11   ` vit9696
2021-02-28 10:43     ` Thomas Lamprecht
2021-02-28 20:45       ` Michael S. Tsirkin
     [not found]       ` <x3i3TiibtrC1qTDQUKxuOA_9qvmInzVwv6RrvzzSCSj-S21gLypbbZgEbYvJSGMxC1r8RaDrnHGgRbDI7vfpA_XuDINdZej9yKCW3_Sc4YM=@protonmail.com>
2021-02-28 21:37         ` Michael S. Tsirkin
2021-02-28 20:43   ` Michael S. Tsirkin
2021-03-01  7:12     ` Thomas Lamprecht
2021-03-01  7:20       ` Michael S. Tsirkin
2021-03-01  7:45         ` Thomas Lamprecht
2021-03-01 14:20           ` Igor Mammedov
2021-03-01 14:27             ` Thomas Lamprecht
2021-03-01 20:16               ` Igor Mammedov
2021-03-01 15:31           ` Michael S. Tsirkin
2021-03-01 13:28     ` Igor Mammedov
2021-03-01 16:14       ` Michael S. Tsirkin
2021-03-01 16:28         ` Laszlo Ersek
2021-03-01 19:08           ` Igor Mammedov
2021-03-01 20:06             ` vit9696
2021-03-02  8:40             ` Laszlo Ersek
2023-04-18  9:06 zhangying (AZ) via
2023-04-19  2:48 zhangying (AZ) via
2023-04-20 14:54 ` Igor Mammedov

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).