All of lore.kernel.org
 help / color / mirror / Atom feed
* aarch64 efi boot failures with qemu 6.0+
@ 2021-07-24 18:52 Guenter Roeck
  2021-07-25 22:14 ` Michael S. Tsirkin
  0 siblings, 1 reply; 31+ messages in thread
From: Guenter Roeck @ 2021-07-24 18:52 UTC (permalink / raw)
  To: qemu-devel; +Cc: Igor Mammedov, Jiahui Cen, Michael S. Tsirkin

Hi all,

starting with qemu v6.0, some of my aarch64 efi boot tests no longer
work. Analysis shows that PCI devices with IO ports do not instantiate
in qemu v6.0 (or v6.1-rc0) when booting through efi. The problem affects
(at least) ne2k_pci, tulip, dc390, and am53c974. The problem only affects
aarch64, not x86/x86_64.

I bisected the problem to commit 0cf8882fd0 ("acpi/gpex: Inform os to
keep firmware resource map"). Since this commit, PCI device BAR
allocation has changed. Taking tulip as example, the kernel reports
the following PCI bar assignments when running qemu v5.2.

[    3.921801] pci 0000:00:01.0: [1011:0019] type 00 class 0x020000
[    3.922207] pci 0000:00:01.0: reg 0x10: [io  0x0000-0x007f]
[    3.922505] pci 0000:00:01.0: reg 0x14: [mem 0x10000000-0x1000007f]
[    3.927111] pci 0000:00:01.0: BAR 0: assigned [io  0x1000-0x107f]
[    3.927455] pci 0000:00:01.0: BAR 1: assigned [mem 0x10000000-0x1000007f]

With qemu v6.0, the assignment is reported as follows.

[    3.922887] pci 0000:00:01.0: [1011:0019] type 00 class 0x020000
[    3.923278] pci 0000:00:01.0: reg 0x10: [io  0x0000-0x007f]
[    3.923451] pci 0000:00:01.0: reg 0x14: [mem 0x10000000-0x1000007f]

and the controller does not instantiate. The problem disapears after
reverting commit 0cf8882fd0.

Attached is a summary of test runs with various devices and qemu v5.2
as well as qemu v6.0, and the command line I use for efi boots.

Did commit 0cf8882fd0 introduce a bug, do I now need need some different
command line to instantiate PCI devices with io ports, or are such devices
simply no longer supported if the system is booted with efi support ?

Thanks,
Guenter

---
Command line (tulip network interface):

CMDLINE="root=/dev/vda console=ttyAMA0"
ROOTFS="rootfs.ext2"

qemu-system-aarch64 -M virt -kernel arch/arm64/boot/Image -no-reboot \
        -m 512 -cpu cortex-a57 -no-reboot \
        -device tulip,netdev=net0 -netdev user,id=net0 \
        -bios QEMU_EFI-aarch64.fd \
        -snapshot \
        -device virtio-blk-device,drive=d0 \
        -drive file=${ROOTFS},if=none,id=d0,format=raw \
        -nographic -serial stdio -monitor none \
        --append "${CMDLINE}"

---
Boot tests with various devices known to work in qemu v5.2.

		v5.2	v6.0	v6.0
		efi	non-efi	efi
e1000		pass	pass	pass
e1000-82544gc	pass	pass	pass
e1000-82545em	pass	pass	pass
e1000e		pass	pass	pass
i82550		pass	pass	pass
i82557a		pass	pass	pass
i82557b		pass	pass	pass
i82557c		pass	pass	pass
i82558a		pass	pass	pass
i82559b		pass	pass	pass
i82559c		pass	pass	pass
i82559er	pass	pass	pass
i82562		pass	pass	pass
i82801		pass	pass	pass
ne2k_pci	pass	pass	fail	<--
pcnet		pass	pass	pass
rtl8139		pass	pass	pass
tulip		pass	pass	fail	<--
usb-net		pass	pass	pass
virtio-net-device
		pass	pass	pass
virtio-net-pci	pass	pass	pass
virtio-net-pci-non-transitional
		pass	pass	pass

usb-xhci	pass	pass	pass
usb-ehci	pass	pass	pass
usb-ohci	pass	pass	pass
usb-uas-xhci	pass	pass	pass
virtio		pass	pass	pass
virtio-blk-pci	pass	pass	pass
virtio-blk-device
		pass	pass	pass
nvme		pass	pass	pass
sdhci		pass	pass	pass
dc390		pass	pass	fail	<--
am53c974	pass	pass	fail	<--
lsi53c895ai	pass	pass	pass
mptsas1068	pass	pass	pass
lsi53c810	pass	pass	pass
megasas		pass	pass	pass
megasas-gen2	pass	pass	pass
virtio-scsi-device
		pass	pass	pass
virtio-scsi-pci	pass	pass	pass


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: aarch64 efi boot failures with qemu 6.0+
  2021-07-24 18:52 aarch64 efi boot failures with qemu 6.0+ Guenter Roeck
@ 2021-07-25 22:14 ` Michael S. Tsirkin
  2021-07-25 22:56   ` Guenter Roeck
  0 siblings, 1 reply; 31+ messages in thread
From: Michael S. Tsirkin @ 2021-07-25 22:14 UTC (permalink / raw)
  To: Guenter Roeck; +Cc: Igor Mammedov, Jiahui Cen, philmd, qemu-devel

On Sat, Jul 24, 2021 at 11:52:34AM -0700, Guenter Roeck wrote:
> Hi all,
> 
> starting with qemu v6.0, some of my aarch64 efi boot tests no longer
> work. Analysis shows that PCI devices with IO ports do not instantiate
> in qemu v6.0 (or v6.1-rc0) when booting through efi. The problem affects
> (at least) ne2k_pci, tulip, dc390, and am53c974. The problem only affects
> aarch64, not x86/x86_64.
> 
> I bisected the problem to commit 0cf8882fd0 ("acpi/gpex: Inform os to
> keep firmware resource map"). Since this commit, PCI device BAR
> allocation has changed. Taking tulip as example, the kernel reports
> the following PCI bar assignments when running qemu v5.2.
> 
> [    3.921801] pci 0000:00:01.0: [1011:0019] type 00 class 0x020000
> [    3.922207] pci 0000:00:01.0: reg 0x10: [io  0x0000-0x007f]
> [    3.922505] pci 0000:00:01.0: reg 0x14: [mem 0x10000000-0x1000007f]
> [    3.927111] pci 0000:00:01.0: BAR 0: assigned [io  0x1000-0x107f]
> [    3.927455] pci 0000:00:01.0: BAR 1: assigned [mem 0x10000000-0x1000007f]
> 
> With qemu v6.0, the assignment is reported as follows.
> 
> [    3.922887] pci 0000:00:01.0: [1011:0019] type 00 class 0x020000
> [    3.923278] pci 0000:00:01.0: reg 0x10: [io  0x0000-0x007f]
> [    3.923451] pci 0000:00:01.0: reg 0x14: [mem 0x10000000-0x1000007f]
> 
> and the controller does not instantiate. The problem disapears after
> reverting commit 0cf8882fd0.
> 
> Attached is a summary of test runs with various devices and qemu v5.2
> as well as qemu v6.0, and the command line I use for efi boots.
> 
> Did commit 0cf8882fd0 introduce a bug, do I now need need some different
> command line to instantiate PCI devices with io ports, or are such devices
> simply no longer supported if the system is booted with efi support ?
> 
> Thanks,
> Guenter


So that commit basically just says don't ignore what efi did.

The issue's thus likely efi.

Cc the maintainer. Philippe can you comment pls?

> ---
> Command line (tulip network interface):
> 
> CMDLINE="root=/dev/vda console=ttyAMA0"
> ROOTFS="rootfs.ext2"
> 
> qemu-system-aarch64 -M virt -kernel arch/arm64/boot/Image -no-reboot \
>         -m 512 -cpu cortex-a57 -no-reboot \
>         -device tulip,netdev=net0 -netdev user,id=net0 \
>         -bios QEMU_EFI-aarch64.fd \
>         -snapshot \
>         -device virtio-blk-device,drive=d0 \
>         -drive file=${ROOTFS},if=none,id=d0,format=raw \
>         -nographic -serial stdio -monitor none \
>         --append "${CMDLINE}"
> 
> ---
> Boot tests with various devices known to work in qemu v5.2.
> 
> 		v5.2	v6.0	v6.0
> 		efi	non-efi	efi
> e1000		pass	pass	pass
> e1000-82544gc	pass	pass	pass
> e1000-82545em	pass	pass	pass
> e1000e		pass	pass	pass
> i82550		pass	pass	pass
> i82557a		pass	pass	pass
> i82557b		pass	pass	pass
> i82557c		pass	pass	pass
> i82558a		pass	pass	pass
> i82559b		pass	pass	pass
> i82559c		pass	pass	pass
> i82559er	pass	pass	pass
> i82562		pass	pass	pass
> i82801		pass	pass	pass
> ne2k_pci	pass	pass	fail	<--
> pcnet		pass	pass	pass
> rtl8139		pass	pass	pass
> tulip		pass	pass	fail	<--
> usb-net		pass	pass	pass
> virtio-net-device
> 		pass	pass	pass
> virtio-net-pci	pass	pass	pass
> virtio-net-pci-non-transitional
> 		pass	pass	pass
> 
> usb-xhci	pass	pass	pass
> usb-ehci	pass	pass	pass
> usb-ohci	pass	pass	pass
> usb-uas-xhci	pass	pass	pass
> virtio		pass	pass	pass
> virtio-blk-pci	pass	pass	pass
> virtio-blk-device
> 		pass	pass	pass
> nvme		pass	pass	pass
> sdhci		pass	pass	pass
> dc390		pass	pass	fail	<--
> am53c974	pass	pass	fail	<--
> lsi53c895ai	pass	pass	pass
> mptsas1068	pass	pass	pass
> lsi53c810	pass	pass	pass
> megasas		pass	pass	pass
> megasas-gen2	pass	pass	pass
> virtio-scsi-device
> 		pass	pass	pass
> virtio-scsi-pci	pass	pass	pass



^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: aarch64 efi boot failures with qemu 6.0+
  2021-07-25 22:14 ` Michael S. Tsirkin
@ 2021-07-25 22:56   ` Guenter Roeck
  2021-07-26  9:08     ` Philippe Mathieu-Daudé
  0 siblings, 1 reply; 31+ messages in thread
From: Guenter Roeck @ 2021-07-25 22:56 UTC (permalink / raw)
  To: Michael S. Tsirkin; +Cc: Igor Mammedov, Jiahui Cen, philmd, qemu-devel

On 7/25/21 3:14 PM, Michael S. Tsirkin wrote:
> On Sat, Jul 24, 2021 at 11:52:34AM -0700, Guenter Roeck wrote:
>> Hi all,
>>
>> starting with qemu v6.0, some of my aarch64 efi boot tests no longer
>> work. Analysis shows that PCI devices with IO ports do not instantiate
>> in qemu v6.0 (or v6.1-rc0) when booting through efi. The problem affects
>> (at least) ne2k_pci, tulip, dc390, and am53c974. The problem only affects
>> aarch64, not x86/x86_64.
>>
>> I bisected the problem to commit 0cf8882fd0 ("acpi/gpex: Inform os to
>> keep firmware resource map"). Since this commit, PCI device BAR
>> allocation has changed. Taking tulip as example, the kernel reports
>> the following PCI bar assignments when running qemu v5.2.
>>
>> [    3.921801] pci 0000:00:01.0: [1011:0019] type 00 class 0x020000
>> [    3.922207] pci 0000:00:01.0: reg 0x10: [io  0x0000-0x007f]
>> [    3.922505] pci 0000:00:01.0: reg 0x14: [mem 0x10000000-0x1000007f]
>> [    3.927111] pci 0000:00:01.0: BAR 0: assigned [io  0x1000-0x107f]
>> [    3.927455] pci 0000:00:01.0: BAR 1: assigned [mem 0x10000000-0x1000007f]
>>
>> With qemu v6.0, the assignment is reported as follows.
>>
>> [    3.922887] pci 0000:00:01.0: [1011:0019] type 00 class 0x020000
>> [    3.923278] pci 0000:00:01.0: reg 0x10: [io  0x0000-0x007f]
>> [    3.923451] pci 0000:00:01.0: reg 0x14: [mem 0x10000000-0x1000007f]
>>
>> and the controller does not instantiate. The problem disapears after
>> reverting commit 0cf8882fd0.
>>
>> Attached is a summary of test runs with various devices and qemu v5.2
>> as well as qemu v6.0, and the command line I use for efi boots.
>>
>> Did commit 0cf8882fd0 introduce a bug, do I now need need some different
>> command line to instantiate PCI devices with io ports, or are such devices
>> simply no longer supported if the system is booted with efi support ?
>>
>> Thanks,
>> Guenter
> 
> 
> So that commit basically just says don't ignore what efi did.
> 
> The issue's thus likely efi.
> 

I don't see the problem with efi boots on x86 and x86_64.
Any idea why that might be the case ?

Thanks,
Guenter

> Cc the maintainer. Philippe can you comment pls?
> 
>> ---
>> Command line (tulip network interface):
>>
>> CMDLINE="root=/dev/vda console=ttyAMA0"
>> ROOTFS="rootfs.ext2"
>>
>> qemu-system-aarch64 -M virt -kernel arch/arm64/boot/Image -no-reboot \
>>          -m 512 -cpu cortex-a57 -no-reboot \
>>          -device tulip,netdev=net0 -netdev user,id=net0 \
>>          -bios QEMU_EFI-aarch64.fd \
>>          -snapshot \
>>          -device virtio-blk-device,drive=d0 \
>>          -drive file=${ROOTFS},if=none,id=d0,format=raw \
>>          -nographic -serial stdio -monitor none \
>>          --append "${CMDLINE}"
>>
>> ---
>> Boot tests with various devices known to work in qemu v5.2.
>>
>> 		v5.2	v6.0	v6.0
>> 		efi	non-efi	efi
>> e1000		pass	pass	pass
>> e1000-82544gc	pass	pass	pass
>> e1000-82545em	pass	pass	pass
>> e1000e		pass	pass	pass
>> i82550		pass	pass	pass
>> i82557a		pass	pass	pass
>> i82557b		pass	pass	pass
>> i82557c		pass	pass	pass
>> i82558a		pass	pass	pass
>> i82559b		pass	pass	pass
>> i82559c		pass	pass	pass
>> i82559er	pass	pass	pass
>> i82562		pass	pass	pass
>> i82801		pass	pass	pass
>> ne2k_pci	pass	pass	fail	<--
>> pcnet		pass	pass	pass
>> rtl8139		pass	pass	pass
>> tulip		pass	pass	fail	<--
>> usb-net		pass	pass	pass
>> virtio-net-device
>> 		pass	pass	pass
>> virtio-net-pci	pass	pass	pass
>> virtio-net-pci-non-transitional
>> 		pass	pass	pass
>>
>> usb-xhci	pass	pass	pass
>> usb-ehci	pass	pass	pass
>> usb-ohci	pass	pass	pass
>> usb-uas-xhci	pass	pass	pass
>> virtio		pass	pass	pass
>> virtio-blk-pci	pass	pass	pass
>> virtio-blk-device
>> 		pass	pass	pass
>> nvme		pass	pass	pass
>> sdhci		pass	pass	pass
>> dc390		pass	pass	fail	<--
>> am53c974	pass	pass	fail	<--
>> lsi53c895ai	pass	pass	pass
>> mptsas1068	pass	pass	pass
>> lsi53c810	pass	pass	pass
>> megasas		pass	pass	pass
>> megasas-gen2	pass	pass	pass
>> virtio-scsi-device
>> 		pass	pass	pass
>> virtio-scsi-pci	pass	pass	pass
> 



^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: aarch64 efi boot failures with qemu 6.0+
  2021-07-25 22:56   ` Guenter Roeck
@ 2021-07-26  9:08     ` Philippe Mathieu-Daudé
  2021-07-26 16:00       ` Ard Biesheuvel
  0 siblings, 1 reply; 31+ messages in thread
From: Philippe Mathieu-Daudé @ 2021-07-26  9:08 UTC (permalink / raw)
  To: Guenter Roeck, Michael S. Tsirkin, Ard Biesheuvel
  Cc: Igor Mammedov, Jiahui Cen, qemu-devel

On 7/26/21 12:56 AM, Guenter Roeck wrote:
> On 7/25/21 3:14 PM, Michael S. Tsirkin wrote:
>> On Sat, Jul 24, 2021 at 11:52:34AM -0700, Guenter Roeck wrote:
>>> Hi all,
>>>
>>> starting with qemu v6.0, some of my aarch64 efi boot tests no longer
>>> work. Analysis shows that PCI devices with IO ports do not instantiate
>>> in qemu v6.0 (or v6.1-rc0) when booting through efi. The problem affects
>>> (at least) ne2k_pci, tulip, dc390, and am53c974. The problem only
>>> affects
>>> aarch64, not x86/x86_64.
>>>
>>> I bisected the problem to commit 0cf8882fd0 ("acpi/gpex: Inform os to
>>> keep firmware resource map"). Since this commit, PCI device BAR
>>> allocation has changed. Taking tulip as example, the kernel reports
>>> the following PCI bar assignments when running qemu v5.2.
>>>
>>> [    3.921801] pci 0000:00:01.0: [1011:0019] type 00 class 0x020000
>>> [    3.922207] pci 0000:00:01.0: reg 0x10: [io  0x0000-0x007f]
>>> [    3.922505] pci 0000:00:01.0: reg 0x14: [mem 0x10000000-0x1000007f]
>>> [    3.927111] pci 0000:00:01.0: BAR 0: assigned [io  0x1000-0x107f]
>>> [    3.927455] pci 0000:00:01.0: BAR 1: assigned [mem
>>> 0x10000000-0x1000007f]
>>>
>>> With qemu v6.0, the assignment is reported as follows.
>>>
>>> [    3.922887] pci 0000:00:01.0: [1011:0019] type 00 class 0x020000
>>> [    3.923278] pci 0000:00:01.0: reg 0x10: [io  0x0000-0x007f]
>>> [    3.923451] pci 0000:00:01.0: reg 0x14: [mem 0x10000000-0x1000007f]
>>>
>>> and the controller does not instantiate. The problem disapears after
>>> reverting commit 0cf8882fd0.
>>>
>>> Attached is a summary of test runs with various devices and qemu v5.2
>>> as well as qemu v6.0, and the command line I use for efi boots.
>>>
>>> Did commit 0cf8882fd0 introduce a bug, do I now need need some different
>>> command line to instantiate PCI devices with io ports, or are such
>>> devices
>>> simply no longer supported if the system is booted with efi support ?
>>>
>>> Thanks,
>>> Guenter
>>
>>
>> So that commit basically just says don't ignore what efi did.
>>
>> The issue's thus likely efi.
>>
> 
> I don't see the problem with efi boots on x86 and x86_64.
> Any idea why that might be the case ?
> 
> Thanks,
> Guenter
> 
>> Cc the maintainer. Philippe can you comment pls?

I'll have a look. Cc'ing Ard for EDK2/Aarch64.

>>
>>> ---
>>> Command line (tulip network interface):
>>>
>>> CMDLINE="root=/dev/vda console=ttyAMA0"
>>> ROOTFS="rootfs.ext2"
>>>
>>> qemu-system-aarch64 -M virt -kernel arch/arm64/boot/Image -no-reboot \
>>>          -m 512 -cpu cortex-a57 -no-reboot \
>>>          -device tulip,netdev=net0 -netdev user,id=net0 \
>>>          -bios QEMU_EFI-aarch64.fd \
>>>          -snapshot \
>>>          -device virtio-blk-device,drive=d0 \
>>>          -drive file=${ROOTFS},if=none,id=d0,format=raw \
>>>          -nographic -serial stdio -monitor none \
>>>          --append "${CMDLINE}"
>>>
>>> ---
>>> Boot tests with various devices known to work in qemu v5.2.
>>>
>>>         v5.2    v6.0    v6.0
>>>         efi    non-efi    efi
>>> e1000        pass    pass    pass
>>> e1000-82544gc    pass    pass    pass
>>> e1000-82545em    pass    pass    pass
>>> e1000e        pass    pass    pass
>>> i82550        pass    pass    pass
>>> i82557a        pass    pass    pass
>>> i82557b        pass    pass    pass
>>> i82557c        pass    pass    pass
>>> i82558a        pass    pass    pass
>>> i82559b        pass    pass    pass
>>> i82559c        pass    pass    pass
>>> i82559er    pass    pass    pass
>>> i82562        pass    pass    pass
>>> i82801        pass    pass    pass
>>> ne2k_pci    pass    pass    fail    <--
>>> pcnet        pass    pass    pass
>>> rtl8139        pass    pass    pass
>>> tulip        pass    pass    fail    <--
>>> usb-net        pass    pass    pass
>>> virtio-net-device
>>>         pass    pass    pass
>>> virtio-net-pci    pass    pass    pass
>>> virtio-net-pci-non-transitional
>>>         pass    pass    pass
>>>
>>> usb-xhci    pass    pass    pass
>>> usb-ehci    pass    pass    pass
>>> usb-ohci    pass    pass    pass
>>> usb-uas-xhci    pass    pass    pass
>>> virtio        pass    pass    pass
>>> virtio-blk-pci    pass    pass    pass
>>> virtio-blk-device
>>>         pass    pass    pass
>>> nvme        pass    pass    pass
>>> sdhci        pass    pass    pass
>>> dc390        pass    pass    fail    <--
>>> am53c974    pass    pass    fail    <--
>>> lsi53c895ai    pass    pass    pass
>>> mptsas1068    pass    pass    pass
>>> lsi53c810    pass    pass    pass
>>> megasas        pass    pass    pass
>>> megasas-gen2    pass    pass    pass
>>> virtio-scsi-device
>>>         pass    pass    pass
>>> virtio-scsi-pci    pass    pass    pass
>>
> 



^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: aarch64 efi boot failures with qemu 6.0+
  2021-07-26  9:08     ` Philippe Mathieu-Daudé
@ 2021-07-26 16:00       ` Ard Biesheuvel
  2021-07-26 21:16         ` Bjorn Helgaas
  2021-07-27  4:45         ` Michael S. Tsirkin
  0 siblings, 2 replies; 31+ messages in thread
From: Ard Biesheuvel @ 2021-07-26 16:00 UTC (permalink / raw)
  To: Philippe Mathieu-Daudé, Bjorn Helgaas
  Cc: Jiahui Cen, Michael S. Tsirkin, Ard Biesheuvel, qemu-devel,
	Igor Mammedov, Guenter Roeck

(cc Bjorn)

On Mon, 26 Jul 2021 at 11:08, Philippe Mathieu-Daudé <philmd@redhat.com> wrote:
>
> On 7/26/21 12:56 AM, Guenter Roeck wrote:
> > On 7/25/21 3:14 PM, Michael S. Tsirkin wrote:
> >> On Sat, Jul 24, 2021 at 11:52:34AM -0700, Guenter Roeck wrote:
> >>> Hi all,
> >>>
> >>> starting with qemu v6.0, some of my aarch64 efi boot tests no longer
> >>> work. Analysis shows that PCI devices with IO ports do not instantiate
> >>> in qemu v6.0 (or v6.1-rc0) when booting through efi. The problem affects
> >>> (at least) ne2k_pci, tulip, dc390, and am53c974. The problem only
> >>> affects
> >>> aarch64, not x86/x86_64.
> >>>
> >>> I bisected the problem to commit 0cf8882fd0 ("acpi/gpex: Inform os to
> >>> keep firmware resource map"). Since this commit, PCI device BAR
> >>> allocation has changed. Taking tulip as example, the kernel reports
> >>> the following PCI bar assignments when running qemu v5.2.
> >>>
> >>> [    3.921801] pci 0000:00:01.0: [1011:0019] type 00 class 0x020000
> >>> [    3.922207] pci 0000:00:01.0: reg 0x10: [io  0x0000-0x007f]
> >>> [    3.922505] pci 0000:00:01.0: reg 0x14: [mem 0x10000000-0x1000007f]

IIUC, these lines are read back from the BARs

> >>> [    3.927111] pci 0000:00:01.0: BAR 0: assigned [io  0x1000-0x107f]
> >>> [    3.927455] pci 0000:00:01.0: BAR 1: assigned [mem
> >>> 0x10000000-0x1000007f]
> >>>

... and this is the assignment created by the kernel.

> >>> With qemu v6.0, the assignment is reported as follows.
> >>>
> >>> [    3.922887] pci 0000:00:01.0: [1011:0019] type 00 class 0x020000
> >>> [    3.923278] pci 0000:00:01.0: reg 0x10: [io  0x0000-0x007f]
> >>> [    3.923451] pci 0000:00:01.0: reg 0x14: [mem 0x10000000-0x1000007f]
> >>>

The problem here is that Linux, for legacy reasons, does not support
I/O ports <= 0x1000 on PCI, so the I/O assignment created by EFI is
rejected.

This might make sense on x86, where legacy I/O ports may exist, but on
other architectures, this makes no sense.


> >>> and the controller does not instantiate. The problem disapears after
> >>> reverting commit 0cf8882fd0.
> >>>
> >>> Attached is a summary of test runs with various devices and qemu v5.2
> >>> as well as qemu v6.0, and the command line I use for efi boots.
> >>>
> >>> Did commit 0cf8882fd0 introduce a bug, do I now need need some different
> >>> command line to instantiate PCI devices with io ports, or are such
> >>> devices
> >>> simply no longer supported if the system is booted with efi support ?
> >>>
> >>> Thanks,
> >>> Guenter
> >>
> >>
> >> So that commit basically just says don't ignore what efi did.
> >>
> >> The issue's thus likely efi.
> >>
> >
> > I don't see the problem with efi boots on x86 and x86_64.
> > Any idea why that might be the case ?
> >
> > Thanks,
> > Guenter
> >
> >> Cc the maintainer. Philippe can you comment pls?
>
> I'll have a look. Cc'ing Ard for EDK2/Aarch64.
>

So a potential workaround would be to use a different I/O resource
window for ArmVirtPkg, that starts at 0x1000. But I would prefer to
fix Linux instead.


> >>
> >>> ---
> >>> Command line (tulip network interface):
> >>>
> >>> CMDLINE="root=/dev/vda console=ttyAMA0"
> >>> ROOTFS="rootfs.ext2"
> >>>
> >>> qemu-system-aarch64 -M virt -kernel arch/arm64/boot/Image -no-reboot \
> >>>          -m 512 -cpu cortex-a57 -no-reboot \
> >>>          -device tulip,netdev=net0 -netdev user,id=net0 \
> >>>          -bios QEMU_EFI-aarch64.fd \
> >>>          -snapshot \
> >>>          -device virtio-blk-device,drive=d0 \
> >>>          -drive file=${ROOTFS},if=none,id=d0,format=raw \
> >>>          -nographic -serial stdio -monitor none \
> >>>          --append "${CMDLINE}"
> >>>
> >>> ---
> >>> Boot tests with various devices known to work in qemu v5.2.
> >>>
> >>>         v5.2    v6.0    v6.0
> >>>         efi    non-efi    efi
> >>> e1000        pass    pass    pass
> >>> e1000-82544gc    pass    pass    pass
> >>> e1000-82545em    pass    pass    pass
> >>> e1000e        pass    pass    pass
> >>> i82550        pass    pass    pass
> >>> i82557a        pass    pass    pass
> >>> i82557b        pass    pass    pass
> >>> i82557c        pass    pass    pass
> >>> i82558a        pass    pass    pass
> >>> i82559b        pass    pass    pass
> >>> i82559c        pass    pass    pass
> >>> i82559er    pass    pass    pass
> >>> i82562        pass    pass    pass
> >>> i82801        pass    pass    pass
> >>> ne2k_pci    pass    pass    fail    <--
> >>> pcnet        pass    pass    pass
> >>> rtl8139        pass    pass    pass
> >>> tulip        pass    pass    fail    <--
> >>> usb-net        pass    pass    pass
> >>> virtio-net-device
> >>>         pass    pass    pass
> >>> virtio-net-pci    pass    pass    pass
> >>> virtio-net-pci-non-transitional
> >>>         pass    pass    pass
> >>>
> >>> usb-xhci    pass    pass    pass
> >>> usb-ehci    pass    pass    pass
> >>> usb-ohci    pass    pass    pass
> >>> usb-uas-xhci    pass    pass    pass
> >>> virtio        pass    pass    pass
> >>> virtio-blk-pci    pass    pass    pass
> >>> virtio-blk-device
> >>>         pass    pass    pass
> >>> nvme        pass    pass    pass
> >>> sdhci        pass    pass    pass
> >>> dc390        pass    pass    fail    <--
> >>> am53c974    pass    pass    fail    <--
> >>> lsi53c895ai    pass    pass    pass
> >>> mptsas1068    pass    pass    pass
> >>> lsi53c810    pass    pass    pass
> >>> megasas        pass    pass    pass
> >>> megasas-gen2    pass    pass    pass
> >>> virtio-scsi-device
> >>>         pass    pass    pass
> >>> virtio-scsi-pci    pass    pass    pass
> >>
> >
>


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: aarch64 efi boot failures with qemu 6.0+
  2021-07-26 16:00       ` Ard Biesheuvel
@ 2021-07-26 21:16         ` Bjorn Helgaas
  2021-07-26 21:31             ` Bjorn Helgaas
  2021-07-27  4:45         ` Michael S. Tsirkin
  1 sibling, 1 reply; 31+ messages in thread
From: Bjorn Helgaas @ 2021-07-26 21:16 UTC (permalink / raw)
  To: Ard Biesheuvel
  Cc: Jiahui Cen, Michael S. Tsirkin, Ard Biesheuvel, qemu-devel,
	Bjorn Helgaas, Igor Mammedov, Philippe Mathieu-Daudé,
	Guenter Roeck

On Mon, Jul 26, 2021 at 06:00:57PM +0200, Ard Biesheuvel wrote:
> (cc Bjorn)
> 
> On Mon, 26 Jul 2021 at 11:08, Philippe Mathieu-Daudé <philmd@redhat.com> wrote:
> > On 7/26/21 12:56 AM, Guenter Roeck wrote:
> > > On 7/25/21 3:14 PM, Michael S. Tsirkin wrote:
> > >> On Sat, Jul 24, 2021 at 11:52:34AM -0700, Guenter Roeck wrote:
> > >>> Hi all,
> > >>>
> > >>> starting with qemu v6.0, some of my aarch64 efi boot tests no longer
> > >>> work. Analysis shows that PCI devices with IO ports do not instantiate
> > >>> in qemu v6.0 (or v6.1-rc0) when booting through efi. The problem affects
> > >>> (at least) ne2k_pci, tulip, dc390, and am53c974. The problem only
> > >>> affects
> > >>> aarch64, not x86/x86_64.
> > >>>
> > >>> I bisected the problem to commit 0cf8882fd0 ("acpi/gpex: Inform os to
> > >>> keep firmware resource map"). Since this commit, PCI device BAR
> > >>> allocation has changed. Taking tulip as example, the kernel reports
> > >>> the following PCI bar assignments when running qemu v5.2.
> > >>>
> > >>> [    3.921801] pci 0000:00:01.0: [1011:0019] type 00 class 0x020000
> > >>> [    3.922207] pci 0000:00:01.0: reg 0x10: [io  0x0000-0x007f]
> > >>> [    3.922505] pci 0000:00:01.0: reg 0x14: [mem 0x10000000-0x1000007f]
> 
> IIUC, these lines are read back from the BARs
> 
> > >>> [    3.927111] pci 0000:00:01.0: BAR 0: assigned [io  0x1000-0x107f]
> > >>> [    3.927455] pci 0000:00:01.0: BAR 1: assigned [mem
> > >>> 0x10000000-0x1000007f]
> > >>>
> 
> ... and this is the assignment created by the kernel.
> 
> > >>> With qemu v6.0, the assignment is reported as follows.
> > >>>
> > >>> [    3.922887] pci 0000:00:01.0: [1011:0019] type 00 class 0x020000
> > >>> [    3.923278] pci 0000:00:01.0: reg 0x10: [io  0x0000-0x007f]
> > >>> [    3.923451] pci 0000:00:01.0: reg 0x14: [mem 0x10000000-0x1000007f]
> 
> The problem here is that Linux, for legacy reasons, does not support
> I/O ports <= 0x1000 on PCI, so the I/O assignment created by EFI is
> rejected.
> 
> This might make sense on x86, where legacy I/O ports may exist, but on
> other architectures, this makes no sense.

I guess this is the "#define PCIBIOS_MIN_IO 0x1000" in
arm64/include/asm/pci.h.  From a PCI point of view, I'm not opposed to
changing that to 0, as it is on csky, riscv, sh, sparc, um.  But it's
really an arch question, so the arm64 folks would have to weigh in.

But I don't think that would fix this.  PCIBIOS_MIN_IO is mainly used
when we assign or reassign resources to a BAR, and if firmware tells
us to preserve the assignments done by firmware, Linux shouldn't be
doing any assignment or reassignment.

Linux received 00:01.0 BAR 0 as [io 0x0000-0x007f], and Guenter didn't
report any reassignment, so I assume Linux saw the
DSM_PCI_PRESERVE_BOOT_CONFIG [1] and didn't change anything.

Could this be due to drivers assuming that an I/O BAR of 0 is invalid?
I see that at least ne2k_pci_init_one() [2] seems to assume that.  And
tulip_init_one() [3] and pci_esp_probe_one() (am53c974.c, [4]) use
pci_iomap() [5], which fails if the resource starts at 0.

So pci_iomap() is probably already broken on the arches above that
allow I/O BARs to be zero.  Maybe pci_iomap() should only fail on
"!start" for *memory* BARs, e.g.,

diff --git a/lib/pci_iomap.c b/lib/pci_iomap.c
index 2d3eb1cb73b8..77455e702a3e 100644
--- a/lib/pci_iomap.c
+++ b/lib/pci_iomap.c
@@ -34,7 +34,9 @@ void __iomem *pci_iomap_range(struct pci_dev *dev,
 	resource_size_t len = pci_resource_len(dev, bar);
 	unsigned long flags = pci_resource_flags(dev, bar);
 
-	if (len <= offset || !start)
+	if (flags & IORESOURCE_MEM && !start)
+		return NULL;
+	if (len <= offset)
 		return NULL;
 	len -= offset;
 	start += offset;


[1] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/acpi/pci_root.c?id=v5.13#n915
[2] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/net/ethernet/8390/ne2k-pci.c?id=v5.13#n247
[3] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/net/ethernet/dec/tulip/tulip_core.c?id=v5.13#n1418
[4] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/scsi/am53c974.c?id=v5.13#n431
[5] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/lib/pci_iomap.c?id=v5.13#n37

> > >>> and the controller does not instantiate. The problem disapears after
> > >>> reverting commit 0cf8882fd0.
> > >>>
> > >>> Attached is a summary of test runs with various devices and qemu v5.2
> > >>> as well as qemu v6.0, and the command line I use for efi boots.
> > >>>
> > >>> Did commit 0cf8882fd0 introduce a bug, do I now need need some different
> > >>> command line to instantiate PCI devices with io ports, or are such
> > >>> devices
> > >>> simply no longer supported if the system is booted with efi support ?
> > >>>
> > >>> Thanks,
> > >>> Guenter
> > >>
> > >>
> > >> So that commit basically just says don't ignore what efi did.
> > >>
> > >> The issue's thus likely efi.
> > >>
> > >
> > > I don't see the problem with efi boots on x86 and x86_64.
> > > Any idea why that might be the case ?
> > >
> > > Thanks,
> > > Guenter
> > >
> > >> Cc the maintainer. Philippe can you comment pls?
> >
> > I'll have a look. Cc'ing Ard for EDK2/Aarch64.
> 
> So a potential workaround would be to use a different I/O resource
> window for ArmVirtPkg, that starts at 0x1000. But I would prefer to
> fix Linux instead.
> 
> 
> > >>
> > >>> ---
> > >>> Command line (tulip network interface):
> > >>>
> > >>> CMDLINE="root=/dev/vda console=ttyAMA0"
> > >>> ROOTFS="rootfs.ext2"
> > >>>
> > >>> qemu-system-aarch64 -M virt -kernel arch/arm64/boot/Image -no-reboot \
> > >>>          -m 512 -cpu cortex-a57 -no-reboot \
> > >>>          -device tulip,netdev=net0 -netdev user,id=net0 \
> > >>>          -bios QEMU_EFI-aarch64.fd \
> > >>>          -snapshot \
> > >>>          -device virtio-blk-device,drive=d0 \
> > >>>          -drive file=${ROOTFS},if=none,id=d0,format=raw \
> > >>>          -nographic -serial stdio -monitor none \
> > >>>          --append "${CMDLINE}"
> > >>>
> > >>> ---
> > >>> Boot tests with various devices known to work in qemu v5.2.
> > >>>
> > >>>         v5.2    v6.0    v6.0
> > >>>         efi    non-efi    efi
> > >>> e1000        pass    pass    pass
> > >>> e1000-82544gc    pass    pass    pass
> > >>> e1000-82545em    pass    pass    pass
> > >>> e1000e        pass    pass    pass
> > >>> i82550        pass    pass    pass
> > >>> i82557a        pass    pass    pass
> > >>> i82557b        pass    pass    pass
> > >>> i82557c        pass    pass    pass
> > >>> i82558a        pass    pass    pass
> > >>> i82559b        pass    pass    pass
> > >>> i82559c        pass    pass    pass
> > >>> i82559er    pass    pass    pass
> > >>> i82562        pass    pass    pass
> > >>> i82801        pass    pass    pass
> > >>> ne2k_pci    pass    pass    fail    <--
> > >>> pcnet        pass    pass    pass
> > >>> rtl8139        pass    pass    pass
> > >>> tulip        pass    pass    fail    <--
> > >>> usb-net        pass    pass    pass
> > >>> virtio-net-device
> > >>>         pass    pass    pass
> > >>> virtio-net-pci    pass    pass    pass
> > >>> virtio-net-pci-non-transitional
> > >>>         pass    pass    pass
> > >>>
> > >>> usb-xhci    pass    pass    pass
> > >>> usb-ehci    pass    pass    pass
> > >>> usb-ohci    pass    pass    pass
> > >>> usb-uas-xhci    pass    pass    pass
> > >>> virtio        pass    pass    pass
> > >>> virtio-blk-pci    pass    pass    pass
> > >>> virtio-blk-device
> > >>>         pass    pass    pass
> > >>> nvme        pass    pass    pass
> > >>> sdhci        pass    pass    pass
> > >>> dc390        pass    pass    fail    <--
> > >>> am53c974    pass    pass    fail    <--
> > >>> lsi53c895ai    pass    pass    pass
> > >>> mptsas1068    pass    pass    pass
> > >>> lsi53c810    pass    pass    pass
> > >>> megasas        pass    pass    pass
> > >>> megasas-gen2    pass    pass    pass
> > >>> virtio-scsi-device
> > >>>         pass    pass    pass
> > >>> virtio-scsi-pci    pass    pass    pass
> > >>
> > >
> >
> 


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* Re: aarch64 efi boot failures with qemu 6.0+
  2021-07-26 21:16         ` Bjorn Helgaas
@ 2021-07-26 21:31             ` Bjorn Helgaas
  0 siblings, 0 replies; 31+ messages in thread
From: Bjorn Helgaas @ 2021-07-26 21:31 UTC (permalink / raw)
  To: Ard Biesheuvel
  Cc: Philippe Mathieu-Daudé,
	Bjorn Helgaas, Jiahui Cen, Michael S. Tsirkin, Ard Biesheuvel,
	qemu-devel, Igor Mammedov, Guenter Roeck, linux-pci

[+cc linux-pci]

On Mon, Jul 26, 2021 at 04:16:29PM -0500, Bjorn Helgaas wrote:
> On Mon, Jul 26, 2021 at 06:00:57PM +0200, Ard Biesheuvel wrote:
> > On Mon, 26 Jul 2021 at 11:08, Philippe Mathieu-Daudé <philmd@redhat.com> wrote:
> > > On 7/26/21 12:56 AM, Guenter Roeck wrote:
> > > > On 7/25/21 3:14 PM, Michael S. Tsirkin wrote:
> > > >> On Sat, Jul 24, 2021 at 11:52:34AM -0700, Guenter Roeck wrote:
> > > >>> Hi all,
> > > >>>
> > > >>> starting with qemu v6.0, some of my aarch64 efi boot tests no longer
> > > >>> work. Analysis shows that PCI devices with IO ports do not instantiate
> > > >>> in qemu v6.0 (or v6.1-rc0) when booting through efi. The problem affects
> > > >>> (at least) ne2k_pci, tulip, dc390, and am53c974. The problem only
> > > >>> affects
> > > >>> aarch64, not x86/x86_64.
> > > >>>
> > > >>> I bisected the problem to commit 0cf8882fd0 ("acpi/gpex: Inform os to
> > > >>> keep firmware resource map"). Since this commit, PCI device BAR
> > > >>> allocation has changed. Taking tulip as example, the kernel reports
> > > >>> the following PCI bar assignments when running qemu v5.2.
> > > >>>
> > > >>> [    3.921801] pci 0000:00:01.0: [1011:0019] type 00 class 0x020000
> > > >>> [    3.922207] pci 0000:00:01.0: reg 0x10: [io  0x0000-0x007f]
> > > >>> [    3.922505] pci 0000:00:01.0: reg 0x14: [mem 0x10000000-0x1000007f]
> > 
> > IIUC, these lines are read back from the BARs
> > 
> > > >>> [    3.927111] pci 0000:00:01.0: BAR 0: assigned [io  0x1000-0x107f]
> > > >>> [    3.927455] pci 0000:00:01.0: BAR 1: assigned [mem
> > > >>> 0x10000000-0x1000007f]
> > > >>>
> > 
> > ... and this is the assignment created by the kernel.
> > 
> > > >>> With qemu v6.0, the assignment is reported as follows.
> > > >>>
> > > >>> [    3.922887] pci 0000:00:01.0: [1011:0019] type 00 class 0x020000
> > > >>> [    3.923278] pci 0000:00:01.0: reg 0x10: [io  0x0000-0x007f]
> > > >>> [    3.923451] pci 0000:00:01.0: reg 0x14: [mem 0x10000000-0x1000007f]
> > 
> > The problem here is that Linux, for legacy reasons, does not support
> > I/O ports <= 0x1000 on PCI, so the I/O assignment created by EFI is
> > rejected.
> > 
> > This might make sense on x86, where legacy I/O ports may exist, but on
> > other architectures, this makes no sense.
> 
> I guess this is the "#define PCIBIOS_MIN_IO 0x1000" in
> arm64/include/asm/pci.h.  From a PCI point of view, I'm not opposed to
> changing that to 0, as it is on csky, riscv, sh, sparc, um.  But it's
> really an arch question, so the arm64 folks would have to weigh in.
> 
> But I don't think that would fix this.  PCIBIOS_MIN_IO is mainly used
> when we assign or reassign resources to a BAR, and if firmware tells
> us to preserve the assignments done by firmware, Linux shouldn't be
> doing any assignment or reassignment.
> 
> Linux received 00:01.0 BAR 0 as [io 0x0000-0x007f], and Guenter didn't
> report any reassignment, so I assume Linux saw the
> DSM_PCI_PRESERVE_BOOT_CONFIG [1] and didn't change anything.
> 
> Could this be due to drivers assuming that an I/O BAR of 0 is invalid?
> I see that at least ne2k_pci_init_one() [2] seems to assume that.  And
> tulip_init_one() [3] and pci_esp_probe_one() (am53c974.c, [4]) use
> pci_iomap() [5], which fails if the resource starts at 0.
> 
> So pci_iomap() is probably already broken on the arches above that
> allow I/O BARs to be zero.  Maybe pci_iomap() should only fail on
> "!start" for *memory* BARs, e.g.,
> 
> diff --git a/lib/pci_iomap.c b/lib/pci_iomap.c
> index 2d3eb1cb73b8..77455e702a3e 100644
> --- a/lib/pci_iomap.c
> +++ b/lib/pci_iomap.c
> @@ -34,7 +34,9 @@ void __iomem *pci_iomap_range(struct pci_dev *dev,
>  	resource_size_t len = pci_resource_len(dev, bar);
>  	unsigned long flags = pci_resource_flags(dev, bar);
>  
> -	if (len <= offset || !start)
> +	if (flags & IORESOURCE_MEM && !start)
> +		return NULL;
> +	if (len <= offset)
>  		return NULL;
>  	len -= offset;
>  	start += offset;
> 
> 
> [1] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/acpi/pci_root.c?id=v5.13#n915
> [2] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/net/ethernet/8390/ne2k-pci.c?id=v5.13#n247
> [3] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/net/ethernet/dec/tulip/tulip_core.c?id=v5.13#n1418
> [4] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/scsi/am53c974.c?id=v5.13#n431
> [5] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/lib/pci_iomap.c?id=v5.13#n37
> 
> > > >>> and the controller does not instantiate. The problem disapears after
> > > >>> reverting commit 0cf8882fd0.
> > > >>>
> > > >>> Attached is a summary of test runs with various devices and qemu v5.2
> > > >>> as well as qemu v6.0, and the command line I use for efi boots.
> > > >>>
> > > >>> Did commit 0cf8882fd0 introduce a bug, do I now need need some different
> > > >>> command line to instantiate PCI devices with io ports, or are such
> > > >>> devices
> > > >>> simply no longer supported if the system is booted with efi support ?
> > > >>>
> > > >>> Thanks,
> > > >>> Guenter
> > > >>
> > > >>
> > > >> So that commit basically just says don't ignore what efi did.
> > > >>
> > > >> The issue's thus likely efi.
> > > >>
> > > >
> > > > I don't see the problem with efi boots on x86 and x86_64.
> > > > Any idea why that might be the case ?
> > > >
> > > > Thanks,
> > > > Guenter
> > > >
> > > >> Cc the maintainer. Philippe can you comment pls?
> > >
> > > I'll have a look. Cc'ing Ard for EDK2/Aarch64.
> > 
> > So a potential workaround would be to use a different I/O resource
> > window for ArmVirtPkg, that starts at 0x1000. But I would prefer to
> > fix Linux instead.
> > 
> > 
> > > >>
> > > >>> ---
> > > >>> Command line (tulip network interface):
> > > >>>
> > > >>> CMDLINE="root=/dev/vda console=ttyAMA0"
> > > >>> ROOTFS="rootfs.ext2"
> > > >>>
> > > >>> qemu-system-aarch64 -M virt -kernel arch/arm64/boot/Image -no-reboot \
> > > >>>          -m 512 -cpu cortex-a57 -no-reboot \
> > > >>>          -device tulip,netdev=net0 -netdev user,id=net0 \
> > > >>>          -bios QEMU_EFI-aarch64.fd \
> > > >>>          -snapshot \
> > > >>>          -device virtio-blk-device,drive=d0 \
> > > >>>          -drive file=${ROOTFS},if=none,id=d0,format=raw \
> > > >>>          -nographic -serial stdio -monitor none \
> > > >>>          --append "${CMDLINE}"
> > > >>>
> > > >>> ---
> > > >>> Boot tests with various devices known to work in qemu v5.2.
> > > >>>
> > > >>>         v5.2    v6.0    v6.0
> > > >>>         efi    non-efi    efi
> > > >>> e1000        pass    pass    pass
> > > >>> e1000-82544gc    pass    pass    pass
> > > >>> e1000-82545em    pass    pass    pass
> > > >>> e1000e        pass    pass    pass
> > > >>> i82550        pass    pass    pass
> > > >>> i82557a        pass    pass    pass
> > > >>> i82557b        pass    pass    pass
> > > >>> i82557c        pass    pass    pass
> > > >>> i82558a        pass    pass    pass
> > > >>> i82559b        pass    pass    pass
> > > >>> i82559c        pass    pass    pass
> > > >>> i82559er    pass    pass    pass
> > > >>> i82562        pass    pass    pass
> > > >>> i82801        pass    pass    pass
> > > >>> ne2k_pci    pass    pass    fail    <--
> > > >>> pcnet        pass    pass    pass
> > > >>> rtl8139        pass    pass    pass
> > > >>> tulip        pass    pass    fail    <--
> > > >>> usb-net        pass    pass    pass
> > > >>> virtio-net-device
> > > >>>         pass    pass    pass
> > > >>> virtio-net-pci    pass    pass    pass
> > > >>> virtio-net-pci-non-transitional
> > > >>>         pass    pass    pass
> > > >>>
> > > >>> usb-xhci    pass    pass    pass
> > > >>> usb-ehci    pass    pass    pass
> > > >>> usb-ohci    pass    pass    pass
> > > >>> usb-uas-xhci    pass    pass    pass
> > > >>> virtio        pass    pass    pass
> > > >>> virtio-blk-pci    pass    pass    pass
> > > >>> virtio-blk-device
> > > >>>         pass    pass    pass
> > > >>> nvme        pass    pass    pass
> > > >>> sdhci        pass    pass    pass
> > > >>> dc390        pass    pass    fail    <--
> > > >>> am53c974    pass    pass    fail    <--
> > > >>> lsi53c895ai    pass    pass    pass
> > > >>> mptsas1068    pass    pass    pass
> > > >>> lsi53c810    pass    pass    pass
> > > >>> megasas        pass    pass    pass
> > > >>> megasas-gen2    pass    pass    pass
> > > >>> virtio-scsi-device
> > > >>>         pass    pass    pass
> > > >>> virtio-scsi-pci    pass    pass    pass
> > > >>
> > > >
> > >
> > 

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: aarch64 efi boot failures with qemu 6.0+
@ 2021-07-26 21:31             ` Bjorn Helgaas
  0 siblings, 0 replies; 31+ messages in thread
From: Bjorn Helgaas @ 2021-07-26 21:31 UTC (permalink / raw)
  To: Ard Biesheuvel
  Cc: Jiahui Cen, Michael S. Tsirkin, linux-pci, Ard Biesheuvel,
	qemu-devel, Bjorn Helgaas, Igor Mammedov,
	Philippe Mathieu-Daudé,
	Guenter Roeck

[+cc linux-pci]

On Mon, Jul 26, 2021 at 04:16:29PM -0500, Bjorn Helgaas wrote:
> On Mon, Jul 26, 2021 at 06:00:57PM +0200, Ard Biesheuvel wrote:
> > On Mon, 26 Jul 2021 at 11:08, Philippe Mathieu-Daudé <philmd@redhat.com> wrote:
> > > On 7/26/21 12:56 AM, Guenter Roeck wrote:
> > > > On 7/25/21 3:14 PM, Michael S. Tsirkin wrote:
> > > >> On Sat, Jul 24, 2021 at 11:52:34AM -0700, Guenter Roeck wrote:
> > > >>> Hi all,
> > > >>>
> > > >>> starting with qemu v6.0, some of my aarch64 efi boot tests no longer
> > > >>> work. Analysis shows that PCI devices with IO ports do not instantiate
> > > >>> in qemu v6.0 (or v6.1-rc0) when booting through efi. The problem affects
> > > >>> (at least) ne2k_pci, tulip, dc390, and am53c974. The problem only
> > > >>> affects
> > > >>> aarch64, not x86/x86_64.
> > > >>>
> > > >>> I bisected the problem to commit 0cf8882fd0 ("acpi/gpex: Inform os to
> > > >>> keep firmware resource map"). Since this commit, PCI device BAR
> > > >>> allocation has changed. Taking tulip as example, the kernel reports
> > > >>> the following PCI bar assignments when running qemu v5.2.
> > > >>>
> > > >>> [    3.921801] pci 0000:00:01.0: [1011:0019] type 00 class 0x020000
> > > >>> [    3.922207] pci 0000:00:01.0: reg 0x10: [io  0x0000-0x007f]
> > > >>> [    3.922505] pci 0000:00:01.0: reg 0x14: [mem 0x10000000-0x1000007f]
> > 
> > IIUC, these lines are read back from the BARs
> > 
> > > >>> [    3.927111] pci 0000:00:01.0: BAR 0: assigned [io  0x1000-0x107f]
> > > >>> [    3.927455] pci 0000:00:01.0: BAR 1: assigned [mem
> > > >>> 0x10000000-0x1000007f]
> > > >>>
> > 
> > ... and this is the assignment created by the kernel.
> > 
> > > >>> With qemu v6.0, the assignment is reported as follows.
> > > >>>
> > > >>> [    3.922887] pci 0000:00:01.0: [1011:0019] type 00 class 0x020000
> > > >>> [    3.923278] pci 0000:00:01.0: reg 0x10: [io  0x0000-0x007f]
> > > >>> [    3.923451] pci 0000:00:01.0: reg 0x14: [mem 0x10000000-0x1000007f]
> > 
> > The problem here is that Linux, for legacy reasons, does not support
> > I/O ports <= 0x1000 on PCI, so the I/O assignment created by EFI is
> > rejected.
> > 
> > This might make sense on x86, where legacy I/O ports may exist, but on
> > other architectures, this makes no sense.
> 
> I guess this is the "#define PCIBIOS_MIN_IO 0x1000" in
> arm64/include/asm/pci.h.  From a PCI point of view, I'm not opposed to
> changing that to 0, as it is on csky, riscv, sh, sparc, um.  But it's
> really an arch question, so the arm64 folks would have to weigh in.
> 
> But I don't think that would fix this.  PCIBIOS_MIN_IO is mainly used
> when we assign or reassign resources to a BAR, and if firmware tells
> us to preserve the assignments done by firmware, Linux shouldn't be
> doing any assignment or reassignment.
> 
> Linux received 00:01.0 BAR 0 as [io 0x0000-0x007f], and Guenter didn't
> report any reassignment, so I assume Linux saw the
> DSM_PCI_PRESERVE_BOOT_CONFIG [1] and didn't change anything.
> 
> Could this be due to drivers assuming that an I/O BAR of 0 is invalid?
> I see that at least ne2k_pci_init_one() [2] seems to assume that.  And
> tulip_init_one() [3] and pci_esp_probe_one() (am53c974.c, [4]) use
> pci_iomap() [5], which fails if the resource starts at 0.
> 
> So pci_iomap() is probably already broken on the arches above that
> allow I/O BARs to be zero.  Maybe pci_iomap() should only fail on
> "!start" for *memory* BARs, e.g.,
> 
> diff --git a/lib/pci_iomap.c b/lib/pci_iomap.c
> index 2d3eb1cb73b8..77455e702a3e 100644
> --- a/lib/pci_iomap.c
> +++ b/lib/pci_iomap.c
> @@ -34,7 +34,9 @@ void __iomem *pci_iomap_range(struct pci_dev *dev,
>  	resource_size_t len = pci_resource_len(dev, bar);
>  	unsigned long flags = pci_resource_flags(dev, bar);
>  
> -	if (len <= offset || !start)
> +	if (flags & IORESOURCE_MEM && !start)
> +		return NULL;
> +	if (len <= offset)
>  		return NULL;
>  	len -= offset;
>  	start += offset;
> 
> 
> [1] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/acpi/pci_root.c?id=v5.13#n915
> [2] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/net/ethernet/8390/ne2k-pci.c?id=v5.13#n247
> [3] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/net/ethernet/dec/tulip/tulip_core.c?id=v5.13#n1418
> [4] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/scsi/am53c974.c?id=v5.13#n431
> [5] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/lib/pci_iomap.c?id=v5.13#n37
> 
> > > >>> and the controller does not instantiate. The problem disapears after
> > > >>> reverting commit 0cf8882fd0.
> > > >>>
> > > >>> Attached is a summary of test runs with various devices and qemu v5.2
> > > >>> as well as qemu v6.0, and the command line I use for efi boots.
> > > >>>
> > > >>> Did commit 0cf8882fd0 introduce a bug, do I now need need some different
> > > >>> command line to instantiate PCI devices with io ports, or are such
> > > >>> devices
> > > >>> simply no longer supported if the system is booted with efi support ?
> > > >>>
> > > >>> Thanks,
> > > >>> Guenter
> > > >>
> > > >>
> > > >> So that commit basically just says don't ignore what efi did.
> > > >>
> > > >> The issue's thus likely efi.
> > > >>
> > > >
> > > > I don't see the problem with efi boots on x86 and x86_64.
> > > > Any idea why that might be the case ?
> > > >
> > > > Thanks,
> > > > Guenter
> > > >
> > > >> Cc the maintainer. Philippe can you comment pls?
> > >
> > > I'll have a look. Cc'ing Ard for EDK2/Aarch64.
> > 
> > So a potential workaround would be to use a different I/O resource
> > window for ArmVirtPkg, that starts at 0x1000. But I would prefer to
> > fix Linux instead.
> > 
> > 
> > > >>
> > > >>> ---
> > > >>> Command line (tulip network interface):
> > > >>>
> > > >>> CMDLINE="root=/dev/vda console=ttyAMA0"
> > > >>> ROOTFS="rootfs.ext2"
> > > >>>
> > > >>> qemu-system-aarch64 -M virt -kernel arch/arm64/boot/Image -no-reboot \
> > > >>>          -m 512 -cpu cortex-a57 -no-reboot \
> > > >>>          -device tulip,netdev=net0 -netdev user,id=net0 \
> > > >>>          -bios QEMU_EFI-aarch64.fd \
> > > >>>          -snapshot \
> > > >>>          -device virtio-blk-device,drive=d0 \
> > > >>>          -drive file=${ROOTFS},if=none,id=d0,format=raw \
> > > >>>          -nographic -serial stdio -monitor none \
> > > >>>          --append "${CMDLINE}"
> > > >>>
> > > >>> ---
> > > >>> Boot tests with various devices known to work in qemu v5.2.
> > > >>>
> > > >>>         v5.2    v6.0    v6.0
> > > >>>         efi    non-efi    efi
> > > >>> e1000        pass    pass    pass
> > > >>> e1000-82544gc    pass    pass    pass
> > > >>> e1000-82545em    pass    pass    pass
> > > >>> e1000e        pass    pass    pass
> > > >>> i82550        pass    pass    pass
> > > >>> i82557a        pass    pass    pass
> > > >>> i82557b        pass    pass    pass
> > > >>> i82557c        pass    pass    pass
> > > >>> i82558a        pass    pass    pass
> > > >>> i82559b        pass    pass    pass
> > > >>> i82559c        pass    pass    pass
> > > >>> i82559er    pass    pass    pass
> > > >>> i82562        pass    pass    pass
> > > >>> i82801        pass    pass    pass
> > > >>> ne2k_pci    pass    pass    fail    <--
> > > >>> pcnet        pass    pass    pass
> > > >>> rtl8139        pass    pass    pass
> > > >>> tulip        pass    pass    fail    <--
> > > >>> usb-net        pass    pass    pass
> > > >>> virtio-net-device
> > > >>>         pass    pass    pass
> > > >>> virtio-net-pci    pass    pass    pass
> > > >>> virtio-net-pci-non-transitional
> > > >>>         pass    pass    pass
> > > >>>
> > > >>> usb-xhci    pass    pass    pass
> > > >>> usb-ehci    pass    pass    pass
> > > >>> usb-ohci    pass    pass    pass
> > > >>> usb-uas-xhci    pass    pass    pass
> > > >>> virtio        pass    pass    pass
> > > >>> virtio-blk-pci    pass    pass    pass
> > > >>> virtio-blk-device
> > > >>>         pass    pass    pass
> > > >>> nvme        pass    pass    pass
> > > >>> sdhci        pass    pass    pass
> > > >>> dc390        pass    pass    fail    <--
> > > >>> am53c974    pass    pass    fail    <--
> > > >>> lsi53c895ai    pass    pass    pass
> > > >>> mptsas1068    pass    pass    pass
> > > >>> lsi53c810    pass    pass    pass
> > > >>> megasas        pass    pass    pass
> > > >>> megasas-gen2    pass    pass    pass
> > > >>> virtio-scsi-device
> > > >>>         pass    pass    pass
> > > >>> virtio-scsi-pci    pass    pass    pass
> > > >>
> > > >
> > >
> > 


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: aarch64 efi boot failures with qemu 6.0+
  2021-07-26 21:31             ` Bjorn Helgaas
@ 2021-07-27  4:22               ` Guenter Roeck
  -1 siblings, 0 replies; 31+ messages in thread
From: Guenter Roeck @ 2021-07-27  4:22 UTC (permalink / raw)
  To: Bjorn Helgaas, Ard Biesheuvel
  Cc: Philippe Mathieu-Daudé,
	Bjorn Helgaas, Jiahui Cen, Michael S. Tsirkin, Ard Biesheuvel,
	qemu-devel, Igor Mammedov, linux-pci

On 7/26/21 2:31 PM, Bjorn Helgaas wrote:
> [+cc linux-pci]
> 
> On Mon, Jul 26, 2021 at 04:16:29PM -0500, Bjorn Helgaas wrote:
>> On Mon, Jul 26, 2021 at 06:00:57PM +0200, Ard Biesheuvel wrote:
>>> On Mon, 26 Jul 2021 at 11:08, Philippe Mathieu-Daudé <philmd@redhat.com> wrote:
>>>> On 7/26/21 12:56 AM, Guenter Roeck wrote:
>>>>> On 7/25/21 3:14 PM, Michael S. Tsirkin wrote:
>>>>>> On Sat, Jul 24, 2021 at 11:52:34AM -0700, Guenter Roeck wrote:
>>>>>>> Hi all,
>>>>>>>
>>>>>>> starting with qemu v6.0, some of my aarch64 efi boot tests no longer
>>>>>>> work. Analysis shows that PCI devices with IO ports do not instantiate
>>>>>>> in qemu v6.0 (or v6.1-rc0) when booting through efi. The problem affects
>>>>>>> (at least) ne2k_pci, tulip, dc390, and am53c974. The problem only
>>>>>>> affects
>>>>>>> aarch64, not x86/x86_64.
>>>>>>>
>>>>>>> I bisected the problem to commit 0cf8882fd0 ("acpi/gpex: Inform os to
>>>>>>> keep firmware resource map"). Since this commit, PCI device BAR
>>>>>>> allocation has changed. Taking tulip as example, the kernel reports
>>>>>>> the following PCI bar assignments when running qemu v5.2.
>>>>>>>
>>>>>>> [    3.921801] pci 0000:00:01.0: [1011:0019] type 00 class 0x020000
>>>>>>> [    3.922207] pci 0000:00:01.0: reg 0x10: [io  0x0000-0x007f]
>>>>>>> [    3.922505] pci 0000:00:01.0: reg 0x14: [mem 0x10000000-0x1000007f]
>>>
>>> IIUC, these lines are read back from the BARs
>>>
>>>>>>> [    3.927111] pci 0000:00:01.0: BAR 0: assigned [io  0x1000-0x107f]
>>>>>>> [    3.927455] pci 0000:00:01.0: BAR 1: assigned [mem
>>>>>>> 0x10000000-0x1000007f]
>>>>>>>
>>>
>>> ... and this is the assignment created by the kernel.
>>>
>>>>>>> With qemu v6.0, the assignment is reported as follows.
>>>>>>>
>>>>>>> [    3.922887] pci 0000:00:01.0: [1011:0019] type 00 class 0x020000
>>>>>>> [    3.923278] pci 0000:00:01.0: reg 0x10: [io  0x0000-0x007f]
>>>>>>> [    3.923451] pci 0000:00:01.0: reg 0x14: [mem 0x10000000-0x1000007f]
>>>
>>> The problem here is that Linux, for legacy reasons, does not support
>>> I/O ports <= 0x1000 on PCI, so the I/O assignment created by EFI is
>>> rejected.
>>>
>>> This might make sense on x86, where legacy I/O ports may exist, but on
>>> other architectures, this makes no sense.
>>
>> I guess this is the "#define PCIBIOS_MIN_IO 0x1000" in
>> arm64/include/asm/pci.h.  From a PCI point of view, I'm not opposed to
>> changing that to 0, as it is on csky, riscv, sh, sparc, um.  But it's
>> really an arch question, so the arm64 folks would have to weigh in.
>>
>> But I don't think that would fix this.  PCIBIOS_MIN_IO is mainly used
>> when we assign or reassign resources to a BAR, and if firmware tells
>> us to preserve the assignments done by firmware, Linux shouldn't be
>> doing any assignment or reassignment.
>>
>> Linux received 00:01.0 BAR 0 as [io 0x0000-0x007f], and Guenter didn't
>> report any reassignment, so I assume Linux saw the
>> DSM_PCI_PRESERVE_BOOT_CONFIG [1] and didn't change anything.
>>
>> Could this be due to drivers assuming that an I/O BAR of 0 is invalid?
>> I see that at least ne2k_pci_init_one() [2] seems to assume that.  And

Correct, and ne2k_pci is known to already fail on architectures where the
IO address range starts at 0, such as riscv. Not that it helps to fix the
code - doing so only results in a crash elsewhere when running a riscv
emulation (when executing outsl, suggesting that there may be a problem
with that emulation or its use). But that is a different problem.

>> tulip_init_one() [3] and pci_esp_probe_one() (am53c974.c, [4]) use
>> pci_iomap() [5], which fails if the resource starts at 0.
>>
>> So pci_iomap() is probably already broken on the arches above that
>> allow I/O BARs to be zero.  Maybe pci_iomap() should only fail on
>> "!start" for *memory* BARs, e.g.,
>>
>> diff --git a/lib/pci_iomap.c b/lib/pci_iomap.c
>> index 2d3eb1cb73b8..77455e702a3e 100644
>> --- a/lib/pci_iomap.c
>> +++ b/lib/pci_iomap.c
>> @@ -34,7 +34,9 @@ void __iomem *pci_iomap_range(struct pci_dev *dev,
>>   	resource_size_t len = pci_resource_len(dev, bar);
>>   	unsigned long flags = pci_resource_flags(dev, bar);
>>   
>> -	if (len <= offset || !start)
>> +	if (flags & IORESOURCE_MEM && !start)
>> +		return NULL;

I am far out of my league here, but what is the purpose of the !start
check given the PCIBIOS_MIN_MEM define which can also be 0 ? Shouldn't
the check be against PCIBIOS_MIN_MEM and PCIBIOS_MIN_IO ?

But, anyway, the above change fixes the problem for 'tulip', though
obviously not for 'ne2k_pci'. 'ne2k_pci' starts working if I remove
the "!ioaddr" check in ne2k_pci_init_one().

Thanks,
Guenter

>> +	if (len <= offset)
>>   		return NULL;
>>   	len -= offset;
>>   	start += offset;
>>
>>
>> [1] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/acpi/pci_root.c?id=v5.13#n915
>> [2] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/net/ethernet/8390/ne2k-pci.c?id=v5.13#n247
>> [3] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/net/ethernet/dec/tulip/tulip_core.c?id=v5.13#n1418
>> [4] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/scsi/am53c974.c?id=v5.13#n431
>> [5] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/lib/pci_iomap.c?id=v5.13#n37
>>
>>>>>>> and the controller does not instantiate. The problem disapears after
>>>>>>> reverting commit 0cf8882fd0.
>>>>>>>
>>>>>>> Attached is a summary of test runs with various devices and qemu v5.2
>>>>>>> as well as qemu v6.0, and the command line I use for efi boots.
>>>>>>>
>>>>>>> Did commit 0cf8882fd0 introduce a bug, do I now need need some different
>>>>>>> command line to instantiate PCI devices with io ports, or are such
>>>>>>> devices
>>>>>>> simply no longer supported if the system is booted with efi support ?
>>>>>>>
>>>>>>> Thanks,
>>>>>>> Guenter
>>>>>>
>>>>>>
>>>>>> So that commit basically just says don't ignore what efi did.
>>>>>>
>>>>>> The issue's thus likely efi.
>>>>>>
>>>>>
>>>>> I don't see the problem with efi boots on x86 and x86_64.
>>>>> Any idea why that might be the case ?
>>>>>
>>>>> Thanks,
>>>>> Guenter
>>>>>
>>>>>> Cc the maintainer. Philippe can you comment pls?
>>>>
>>>> I'll have a look. Cc'ing Ard for EDK2/Aarch64.
>>>
>>> So a potential workaround would be to use a different I/O resource
>>> window for ArmVirtPkg, that starts at 0x1000. But I would prefer to
>>> fix Linux instead.
>>>
>>>
>>>>>>
>>>>>>> ---
>>>>>>> Command line (tulip network interface):
>>>>>>>
>>>>>>> CMDLINE="root=/dev/vda console=ttyAMA0"
>>>>>>> ROOTFS="rootfs.ext2"
>>>>>>>
>>>>>>> qemu-system-aarch64 -M virt -kernel arch/arm64/boot/Image -no-reboot \
>>>>>>>           -m 512 -cpu cortex-a57 -no-reboot \
>>>>>>>           -device tulip,netdev=net0 -netdev user,id=net0 \
>>>>>>>           -bios QEMU_EFI-aarch64.fd \
>>>>>>>           -snapshot \
>>>>>>>           -device virtio-blk-device,drive=d0 \
>>>>>>>           -drive file=${ROOTFS},if=none,id=d0,format=raw \
>>>>>>>           -nographic -serial stdio -monitor none \
>>>>>>>           --append "${CMDLINE}"
>>>>>>>
>>>>>>> ---
>>>>>>> Boot tests with various devices known to work in qemu v5.2.
>>>>>>>
>>>>>>>          v5.2    v6.0    v6.0
>>>>>>>          efi    non-efi    efi
>>>>>>> e1000        pass    pass    pass
>>>>>>> e1000-82544gc    pass    pass    pass
>>>>>>> e1000-82545em    pass    pass    pass
>>>>>>> e1000e        pass    pass    pass
>>>>>>> i82550        pass    pass    pass
>>>>>>> i82557a        pass    pass    pass
>>>>>>> i82557b        pass    pass    pass
>>>>>>> i82557c        pass    pass    pass
>>>>>>> i82558a        pass    pass    pass
>>>>>>> i82559b        pass    pass    pass
>>>>>>> i82559c        pass    pass    pass
>>>>>>> i82559er    pass    pass    pass
>>>>>>> i82562        pass    pass    pass
>>>>>>> i82801        pass    pass    pass
>>>>>>> ne2k_pci    pass    pass    fail    <--
>>>>>>> pcnet        pass    pass    pass
>>>>>>> rtl8139        pass    pass    pass
>>>>>>> tulip        pass    pass    fail    <--
>>>>>>> usb-net        pass    pass    pass
>>>>>>> virtio-net-device
>>>>>>>          pass    pass    pass
>>>>>>> virtio-net-pci    pass    pass    pass
>>>>>>> virtio-net-pci-non-transitional
>>>>>>>          pass    pass    pass
>>>>>>>
>>>>>>> usb-xhci    pass    pass    pass
>>>>>>> usb-ehci    pass    pass    pass
>>>>>>> usb-ohci    pass    pass    pass
>>>>>>> usb-uas-xhci    pass    pass    pass
>>>>>>> virtio        pass    pass    pass
>>>>>>> virtio-blk-pci    pass    pass    pass
>>>>>>> virtio-blk-device
>>>>>>>          pass    pass    pass
>>>>>>> nvme        pass    pass    pass
>>>>>>> sdhci        pass    pass    pass
>>>>>>> dc390        pass    pass    fail    <--
>>>>>>> am53c974    pass    pass    fail    <--
>>>>>>> lsi53c895ai    pass    pass    pass
>>>>>>> mptsas1068    pass    pass    pass
>>>>>>> lsi53c810    pass    pass    pass
>>>>>>> megasas        pass    pass    pass
>>>>>>> megasas-gen2    pass    pass    pass
>>>>>>> virtio-scsi-device
>>>>>>>          pass    pass    pass
>>>>>>> virtio-scsi-pci    pass    pass    pass
>>>>>>
>>>>>
>>>>
>>>


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: aarch64 efi boot failures with qemu 6.0+
@ 2021-07-27  4:22               ` Guenter Roeck
  0 siblings, 0 replies; 31+ messages in thread
From: Guenter Roeck @ 2021-07-27  4:22 UTC (permalink / raw)
  To: Bjorn Helgaas, Ard Biesheuvel
  Cc: Jiahui Cen, Michael S. Tsirkin, linux-pci, Ard Biesheuvel,
	qemu-devel, Bjorn Helgaas, Igor Mammedov,
	Philippe Mathieu-Daudé

On 7/26/21 2:31 PM, Bjorn Helgaas wrote:
> [+cc linux-pci]
> 
> On Mon, Jul 26, 2021 at 04:16:29PM -0500, Bjorn Helgaas wrote:
>> On Mon, Jul 26, 2021 at 06:00:57PM +0200, Ard Biesheuvel wrote:
>>> On Mon, 26 Jul 2021 at 11:08, Philippe Mathieu-Daudé <philmd@redhat.com> wrote:
>>>> On 7/26/21 12:56 AM, Guenter Roeck wrote:
>>>>> On 7/25/21 3:14 PM, Michael S. Tsirkin wrote:
>>>>>> On Sat, Jul 24, 2021 at 11:52:34AM -0700, Guenter Roeck wrote:
>>>>>>> Hi all,
>>>>>>>
>>>>>>> starting with qemu v6.0, some of my aarch64 efi boot tests no longer
>>>>>>> work. Analysis shows that PCI devices with IO ports do not instantiate
>>>>>>> in qemu v6.0 (or v6.1-rc0) when booting through efi. The problem affects
>>>>>>> (at least) ne2k_pci, tulip, dc390, and am53c974. The problem only
>>>>>>> affects
>>>>>>> aarch64, not x86/x86_64.
>>>>>>>
>>>>>>> I bisected the problem to commit 0cf8882fd0 ("acpi/gpex: Inform os to
>>>>>>> keep firmware resource map"). Since this commit, PCI device BAR
>>>>>>> allocation has changed. Taking tulip as example, the kernel reports
>>>>>>> the following PCI bar assignments when running qemu v5.2.
>>>>>>>
>>>>>>> [    3.921801] pci 0000:00:01.0: [1011:0019] type 00 class 0x020000
>>>>>>> [    3.922207] pci 0000:00:01.0: reg 0x10: [io  0x0000-0x007f]
>>>>>>> [    3.922505] pci 0000:00:01.0: reg 0x14: [mem 0x10000000-0x1000007f]
>>>
>>> IIUC, these lines are read back from the BARs
>>>
>>>>>>> [    3.927111] pci 0000:00:01.0: BAR 0: assigned [io  0x1000-0x107f]
>>>>>>> [    3.927455] pci 0000:00:01.0: BAR 1: assigned [mem
>>>>>>> 0x10000000-0x1000007f]
>>>>>>>
>>>
>>> ... and this is the assignment created by the kernel.
>>>
>>>>>>> With qemu v6.0, the assignment is reported as follows.
>>>>>>>
>>>>>>> [    3.922887] pci 0000:00:01.0: [1011:0019] type 00 class 0x020000
>>>>>>> [    3.923278] pci 0000:00:01.0: reg 0x10: [io  0x0000-0x007f]
>>>>>>> [    3.923451] pci 0000:00:01.0: reg 0x14: [mem 0x10000000-0x1000007f]
>>>
>>> The problem here is that Linux, for legacy reasons, does not support
>>> I/O ports <= 0x1000 on PCI, so the I/O assignment created by EFI is
>>> rejected.
>>>
>>> This might make sense on x86, where legacy I/O ports may exist, but on
>>> other architectures, this makes no sense.
>>
>> I guess this is the "#define PCIBIOS_MIN_IO 0x1000" in
>> arm64/include/asm/pci.h.  From a PCI point of view, I'm not opposed to
>> changing that to 0, as it is on csky, riscv, sh, sparc, um.  But it's
>> really an arch question, so the arm64 folks would have to weigh in.
>>
>> But I don't think that would fix this.  PCIBIOS_MIN_IO is mainly used
>> when we assign or reassign resources to a BAR, and if firmware tells
>> us to preserve the assignments done by firmware, Linux shouldn't be
>> doing any assignment or reassignment.
>>
>> Linux received 00:01.0 BAR 0 as [io 0x0000-0x007f], and Guenter didn't
>> report any reassignment, so I assume Linux saw the
>> DSM_PCI_PRESERVE_BOOT_CONFIG [1] and didn't change anything.
>>
>> Could this be due to drivers assuming that an I/O BAR of 0 is invalid?
>> I see that at least ne2k_pci_init_one() [2] seems to assume that.  And

Correct, and ne2k_pci is known to already fail on architectures where the
IO address range starts at 0, such as riscv. Not that it helps to fix the
code - doing so only results in a crash elsewhere when running a riscv
emulation (when executing outsl, suggesting that there may be a problem
with that emulation or its use). But that is a different problem.

>> tulip_init_one() [3] and pci_esp_probe_one() (am53c974.c, [4]) use
>> pci_iomap() [5], which fails if the resource starts at 0.
>>
>> So pci_iomap() is probably already broken on the arches above that
>> allow I/O BARs to be zero.  Maybe pci_iomap() should only fail on
>> "!start" for *memory* BARs, e.g.,
>>
>> diff --git a/lib/pci_iomap.c b/lib/pci_iomap.c
>> index 2d3eb1cb73b8..77455e702a3e 100644
>> --- a/lib/pci_iomap.c
>> +++ b/lib/pci_iomap.c
>> @@ -34,7 +34,9 @@ void __iomem *pci_iomap_range(struct pci_dev *dev,
>>   	resource_size_t len = pci_resource_len(dev, bar);
>>   	unsigned long flags = pci_resource_flags(dev, bar);
>>   
>> -	if (len <= offset || !start)
>> +	if (flags & IORESOURCE_MEM && !start)
>> +		return NULL;

I am far out of my league here, but what is the purpose of the !start
check given the PCIBIOS_MIN_MEM define which can also be 0 ? Shouldn't
the check be against PCIBIOS_MIN_MEM and PCIBIOS_MIN_IO ?

But, anyway, the above change fixes the problem for 'tulip', though
obviously not for 'ne2k_pci'. 'ne2k_pci' starts working if I remove
the "!ioaddr" check in ne2k_pci_init_one().

Thanks,
Guenter

>> +	if (len <= offset)
>>   		return NULL;
>>   	len -= offset;
>>   	start += offset;
>>
>>
>> [1] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/acpi/pci_root.c?id=v5.13#n915
>> [2] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/net/ethernet/8390/ne2k-pci.c?id=v5.13#n247
>> [3] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/net/ethernet/dec/tulip/tulip_core.c?id=v5.13#n1418
>> [4] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/scsi/am53c974.c?id=v5.13#n431
>> [5] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/lib/pci_iomap.c?id=v5.13#n37
>>
>>>>>>> and the controller does not instantiate. The problem disapears after
>>>>>>> reverting commit 0cf8882fd0.
>>>>>>>
>>>>>>> Attached is a summary of test runs with various devices and qemu v5.2
>>>>>>> as well as qemu v6.0, and the command line I use for efi boots.
>>>>>>>
>>>>>>> Did commit 0cf8882fd0 introduce a bug, do I now need need some different
>>>>>>> command line to instantiate PCI devices with io ports, or are such
>>>>>>> devices
>>>>>>> simply no longer supported if the system is booted with efi support ?
>>>>>>>
>>>>>>> Thanks,
>>>>>>> Guenter
>>>>>>
>>>>>>
>>>>>> So that commit basically just says don't ignore what efi did.
>>>>>>
>>>>>> The issue's thus likely efi.
>>>>>>
>>>>>
>>>>> I don't see the problem with efi boots on x86 and x86_64.
>>>>> Any idea why that might be the case ?
>>>>>
>>>>> Thanks,
>>>>> Guenter
>>>>>
>>>>>> Cc the maintainer. Philippe can you comment pls?
>>>>
>>>> I'll have a look. Cc'ing Ard for EDK2/Aarch64.
>>>
>>> So a potential workaround would be to use a different I/O resource
>>> window for ArmVirtPkg, that starts at 0x1000. But I would prefer to
>>> fix Linux instead.
>>>
>>>
>>>>>>
>>>>>>> ---
>>>>>>> Command line (tulip network interface):
>>>>>>>
>>>>>>> CMDLINE="root=/dev/vda console=ttyAMA0"
>>>>>>> ROOTFS="rootfs.ext2"
>>>>>>>
>>>>>>> qemu-system-aarch64 -M virt -kernel arch/arm64/boot/Image -no-reboot \
>>>>>>>           -m 512 -cpu cortex-a57 -no-reboot \
>>>>>>>           -device tulip,netdev=net0 -netdev user,id=net0 \
>>>>>>>           -bios QEMU_EFI-aarch64.fd \
>>>>>>>           -snapshot \
>>>>>>>           -device virtio-blk-device,drive=d0 \
>>>>>>>           -drive file=${ROOTFS},if=none,id=d0,format=raw \
>>>>>>>           -nographic -serial stdio -monitor none \
>>>>>>>           --append "${CMDLINE}"
>>>>>>>
>>>>>>> ---
>>>>>>> Boot tests with various devices known to work in qemu v5.2.
>>>>>>>
>>>>>>>          v5.2    v6.0    v6.0
>>>>>>>          efi    non-efi    efi
>>>>>>> e1000        pass    pass    pass
>>>>>>> e1000-82544gc    pass    pass    pass
>>>>>>> e1000-82545em    pass    pass    pass
>>>>>>> e1000e        pass    pass    pass
>>>>>>> i82550        pass    pass    pass
>>>>>>> i82557a        pass    pass    pass
>>>>>>> i82557b        pass    pass    pass
>>>>>>> i82557c        pass    pass    pass
>>>>>>> i82558a        pass    pass    pass
>>>>>>> i82559b        pass    pass    pass
>>>>>>> i82559c        pass    pass    pass
>>>>>>> i82559er    pass    pass    pass
>>>>>>> i82562        pass    pass    pass
>>>>>>> i82801        pass    pass    pass
>>>>>>> ne2k_pci    pass    pass    fail    <--
>>>>>>> pcnet        pass    pass    pass
>>>>>>> rtl8139        pass    pass    pass
>>>>>>> tulip        pass    pass    fail    <--
>>>>>>> usb-net        pass    pass    pass
>>>>>>> virtio-net-device
>>>>>>>          pass    pass    pass
>>>>>>> virtio-net-pci    pass    pass    pass
>>>>>>> virtio-net-pci-non-transitional
>>>>>>>          pass    pass    pass
>>>>>>>
>>>>>>> usb-xhci    pass    pass    pass
>>>>>>> usb-ehci    pass    pass    pass
>>>>>>> usb-ohci    pass    pass    pass
>>>>>>> usb-uas-xhci    pass    pass    pass
>>>>>>> virtio        pass    pass    pass
>>>>>>> virtio-blk-pci    pass    pass    pass
>>>>>>> virtio-blk-device
>>>>>>>          pass    pass    pass
>>>>>>> nvme        pass    pass    pass
>>>>>>> sdhci        pass    pass    pass
>>>>>>> dc390        pass    pass    fail    <--
>>>>>>> am53c974    pass    pass    fail    <--
>>>>>>> lsi53c895ai    pass    pass    pass
>>>>>>> mptsas1068    pass    pass    pass
>>>>>>> lsi53c810    pass    pass    pass
>>>>>>> megasas        pass    pass    pass
>>>>>>> megasas-gen2    pass    pass    pass
>>>>>>> virtio-scsi-device
>>>>>>>          pass    pass    pass
>>>>>>> virtio-scsi-pci    pass    pass    pass
>>>>>>
>>>>>
>>>>
>>>



^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: aarch64 efi boot failures with qemu 6.0+
  2021-07-26 16:00       ` Ard Biesheuvel
  2021-07-26 21:16         ` Bjorn Helgaas
@ 2021-07-27  4:45         ` Michael S. Tsirkin
  2021-07-27  5:12           ` Guenter Roeck
  1 sibling, 1 reply; 31+ messages in thread
From: Michael S. Tsirkin @ 2021-07-27  4:45 UTC (permalink / raw)
  To: Ard Biesheuvel
  Cc: Jiahui Cen, Ard Biesheuvel, qemu-devel, Bjorn Helgaas,
	Igor Mammedov, Philippe Mathieu-Daudé,
	Guenter Roeck

On Mon, Jul 26, 2021 at 06:00:57PM +0200, Ard Biesheuvel wrote:
> (cc Bjorn)
> 
> On Mon, 26 Jul 2021 at 11:08, Philippe Mathieu-Daudé <philmd@redhat.com> wrote:
> >
> > On 7/26/21 12:56 AM, Guenter Roeck wrote:
> > > On 7/25/21 3:14 PM, Michael S. Tsirkin wrote:
> > >> On Sat, Jul 24, 2021 at 11:52:34AM -0700, Guenter Roeck wrote:
> > >>> Hi all,
> > >>>
> > >>> starting with qemu v6.0, some of my aarch64 efi boot tests no longer
> > >>> work. Analysis shows that PCI devices with IO ports do not instantiate
> > >>> in qemu v6.0 (or v6.1-rc0) when booting through efi. The problem affects
> > >>> (at least) ne2k_pci, tulip, dc390, and am53c974. The problem only
> > >>> affects
> > >>> aarch64, not x86/x86_64.
> > >>>
> > >>> I bisected the problem to commit 0cf8882fd0 ("acpi/gpex: Inform os to
> > >>> keep firmware resource map"). Since this commit, PCI device BAR
> > >>> allocation has changed. Taking tulip as example, the kernel reports
> > >>> the following PCI bar assignments when running qemu v5.2.
> > >>>
> > >>> [    3.921801] pci 0000:00:01.0: [1011:0019] type 00 class 0x020000
> > >>> [    3.922207] pci 0000:00:01.0: reg 0x10: [io  0x0000-0x007f]
> > >>> [    3.922505] pci 0000:00:01.0: reg 0x14: [mem 0x10000000-0x1000007f]
> 
> IIUC, these lines are read back from the BARs
> 
> > >>> [    3.927111] pci 0000:00:01.0: BAR 0: assigned [io  0x1000-0x107f]
> > >>> [    3.927455] pci 0000:00:01.0: BAR 1: assigned [mem
> > >>> 0x10000000-0x1000007f]
> > >>>
> 
> ... and this is the assignment created by the kernel.
> 
> > >>> With qemu v6.0, the assignment is reported as follows.
> > >>>
> > >>> [    3.922887] pci 0000:00:01.0: [1011:0019] type 00 class 0x020000
> > >>> [    3.923278] pci 0000:00:01.0: reg 0x10: [io  0x0000-0x007f]
> > >>> [    3.923451] pci 0000:00:01.0: reg 0x14: [mem 0x10000000-0x1000007f]
> > >>>
> 
> The problem here is that Linux, for legacy reasons, does not support
> I/O ports <= 0x1000 on PCI, so the I/O assignment created by EFI is
> rejected.
> 
> This might make sense on x86, where legacy I/O ports may exist, but on
> other architectures, this makes no sense.


Fixing Linux makes sense but OTOH EFI probably shouldn't create mappings
that trip up existing guests, right?

> 
> > >>> and the controller does not instantiate. The problem disapears after
> > >>> reverting commit 0cf8882fd0.
> > >>>
> > >>> Attached is a summary of test runs with various devices and qemu v5.2
> > >>> as well as qemu v6.0, and the command line I use for efi boots.
> > >>>
> > >>> Did commit 0cf8882fd0 introduce a bug, do I now need need some different
> > >>> command line to instantiate PCI devices with io ports, or are such
> > >>> devices
> > >>> simply no longer supported if the system is booted with efi support ?
> > >>>
> > >>> Thanks,
> > >>> Guenter
> > >>
> > >>
> > >> So that commit basically just says don't ignore what efi did.
> > >>
> > >> The issue's thus likely efi.
> > >>
> > >
> > > I don't see the problem with efi boots on x86 and x86_64.
> > > Any idea why that might be the case ?
> > >
> > > Thanks,
> > > Guenter
> > >
> > >> Cc the maintainer. Philippe can you comment pls?
> >
> > I'll have a look. Cc'ing Ard for EDK2/Aarch64.
> >
> 
> So a potential workaround would be to use a different I/O resource
> window for ArmVirtPkg, that starts at 0x1000. But I would prefer to
> fix Linux instead.
> 
> 
> > >>
> > >>> ---
> > >>> Command line (tulip network interface):
> > >>>
> > >>> CMDLINE="root=/dev/vda console=ttyAMA0"
> > >>> ROOTFS="rootfs.ext2"
> > >>>
> > >>> qemu-system-aarch64 -M virt -kernel arch/arm64/boot/Image -no-reboot \
> > >>>          -m 512 -cpu cortex-a57 -no-reboot \
> > >>>          -device tulip,netdev=net0 -netdev user,id=net0 \
> > >>>          -bios QEMU_EFI-aarch64.fd \
> > >>>          -snapshot \
> > >>>          -device virtio-blk-device,drive=d0 \
> > >>>          -drive file=${ROOTFS},if=none,id=d0,format=raw \
> > >>>          -nographic -serial stdio -monitor none \
> > >>>          --append "${CMDLINE}"
> > >>>
> > >>> ---
> > >>> Boot tests with various devices known to work in qemu v5.2.
> > >>>
> > >>>         v5.2    v6.0    v6.0
> > >>>         efi    non-efi    efi
> > >>> e1000        pass    pass    pass
> > >>> e1000-82544gc    pass    pass    pass
> > >>> e1000-82545em    pass    pass    pass
> > >>> e1000e        pass    pass    pass
> > >>> i82550        pass    pass    pass
> > >>> i82557a        pass    pass    pass
> > >>> i82557b        pass    pass    pass
> > >>> i82557c        pass    pass    pass
> > >>> i82558a        pass    pass    pass
> > >>> i82559b        pass    pass    pass
> > >>> i82559c        pass    pass    pass
> > >>> i82559er    pass    pass    pass
> > >>> i82562        pass    pass    pass
> > >>> i82801        pass    pass    pass
> > >>> ne2k_pci    pass    pass    fail    <--
> > >>> pcnet        pass    pass    pass
> > >>> rtl8139        pass    pass    pass
> > >>> tulip        pass    pass    fail    <--
> > >>> usb-net        pass    pass    pass
> > >>> virtio-net-device
> > >>>         pass    pass    pass
> > >>> virtio-net-pci    pass    pass    pass
> > >>> virtio-net-pci-non-transitional
> > >>>         pass    pass    pass
> > >>>
> > >>> usb-xhci    pass    pass    pass
> > >>> usb-ehci    pass    pass    pass
> > >>> usb-ohci    pass    pass    pass
> > >>> usb-uas-xhci    pass    pass    pass
> > >>> virtio        pass    pass    pass
> > >>> virtio-blk-pci    pass    pass    pass
> > >>> virtio-blk-device
> > >>>         pass    pass    pass
> > >>> nvme        pass    pass    pass
> > >>> sdhci        pass    pass    pass
> > >>> dc390        pass    pass    fail    <--
> > >>> am53c974    pass    pass    fail    <--
> > >>> lsi53c895ai    pass    pass    pass
> > >>> mptsas1068    pass    pass    pass
> > >>> lsi53c810    pass    pass    pass
> > >>> megasas        pass    pass    pass
> > >>> megasas-gen2    pass    pass    pass
> > >>> virtio-scsi-device
> > >>>         pass    pass    pass
> > >>> virtio-scsi-pci    pass    pass    pass
> > >>
> > >
> >



^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: aarch64 efi boot failures with qemu 6.0+
  2021-07-27  4:45         ` Michael S. Tsirkin
@ 2021-07-27  5:12           ` Guenter Roeck
  2021-07-27  7:04             ` Ard Biesheuvel
  2021-07-27  9:01             ` Michael S. Tsirkin
  0 siblings, 2 replies; 31+ messages in thread
From: Guenter Roeck @ 2021-07-27  5:12 UTC (permalink / raw)
  To: Michael S. Tsirkin, Ard Biesheuvel
  Cc: Jiahui Cen, Ard Biesheuvel, qemu-devel, Bjorn Helgaas,
	Igor Mammedov, Philippe Mathieu-Daudé

On 7/26/21 9:45 PM, Michael S. Tsirkin wrote:
> On Mon, Jul 26, 2021 at 06:00:57PM +0200, Ard Biesheuvel wrote:
>> (cc Bjorn)
>>
>> On Mon, 26 Jul 2021 at 11:08, Philippe Mathieu-Daudé <philmd@redhat.com> wrote:
>>>
>>> On 7/26/21 12:56 AM, Guenter Roeck wrote:
>>>> On 7/25/21 3:14 PM, Michael S. Tsirkin wrote:
>>>>> On Sat, Jul 24, 2021 at 11:52:34AM -0700, Guenter Roeck wrote:
>>>>>> Hi all,
>>>>>>
>>>>>> starting with qemu v6.0, some of my aarch64 efi boot tests no longer
>>>>>> work. Analysis shows that PCI devices with IO ports do not instantiate
>>>>>> in qemu v6.0 (or v6.1-rc0) when booting through efi. The problem affects
>>>>>> (at least) ne2k_pci, tulip, dc390, and am53c974. The problem only
>>>>>> affects
>>>>>> aarch64, not x86/x86_64.
>>>>>>
>>>>>> I bisected the problem to commit 0cf8882fd0 ("acpi/gpex: Inform os to
>>>>>> keep firmware resource map"). Since this commit, PCI device BAR
>>>>>> allocation has changed. Taking tulip as example, the kernel reports
>>>>>> the following PCI bar assignments when running qemu v5.2.
>>>>>>
>>>>>> [    3.921801] pci 0000:00:01.0: [1011:0019] type 00 class 0x020000
>>>>>> [    3.922207] pci 0000:00:01.0: reg 0x10: [io  0x0000-0x007f]
>>>>>> [    3.922505] pci 0000:00:01.0: reg 0x14: [mem 0x10000000-0x1000007f]
>>
>> IIUC, these lines are read back from the BARs
>>
>>>>>> [    3.927111] pci 0000:00:01.0: BAR 0: assigned [io  0x1000-0x107f]
>>>>>> [    3.927455] pci 0000:00:01.0: BAR 1: assigned [mem
>>>>>> 0x10000000-0x1000007f]
>>>>>>
>>
>> ... and this is the assignment created by the kernel.
>>
>>>>>> With qemu v6.0, the assignment is reported as follows.
>>>>>>
>>>>>> [    3.922887] pci 0000:00:01.0: [1011:0019] type 00 class 0x020000
>>>>>> [    3.923278] pci 0000:00:01.0: reg 0x10: [io  0x0000-0x007f]
>>>>>> [    3.923451] pci 0000:00:01.0: reg 0x14: [mem 0x10000000-0x1000007f]
>>>>>>
>>
>> The problem here is that Linux, for legacy reasons, does not support
>> I/O ports <= 0x1000 on PCI, so the I/O assignment created by EFI is
>> rejected.
>>
>> This might make sense on x86, where legacy I/O ports may exist, but on
>> other architectures, this makes no sense.
> 
> 
> Fixing Linux makes sense but OTOH EFI probably shouldn't create mappings
> that trip up existing guests, right?
> 

I think it is difficult to draw a line. Sure, maybe EFI should not create
such mappings, but then maybe qemu should not suddenly start to enforce
those mappings for existing guests either.

For my own testing, I simply reverted commit 0cf8882fd0 in my copy of
qemu. That solves my immediate problem, giving us time to find a solution
that is acceptable for everyone. After all, it doesn't look like anyone
else has noticed the problem, so there is no real urgency.

Thanks,
Guenter


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: aarch64 efi boot failures with qemu 6.0+
  2021-07-27  5:12           ` Guenter Roeck
@ 2021-07-27  7:04             ` Ard Biesheuvel
  2021-07-27  9:02               ` Michael S. Tsirkin
  2021-07-27  9:30               ` Michael S. Tsirkin
  2021-07-27  9:01             ` Michael S. Tsirkin
  1 sibling, 2 replies; 31+ messages in thread
From: Ard Biesheuvel @ 2021-07-27  7:04 UTC (permalink / raw)
  To: Guenter Roeck
  Cc: Jiahui Cen, Michael S. Tsirkin, Ard Biesheuvel, qemu-devel,
	Bjorn Helgaas, Igor Mammedov, Philippe Mathieu-Daudé

On Tue, 27 Jul 2021 at 07:12, Guenter Roeck <linux@roeck-us.net> wrote:
>
> On 7/26/21 9:45 PM, Michael S. Tsirkin wrote:
> > On Mon, Jul 26, 2021 at 06:00:57PM +0200, Ard Biesheuvel wrote:
> >> (cc Bjorn)
> >>
> >> On Mon, 26 Jul 2021 at 11:08, Philippe Mathieu-Daudé <philmd@redhat.com> wrote:
> >>>
> >>> On 7/26/21 12:56 AM, Guenter Roeck wrote:
> >>>> On 7/25/21 3:14 PM, Michael S. Tsirkin wrote:
> >>>>> On Sat, Jul 24, 2021 at 11:52:34AM -0700, Guenter Roeck wrote:
> >>>>>> Hi all,
> >>>>>>
> >>>>>> starting with qemu v6.0, some of my aarch64 efi boot tests no longer
> >>>>>> work. Analysis shows that PCI devices with IO ports do not instantiate
> >>>>>> in qemu v6.0 (or v6.1-rc0) when booting through efi. The problem affects
> >>>>>> (at least) ne2k_pci, tulip, dc390, and am53c974. The problem only
> >>>>>> affects
> >>>>>> aarch64, not x86/x86_64.
> >>>>>>
> >>>>>> I bisected the problem to commit 0cf8882fd0 ("acpi/gpex: Inform os to
> >>>>>> keep firmware resource map"). Since this commit, PCI device BAR
> >>>>>> allocation has changed. Taking tulip as example, the kernel reports
> >>>>>> the following PCI bar assignments when running qemu v5.2.
> >>>>>>
> >>>>>> [    3.921801] pci 0000:00:01.0: [1011:0019] type 00 class 0x020000
> >>>>>> [    3.922207] pci 0000:00:01.0: reg 0x10: [io  0x0000-0x007f]
> >>>>>> [    3.922505] pci 0000:00:01.0: reg 0x14: [mem 0x10000000-0x1000007f]
> >>
> >> IIUC, these lines are read back from the BARs
> >>
> >>>>>> [    3.927111] pci 0000:00:01.0: BAR 0: assigned [io  0x1000-0x107f]
> >>>>>> [    3.927455] pci 0000:00:01.0: BAR 1: assigned [mem
> >>>>>> 0x10000000-0x1000007f]
> >>>>>>
> >>
> >> ... and this is the assignment created by the kernel.
> >>
> >>>>>> With qemu v6.0, the assignment is reported as follows.
> >>>>>>
> >>>>>> [    3.922887] pci 0000:00:01.0: [1011:0019] type 00 class 0x020000
> >>>>>> [    3.923278] pci 0000:00:01.0: reg 0x10: [io  0x0000-0x007f]
> >>>>>> [    3.923451] pci 0000:00:01.0: reg 0x14: [mem 0x10000000-0x1000007f]
> >>>>>>
> >>
> >> The problem here is that Linux, for legacy reasons, does not support
> >> I/O ports <= 0x1000 on PCI, so the I/O assignment created by EFI is
> >> rejected.
> >>
> >> This might make sense on x86, where legacy I/O ports may exist, but on
> >> other architectures, this makes no sense.
> >
> >
> > Fixing Linux makes sense but OTOH EFI probably shouldn't create mappings
> > that trip up existing guests, right?
> >
>
> I think it is difficult to draw a line. Sure, maybe EFI should not create
> such mappings, but then maybe qemu should not suddenly start to enforce
> those mappings for existing guests either.
>

EFI creates the mappings primarily for itself, and up until DSM #5
started to be enforced, all PCI resource allocations that existed at
boot were ignored by Linux and recreated from scratch.

Also, the commit in question looks dubious to me. I don't think it is
likely that Linux would fail to create a resource tree. What does
happen is that BARs get moved around, which may cause trouble in some
cases: for instance, we had to add special code to the EFI framebuffer
driver to copy with framebuffer BARs being relocated.

> For my own testing, I simply reverted commit 0cf8882fd0 in my copy of
> qemu. That solves my immediate problem, giving us time to find a solution
> that is acceptable for everyone. After all, it doesn't look like anyone
> else has noticed the problem, so there is no real urgency.
>

I would argue that it is better to revert that commit. DSM #5 has a
long history of debate and misinterpretation, and while I think we
ended up with something sane, I don't think we should be using it in
this particular case.


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: aarch64 efi boot failures with qemu 6.0+
  2021-07-27  5:12           ` Guenter Roeck
  2021-07-27  7:04             ` Ard Biesheuvel
@ 2021-07-27  9:01             ` Michael S. Tsirkin
  2021-07-27 10:36               ` Igor Mammedov
  1 sibling, 1 reply; 31+ messages in thread
From: Michael S. Tsirkin @ 2021-07-27  9:01 UTC (permalink / raw)
  To: Guenter Roeck
  Cc: Jiahui Cen, Ard Biesheuvel, qemu-devel, Bjorn Helgaas,
	Igor Mammedov, Philippe Mathieu-Daudé,
	Ard Biesheuvel

On Mon, Jul 26, 2021 at 10:12:38PM -0700, Guenter Roeck wrote:
> On 7/26/21 9:45 PM, Michael S. Tsirkin wrote:
> > On Mon, Jul 26, 2021 at 06:00:57PM +0200, Ard Biesheuvel wrote:
> > > (cc Bjorn)
> > > 
> > > On Mon, 26 Jul 2021 at 11:08, Philippe Mathieu-Daudé <philmd@redhat.com> wrote:
> > > > 
> > > > On 7/26/21 12:56 AM, Guenter Roeck wrote:
> > > > > On 7/25/21 3:14 PM, Michael S. Tsirkin wrote:
> > > > > > On Sat, Jul 24, 2021 at 11:52:34AM -0700, Guenter Roeck wrote:
> > > > > > > Hi all,
> > > > > > > 
> > > > > > > starting with qemu v6.0, some of my aarch64 efi boot tests no longer
> > > > > > > work. Analysis shows that PCI devices with IO ports do not instantiate
> > > > > > > in qemu v6.0 (or v6.1-rc0) when booting through efi. The problem affects
> > > > > > > (at least) ne2k_pci, tulip, dc390, and am53c974. The problem only
> > > > > > > affects
> > > > > > > aarch64, not x86/x86_64.
> > > > > > > 
> > > > > > > I bisected the problem to commit 0cf8882fd0 ("acpi/gpex: Inform os to
> > > > > > > keep firmware resource map"). Since this commit, PCI device BAR
> > > > > > > allocation has changed. Taking tulip as example, the kernel reports
> > > > > > > the following PCI bar assignments when running qemu v5.2.
> > > > > > > 
> > > > > > > [    3.921801] pci 0000:00:01.0: [1011:0019] type 00 class 0x020000
> > > > > > > [    3.922207] pci 0000:00:01.0: reg 0x10: [io  0x0000-0x007f]
> > > > > > > [    3.922505] pci 0000:00:01.0: reg 0x14: [mem 0x10000000-0x1000007f]
> > > 
> > > IIUC, these lines are read back from the BARs
> > > 
> > > > > > > [    3.927111] pci 0000:00:01.0: BAR 0: assigned [io  0x1000-0x107f]
> > > > > > > [    3.927455] pci 0000:00:01.0: BAR 1: assigned [mem
> > > > > > > 0x10000000-0x1000007f]
> > > > > > > 
> > > 
> > > ... and this is the assignment created by the kernel.
> > > 
> > > > > > > With qemu v6.0, the assignment is reported as follows.
> > > > > > > 
> > > > > > > [    3.922887] pci 0000:00:01.0: [1011:0019] type 00 class 0x020000
> > > > > > > [    3.923278] pci 0000:00:01.0: reg 0x10: [io  0x0000-0x007f]
> > > > > > > [    3.923451] pci 0000:00:01.0: reg 0x14: [mem 0x10000000-0x1000007f]
> > > > > > > 
> > > 
> > > The problem here is that Linux, for legacy reasons, does not support
> > > I/O ports <= 0x1000 on PCI, so the I/O assignment created by EFI is
> > > rejected.
> > > 
> > > This might make sense on x86, where legacy I/O ports may exist, but on
> > > other architectures, this makes no sense.
> > 
> > 
> > Fixing Linux makes sense but OTOH EFI probably shouldn't create mappings
> > that trip up existing guests, right?
> > 
> 
> I think it is difficult to draw a line. Sure, maybe EFI should not create
> such mappings, but then maybe qemu should not suddenly start to enforce
> those mappings for existing guests either.

I would say both. But about QEMU actually I think you have a point here.
Re-reading the spec:

0: No (The operating system shall not ignore the PCI configuration that firmware has done
at boot time. However, the operating system is free to configure the devices in this hierarchy
that have not been configured by the firmware. There may be a reduced level of hot plug
capability support in this hierarchy due to resource constraints. This situation is the same as
the legacy situation where this _DSM is not provided.)
1: Yes (The operating system may ignore the PCI configuration that the firmware has done
at boot time, and reconfigure/rebalance the resources in the hierarchy.)


I think I misread the spec previously, and understood it to mean that
1 means must ignore. In fact 1 gives the most flexibility.
So why are we suddenly telling the guest it must not override
firmware?

The commit log says
    The diffences could result in resource assignment failure.

which is kind of vague ...

Jiahui Cen, Igor, what do you think about it?
I'm inclined to revert 0cf8882fd06ba0aeb1e90fa6f23fce85504d7e14
at least for now.


> For my own testing, I simply reverted commit 0cf8882fd0 in my copy of
> qemu. That solves my immediate problem, giving us time to find a solution
> that is acceptable for everyone. After all, it doesn't look like anyone
> else has noticed the problem, so there is no real urgency.
> 
> Thanks,
> Guenter

Well it's not like we have an army of testers, I think we should
treat each problem report seriously ...

-- 
MST



^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: aarch64 efi boot failures with qemu 6.0+
  2021-07-27  7:04             ` Ard Biesheuvel
@ 2021-07-27  9:02               ` Michael S. Tsirkin
  2021-07-27  9:30               ` Michael S. Tsirkin
  1 sibling, 0 replies; 31+ messages in thread
From: Michael S. Tsirkin @ 2021-07-27  9:02 UTC (permalink / raw)
  To: Ard Biesheuvel
  Cc: Jiahui Cen, Ard Biesheuvel, qemu-devel, Bjorn Helgaas,
	Igor Mammedov, Philippe Mathieu-Daudé,
	Guenter Roeck

On Tue, Jul 27, 2021 at 09:04:20AM +0200, Ard Biesheuvel wrote:
> On Tue, 27 Jul 2021 at 07:12, Guenter Roeck <linux@roeck-us.net> wrote:
> >
> > On 7/26/21 9:45 PM, Michael S. Tsirkin wrote:
> > > On Mon, Jul 26, 2021 at 06:00:57PM +0200, Ard Biesheuvel wrote:
> > >> (cc Bjorn)
> > >>
> > >> On Mon, 26 Jul 2021 at 11:08, Philippe Mathieu-Daudé <philmd@redhat.com> wrote:
> > >>>
> > >>> On 7/26/21 12:56 AM, Guenter Roeck wrote:
> > >>>> On 7/25/21 3:14 PM, Michael S. Tsirkin wrote:
> > >>>>> On Sat, Jul 24, 2021 at 11:52:34AM -0700, Guenter Roeck wrote:
> > >>>>>> Hi all,
> > >>>>>>
> > >>>>>> starting with qemu v6.0, some of my aarch64 efi boot tests no longer
> > >>>>>> work. Analysis shows that PCI devices with IO ports do not instantiate
> > >>>>>> in qemu v6.0 (or v6.1-rc0) when booting through efi. The problem affects
> > >>>>>> (at least) ne2k_pci, tulip, dc390, and am53c974. The problem only
> > >>>>>> affects
> > >>>>>> aarch64, not x86/x86_64.
> > >>>>>>
> > >>>>>> I bisected the problem to commit 0cf8882fd0 ("acpi/gpex: Inform os to
> > >>>>>> keep firmware resource map"). Since this commit, PCI device BAR
> > >>>>>> allocation has changed. Taking tulip as example, the kernel reports
> > >>>>>> the following PCI bar assignments when running qemu v5.2.
> > >>>>>>
> > >>>>>> [    3.921801] pci 0000:00:01.0: [1011:0019] type 00 class 0x020000
> > >>>>>> [    3.922207] pci 0000:00:01.0: reg 0x10: [io  0x0000-0x007f]
> > >>>>>> [    3.922505] pci 0000:00:01.0: reg 0x14: [mem 0x10000000-0x1000007f]
> > >>
> > >> IIUC, these lines are read back from the BARs
> > >>
> > >>>>>> [    3.927111] pci 0000:00:01.0: BAR 0: assigned [io  0x1000-0x107f]
> > >>>>>> [    3.927455] pci 0000:00:01.0: BAR 1: assigned [mem
> > >>>>>> 0x10000000-0x1000007f]
> > >>>>>>
> > >>
> > >> ... and this is the assignment created by the kernel.
> > >>
> > >>>>>> With qemu v6.0, the assignment is reported as follows.
> > >>>>>>
> > >>>>>> [    3.922887] pci 0000:00:01.0: [1011:0019] type 00 class 0x020000
> > >>>>>> [    3.923278] pci 0000:00:01.0: reg 0x10: [io  0x0000-0x007f]
> > >>>>>> [    3.923451] pci 0000:00:01.0: reg 0x14: [mem 0x10000000-0x1000007f]
> > >>>>>>
> > >>
> > >> The problem here is that Linux, for legacy reasons, does not support
> > >> I/O ports <= 0x1000 on PCI, so the I/O assignment created by EFI is
> > >> rejected.
> > >>
> > >> This might make sense on x86, where legacy I/O ports may exist, but on
> > >> other architectures, this makes no sense.
> > >
> > >
> > > Fixing Linux makes sense but OTOH EFI probably shouldn't create mappings
> > > that trip up existing guests, right?
> > >
> >
> > I think it is difficult to draw a line. Sure, maybe EFI should not create
> > such mappings, but then maybe qemu should not suddenly start to enforce
> > those mappings for existing guests either.
> >
> 
> EFI creates the mappings primarily for itself, and up until DSM #5
> started to be enforced, all PCI resource allocations that existed at
> boot were ignored by Linux and recreated from scratch.
> 
> Also, the commit in question looks dubious to me. I don't think it is
> likely that Linux would fail to create a resource tree. What does
> happen is that BARs get moved around, which may cause trouble in some
> cases: for instance, we had to add special code to the EFI framebuffer
> driver to copy with framebuffer BARs being relocated.
> 
> > For my own testing, I simply reverted commit 0cf8882fd0 in my copy of
> > qemu. That solves my immediate problem, giving us time to find a solution
> > that is acceptable for everyone. After all, it doesn't look like anyone
> > else has noticed the problem, so there is no real urgency.
> >
> 
> I would argue that it is better to revert that commit. DSM #5 has a
> long history of debate and misinterpretation, and while I think we
> ended up with something sane, I don't think we should be using it in
> this particular case.

Re-reading it I have to agree. I think I misunderstood the spec and
guest behaviour when I applied it.

-- 
MST



^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: aarch64 efi boot failures with qemu 6.0+
  2021-07-27  7:04             ` Ard Biesheuvel
  2021-07-27  9:02               ` Michael S. Tsirkin
@ 2021-07-27  9:30               ` Michael S. Tsirkin
  2021-07-27  9:50                 ` Ard Biesheuvel
  2021-07-27 11:18                 ` Guenter Roeck
  1 sibling, 2 replies; 31+ messages in thread
From: Michael S. Tsirkin @ 2021-07-27  9:30 UTC (permalink / raw)
  To: Ard Biesheuvel
  Cc: Jiahui Cen, Ard Biesheuvel, qemu-devel, Bjorn Helgaas,
	Igor Mammedov, Philippe Mathieu-Daudé,
	Guenter Roeck

On Tue, Jul 27, 2021 at 09:04:20AM +0200, Ard Biesheuvel wrote:
> On Tue, 27 Jul 2021 at 07:12, Guenter Roeck <linux@roeck-us.net> wrote:
> >
> > On 7/26/21 9:45 PM, Michael S. Tsirkin wrote:
> > > On Mon, Jul 26, 2021 at 06:00:57PM +0200, Ard Biesheuvel wrote:
> > >> (cc Bjorn)
> > >>
> > >> On Mon, 26 Jul 2021 at 11:08, Philippe Mathieu-Daudé <philmd@redhat.com> wrote:
> > >>>
> > >>> On 7/26/21 12:56 AM, Guenter Roeck wrote:
> > >>>> On 7/25/21 3:14 PM, Michael S. Tsirkin wrote:
> > >>>>> On Sat, Jul 24, 2021 at 11:52:34AM -0700, Guenter Roeck wrote:
> > >>>>>> Hi all,
> > >>>>>>
> > >>>>>> starting with qemu v6.0, some of my aarch64 efi boot tests no longer
> > >>>>>> work. Analysis shows that PCI devices with IO ports do not instantiate
> > >>>>>> in qemu v6.0 (or v6.1-rc0) when booting through efi. The problem affects
> > >>>>>> (at least) ne2k_pci, tulip, dc390, and am53c974. The problem only
> > >>>>>> affects
> > >>>>>> aarch64, not x86/x86_64.
> > >>>>>>
> > >>>>>> I bisected the problem to commit 0cf8882fd0 ("acpi/gpex: Inform os to
> > >>>>>> keep firmware resource map"). Since this commit, PCI device BAR
> > >>>>>> allocation has changed. Taking tulip as example, the kernel reports
> > >>>>>> the following PCI bar assignments when running qemu v5.2.
> > >>>>>>
> > >>>>>> [    3.921801] pci 0000:00:01.0: [1011:0019] type 00 class 0x020000
> > >>>>>> [    3.922207] pci 0000:00:01.0: reg 0x10: [io  0x0000-0x007f]
> > >>>>>> [    3.922505] pci 0000:00:01.0: reg 0x14: [mem 0x10000000-0x1000007f]
> > >>
> > >> IIUC, these lines are read back from the BARs
> > >>
> > >>>>>> [    3.927111] pci 0000:00:01.0: BAR 0: assigned [io  0x1000-0x107f]
> > >>>>>> [    3.927455] pci 0000:00:01.0: BAR 1: assigned [mem
> > >>>>>> 0x10000000-0x1000007f]
> > >>>>>>
> > >>
> > >> ... and this is the assignment created by the kernel.
> > >>
> > >>>>>> With qemu v6.0, the assignment is reported as follows.
> > >>>>>>
> > >>>>>> [    3.922887] pci 0000:00:01.0: [1011:0019] type 00 class 0x020000
> > >>>>>> [    3.923278] pci 0000:00:01.0: reg 0x10: [io  0x0000-0x007f]
> > >>>>>> [    3.923451] pci 0000:00:01.0: reg 0x14: [mem 0x10000000-0x1000007f]
> > >>>>>>
> > >>
> > >> The problem here is that Linux, for legacy reasons, does not support
> > >> I/O ports <= 0x1000 on PCI, so the I/O assignment created by EFI is
> > >> rejected.
> > >>
> > >> This might make sense on x86, where legacy I/O ports may exist, but on
> > >> other architectures, this makes no sense.
> > >
> > >
> > > Fixing Linux makes sense but OTOH EFI probably shouldn't create mappings
> > > that trip up existing guests, right?
> > >
> >
> > I think it is difficult to draw a line. Sure, maybe EFI should not create
> > such mappings, but then maybe qemu should not suddenly start to enforce
> > those mappings for existing guests either.
> >
> 
> EFI creates the mappings primarily for itself, and up until DSM #5
> started to be enforced, all PCI resource allocations that existed at
> boot were ignored by Linux and recreated from scratch.
> 
> Also, the commit in question looks dubious to me. I don't think it is
> likely that Linux would fail to create a resource tree. What does
> happen is that BARs get moved around, which may cause trouble in some
> cases: for instance, we had to add special code to the EFI framebuffer
> driver to copy with framebuffer BARs being relocated.
> 
> > For my own testing, I simply reverted commit 0cf8882fd0 in my copy of
> > qemu. That solves my immediate problem, giving us time to find a solution
> > that is acceptable for everyone. After all, it doesn't look like anyone
> > else has noticed the problem, so there is no real urgency.
> >
> 
> I would argue that it is better to revert that commit. DSM #5 has a
> long history of debate and misinterpretation, and while I think we
> ended up with something sane, I don't think we should be using it in
> this particular case.

I think revert might make sense, however:

0: No (The operating system shall not ignore the PCI configuration that firmware has done
at boot time. However, the operating system is free to configure the devices in this hierarchy
that have not been configured by the firmware. There may be a reduced level of hot plug
capability support in this hierarchy due to resource constraints. This situation is the same as
the legacy situation where this _DSM is not provided.)

^^^^ does not this imply that reporting a 0 as we currently do
     should be mostly a NOP?


1: Yes (The operating system may ignore the PCI configuration that the firmware has done
at boot time, and reconfigure/rebalance the resources in the hierarchy.)


So I am debating with myself whether this should be a plain revert or
return 1 here:
     /*
      * 0 - The operating system must not ignore the PCI configuration that
      *     firmware has done at boot time.
      */
     aml_append(ifctx1, aml_return(aml_int(0)));
-    aml_append(ifctx, ifctx1);
+    aml_append(ifctx1, aml_return(aml_int(1)));
     aml_append(method, ifctx);



Guenter what happens if we return 1? Do things work well?

-- 
MST



^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: aarch64 efi boot failures with qemu 6.0+
  2021-07-27  9:30               ` Michael S. Tsirkin
@ 2021-07-27  9:50                 ` Ard Biesheuvel
  2021-07-27 10:07                   ` Michael S. Tsirkin
  2021-07-27 11:18                 ` Guenter Roeck
  1 sibling, 1 reply; 31+ messages in thread
From: Ard Biesheuvel @ 2021-07-27  9:50 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Jiahui Cen, Ard Biesheuvel, qemu-devel, Bjorn Helgaas,
	Igor Mammedov, Philippe Mathieu-Daudé,
	Guenter Roeck

On Tue, 27 Jul 2021 at 11:30, Michael S. Tsirkin <mst@redhat.com> wrote:
>
> On Tue, Jul 27, 2021 at 09:04:20AM +0200, Ard Biesheuvel wrote:
> > On Tue, 27 Jul 2021 at 07:12, Guenter Roeck <linux@roeck-us.net> wrote:
> > >
> > > On 7/26/21 9:45 PM, Michael S. Tsirkin wrote:
> > > > On Mon, Jul 26, 2021 at 06:00:57PM +0200, Ard Biesheuvel wrote:
> > > >> (cc Bjorn)
> > > >>
> > > >> On Mon, 26 Jul 2021 at 11:08, Philippe Mathieu-Daudé <philmd@redhat.com> wrote:
> > > >>>
> > > >>> On 7/26/21 12:56 AM, Guenter Roeck wrote:
> > > >>>> On 7/25/21 3:14 PM, Michael S. Tsirkin wrote:
> > > >>>>> On Sat, Jul 24, 2021 at 11:52:34AM -0700, Guenter Roeck wrote:
> > > >>>>>> Hi all,
> > > >>>>>>
> > > >>>>>> starting with qemu v6.0, some of my aarch64 efi boot tests no longer
> > > >>>>>> work. Analysis shows that PCI devices with IO ports do not instantiate
> > > >>>>>> in qemu v6.0 (or v6.1-rc0) when booting through efi. The problem affects
> > > >>>>>> (at least) ne2k_pci, tulip, dc390, and am53c974. The problem only
> > > >>>>>> affects
> > > >>>>>> aarch64, not x86/x86_64.
> > > >>>>>>
> > > >>>>>> I bisected the problem to commit 0cf8882fd0 ("acpi/gpex: Inform os to
> > > >>>>>> keep firmware resource map"). Since this commit, PCI device BAR
> > > >>>>>> allocation has changed. Taking tulip as example, the kernel reports
> > > >>>>>> the following PCI bar assignments when running qemu v5.2.
> > > >>>>>>
> > > >>>>>> [    3.921801] pci 0000:00:01.0: [1011:0019] type 00 class 0x020000
> > > >>>>>> [    3.922207] pci 0000:00:01.0: reg 0x10: [io  0x0000-0x007f]
> > > >>>>>> [    3.922505] pci 0000:00:01.0: reg 0x14: [mem 0x10000000-0x1000007f]
> > > >>
> > > >> IIUC, these lines are read back from the BARs
> > > >>
> > > >>>>>> [    3.927111] pci 0000:00:01.0: BAR 0: assigned [io  0x1000-0x107f]
> > > >>>>>> [    3.927455] pci 0000:00:01.0: BAR 1: assigned [mem
> > > >>>>>> 0x10000000-0x1000007f]
> > > >>>>>>
> > > >>
> > > >> ... and this is the assignment created by the kernel.
> > > >>
> > > >>>>>> With qemu v6.0, the assignment is reported as follows.
> > > >>>>>>
> > > >>>>>> [    3.922887] pci 0000:00:01.0: [1011:0019] type 00 class 0x020000
> > > >>>>>> [    3.923278] pci 0000:00:01.0: reg 0x10: [io  0x0000-0x007f]
> > > >>>>>> [    3.923451] pci 0000:00:01.0: reg 0x14: [mem 0x10000000-0x1000007f]
> > > >>>>>>
> > > >>
> > > >> The problem here is that Linux, for legacy reasons, does not support
> > > >> I/O ports <= 0x1000 on PCI, so the I/O assignment created by EFI is
> > > >> rejected.
> > > >>
> > > >> This might make sense on x86, where legacy I/O ports may exist, but on
> > > >> other architectures, this makes no sense.
> > > >
> > > >
> > > > Fixing Linux makes sense but OTOH EFI probably shouldn't create mappings
> > > > that trip up existing guests, right?
> > > >
> > >
> > > I think it is difficult to draw a line. Sure, maybe EFI should not create
> > > such mappings, but then maybe qemu should not suddenly start to enforce
> > > those mappings for existing guests either.
> > >
> >
> > EFI creates the mappings primarily for itself, and up until DSM #5
> > started to be enforced, all PCI resource allocations that existed at
> > boot were ignored by Linux and recreated from scratch.
> >
> > Also, the commit in question looks dubious to me. I don't think it is
> > likely that Linux would fail to create a resource tree. What does
> > happen is that BARs get moved around, which may cause trouble in some
> > cases: for instance, we had to add special code to the EFI framebuffer
> > driver to copy with framebuffer BARs being relocated.
> >
> > > For my own testing, I simply reverted commit 0cf8882fd0 in my copy of
> > > qemu. That solves my immediate problem, giving us time to find a solution
> > > that is acceptable for everyone. After all, it doesn't look like anyone
> > > else has noticed the problem, so there is no real urgency.
> > >
> >
> > I would argue that it is better to revert that commit. DSM #5 has a
> > long history of debate and misinterpretation, and while I think we
> > ended up with something sane, I don't think we should be using it in
> > this particular case.
>
> I think revert might make sense, however:
>
> 0: No (The operating system shall not ignore the PCI configuration that firmware has done
> at boot time. However, the operating system is free to configure the devices in this hierarchy
> that have not been configured by the firmware. There may be a reduced level of hot plug
> capability support in this hierarchy due to resource constraints. This situation is the same as
> the legacy situation where this _DSM is not provided.)
>
> ^^^^ does not this imply that reporting a 0 as we currently do
>      should be mostly a NOP?
>

Not really. The resource allocation strategies are different between
EDK2 and Linux, and as Guenter's testing proves, EDK2 may lay out PCI
resources in a way that interferes with Linux's expectations. The I/O
port 0x0 problem is just one potential issue here: another issue is
resource padding for hotplug, which is important for VMs, not only the
IO/MEM resource allocations, but the bus ranges as well.

>
> 1: Yes (The operating system may ignore the PCI configuration that the firmware has done
> at boot time, and reconfigure/rebalance the resources in the hierarchy.)
>
>
> So I am debating with myself whether this should be a plain revert or
> return 1 here:
>      /*
>       * 0 - The operating system must not ignore the PCI configuration that
>       *     firmware has done at boot time.
>       */
>      aml_append(ifctx1, aml_return(aml_int(0)));
> -    aml_append(ifctx, ifctx1);
> +    aml_append(ifctx1, aml_return(aml_int(1)));
>      aml_append(method, ifctx);
>

I agree that returning '1' here is a better choice, as it explicitly
gives the OS license to reassign all resources, which is what we have
been relying on to begin with.

OTOH, I do think we should fix arbitrary zero checks in Linux that
make no sense on !x86

>
>
> Guenter what happens if we return 1? Do things work well?
>
> --
> MST
>


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: aarch64 efi boot failures with qemu 6.0+
  2021-07-27  9:50                 ` Ard Biesheuvel
@ 2021-07-27 10:07                   ` Michael S. Tsirkin
  2021-07-27 10:14                     ` Ard Biesheuvel
  0 siblings, 1 reply; 31+ messages in thread
From: Michael S. Tsirkin @ 2021-07-27 10:07 UTC (permalink / raw)
  To: Ard Biesheuvel
  Cc: Jiahui Cen, Ard Biesheuvel, qemu-devel, Bjorn Helgaas,
	Igor Mammedov, Philippe Mathieu-Daudé,
	Guenter Roeck

On Tue, Jul 27, 2021 at 11:50:23AM +0200, Ard Biesheuvel wrote:
> On Tue, 27 Jul 2021 at 11:30, Michael S. Tsirkin <mst@redhat.com> wrote:
> >
> > On Tue, Jul 27, 2021 at 09:04:20AM +0200, Ard Biesheuvel wrote:
> > > On Tue, 27 Jul 2021 at 07:12, Guenter Roeck <linux@roeck-us.net> wrote:
> > > >
> > > > On 7/26/21 9:45 PM, Michael S. Tsirkin wrote:
> > > > > On Mon, Jul 26, 2021 at 06:00:57PM +0200, Ard Biesheuvel wrote:
> > > > >> (cc Bjorn)
> > > > >>
> > > > >> On Mon, 26 Jul 2021 at 11:08, Philippe Mathieu-Daudé <philmd@redhat.com> wrote:
> > > > >>>
> > > > >>> On 7/26/21 12:56 AM, Guenter Roeck wrote:
> > > > >>>> On 7/25/21 3:14 PM, Michael S. Tsirkin wrote:
> > > > >>>>> On Sat, Jul 24, 2021 at 11:52:34AM -0700, Guenter Roeck wrote:
> > > > >>>>>> Hi all,
> > > > >>>>>>
> > > > >>>>>> starting with qemu v6.0, some of my aarch64 efi boot tests no longer
> > > > >>>>>> work. Analysis shows that PCI devices with IO ports do not instantiate
> > > > >>>>>> in qemu v6.0 (or v6.1-rc0) when booting through efi. The problem affects
> > > > >>>>>> (at least) ne2k_pci, tulip, dc390, and am53c974. The problem only
> > > > >>>>>> affects
> > > > >>>>>> aarch64, not x86/x86_64.
> > > > >>>>>>
> > > > >>>>>> I bisected the problem to commit 0cf8882fd0 ("acpi/gpex: Inform os to
> > > > >>>>>> keep firmware resource map"). Since this commit, PCI device BAR
> > > > >>>>>> allocation has changed. Taking tulip as example, the kernel reports
> > > > >>>>>> the following PCI bar assignments when running qemu v5.2.
> > > > >>>>>>
> > > > >>>>>> [    3.921801] pci 0000:00:01.0: [1011:0019] type 00 class 0x020000
> > > > >>>>>> [    3.922207] pci 0000:00:01.0: reg 0x10: [io  0x0000-0x007f]
> > > > >>>>>> [    3.922505] pci 0000:00:01.0: reg 0x14: [mem 0x10000000-0x1000007f]
> > > > >>
> > > > >> IIUC, these lines are read back from the BARs
> > > > >>
> > > > >>>>>> [    3.927111] pci 0000:00:01.0: BAR 0: assigned [io  0x1000-0x107f]
> > > > >>>>>> [    3.927455] pci 0000:00:01.0: BAR 1: assigned [mem
> > > > >>>>>> 0x10000000-0x1000007f]
> > > > >>>>>>
> > > > >>
> > > > >> ... and this is the assignment created by the kernel.
> > > > >>
> > > > >>>>>> With qemu v6.0, the assignment is reported as follows.
> > > > >>>>>>
> > > > >>>>>> [    3.922887] pci 0000:00:01.0: [1011:0019] type 00 class 0x020000
> > > > >>>>>> [    3.923278] pci 0000:00:01.0: reg 0x10: [io  0x0000-0x007f]
> > > > >>>>>> [    3.923451] pci 0000:00:01.0: reg 0x14: [mem 0x10000000-0x1000007f]
> > > > >>>>>>
> > > > >>
> > > > >> The problem here is that Linux, for legacy reasons, does not support
> > > > >> I/O ports <= 0x1000 on PCI, so the I/O assignment created by EFI is
> > > > >> rejected.
> > > > >>
> > > > >> This might make sense on x86, where legacy I/O ports may exist, but on
> > > > >> other architectures, this makes no sense.
> > > > >
> > > > >
> > > > > Fixing Linux makes sense but OTOH EFI probably shouldn't create mappings
> > > > > that trip up existing guests, right?
> > > > >
> > > >
> > > > I think it is difficult to draw a line. Sure, maybe EFI should not create
> > > > such mappings, but then maybe qemu should not suddenly start to enforce
> > > > those mappings for existing guests either.
> > > >
> > >
> > > EFI creates the mappings primarily for itself, and up until DSM #5
> > > started to be enforced, all PCI resource allocations that existed at
> > > boot were ignored by Linux and recreated from scratch.
> > >
> > > Also, the commit in question looks dubious to me. I don't think it is
> > > likely that Linux would fail to create a resource tree. What does
> > > happen is that BARs get moved around, which may cause trouble in some
> > > cases: for instance, we had to add special code to the EFI framebuffer
> > > driver to copy with framebuffer BARs being relocated.
> > >
> > > > For my own testing, I simply reverted commit 0cf8882fd0 in my copy of
> > > > qemu. That solves my immediate problem, giving us time to find a solution
> > > > that is acceptable for everyone. After all, it doesn't look like anyone
> > > > else has noticed the problem, so there is no real urgency.
> > > >
> > >
> > > I would argue that it is better to revert that commit. DSM #5 has a
> > > long history of debate and misinterpretation, and while I think we
> > > ended up with something sane, I don't think we should be using it in
> > > this particular case.
> >
> > I think revert might make sense, however:
> >
> > 0: No (The operating system shall not ignore the PCI configuration that firmware has done
> > at boot time. However, the operating system is free to configure the devices in this hierarchy
> > that have not been configured by the firmware. There may be a reduced level of hot plug
> > capability support in this hierarchy due to resource constraints. This situation is the same as
> > the legacy situation where this _DSM is not provided.)
> >
> > ^^^^ does not this imply that reporting a 0 as we currently do
> >      should be mostly a NOP?
> >
> 
> Not really. The resource allocation strategies are different between
> EDK2 and Linux, and as Guenter's testing proves, EDK2 may lay out PCI
> resources in a way that interferes with Linux's expectations. The I/O
> port 0x0 problem is just one potential issue here: another issue is
> resource padding for hotplug, which is important for VMs, not only the
> IO/MEM resource allocations, but the bus ranges as well.

Hmm not sure I understand the answer. The text above seems to say
that 0 should be the same as _DSM 5 is not provided, does it not?
Why did behaviour change when we switched from not providing _DSM 5
to providing but returning 0?


> >
> > 1: Yes (The operating system may ignore the PCI configuration that the firmware has done
> > at boot time, and reconfigure/rebalance the resources in the hierarchy.)
> >
> >
> > So I am debating with myself whether this should be a plain revert or
> > return 1 here:
> >      /*
> >       * 0 - The operating system must not ignore the PCI configuration that
> >       *     firmware has done at boot time.
> >       */
> >      aml_append(ifctx1, aml_return(aml_int(0)));
> > -    aml_append(ifctx, ifctx1);
> > +    aml_append(ifctx1, aml_return(aml_int(1)));
> >      aml_append(method, ifctx);
> >
> 
> I agree that returning '1' here is a better choice, as it explicitly
> gives the OS license to reassign all resources, which is what we have
> been relying on to begin with.
> 
> OTOH, I do think we should fix arbitrary zero checks in Linux that
> make no sense on !x86
> 
> >
> >
> > Guenter what happens if we return 1? Do things work well?
> >
> > --
> > MST
> >



^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: aarch64 efi boot failures with qemu 6.0+
  2021-07-27 10:07                   ` Michael S. Tsirkin
@ 2021-07-27 10:14                     ` Ard Biesheuvel
  2022-03-18 11:48                       ` Lorenzo Pieralisi
  0 siblings, 1 reply; 31+ messages in thread
From: Ard Biesheuvel @ 2021-07-27 10:14 UTC (permalink / raw)
  To: Michael S. Tsirkin, Lorenzo Pieralisi
  Cc: Jiahui Cen, Ard Biesheuvel, qemu-devel, Bjorn Helgaas,
	Igor Mammedov, Philippe Mathieu-Daudé,
	Guenter Roeck

(+ Lorenzo)

On Tue, 27 Jul 2021 at 12:07, Michael S. Tsirkin <mst@redhat.com> wrote:
>
> On Tue, Jul 27, 2021 at 11:50:23AM +0200, Ard Biesheuvel wrote:
> > On Tue, 27 Jul 2021 at 11:30, Michael S. Tsirkin <mst@redhat.com> wrote:
> > >
> > > On Tue, Jul 27, 2021 at 09:04:20AM +0200, Ard Biesheuvel wrote:
> > > > On Tue, 27 Jul 2021 at 07:12, Guenter Roeck <linux@roeck-us.net> wrote:
> > > > >
> > > > > On 7/26/21 9:45 PM, Michael S. Tsirkin wrote:
> > > > > > On Mon, Jul 26, 2021 at 06:00:57PM +0200, Ard Biesheuvel wrote:
> > > > > >> (cc Bjorn)
> > > > > >>
> > > > > >> On Mon, 26 Jul 2021 at 11:08, Philippe Mathieu-Daudé <philmd@redhat.com> wrote:
> > > > > >>>
> > > > > >>> On 7/26/21 12:56 AM, Guenter Roeck wrote:
> > > > > >>>> On 7/25/21 3:14 PM, Michael S. Tsirkin wrote:
> > > > > >>>>> On Sat, Jul 24, 2021 at 11:52:34AM -0700, Guenter Roeck wrote:
> > > > > >>>>>> Hi all,
> > > > > >>>>>>
> > > > > >>>>>> starting with qemu v6.0, some of my aarch64 efi boot tests no longer
> > > > > >>>>>> work. Analysis shows that PCI devices with IO ports do not instantiate
> > > > > >>>>>> in qemu v6.0 (or v6.1-rc0) when booting through efi. The problem affects
> > > > > >>>>>> (at least) ne2k_pci, tulip, dc390, and am53c974. The problem only
> > > > > >>>>>> affects
> > > > > >>>>>> aarch64, not x86/x86_64.
> > > > > >>>>>>
> > > > > >>>>>> I bisected the problem to commit 0cf8882fd0 ("acpi/gpex: Inform os to
> > > > > >>>>>> keep firmware resource map"). Since this commit, PCI device BAR
> > > > > >>>>>> allocation has changed. Taking tulip as example, the kernel reports
> > > > > >>>>>> the following PCI bar assignments when running qemu v5.2.
> > > > > >>>>>>
> > > > > >>>>>> [    3.921801] pci 0000:00:01.0: [1011:0019] type 00 class 0x020000
> > > > > >>>>>> [    3.922207] pci 0000:00:01.0: reg 0x10: [io  0x0000-0x007f]
> > > > > >>>>>> [    3.922505] pci 0000:00:01.0: reg 0x14: [mem 0x10000000-0x1000007f]
> > > > > >>
> > > > > >> IIUC, these lines are read back from the BARs
> > > > > >>
> > > > > >>>>>> [    3.927111] pci 0000:00:01.0: BAR 0: assigned [io  0x1000-0x107f]
> > > > > >>>>>> [    3.927455] pci 0000:00:01.0: BAR 1: assigned [mem
> > > > > >>>>>> 0x10000000-0x1000007f]
> > > > > >>>>>>
> > > > > >>
> > > > > >> ... and this is the assignment created by the kernel.
> > > > > >>
> > > > > >>>>>> With qemu v6.0, the assignment is reported as follows.
> > > > > >>>>>>
> > > > > >>>>>> [    3.922887] pci 0000:00:01.0: [1011:0019] type 00 class 0x020000
> > > > > >>>>>> [    3.923278] pci 0000:00:01.0: reg 0x10: [io  0x0000-0x007f]
> > > > > >>>>>> [    3.923451] pci 0000:00:01.0: reg 0x14: [mem 0x10000000-0x1000007f]
> > > > > >>>>>>
> > > > > >>
> > > > > >> The problem here is that Linux, for legacy reasons, does not support
> > > > > >> I/O ports <= 0x1000 on PCI, so the I/O assignment created by EFI is
> > > > > >> rejected.
> > > > > >>
> > > > > >> This might make sense on x86, where legacy I/O ports may exist, but on
> > > > > >> other architectures, this makes no sense.
> > > > > >
> > > > > >
> > > > > > Fixing Linux makes sense but OTOH EFI probably shouldn't create mappings
> > > > > > that trip up existing guests, right?
> > > > > >
> > > > >
> > > > > I think it is difficult to draw a line. Sure, maybe EFI should not create
> > > > > such mappings, but then maybe qemu should not suddenly start to enforce
> > > > > those mappings for existing guests either.
> > > > >
> > > >
> > > > EFI creates the mappings primarily for itself, and up until DSM #5
> > > > started to be enforced, all PCI resource allocations that existed at
> > > > boot were ignored by Linux and recreated from scratch.
> > > >
> > > > Also, the commit in question looks dubious to me. I don't think it is
> > > > likely that Linux would fail to create a resource tree. What does
> > > > happen is that BARs get moved around, which may cause trouble in some
> > > > cases: for instance, we had to add special code to the EFI framebuffer
> > > > driver to copy with framebuffer BARs being relocated.
> > > >
> > > > > For my own testing, I simply reverted commit 0cf8882fd0 in my copy of
> > > > > qemu. That solves my immediate problem, giving us time to find a solution
> > > > > that is acceptable for everyone. After all, it doesn't look like anyone
> > > > > else has noticed the problem, so there is no real urgency.
> > > > >
> > > >
> > > > I would argue that it is better to revert that commit. DSM #5 has a
> > > > long history of debate and misinterpretation, and while I think we
> > > > ended up with something sane, I don't think we should be using it in
> > > > this particular case.
> > >
> > > I think revert might make sense, however:
> > >
> > > 0: No (The operating system shall not ignore the PCI configuration that firmware has done
> > > at boot time. However, the operating system is free to configure the devices in this hierarchy
> > > that have not been configured by the firmware. There may be a reduced level of hot plug
> > > capability support in this hierarchy due to resource constraints. This situation is the same as
> > > the legacy situation where this _DSM is not provided.)
> > >
> > > ^^^^ does not this imply that reporting a 0 as we currently do
> > >      should be mostly a NOP?
> > >
> >
> > Not really. The resource allocation strategies are different between
> > EDK2 and Linux, and as Guenter's testing proves, EDK2 may lay out PCI
> > resources in a way that interferes with Linux's expectations. The I/O
> > port 0x0 problem is just one potential issue here: another issue is
> > resource padding for hotplug, which is important for VMs, not only the
> > IO/MEM resource allocations, but the bus ranges as well.
>
> Hmm not sure I understand the answer. The text above seems to say
> that 0 should be the same as _DSM 5 is not provided, does it not?

That is what the spec says, but it has never been what Linux/arm64
does. Its PCI arch code is based on 32-bit ARM, which uses simple
bootloaders that are completely unaware of the existence of PCI, and
so from the beginning, we have always reassigned all resources (but
not bus numbers IIRC) because that is what ARM did. On arm64, of
course, we often do have rich firmware that initializes PCI, but by
that time, the cat was out of the bag already, and we could not simply
stop reassigning resources without running a substantial regression
risk, even when booting in ACPI mode.

So the default behavior on arm64 is '1' not '0' in terms of DSM #5.

> Why did behaviour change when we switched from not providing _DSM 5
> to providing but returning 0?
>
>
> > >
> > > 1: Yes (The operating system may ignore the PCI configuration that the firmware has done
> > > at boot time, and reconfigure/rebalance the resources in the hierarchy.)
> > >
> > >
> > > So I am debating with myself whether this should be a plain revert or
> > > return 1 here:
> > >      /*
> > >       * 0 - The operating system must not ignore the PCI configuration that
> > >       *     firmware has done at boot time.
> > >       */
> > >      aml_append(ifctx1, aml_return(aml_int(0)));
> > > -    aml_append(ifctx, ifctx1);
> > > +    aml_append(ifctx1, aml_return(aml_int(1)));
> > >      aml_append(method, ifctx);
> > >
> >
> > I agree that returning '1' here is a better choice, as it explicitly
> > gives the OS license to reassign all resources, which is what we have
> > been relying on to begin with.
> >
> > OTOH, I do think we should fix arbitrary zero checks in Linux that
> > make no sense on !x86
> >
> > >
> > >
> > > Guenter what happens if we return 1? Do things work well?
> > >
> > > --
> > > MST
> > >
>


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: aarch64 efi boot failures with qemu 6.0+
  2021-07-27  9:01             ` Michael S. Tsirkin
@ 2021-07-27 10:36               ` Igor Mammedov
  2021-07-27 11:32                 ` Guenter Roeck
  2021-07-28 13:11                 ` Michael S. Tsirkin
  0 siblings, 2 replies; 31+ messages in thread
From: Igor Mammedov @ 2021-07-27 10:36 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Jiahui Cen, Ard Biesheuvel, qemu-devel, Bjorn Helgaas,
	Philippe Mathieu-Daudé,
	Ard Biesheuvel, Guenter Roeck

On Tue, 27 Jul 2021 05:01:23 -0400
"Michael S. Tsirkin" <mst@redhat.com> wrote:

> On Mon, Jul 26, 2021 at 10:12:38PM -0700, Guenter Roeck wrote:
> > On 7/26/21 9:45 PM, Michael S. Tsirkin wrote:  
> > > On Mon, Jul 26, 2021 at 06:00:57PM +0200, Ard Biesheuvel wrote:  
> > > > (cc Bjorn)
> > > > 
> > > > On Mon, 26 Jul 2021 at 11:08, Philippe Mathieu-Daudé <philmd@redhat.com> wrote:  
> > > > > 
> > > > > On 7/26/21 12:56 AM, Guenter Roeck wrote:  
> > > > > > On 7/25/21 3:14 PM, Michael S. Tsirkin wrote:  
> > > > > > > On Sat, Jul 24, 2021 at 11:52:34AM -0700, Guenter Roeck wrote:  
> > > > > > > > Hi all,
> > > > > > > > 
> > > > > > > > starting with qemu v6.0, some of my aarch64 efi boot tests no longer
> > > > > > > > work. Analysis shows that PCI devices with IO ports do not instantiate
> > > > > > > > in qemu v6.0 (or v6.1-rc0) when booting through efi. The problem affects
> > > > > > > > (at least) ne2k_pci, tulip, dc390, and am53c974. The problem only
> > > > > > > > affects
> > > > > > > > aarch64, not x86/x86_64.
> > > > > > > > 
> > > > > > > > I bisected the problem to commit 0cf8882fd0 ("acpi/gpex: Inform os to
> > > > > > > > keep firmware resource map"). Since this commit, PCI device BAR
> > > > > > > > allocation has changed. Taking tulip as example, the kernel reports
> > > > > > > > the following PCI bar assignments when running qemu v5.2.
> > > > > > > > 
> > > > > > > > [    3.921801] pci 0000:00:01.0: [1011:0019] type 00 class 0x020000
> > > > > > > > [    3.922207] pci 0000:00:01.0: reg 0x10: [io  0x0000-0x007f]
> > > > > > > > [    3.922505] pci 0000:00:01.0: reg 0x14: [mem 0x10000000-0x1000007f]  
> > > > 
> > > > IIUC, these lines are read back from the BARs
> > > >   
> > > > > > > > [    3.927111] pci 0000:00:01.0: BAR 0: assigned [io  0x1000-0x107f]
> > > > > > > > [    3.927455] pci 0000:00:01.0: BAR 1: assigned [mem
> > > > > > > > 0x10000000-0x1000007f]
> > > > > > > >   
> > > > 
> > > > ... and this is the assignment created by the kernel.
> > > >   
> > > > > > > > With qemu v6.0, the assignment is reported as follows.
> > > > > > > > 
> > > > > > > > [    3.922887] pci 0000:00:01.0: [1011:0019] type 00 class 0x020000
> > > > > > > > [    3.923278] pci 0000:00:01.0: reg 0x10: [io  0x0000-0x007f]
> > > > > > > > [    3.923451] pci 0000:00:01.0: reg 0x14: [mem 0x10000000-0x1000007f]
> > > > > > > >   
> > > > 
> > > > The problem here is that Linux, for legacy reasons, does not support
> > > > I/O ports <= 0x1000 on PCI, so the I/O assignment created by EFI is
> > > > rejected.
> > > > 
> > > > This might make sense on x86, where legacy I/O ports may exist, but on
> > > > other architectures, this makes no sense.  
> > > 
> > > 
> > > Fixing Linux makes sense but OTOH EFI probably shouldn't create mappings
> > > that trip up existing guests, right?
> > >   
> > 
> > I think it is difficult to draw a line. Sure, maybe EFI should not create
> > such mappings, but then maybe qemu should not suddenly start to enforce
> > those mappings for existing guests either.  
> 
> I would say both. But about QEMU actually I think you have a point here.
> Re-reading the spec:
> 
> 0: No (The operating system shall not ignore the PCI configuration that firmware has done
> at boot time. However, the operating system is free to configure the devices in this hierarchy
> that have not been configured by the firmware. There may be a reduced level of hot plug
> capability support in this hierarchy due to resource constraints. This situation is the same as
> the legacy situation where this _DSM is not provided.)
> 1: Yes (The operating system may ignore the PCI configuration that the firmware has done
> at boot time, and reconfigure/rebalance the resources in the hierarchy.)
> 
> 
> I think I misread the spec previously, and understood it to mean that
> 1 means must ignore. In fact 1 gives the most flexibility.
> So why are we suddenly telling the guest it must not override
> firmware?
> 
> The commit log says
>     The diffences could result in resource assignment failure.
> 
> which is kind of vague ...
> 
> Jiahui Cen, Igor, what do you think about it?
> I'm inclined to revert 0cf8882fd06ba0aeb1e90fa6f23fce85504d7e14
> at least for now.
Looking at patch history, it seems consensus was that it's better to
enforce firmware allocations.

Also letting OS do as it pleases might break PCI devices that
don't tolerate reallocation. ex: firmware initializes PCI device
IO/BARs and then fetches ACPI tables, which get patched with
assigned resources.

to me returning 0 seems to be correct choice.
In addition resource hinting also works via firmware allocations,
if we revert the commit it might change those configs.

me wonders if there is a way make enforcement per device.

> > For my own testing, I simply reverted commit 0cf8882fd0 in my copy of
> > qemu. That solves my immediate problem, giving us time to find a solution
> > that is acceptable for everyone. After all, it doesn't look like anyone
> > else has noticed the problem, so there is no real urgency.
> > 
> > Thanks,
> > Guenter  
> 
> Well it's not like we have an army of testers, I think we should
> treat each problem report seriously ...
> 



^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: aarch64 efi boot failures with qemu 6.0+
  2021-07-27  9:30               ` Michael S. Tsirkin
  2021-07-27  9:50                 ` Ard Biesheuvel
@ 2021-07-27 11:18                 ` Guenter Roeck
  1 sibling, 0 replies; 31+ messages in thread
From: Guenter Roeck @ 2021-07-27 11:18 UTC (permalink / raw)
  To: Michael S. Tsirkin, Ard Biesheuvel
  Cc: Jiahui Cen, Ard Biesheuvel, qemu-devel, Bjorn Helgaas,
	Igor Mammedov, Philippe Mathieu-Daudé

On 7/27/21 2:30 AM, Michael S. Tsirkin wrote:
> On Tue, Jul 27, 2021 at 09:04:20AM +0200, Ard Biesheuvel wrote:
>> On Tue, 27 Jul 2021 at 07:12, Guenter Roeck <linux@roeck-us.net> wrote:
>>>
>>> On 7/26/21 9:45 PM, Michael S. Tsirkin wrote:
>>>> On Mon, Jul 26, 2021 at 06:00:57PM +0200, Ard Biesheuvel wrote:
>>>>> (cc Bjorn)
>>>>>
>>>>> On Mon, 26 Jul 2021 at 11:08, Philippe Mathieu-Daudé <philmd@redhat.com> wrote:
>>>>>>
>>>>>> On 7/26/21 12:56 AM, Guenter Roeck wrote:
>>>>>>> On 7/25/21 3:14 PM, Michael S. Tsirkin wrote:
>>>>>>>> On Sat, Jul 24, 2021 at 11:52:34AM -0700, Guenter Roeck wrote:
>>>>>>>>> Hi all,
>>>>>>>>>
>>>>>>>>> starting with qemu v6.0, some of my aarch64 efi boot tests no longer
>>>>>>>>> work. Analysis shows that PCI devices with IO ports do not instantiate
>>>>>>>>> in qemu v6.0 (or v6.1-rc0) when booting through efi. The problem affects
>>>>>>>>> (at least) ne2k_pci, tulip, dc390, and am53c974. The problem only
>>>>>>>>> affects
>>>>>>>>> aarch64, not x86/x86_64.
>>>>>>>>>
>>>>>>>>> I bisected the problem to commit 0cf8882fd0 ("acpi/gpex: Inform os to
>>>>>>>>> keep firmware resource map"). Since this commit, PCI device BAR
>>>>>>>>> allocation has changed. Taking tulip as example, the kernel reports
>>>>>>>>> the following PCI bar assignments when running qemu v5.2.
>>>>>>>>>
>>>>>>>>> [    3.921801] pci 0000:00:01.0: [1011:0019] type 00 class 0x020000
>>>>>>>>> [    3.922207] pci 0000:00:01.0: reg 0x10: [io  0x0000-0x007f]
>>>>>>>>> [    3.922505] pci 0000:00:01.0: reg 0x14: [mem 0x10000000-0x1000007f]
>>>>>
>>>>> IIUC, these lines are read back from the BARs
>>>>>
>>>>>>>>> [    3.927111] pci 0000:00:01.0: BAR 0: assigned [io  0x1000-0x107f]
>>>>>>>>> [    3.927455] pci 0000:00:01.0: BAR 1: assigned [mem
>>>>>>>>> 0x10000000-0x1000007f]
>>>>>>>>>
>>>>>
>>>>> ... and this is the assignment created by the kernel.
>>>>>
>>>>>>>>> With qemu v6.0, the assignment is reported as follows.
>>>>>>>>>
>>>>>>>>> [    3.922887] pci 0000:00:01.0: [1011:0019] type 00 class 0x020000
>>>>>>>>> [    3.923278] pci 0000:00:01.0: reg 0x10: [io  0x0000-0x007f]
>>>>>>>>> [    3.923451] pci 0000:00:01.0: reg 0x14: [mem 0x10000000-0x1000007f]
>>>>>>>>>
>>>>>
>>>>> The problem here is that Linux, for legacy reasons, does not support
>>>>> I/O ports <= 0x1000 on PCI, so the I/O assignment created by EFI is
>>>>> rejected.
>>>>>
>>>>> This might make sense on x86, where legacy I/O ports may exist, but on
>>>>> other architectures, this makes no sense.
>>>>
>>>>
>>>> Fixing Linux makes sense but OTOH EFI probably shouldn't create mappings
>>>> that trip up existing guests, right?
>>>>
>>>
>>> I think it is difficult to draw a line. Sure, maybe EFI should not create
>>> such mappings, but then maybe qemu should not suddenly start to enforce
>>> those mappings for existing guests either.
>>>
>>
>> EFI creates the mappings primarily for itself, and up until DSM #5
>> started to be enforced, all PCI resource allocations that existed at
>> boot were ignored by Linux and recreated from scratch.
>>
>> Also, the commit in question looks dubious to me. I don't think it is
>> likely that Linux would fail to create a resource tree. What does
>> happen is that BARs get moved around, which may cause trouble in some
>> cases: for instance, we had to add special code to the EFI framebuffer
>> driver to copy with framebuffer BARs being relocated.
>>
>>> For my own testing, I simply reverted commit 0cf8882fd0 in my copy of
>>> qemu. That solves my immediate problem, giving us time to find a solution
>>> that is acceptable for everyone. After all, it doesn't look like anyone
>>> else has noticed the problem, so there is no real urgency.
>>>
>>
>> I would argue that it is better to revert that commit. DSM #5 has a
>> long history of debate and misinterpretation, and while I think we
>> ended up with something sane, I don't think we should be using it in
>> this particular case.
> 
> I think revert might make sense, however:
> 
> 0: No (The operating system shall not ignore the PCI configuration that firmware has done
> at boot time. However, the operating system is free to configure the devices in this hierarchy
> that have not been configured by the firmware. There may be a reduced level of hot plug
> capability support in this hierarchy due to resource constraints. This situation is the same as
> the legacy situation where this _DSM is not provided.)
> 
> ^^^^ does not this imply that reporting a 0 as we currently do
>       should be mostly a NOP?
> 
> 
> 1: Yes (The operating system may ignore the PCI configuration that the firmware has done
> at boot time, and reconfigure/rebalance the resources in the hierarchy.)
> 
> 
> So I am debating with myself whether this should be a plain revert or
> return 1 here:
>       /*
>        * 0 - The operating system must not ignore the PCI configuration that
>        *     firmware has done at boot time.
>        */
>       aml_append(ifctx1, aml_return(aml_int(0)));
> -    aml_append(ifctx, ifctx1);
> +    aml_append(ifctx1, aml_return(aml_int(1)));
>       aml_append(method, ifctx);
> 
> 
> 
> Guenter what happens if we return 1? Do things work well?
> 
Yes.

Guenter


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: aarch64 efi boot failures with qemu 6.0+
  2021-07-27 10:36               ` Igor Mammedov
@ 2021-07-27 11:32                 ` Guenter Roeck
  2021-07-28 13:11                 ` Michael S. Tsirkin
  1 sibling, 0 replies; 31+ messages in thread
From: Guenter Roeck @ 2021-07-27 11:32 UTC (permalink / raw)
  To: Igor Mammedov, Michael S. Tsirkin
  Cc: Jiahui Cen, Ard Biesheuvel, qemu-devel, Bjorn Helgaas,
	Philippe Mathieu-Daudé,
	Ard Biesheuvel

On 7/27/21 3:36 AM, Igor Mammedov wrote:
> On Tue, 27 Jul 2021 05:01:23 -0400
> "Michael S. Tsirkin" <mst@redhat.com> wrote:
> 
>> On Mon, Jul 26, 2021 at 10:12:38PM -0700, Guenter Roeck wrote:
>>> On 7/26/21 9:45 PM, Michael S. Tsirkin wrote:
>>>> On Mon, Jul 26, 2021 at 06:00:57PM +0200, Ard Biesheuvel wrote:
>>>>> (cc Bjorn)
>>>>>
>>>>> On Mon, 26 Jul 2021 at 11:08, Philippe Mathieu-Daudé <philmd@redhat.com> wrote:
>>>>>>
>>>>>> On 7/26/21 12:56 AM, Guenter Roeck wrote:
>>>>>>> On 7/25/21 3:14 PM, Michael S. Tsirkin wrote:
>>>>>>>> On Sat, Jul 24, 2021 at 11:52:34AM -0700, Guenter Roeck wrote:
>>>>>>>>> Hi all,
>>>>>>>>>
>>>>>>>>> starting with qemu v6.0, some of my aarch64 efi boot tests no longer
>>>>>>>>> work. Analysis shows that PCI devices with IO ports do not instantiate
>>>>>>>>> in qemu v6.0 (or v6.1-rc0) when booting through efi. The problem affects
>>>>>>>>> (at least) ne2k_pci, tulip, dc390, and am53c974. The problem only
>>>>>>>>> affects
>>>>>>>>> aarch64, not x86/x86_64.
>>>>>>>>>
>>>>>>>>> I bisected the problem to commit 0cf8882fd0 ("acpi/gpex: Inform os to
>>>>>>>>> keep firmware resource map"). Since this commit, PCI device BAR
>>>>>>>>> allocation has changed. Taking tulip as example, the kernel reports
>>>>>>>>> the following PCI bar assignments when running qemu v5.2.
>>>>>>>>>
>>>>>>>>> [    3.921801] pci 0000:00:01.0: [1011:0019] type 00 class 0x020000
>>>>>>>>> [    3.922207] pci 0000:00:01.0: reg 0x10: [io  0x0000-0x007f]
>>>>>>>>> [    3.922505] pci 0000:00:01.0: reg 0x14: [mem 0x10000000-0x1000007f]
>>>>>
>>>>> IIUC, these lines are read back from the BARs
>>>>>    
>>>>>>>>> [    3.927111] pci 0000:00:01.0: BAR 0: assigned [io  0x1000-0x107f]
>>>>>>>>> [    3.927455] pci 0000:00:01.0: BAR 1: assigned [mem
>>>>>>>>> 0x10000000-0x1000007f]
>>>>>>>>>    
>>>>>
>>>>> ... and this is the assignment created by the kernel.
>>>>>    
>>>>>>>>> With qemu v6.0, the assignment is reported as follows.
>>>>>>>>>
>>>>>>>>> [    3.922887] pci 0000:00:01.0: [1011:0019] type 00 class 0x020000
>>>>>>>>> [    3.923278] pci 0000:00:01.0: reg 0x10: [io  0x0000-0x007f]
>>>>>>>>> [    3.923451] pci 0000:00:01.0: reg 0x14: [mem 0x10000000-0x1000007f]
>>>>>>>>>    
>>>>>
>>>>> The problem here is that Linux, for legacy reasons, does not support
>>>>> I/O ports <= 0x1000 on PCI, so the I/O assignment created by EFI is
>>>>> rejected.
>>>>>
>>>>> This might make sense on x86, where legacy I/O ports may exist, but on
>>>>> other architectures, this makes no sense.
>>>>
>>>>
>>>> Fixing Linux makes sense but OTOH EFI probably shouldn't create mappings
>>>> that trip up existing guests, right?
>>>>    
>>>
>>> I think it is difficult to draw a line. Sure, maybe EFI should not create
>>> such mappings, but then maybe qemu should not suddenly start to enforce
>>> those mappings for existing guests either.
>>
>> I would say both. But about QEMU actually I think you have a point here.
>> Re-reading the spec:
>>
>> 0: No (The operating system shall not ignore the PCI configuration that firmware has done
>> at boot time. However, the operating system is free to configure the devices in this hierarchy
>> that have not been configured by the firmware. There may be a reduced level of hot plug
>> capability support in this hierarchy due to resource constraints. This situation is the same as
>> the legacy situation where this _DSM is not provided.)
>> 1: Yes (The operating system may ignore the PCI configuration that the firmware has done
>> at boot time, and reconfigure/rebalance the resources in the hierarchy.)
>>
>>
>> I think I misread the spec previously, and understood it to mean that
>> 1 means must ignore. In fact 1 gives the most flexibility.
>> So why are we suddenly telling the guest it must not override
>> firmware?
>>
>> The commit log says
>>      The diffences could result in resource assignment failure.
>>
>> which is kind of vague ...
>>
>> Jiahui Cen, Igor, what do you think about it?
>> I'm inclined to revert 0cf8882fd06ba0aeb1e90fa6f23fce85504d7e14
>> at least for now.
> Looking at patch history, it seems consensus was that it's better to
> enforce firmware allocations.
> 
> Also letting OS do as it pleases might break PCI devices that
> don't tolerate reallocation. ex: firmware initializes PCI device
> IO/BARs and then fetches ACPI tables, which get patched with
> assigned resources.
> 

On the other side, _not_ letting the OS do as it pleases _will_ break
PCI devices with don't meet OS requirements.

That makes me curious: There has been a lot of "may", "might", and
"could" associated with commit 0cf8882fd06b. Does anyone happen to
have a specific example of a problem that was actually fixed with
this patch ?

Thanks,
Guenter

> to me returning 0 seems to be correct choice.
> In addition resource hinting also works via firmware allocations,
> if we revert the commit it might change those configs.
> 
> me wonders if there is a way make enforcement per device.
> 
>>> For my own testing, I simply reverted commit 0cf8882fd0 in my copy of
>>> qemu. That solves my immediate problem, giving us time to find a solution
>>> that is acceptable for everyone. After all, it doesn't look like anyone
>>> else has noticed the problem, so there is no real urgency.
>>>
>>> Thanks,
>>> Guenter
>>
>> Well it's not like we have an army of testers, I think we should
>> treat each problem report seriously ...
>>
> 



^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: aarch64 efi boot failures with qemu 6.0+
  2021-07-27  4:22               ` Guenter Roeck
@ 2021-07-27 14:25                 ` Bjorn Helgaas
  -1 siblings, 0 replies; 31+ messages in thread
From: Bjorn Helgaas @ 2021-07-27 14:25 UTC (permalink / raw)
  To: Guenter Roeck
  Cc: Ard Biesheuvel, Philippe Mathieu-Daudé,
	Bjorn Helgaas, Jiahui Cen, Michael S. Tsirkin, Ard Biesheuvel,
	qemu-devel, Igor Mammedov, linux-pci

On Mon, Jul 26, 2021 at 09:22:19PM -0700, Guenter Roeck wrote:
> On 7/26/21 2:31 PM, Bjorn Helgaas wrote:
> > [+cc linux-pci]
> > 
> > On Mon, Jul 26, 2021 at 04:16:29PM -0500, Bjorn Helgaas wrote:
> > > On Mon, Jul 26, 2021 at 06:00:57PM +0200, Ard Biesheuvel wrote:
> > > > On Mon, 26 Jul 2021 at 11:08, Philippe Mathieu-Daudé <philmd@redhat.com> wrote:
> > > > > On 7/26/21 12:56 AM, Guenter Roeck wrote:
> > > > > > On 7/25/21 3:14 PM, Michael S. Tsirkin wrote:
> > > > > > > On Sat, Jul 24, 2021 at 11:52:34AM -0700, Guenter Roeck wrote:
> > > > > > > > Hi all,
> > > > > > > > 
> > > > > > > > starting with qemu v6.0, some of my aarch64 efi boot tests no longer
> > > > > > > > work. Analysis shows that PCI devices with IO ports do not instantiate
> > > > > > > > in qemu v6.0 (or v6.1-rc0) when booting through efi. The problem affects
> > > > > > > > (at least) ne2k_pci, tulip, dc390, and am53c974. The problem only
> > > > > > > > affects
> > > > > > > > aarch64, not x86/x86_64.
> > > > > > > > 
> > > > > > > > I bisected the problem to commit 0cf8882fd0 ("acpi/gpex: Inform os to
> > > > > > > > keep firmware resource map"). Since this commit, PCI device BAR
> > > > > > > > allocation has changed. Taking tulip as example, the kernel reports
> > > > > > > > the following PCI bar assignments when running qemu v5.2.
> > > > > > > > 
> > > > > > > > [    3.921801] pci 0000:00:01.0: [1011:0019] type 00 class 0x020000
> > > > > > > > [    3.922207] pci 0000:00:01.0: reg 0x10: [io  0x0000-0x007f]
> > > > > > > > [    3.922505] pci 0000:00:01.0: reg 0x14: [mem 0x10000000-0x1000007f]
> > > > 
> > > > IIUC, these lines are read back from the BARs
> > > > 
> > > > > > > > [    3.927111] pci 0000:00:01.0: BAR 0: assigned [io  0x1000-0x107f]
> > > > > > > > [    3.927455] pci 0000:00:01.0: BAR 1: assigned [mem
> > > > > > > > 0x10000000-0x1000007f]
> > > > > > > > 
> > > > 
> > > > ... and this is the assignment created by the kernel.
> > > > 
> > > > > > > > With qemu v6.0, the assignment is reported as follows.
> > > > > > > > 
> > > > > > > > [    3.922887] pci 0000:00:01.0: [1011:0019] type 00 class 0x020000
> > > > > > > > [    3.923278] pci 0000:00:01.0: reg 0x10: [io  0x0000-0x007f]
> > > > > > > > [    3.923451] pci 0000:00:01.0: reg 0x14: [mem 0x10000000-0x1000007f]
> > > > 
> > > > The problem here is that Linux, for legacy reasons, does not support
> > > > I/O ports <= 0x1000 on PCI, so the I/O assignment created by EFI is
> > > > rejected.
> > > > 
> > > > This might make sense on x86, where legacy I/O ports may exist, but on
> > > > other architectures, this makes no sense.
> > > 
> > > I guess this is the "#define PCIBIOS_MIN_IO 0x1000" in
> > > arm64/include/asm/pci.h.  From a PCI point of view, I'm not opposed to
> > > changing that to 0, as it is on csky, riscv, sh, sparc, um.  But it's
> > > really an arch question, so the arm64 folks would have to weigh in.
> > > 
> > > But I don't think that would fix this.  PCIBIOS_MIN_IO is mainly used
> > > when we assign or reassign resources to a BAR, and if firmware tells
> > > us to preserve the assignments done by firmware, Linux shouldn't be
> > > doing any assignment or reassignment.
> > > 
> > > Linux received 00:01.0 BAR 0 as [io 0x0000-0x007f], and Guenter didn't
> > > report any reassignment, so I assume Linux saw the
> > > DSM_PCI_PRESERVE_BOOT_CONFIG [1] and didn't change anything.
> > > 
> > > Could this be due to drivers assuming that an I/O BAR of 0 is invalid?
> > > I see that at least ne2k_pci_init_one() [2] seems to assume that.  And
> 
> Correct, and ne2k_pci is known to already fail on architectures where the
> IO address range starts at 0, such as riscv. Not that it helps to fix the
> code - doing so only results in a crash elsewhere when running a riscv
> emulation (when executing outsl, suggesting that there may be a problem
> with that emulation or its use). But that is a different problem.
> 
> > > tulip_init_one() [3] and pci_esp_probe_one() (am53c974.c, [4]) use
> > > pci_iomap() [5], which fails if the resource starts at 0.
> > > 
> > > So pci_iomap() is probably already broken on the arches above that
> > > allow I/O BARs to be zero.  Maybe pci_iomap() should only fail on
> > > "!start" for *memory* BARs, e.g.,
> > > 
> > > diff --git a/lib/pci_iomap.c b/lib/pci_iomap.c
> > > index 2d3eb1cb73b8..77455e702a3e 100644
> > > --- a/lib/pci_iomap.c
> > > +++ b/lib/pci_iomap.c
> > > @@ -34,7 +34,9 @@ void __iomem *pci_iomap_range(struct pci_dev *dev,
> > >   	resource_size_t len = pci_resource_len(dev, bar);
> > >   	unsigned long flags = pci_resource_flags(dev, bar);
> > > -	if (len <= offset || !start)
> > > +	if (flags & IORESOURCE_MEM && !start)
> > > +		return NULL;
> 
> I am far out of my league here, but what is the purpose of the !start
> check given the PCIBIOS_MIN_MEM define which can also be 0 ? Shouldn't
> the check be against PCIBIOS_MIN_MEM and PCIBIOS_MIN_IO ?

I think the "!start" check is intended to catch uninitialized BARs.

For memory BARs, I think it works pretty well.  Most PCI host bridges
don't do address translation, so "start" is zero if the BAR contains
zero (the power-up default).  It's conceivable a platform could put a
PCI aperture at CPU physical address zero, but unlikely.  It's also
conceivable that "start" could be zero if the bridge translates
addresses and the BAR happens to contain the complement of the
translation, but that also seems unlikely.

I don't think the check works quite as well for I/O BARs.  Zero is a
perfectly legitimate I/O port, though we could argue that we should
never use it, just like we have decided not to use memory address
zero.

In this case, firmware told us to preserve whatever it left in the
BARs, so if it left zero in an I/O BAR, it's basically telling us we
have to use I/O port zero.

> But, anyway, the above change fixes the problem for 'tulip', though
> obviously not for 'ne2k_pci'. 'ne2k_pci' starts working if I remove
> the "!ioaddr" check in ne2k_pci_init_one().
> 
> Thanks,
> Guenter
> 
> > > +	if (len <= offset)
> > >   		return NULL;
> > >   	len -= offset;
> > >   	start += offset;
> > > 
> > > 
> > > [1] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/acpi/pci_root.c?id=v5.13#n915
> > > [2] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/net/ethernet/8390/ne2k-pci.c?id=v5.13#n247
> > > [3] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/net/ethernet/dec/tulip/tulip_core.c?id=v5.13#n1418
> > > [4] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/scsi/am53c974.c?id=v5.13#n431
> > > [5] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/lib/pci_iomap.c?id=v5.13#n37
> > > 
> > > > > > > > and the controller does not instantiate. The problem disapears after
> > > > > > > > reverting commit 0cf8882fd0.
> > > > > > > > 
> > > > > > > > Attached is a summary of test runs with various devices and qemu v5.2
> > > > > > > > as well as qemu v6.0, and the command line I use for efi boots.
> > > > > > > > 
> > > > > > > > Did commit 0cf8882fd0 introduce a bug, do I now need need some different
> > > > > > > > command line to instantiate PCI devices with io ports, or are such
> > > > > > > > devices
> > > > > > > > simply no longer supported if the system is booted with efi support ?
> > > > > > > > 
> > > > > > > > Thanks,
> > > > > > > > Guenter
> > > > > > > 
> > > > > > > 
> > > > > > > So that commit basically just says don't ignore what efi did.
> > > > > > > 
> > > > > > > The issue's thus likely efi.
> > > > > > > 
> > > > > > 
> > > > > > I don't see the problem with efi boots on x86 and x86_64.
> > > > > > Any idea why that might be the case ?
> > > > > > 
> > > > > > Thanks,
> > > > > > Guenter
> > > > > > 
> > > > > > > Cc the maintainer. Philippe can you comment pls?
> > > > > 
> > > > > I'll have a look. Cc'ing Ard for EDK2/Aarch64.
> > > > 
> > > > So a potential workaround would be to use a different I/O resource
> > > > window for ArmVirtPkg, that starts at 0x1000. But I would prefer to
> > > > fix Linux instead.
> > > > 
> > > > 
> > > > > > > 
> > > > > > > > ---
> > > > > > > > Command line (tulip network interface):
> > > > > > > > 
> > > > > > > > CMDLINE="root=/dev/vda console=ttyAMA0"
> > > > > > > > ROOTFS="rootfs.ext2"
> > > > > > > > 
> > > > > > > > qemu-system-aarch64 -M virt -kernel arch/arm64/boot/Image -no-reboot \
> > > > > > > >           -m 512 -cpu cortex-a57 -no-reboot \
> > > > > > > >           -device tulip,netdev=net0 -netdev user,id=net0 \
> > > > > > > >           -bios QEMU_EFI-aarch64.fd \
> > > > > > > >           -snapshot \
> > > > > > > >           -device virtio-blk-device,drive=d0 \
> > > > > > > >           -drive file=${ROOTFS},if=none,id=d0,format=raw \
> > > > > > > >           -nographic -serial stdio -monitor none \
> > > > > > > >           --append "${CMDLINE}"
> > > > > > > > 
> > > > > > > > ---
> > > > > > > > Boot tests with various devices known to work in qemu v5.2.
> > > > > > > > 
> > > > > > > >          v5.2    v6.0    v6.0
> > > > > > > >          efi    non-efi    efi
> > > > > > > > e1000        pass    pass    pass
> > > > > > > > e1000-82544gc    pass    pass    pass
> > > > > > > > e1000-82545em    pass    pass    pass
> > > > > > > > e1000e        pass    pass    pass
> > > > > > > > i82550        pass    pass    pass
> > > > > > > > i82557a        pass    pass    pass
> > > > > > > > i82557b        pass    pass    pass
> > > > > > > > i82557c        pass    pass    pass
> > > > > > > > i82558a        pass    pass    pass
> > > > > > > > i82559b        pass    pass    pass
> > > > > > > > i82559c        pass    pass    pass
> > > > > > > > i82559er    pass    pass    pass
> > > > > > > > i82562        pass    pass    pass
> > > > > > > > i82801        pass    pass    pass
> > > > > > > > ne2k_pci    pass    pass    fail    <--
> > > > > > > > pcnet        pass    pass    pass
> > > > > > > > rtl8139        pass    pass    pass
> > > > > > > > tulip        pass    pass    fail    <--
> > > > > > > > usb-net        pass    pass    pass
> > > > > > > > virtio-net-device
> > > > > > > >          pass    pass    pass
> > > > > > > > virtio-net-pci    pass    pass    pass
> > > > > > > > virtio-net-pci-non-transitional
> > > > > > > >          pass    pass    pass
> > > > > > > > 
> > > > > > > > usb-xhci    pass    pass    pass
> > > > > > > > usb-ehci    pass    pass    pass
> > > > > > > > usb-ohci    pass    pass    pass
> > > > > > > > usb-uas-xhci    pass    pass    pass
> > > > > > > > virtio        pass    pass    pass
> > > > > > > > virtio-blk-pci    pass    pass    pass
> > > > > > > > virtio-blk-device
> > > > > > > >          pass    pass    pass
> > > > > > > > nvme        pass    pass    pass
> > > > > > > > sdhci        pass    pass    pass
> > > > > > > > dc390        pass    pass    fail    <--
> > > > > > > > am53c974    pass    pass    fail    <--
> > > > > > > > lsi53c895ai    pass    pass    pass
> > > > > > > > mptsas1068    pass    pass    pass
> > > > > > > > lsi53c810    pass    pass    pass
> > > > > > > > megasas        pass    pass    pass
> > > > > > > > megasas-gen2    pass    pass    pass
> > > > > > > > virtio-scsi-device
> > > > > > > >          pass    pass    pass
> > > > > > > > virtio-scsi-pci    pass    pass    pass
> > > > > > > 
> > > > > > 
> > > > > 
> > > > 
> 

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: aarch64 efi boot failures with qemu 6.0+
@ 2021-07-27 14:25                 ` Bjorn Helgaas
  0 siblings, 0 replies; 31+ messages in thread
From: Bjorn Helgaas @ 2021-07-27 14:25 UTC (permalink / raw)
  To: Guenter Roeck
  Cc: Jiahui Cen, Michael S. Tsirkin, linux-pci, Ard Biesheuvel,
	qemu-devel, Bjorn Helgaas, Igor Mammedov,
	Philippe Mathieu-Daudé,
	Ard Biesheuvel

On Mon, Jul 26, 2021 at 09:22:19PM -0700, Guenter Roeck wrote:
> On 7/26/21 2:31 PM, Bjorn Helgaas wrote:
> > [+cc linux-pci]
> > 
> > On Mon, Jul 26, 2021 at 04:16:29PM -0500, Bjorn Helgaas wrote:
> > > On Mon, Jul 26, 2021 at 06:00:57PM +0200, Ard Biesheuvel wrote:
> > > > On Mon, 26 Jul 2021 at 11:08, Philippe Mathieu-Daudé <philmd@redhat.com> wrote:
> > > > > On 7/26/21 12:56 AM, Guenter Roeck wrote:
> > > > > > On 7/25/21 3:14 PM, Michael S. Tsirkin wrote:
> > > > > > > On Sat, Jul 24, 2021 at 11:52:34AM -0700, Guenter Roeck wrote:
> > > > > > > > Hi all,
> > > > > > > > 
> > > > > > > > starting with qemu v6.0, some of my aarch64 efi boot tests no longer
> > > > > > > > work. Analysis shows that PCI devices with IO ports do not instantiate
> > > > > > > > in qemu v6.0 (or v6.1-rc0) when booting through efi. The problem affects
> > > > > > > > (at least) ne2k_pci, tulip, dc390, and am53c974. The problem only
> > > > > > > > affects
> > > > > > > > aarch64, not x86/x86_64.
> > > > > > > > 
> > > > > > > > I bisected the problem to commit 0cf8882fd0 ("acpi/gpex: Inform os to
> > > > > > > > keep firmware resource map"). Since this commit, PCI device BAR
> > > > > > > > allocation has changed. Taking tulip as example, the kernel reports
> > > > > > > > the following PCI bar assignments when running qemu v5.2.
> > > > > > > > 
> > > > > > > > [    3.921801] pci 0000:00:01.0: [1011:0019] type 00 class 0x020000
> > > > > > > > [    3.922207] pci 0000:00:01.0: reg 0x10: [io  0x0000-0x007f]
> > > > > > > > [    3.922505] pci 0000:00:01.0: reg 0x14: [mem 0x10000000-0x1000007f]
> > > > 
> > > > IIUC, these lines are read back from the BARs
> > > > 
> > > > > > > > [    3.927111] pci 0000:00:01.0: BAR 0: assigned [io  0x1000-0x107f]
> > > > > > > > [    3.927455] pci 0000:00:01.0: BAR 1: assigned [mem
> > > > > > > > 0x10000000-0x1000007f]
> > > > > > > > 
> > > > 
> > > > ... and this is the assignment created by the kernel.
> > > > 
> > > > > > > > With qemu v6.0, the assignment is reported as follows.
> > > > > > > > 
> > > > > > > > [    3.922887] pci 0000:00:01.0: [1011:0019] type 00 class 0x020000
> > > > > > > > [    3.923278] pci 0000:00:01.0: reg 0x10: [io  0x0000-0x007f]
> > > > > > > > [    3.923451] pci 0000:00:01.0: reg 0x14: [mem 0x10000000-0x1000007f]
> > > > 
> > > > The problem here is that Linux, for legacy reasons, does not support
> > > > I/O ports <= 0x1000 on PCI, so the I/O assignment created by EFI is
> > > > rejected.
> > > > 
> > > > This might make sense on x86, where legacy I/O ports may exist, but on
> > > > other architectures, this makes no sense.
> > > 
> > > I guess this is the "#define PCIBIOS_MIN_IO 0x1000" in
> > > arm64/include/asm/pci.h.  From a PCI point of view, I'm not opposed to
> > > changing that to 0, as it is on csky, riscv, sh, sparc, um.  But it's
> > > really an arch question, so the arm64 folks would have to weigh in.
> > > 
> > > But I don't think that would fix this.  PCIBIOS_MIN_IO is mainly used
> > > when we assign or reassign resources to a BAR, and if firmware tells
> > > us to preserve the assignments done by firmware, Linux shouldn't be
> > > doing any assignment or reassignment.
> > > 
> > > Linux received 00:01.0 BAR 0 as [io 0x0000-0x007f], and Guenter didn't
> > > report any reassignment, so I assume Linux saw the
> > > DSM_PCI_PRESERVE_BOOT_CONFIG [1] and didn't change anything.
> > > 
> > > Could this be due to drivers assuming that an I/O BAR of 0 is invalid?
> > > I see that at least ne2k_pci_init_one() [2] seems to assume that.  And
> 
> Correct, and ne2k_pci is known to already fail on architectures where the
> IO address range starts at 0, such as riscv. Not that it helps to fix the
> code - doing so only results in a crash elsewhere when running a riscv
> emulation (when executing outsl, suggesting that there may be a problem
> with that emulation or its use). But that is a different problem.
> 
> > > tulip_init_one() [3] and pci_esp_probe_one() (am53c974.c, [4]) use
> > > pci_iomap() [5], which fails if the resource starts at 0.
> > > 
> > > So pci_iomap() is probably already broken on the arches above that
> > > allow I/O BARs to be zero.  Maybe pci_iomap() should only fail on
> > > "!start" for *memory* BARs, e.g.,
> > > 
> > > diff --git a/lib/pci_iomap.c b/lib/pci_iomap.c
> > > index 2d3eb1cb73b8..77455e702a3e 100644
> > > --- a/lib/pci_iomap.c
> > > +++ b/lib/pci_iomap.c
> > > @@ -34,7 +34,9 @@ void __iomem *pci_iomap_range(struct pci_dev *dev,
> > >   	resource_size_t len = pci_resource_len(dev, bar);
> > >   	unsigned long flags = pci_resource_flags(dev, bar);
> > > -	if (len <= offset || !start)
> > > +	if (flags & IORESOURCE_MEM && !start)
> > > +		return NULL;
> 
> I am far out of my league here, but what is the purpose of the !start
> check given the PCIBIOS_MIN_MEM define which can also be 0 ? Shouldn't
> the check be against PCIBIOS_MIN_MEM and PCIBIOS_MIN_IO ?

I think the "!start" check is intended to catch uninitialized BARs.

For memory BARs, I think it works pretty well.  Most PCI host bridges
don't do address translation, so "start" is zero if the BAR contains
zero (the power-up default).  It's conceivable a platform could put a
PCI aperture at CPU physical address zero, but unlikely.  It's also
conceivable that "start" could be zero if the bridge translates
addresses and the BAR happens to contain the complement of the
translation, but that also seems unlikely.

I don't think the check works quite as well for I/O BARs.  Zero is a
perfectly legitimate I/O port, though we could argue that we should
never use it, just like we have decided not to use memory address
zero.

In this case, firmware told us to preserve whatever it left in the
BARs, so if it left zero in an I/O BAR, it's basically telling us we
have to use I/O port zero.

> But, anyway, the above change fixes the problem for 'tulip', though
> obviously not for 'ne2k_pci'. 'ne2k_pci' starts working if I remove
> the "!ioaddr" check in ne2k_pci_init_one().
> 
> Thanks,
> Guenter
> 
> > > +	if (len <= offset)
> > >   		return NULL;
> > >   	len -= offset;
> > >   	start += offset;
> > > 
> > > 
> > > [1] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/acpi/pci_root.c?id=v5.13#n915
> > > [2] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/net/ethernet/8390/ne2k-pci.c?id=v5.13#n247
> > > [3] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/net/ethernet/dec/tulip/tulip_core.c?id=v5.13#n1418
> > > [4] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/scsi/am53c974.c?id=v5.13#n431
> > > [5] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/lib/pci_iomap.c?id=v5.13#n37
> > > 
> > > > > > > > and the controller does not instantiate. The problem disapears after
> > > > > > > > reverting commit 0cf8882fd0.
> > > > > > > > 
> > > > > > > > Attached is a summary of test runs with various devices and qemu v5.2
> > > > > > > > as well as qemu v6.0, and the command line I use for efi boots.
> > > > > > > > 
> > > > > > > > Did commit 0cf8882fd0 introduce a bug, do I now need need some different
> > > > > > > > command line to instantiate PCI devices with io ports, or are such
> > > > > > > > devices
> > > > > > > > simply no longer supported if the system is booted with efi support ?
> > > > > > > > 
> > > > > > > > Thanks,
> > > > > > > > Guenter
> > > > > > > 
> > > > > > > 
> > > > > > > So that commit basically just says don't ignore what efi did.
> > > > > > > 
> > > > > > > The issue's thus likely efi.
> > > > > > > 
> > > > > > 
> > > > > > I don't see the problem with efi boots on x86 and x86_64.
> > > > > > Any idea why that might be the case ?
> > > > > > 
> > > > > > Thanks,
> > > > > > Guenter
> > > > > > 
> > > > > > > Cc the maintainer. Philippe can you comment pls?
> > > > > 
> > > > > I'll have a look. Cc'ing Ard for EDK2/Aarch64.
> > > > 
> > > > So a potential workaround would be to use a different I/O resource
> > > > window for ArmVirtPkg, that starts at 0x1000. But I would prefer to
> > > > fix Linux instead.
> > > > 
> > > > 
> > > > > > > 
> > > > > > > > ---
> > > > > > > > Command line (tulip network interface):
> > > > > > > > 
> > > > > > > > CMDLINE="root=/dev/vda console=ttyAMA0"
> > > > > > > > ROOTFS="rootfs.ext2"
> > > > > > > > 
> > > > > > > > qemu-system-aarch64 -M virt -kernel arch/arm64/boot/Image -no-reboot \
> > > > > > > >           -m 512 -cpu cortex-a57 -no-reboot \
> > > > > > > >           -device tulip,netdev=net0 -netdev user,id=net0 \
> > > > > > > >           -bios QEMU_EFI-aarch64.fd \
> > > > > > > >           -snapshot \
> > > > > > > >           -device virtio-blk-device,drive=d0 \
> > > > > > > >           -drive file=${ROOTFS},if=none,id=d0,format=raw \
> > > > > > > >           -nographic -serial stdio -monitor none \
> > > > > > > >           --append "${CMDLINE}"
> > > > > > > > 
> > > > > > > > ---
> > > > > > > > Boot tests with various devices known to work in qemu v5.2.
> > > > > > > > 
> > > > > > > >          v5.2    v6.0    v6.0
> > > > > > > >          efi    non-efi    efi
> > > > > > > > e1000        pass    pass    pass
> > > > > > > > e1000-82544gc    pass    pass    pass
> > > > > > > > e1000-82545em    pass    pass    pass
> > > > > > > > e1000e        pass    pass    pass
> > > > > > > > i82550        pass    pass    pass
> > > > > > > > i82557a        pass    pass    pass
> > > > > > > > i82557b        pass    pass    pass
> > > > > > > > i82557c        pass    pass    pass
> > > > > > > > i82558a        pass    pass    pass
> > > > > > > > i82559b        pass    pass    pass
> > > > > > > > i82559c        pass    pass    pass
> > > > > > > > i82559er    pass    pass    pass
> > > > > > > > i82562        pass    pass    pass
> > > > > > > > i82801        pass    pass    pass
> > > > > > > > ne2k_pci    pass    pass    fail    <--
> > > > > > > > pcnet        pass    pass    pass
> > > > > > > > rtl8139        pass    pass    pass
> > > > > > > > tulip        pass    pass    fail    <--
> > > > > > > > usb-net        pass    pass    pass
> > > > > > > > virtio-net-device
> > > > > > > >          pass    pass    pass
> > > > > > > > virtio-net-pci    pass    pass    pass
> > > > > > > > virtio-net-pci-non-transitional
> > > > > > > >          pass    pass    pass
> > > > > > > > 
> > > > > > > > usb-xhci    pass    pass    pass
> > > > > > > > usb-ehci    pass    pass    pass
> > > > > > > > usb-ohci    pass    pass    pass
> > > > > > > > usb-uas-xhci    pass    pass    pass
> > > > > > > > virtio        pass    pass    pass
> > > > > > > > virtio-blk-pci    pass    pass    pass
> > > > > > > > virtio-blk-device
> > > > > > > >          pass    pass    pass
> > > > > > > > nvme        pass    pass    pass
> > > > > > > > sdhci        pass    pass    pass
> > > > > > > > dc390        pass    pass    fail    <--
> > > > > > > > am53c974    pass    pass    fail    <--
> > > > > > > > lsi53c895ai    pass    pass    pass
> > > > > > > > mptsas1068    pass    pass    pass
> > > > > > > > lsi53c810    pass    pass    pass
> > > > > > > > megasas        pass    pass    pass
> > > > > > > > megasas-gen2    pass    pass    pass
> > > > > > > > virtio-scsi-device
> > > > > > > >          pass    pass    pass
> > > > > > > > virtio-scsi-pci    pass    pass    pass
> > > > > > > 
> > > > > > 
> > > > > 
> > > > 
> 


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: aarch64 efi boot failures with qemu 6.0+
  2021-07-27 10:36               ` Igor Mammedov
  2021-07-27 11:32                 ` Guenter Roeck
@ 2021-07-28 13:11                 ` Michael S. Tsirkin
  2021-07-28 13:25                   ` Ard Biesheuvel
  1 sibling, 1 reply; 31+ messages in thread
From: Michael S. Tsirkin @ 2021-07-28 13:11 UTC (permalink / raw)
  To: Igor Mammedov
  Cc: Jiahui Cen, Ard Biesheuvel, qemu-devel, Bjorn Helgaas,
	Philippe Mathieu-Daudé,
	Ard Biesheuvel, Guenter Roeck

On Tue, Jul 27, 2021 at 12:36:03PM +0200, Igor Mammedov wrote:
> On Tue, 27 Jul 2021 05:01:23 -0400
> "Michael S. Tsirkin" <mst@redhat.com> wrote:
> 
> > On Mon, Jul 26, 2021 at 10:12:38PM -0700, Guenter Roeck wrote:
> > > On 7/26/21 9:45 PM, Michael S. Tsirkin wrote:  
> > > > On Mon, Jul 26, 2021 at 06:00:57PM +0200, Ard Biesheuvel wrote:  
> > > > > (cc Bjorn)
> > > > > 
> > > > > On Mon, 26 Jul 2021 at 11:08, Philippe Mathieu-Daudé <philmd@redhat.com> wrote:  
> > > > > > 
> > > > > > On 7/26/21 12:56 AM, Guenter Roeck wrote:  
> > > > > > > On 7/25/21 3:14 PM, Michael S. Tsirkin wrote:  
> > > > > > > > On Sat, Jul 24, 2021 at 11:52:34AM -0700, Guenter Roeck wrote:  
> > > > > > > > > Hi all,
> > > > > > > > > 
> > > > > > > > > starting with qemu v6.0, some of my aarch64 efi boot tests no longer
> > > > > > > > > work. Analysis shows that PCI devices with IO ports do not instantiate
> > > > > > > > > in qemu v6.0 (or v6.1-rc0) when booting through efi. The problem affects
> > > > > > > > > (at least) ne2k_pci, tulip, dc390, and am53c974. The problem only
> > > > > > > > > affects
> > > > > > > > > aarch64, not x86/x86_64.
> > > > > > > > > 
> > > > > > > > > I bisected the problem to commit 0cf8882fd0 ("acpi/gpex: Inform os to
> > > > > > > > > keep firmware resource map"). Since this commit, PCI device BAR
> > > > > > > > > allocation has changed. Taking tulip as example, the kernel reports
> > > > > > > > > the following PCI bar assignments when running qemu v5.2.
> > > > > > > > > 
> > > > > > > > > [    3.921801] pci 0000:00:01.0: [1011:0019] type 00 class 0x020000
> > > > > > > > > [    3.922207] pci 0000:00:01.0: reg 0x10: [io  0x0000-0x007f]
> > > > > > > > > [    3.922505] pci 0000:00:01.0: reg 0x14: [mem 0x10000000-0x1000007f]  
> > > > > 
> > > > > IIUC, these lines are read back from the BARs
> > > > >   
> > > > > > > > > [    3.927111] pci 0000:00:01.0: BAR 0: assigned [io  0x1000-0x107f]
> > > > > > > > > [    3.927455] pci 0000:00:01.0: BAR 1: assigned [mem
> > > > > > > > > 0x10000000-0x1000007f]
> > > > > > > > >   
> > > > > 
> > > > > ... and this is the assignment created by the kernel.
> > > > >   
> > > > > > > > > With qemu v6.0, the assignment is reported as follows.
> > > > > > > > > 
> > > > > > > > > [    3.922887] pci 0000:00:01.0: [1011:0019] type 00 class 0x020000
> > > > > > > > > [    3.923278] pci 0000:00:01.0: reg 0x10: [io  0x0000-0x007f]
> > > > > > > > > [    3.923451] pci 0000:00:01.0: reg 0x14: [mem 0x10000000-0x1000007f]
> > > > > > > > >   
> > > > > 
> > > > > The problem here is that Linux, for legacy reasons, does not support
> > > > > I/O ports <= 0x1000 on PCI, so the I/O assignment created by EFI is
> > > > > rejected.
> > > > > 
> > > > > This might make sense on x86, where legacy I/O ports may exist, but on
> > > > > other architectures, this makes no sense.  
> > > > 
> > > > 
> > > > Fixing Linux makes sense but OTOH EFI probably shouldn't create mappings
> > > > that trip up existing guests, right?
> > > >   
> > > 
> > > I think it is difficult to draw a line. Sure, maybe EFI should not create
> > > such mappings, but then maybe qemu should not suddenly start to enforce
> > > those mappings for existing guests either.  
> > 
> > I would say both. But about QEMU actually I think you have a point here.
> > Re-reading the spec:
> > 
> > 0: No (The operating system shall not ignore the PCI configuration that firmware has done
> > at boot time. However, the operating system is free to configure the devices in this hierarchy
> > that have not been configured by the firmware. There may be a reduced level of hot plug
> > capability support in this hierarchy due to resource constraints. This situation is the same as
> > the legacy situation where this _DSM is not provided.)
> > 1: Yes (The operating system may ignore the PCI configuration that the firmware has done
> > at boot time, and reconfigure/rebalance the resources in the hierarchy.)
> > 
> > 
> > I think I misread the spec previously, and understood it to mean that
> > 1 means must ignore. In fact 1 gives the most flexibility.
> > So why are we suddenly telling the guest it must not override
> > firmware?
> > 
> > The commit log says
> >     The diffences could result in resource assignment failure.
> > 
> > which is kind of vague ...
> > 
> > Jiahui Cen, Igor, what do you think about it?
> > I'm inclined to revert 0cf8882fd06ba0aeb1e90fa6f23fce85504d7e14
> > at least for now.
> Looking at patch history, it seems consensus was that it's better to
> enforce firmware allocations.
> 
> Also letting OS do as it pleases might break PCI devices that
> don't tolerate reallocation. ex: firmware initializes PCI device
> IO/BARs and then fetches ACPI tables, which get patched with
> assigned resources.
> 
> to me returning 0 seems to be correct choice.
> In addition resource hinting also works via firmware allocations,
> if we revert the commit it might change those configs.


Well if firmware people now tell us their allocations were never
intended for guest OS use then maybe we should not intervene.

As others noted the original commit was kind of vague:

1. it said "Using _DSM #5 method to inform guest os not to ignore the PCI configuration
that firmware has done at boot time could handle the differences."
which is not what the spec says and not what the patch did -
guest os does not ignore configuration even without this,
it is just allowed to change it.


2. is says could result but does not report whether that happened in the
field.


Given this causes a regression I'm inclined to just revert for now.
We can figure it out for the next release.


> 
> me wonders if there is a way make enforcement per device.

devices shouldn't normally care by the spec.


> > > For my own testing, I simply reverted commit 0cf8882fd0 in my copy of
> > > qemu. That solves my immediate problem, giving us time to find a solution
> > > that is acceptable for everyone. After all, it doesn't look like anyone
> > > else has noticed the problem, so there is no real urgency.
> > > 
> > > Thanks,
> > > Guenter  
> > 
> > Well it's not like we have an army of testers, I think we should
> > treat each problem report seriously ...
> > 



^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: aarch64 efi boot failures with qemu 6.0+
  2021-07-28 13:11                 ` Michael S. Tsirkin
@ 2021-07-28 13:25                   ` Ard Biesheuvel
  2021-07-28 14:03                     ` Guenter Roeck
  0 siblings, 1 reply; 31+ messages in thread
From: Ard Biesheuvel @ 2021-07-28 13:25 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Jiahui Cen, Ard Biesheuvel, qemu-devel, Bjorn Helgaas,
	Igor Mammedov, Philippe Mathieu-Daudé,
	Guenter Roeck

On Wed, 28 Jul 2021 at 15:11, Michael S. Tsirkin <mst@redhat.com> wrote:
>
> On Tue, Jul 27, 2021 at 12:36:03PM +0200, Igor Mammedov wrote:
> > On Tue, 27 Jul 2021 05:01:23 -0400
> > "Michael S. Tsirkin" <mst@redhat.com> wrote:
> >
> > > On Mon, Jul 26, 2021 at 10:12:38PM -0700, Guenter Roeck wrote:
> > > > On 7/26/21 9:45 PM, Michael S. Tsirkin wrote:
> > > > > On Mon, Jul 26, 2021 at 06:00:57PM +0200, Ard Biesheuvel wrote:
> > > > > > (cc Bjorn)
> > > > > >
> > > > > > On Mon, 26 Jul 2021 at 11:08, Philippe Mathieu-Daudé <philmd@redhat.com> wrote:
> > > > > > >
> > > > > > > On 7/26/21 12:56 AM, Guenter Roeck wrote:
> > > > > > > > On 7/25/21 3:14 PM, Michael S. Tsirkin wrote:
> > > > > > > > > On Sat, Jul 24, 2021 at 11:52:34AM -0700, Guenter Roeck wrote:
> > > > > > > > > > Hi all,
> > > > > > > > > >
> > > > > > > > > > starting with qemu v6.0, some of my aarch64 efi boot tests no longer
> > > > > > > > > > work. Analysis shows that PCI devices with IO ports do not instantiate
> > > > > > > > > > in qemu v6.0 (or v6.1-rc0) when booting through efi. The problem affects
> > > > > > > > > > (at least) ne2k_pci, tulip, dc390, and am53c974. The problem only
> > > > > > > > > > affects
> > > > > > > > > > aarch64, not x86/x86_64.
> > > > > > > > > >
> > > > > > > > > > I bisected the problem to commit 0cf8882fd0 ("acpi/gpex: Inform os to
> > > > > > > > > > keep firmware resource map"). Since this commit, PCI device BAR
> > > > > > > > > > allocation has changed. Taking tulip as example, the kernel reports
> > > > > > > > > > the following PCI bar assignments when running qemu v5.2.
> > > > > > > > > >
> > > > > > > > > > [    3.921801] pci 0000:00:01.0: [1011:0019] type 00 class 0x020000
> > > > > > > > > > [    3.922207] pci 0000:00:01.0: reg 0x10: [io  0x0000-0x007f]
> > > > > > > > > > [    3.922505] pci 0000:00:01.0: reg 0x14: [mem 0x10000000-0x1000007f]
> > > > > >
> > > > > > IIUC, these lines are read back from the BARs
> > > > > >
> > > > > > > > > > [    3.927111] pci 0000:00:01.0: BAR 0: assigned [io  0x1000-0x107f]
> > > > > > > > > > [    3.927455] pci 0000:00:01.0: BAR 1: assigned [mem
> > > > > > > > > > 0x10000000-0x1000007f]
> > > > > > > > > >
> > > > > >
> > > > > > ... and this is the assignment created by the kernel.
> > > > > >
> > > > > > > > > > With qemu v6.0, the assignment is reported as follows.
> > > > > > > > > >
> > > > > > > > > > [    3.922887] pci 0000:00:01.0: [1011:0019] type 00 class 0x020000
> > > > > > > > > > [    3.923278] pci 0000:00:01.0: reg 0x10: [io  0x0000-0x007f]
> > > > > > > > > > [    3.923451] pci 0000:00:01.0: reg 0x14: [mem 0x10000000-0x1000007f]
> > > > > > > > > >
> > > > > >
> > > > > > The problem here is that Linux, for legacy reasons, does not support
> > > > > > I/O ports <= 0x1000 on PCI, so the I/O assignment created by EFI is
> > > > > > rejected.
> > > > > >
> > > > > > This might make sense on x86, where legacy I/O ports may exist, but on
> > > > > > other architectures, this makes no sense.
> > > > >
> > > > >
> > > > > Fixing Linux makes sense but OTOH EFI probably shouldn't create mappings
> > > > > that trip up existing guests, right?
> > > > >
> > > >
> > > > I think it is difficult to draw a line. Sure, maybe EFI should not create
> > > > such mappings, but then maybe qemu should not suddenly start to enforce
> > > > those mappings for existing guests either.
> > >
> > > I would say both. But about QEMU actually I think you have a point here.
> > > Re-reading the spec:
> > >
> > > 0: No (The operating system shall not ignore the PCI configuration that firmware has done
> > > at boot time. However, the operating system is free to configure the devices in this hierarchy
> > > that have not been configured by the firmware. There may be a reduced level of hot plug
> > > capability support in this hierarchy due to resource constraints. This situation is the same as
> > > the legacy situation where this _DSM is not provided.)
> > > 1: Yes (The operating system may ignore the PCI configuration that the firmware has done
> > > at boot time, and reconfigure/rebalance the resources in the hierarchy.)
> > >
> > >
> > > I think I misread the spec previously, and understood it to mean that
> > > 1 means must ignore. In fact 1 gives the most flexibility.
> > > So why are we suddenly telling the guest it must not override
> > > firmware?
> > >
> > > The commit log says
> > >     The diffences could result in resource assignment failure.
> > >
> > > which is kind of vague ...
> > >
> > > Jiahui Cen, Igor, what do you think about it?
> > > I'm inclined to revert 0cf8882fd06ba0aeb1e90fa6f23fce85504d7e14
> > > at least for now.
> > Looking at patch history, it seems consensus was that it's better to
> > enforce firmware allocations.
> >
> > Also letting OS do as it pleases might break PCI devices that
> > don't tolerate reallocation. ex: firmware initializes PCI device
> > IO/BARs and then fetches ACPI tables, which get patched with
> > assigned resources.
> >
> > to me returning 0 seems to be correct choice.
> > In addition resource hinting also works via firmware allocations,
> > if we revert the commit it might change those configs.
>
>
> Well if firmware people now tell us their allocations were never
> intended for guest OS use then maybe we should not intervene.
>

DSM #5 was introduced to permit firmware running on x86_64 systems to
boot 32-bit OSes (read Windows) unmodified, while still leaving
enlightened, 64-bit OSes the opportunity to reorganize the BARs if
there is sufficient space in the resource windows, and if the OS runs
in long mode so it can address all of it.

This is why the default-if-absent according to the spec is '0', and I
already explained up-thread why arm64 deviates from this.

But Igor has a point: there are cases where especially bus numbers
should not be touched, as firmware tables consumed by the OS may carry
b/d/f identifiers for things like SMMU pass through, where changing
the bus numbers obviously invalidates this information.

These are exceptional cases, though, and I would argue that these
should be considered individually, rather than setting DSM #5 to 0x0
simply because there might be cases where not doing so could
theoretically break things, given that doing so has proven to break
things.


> As others noted the original commit was kind of vague:
>
> 1. it said "Using _DSM #5 method to inform guest os not to ignore the PCI configuration
> that firmware has done at boot time could handle the differences."
> which is not what the spec says and not what the patch did -
> guest os does not ignore configuration even without this,
> it is just allowed to change it.
>
>
> 2. is says could result but does not report whether that happened in the
> field.
>
>
> Given this causes a regression I'm inclined to just revert for now.
> We can figure it out for the next release.
>

For a revert of commit 0cf8882fd06ba0aeb1e90fa6f23fce85504d7e14, feel
free to include

Acked-by: Ard Biesheuvel <ardb@kernel.org>

and please also involve me if any future debates on this subject flare up again.

-- 
Ard.

>
> >
> > me wonders if there is a way make enforcement per device.
>
> devices shouldn't normally care by the spec.
>
>
> > > > For my own testing, I simply reverted commit 0cf8882fd0 in my copy of
> > > > qemu. That solves my immediate problem, giving us time to find a solution
> > > > that is acceptable for everyone. After all, it doesn't look like anyone
> > > > else has noticed the problem, so there is no real urgency.
> > > >
> > > > Thanks,
> > > > Guenter
> > >
> > > Well it's not like we have an army of testers, I think we should
> > > treat each problem report seriously ...
> > >
>


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: aarch64 efi boot failures with qemu 6.0+
  2021-07-28 13:25                   ` Ard Biesheuvel
@ 2021-07-28 14:03                     ` Guenter Roeck
  2021-07-29  8:08                       ` Philippe Mathieu-Daudé
  0 siblings, 1 reply; 31+ messages in thread
From: Guenter Roeck @ 2021-07-28 14:03 UTC (permalink / raw)
  To: Ard Biesheuvel, Michael S. Tsirkin
  Cc: Jiahui Cen, Ard Biesheuvel, qemu-devel, Bjorn Helgaas,
	Igor Mammedov, Philippe Mathieu-Daudé

On 7/28/21 6:25 AM, Ard Biesheuvel wrote:
> On Wed, 28 Jul 2021 at 15:11, Michael S. Tsirkin <mst@redhat.com> wrote:
>>
>> On Tue, Jul 27, 2021 at 12:36:03PM +0200, Igor Mammedov wrote:
>>> On Tue, 27 Jul 2021 05:01:23 -0400
>>> "Michael S. Tsirkin" <mst@redhat.com> wrote:
>>>
>>>> On Mon, Jul 26, 2021 at 10:12:38PM -0700, Guenter Roeck wrote:
>>>>> On 7/26/21 9:45 PM, Michael S. Tsirkin wrote:
>>>>>> On Mon, Jul 26, 2021 at 06:00:57PM +0200, Ard Biesheuvel wrote:
>>>>>>> (cc Bjorn)
>>>>>>>
>>>>>>> On Mon, 26 Jul 2021 at 11:08, Philippe Mathieu-Daudé <philmd@redhat.com> wrote:
>>>>>>>>
>>>>>>>> On 7/26/21 12:56 AM, Guenter Roeck wrote:
>>>>>>>>> On 7/25/21 3:14 PM, Michael S. Tsirkin wrote:
>>>>>>>>>> On Sat, Jul 24, 2021 at 11:52:34AM -0700, Guenter Roeck wrote:
>>>>>>>>>>> Hi all,
>>>>>>>>>>>
>>>>>>>>>>> starting with qemu v6.0, some of my aarch64 efi boot tests no longer
>>>>>>>>>>> work. Analysis shows that PCI devices with IO ports do not instantiate
>>>>>>>>>>> in qemu v6.0 (or v6.1-rc0) when booting through efi. The problem affects
>>>>>>>>>>> (at least) ne2k_pci, tulip, dc390, and am53c974. The problem only
>>>>>>>>>>> affects
>>>>>>>>>>> aarch64, not x86/x86_64.
>>>>>>>>>>>
>>>>>>>>>>> I bisected the problem to commit 0cf8882fd0 ("acpi/gpex: Inform os to
>>>>>>>>>>> keep firmware resource map"). Since this commit, PCI device BAR
>>>>>>>>>>> allocation has changed. Taking tulip as example, the kernel reports
>>>>>>>>>>> the following PCI bar assignments when running qemu v5.2.
>>>>>>>>>>>
>>>>>>>>>>> [    3.921801] pci 0000:00:01.0: [1011:0019] type 00 class 0x020000
>>>>>>>>>>> [    3.922207] pci 0000:00:01.0: reg 0x10: [io  0x0000-0x007f]
>>>>>>>>>>> [    3.922505] pci 0000:00:01.0: reg 0x14: [mem 0x10000000-0x1000007f]
>>>>>>>
>>>>>>> IIUC, these lines are read back from the BARs
>>>>>>>
>>>>>>>>>>> [    3.927111] pci 0000:00:01.0: BAR 0: assigned [io  0x1000-0x107f]
>>>>>>>>>>> [    3.927455] pci 0000:00:01.0: BAR 1: assigned [mem
>>>>>>>>>>> 0x10000000-0x1000007f]
>>>>>>>>>>>
>>>>>>>
>>>>>>> ... and this is the assignment created by the kernel.
>>>>>>>
>>>>>>>>>>> With qemu v6.0, the assignment is reported as follows.
>>>>>>>>>>>
>>>>>>>>>>> [    3.922887] pci 0000:00:01.0: [1011:0019] type 00 class 0x020000
>>>>>>>>>>> [    3.923278] pci 0000:00:01.0: reg 0x10: [io  0x0000-0x007f]
>>>>>>>>>>> [    3.923451] pci 0000:00:01.0: reg 0x14: [mem 0x10000000-0x1000007f]
>>>>>>>>>>>
>>>>>>>
>>>>>>> The problem here is that Linux, for legacy reasons, does not support
>>>>>>> I/O ports <= 0x1000 on PCI, so the I/O assignment created by EFI is
>>>>>>> rejected.
>>>>>>>
>>>>>>> This might make sense on x86, where legacy I/O ports may exist, but on
>>>>>>> other architectures, this makes no sense.
>>>>>>
>>>>>>
>>>>>> Fixing Linux makes sense but OTOH EFI probably shouldn't create mappings
>>>>>> that trip up existing guests, right?
>>>>>>
>>>>>
>>>>> I think it is difficult to draw a line. Sure, maybe EFI should not create
>>>>> such mappings, but then maybe qemu should not suddenly start to enforce
>>>>> those mappings for existing guests either.
>>>>
>>>> I would say both. But about QEMU actually I think you have a point here.
>>>> Re-reading the spec:
>>>>
>>>> 0: No (The operating system shall not ignore the PCI configuration that firmware has done
>>>> at boot time. However, the operating system is free to configure the devices in this hierarchy
>>>> that have not been configured by the firmware. There may be a reduced level of hot plug
>>>> capability support in this hierarchy due to resource constraints. This situation is the same as
>>>> the legacy situation where this _DSM is not provided.)
>>>> 1: Yes (The operating system may ignore the PCI configuration that the firmware has done
>>>> at boot time, and reconfigure/rebalance the resources in the hierarchy.)
>>>>
>>>>
>>>> I think I misread the spec previously, and understood it to mean that
>>>> 1 means must ignore. In fact 1 gives the most flexibility.
>>>> So why are we suddenly telling the guest it must not override
>>>> firmware?
>>>>
>>>> The commit log says
>>>>      The diffences could result in resource assignment failure.
>>>>
>>>> which is kind of vague ...
>>>>
>>>> Jiahui Cen, Igor, what do you think about it?
>>>> I'm inclined to revert 0cf8882fd06ba0aeb1e90fa6f23fce85504d7e14
>>>> at least for now.
>>> Looking at patch history, it seems consensus was that it's better to
>>> enforce firmware allocations.
>>>
>>> Also letting OS do as it pleases might break PCI devices that
>>> don't tolerate reallocation. ex: firmware initializes PCI device
>>> IO/BARs and then fetches ACPI tables, which get patched with
>>> assigned resources.
>>>
>>> to me returning 0 seems to be correct choice.
>>> In addition resource hinting also works via firmware allocations,
>>> if we revert the commit it might change those configs.
>>
>>
>> Well if firmware people now tell us their allocations were never
>> intended for guest OS use then maybe we should not intervene.
>>
> 
> DSM #5 was introduced to permit firmware running on x86_64 systems to
> boot 32-bit OSes (read Windows) unmodified, while still leaving
> enlightened, 64-bit OSes the opportunity to reorganize the BARs if
> there is sufficient space in the resource windows, and if the OS runs
> in long mode so it can address all of it.
> 
> This is why the default-if-absent according to the spec is '0', and I
> already explained up-thread why arm64 deviates from this.
> 
> But Igor has a point: there are cases where especially bus numbers
> should not be touched, as firmware tables consumed by the OS may carry
> b/d/f identifiers for things like SMMU pass through, where changing
> the bus numbers obviously invalidates this information.
> 
> These are exceptional cases, though, and I would argue that these
> should be considered individually, rather than setting DSM #5 to 0x0
> simply because there might be cases where not doing so could
> theoretically break things, given that doing so has proven to break
> things.
> 
> 
>> As others noted the original commit was kind of vague:
>>
>> 1. it said "Using _DSM #5 method to inform guest os not to ignore the PCI configuration
>> that firmware has done at boot time could handle the differences."
>> which is not what the spec says and not what the patch did -
>> guest os does not ignore configuration even without this,
>> it is just allowed to change it.
>>
>>
>> 2. is says could result but does not report whether that happened in the
>> field.
>>
>>
>> Given this causes a regression I'm inclined to just revert for now.
>> We can figure it out for the next release.
>>
> 
> For a revert of commit 0cf8882fd06ba0aeb1e90fa6f23fce85504d7e14, feel
> free to include
> 
> Acked-by: Ard Biesheuvel <ardb@kernel.org>
> 

and:

Tested-by: Guenter Roeck <linux@roeck-us.net>

> and please also involve me if any future debates on this subject flare up again.
> 

Same here.

Thanks,
Guenter


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: aarch64 efi boot failures with qemu 6.0+
  2021-07-28 14:03                     ` Guenter Roeck
@ 2021-07-29  8:08                       ` Philippe Mathieu-Daudé
  2021-07-29 14:42                         ` Bjorn Helgaas
  0 siblings, 1 reply; 31+ messages in thread
From: Philippe Mathieu-Daudé @ 2021-07-29  8:08 UTC (permalink / raw)
  To: Guenter Roeck, Ard Biesheuvel, Michael S. Tsirkin
  Cc: Bjorn Helgaas, Igor Mammedov, Ard Biesheuvel, Jiahui Cen, qemu-devel

On 7/28/21 4:03 PM, Guenter Roeck wrote:
> On 7/28/21 6:25 AM, Ard Biesheuvel wrote:
>> On Wed, 28 Jul 2021 at 15:11, Michael S. Tsirkin <mst@redhat.com> wrote:
>>>
>>> On Tue, Jul 27, 2021 at 12:36:03PM +0200, Igor Mammedov wrote:

>>> As others noted the original commit was kind of vague:
>>>
>>> 1. it said "Using _DSM #5 method to inform guest os not to ignore the
>>> PCI configuration
>>> that firmware has done at boot time could handle the differences."
>>> which is not what the spec says and not what the patch did -
>>> guest os does not ignore configuration even without this,
>>> it is just allowed to change it.
>>>
>>>
>>> 2. is says could result but does not report whether that happened in the
>>> field.
>>>
>>>
>>> Given this causes a regression I'm inclined to just revert for now.
>>> We can figure it out for the next release.
>>>
>>
>> For a revert of commit 0cf8882fd06ba0aeb1e90fa6f23fce85504d7e14, feel
>> free to include
>>
>> Acked-by: Ard Biesheuvel <ardb@kernel.org>
>>
> 
> and:
> 
> Tested-by: Guenter Roeck <linux@roeck-us.net>
> 
>> and please also involve me if any future debates on this subject flare
>> up again.

Thanks all, this thread has been very instructive.

Michael, if describing the issue in the revert is too complex, could you
include a link to this thread in the revert description?
(Message-Id: <20210724185234.GA2265457@roeck-us.net> or
https://www.mail-archive.com/qemu-devel@nongnu.org/msg826392.html)



^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: aarch64 efi boot failures with qemu 6.0+
  2021-07-29  8:08                       ` Philippe Mathieu-Daudé
@ 2021-07-29 14:42                         ` Bjorn Helgaas
  2021-07-29 15:59                           ` Michael S. Tsirkin
  0 siblings, 1 reply; 31+ messages in thread
From: Bjorn Helgaas @ 2021-07-29 14:42 UTC (permalink / raw)
  To: Philippe Mathieu-Daudé
  Cc: Guenter Roeck, Ard Biesheuvel, Michael S. Tsirkin, Igor Mammedov,
	Ard Biesheuvel, qemu-devel, Jiahui Cen

On Thu, Jul 29, 2021 at 3:08 AM Philippe Mathieu-Daudé
<philmd@redhat.com> wrote:

> Michael, if describing the issue in the revert is too complex, could you
> include a link to this thread in the revert description?
> (Message-Id: <20210724185234.GA2265457@roeck-us.net> or
> https://www.mail-archive.com/qemu-devel@nongnu.org/msg826392.html)

Or https://lore.kernel.org/r/20210724185234.GA2265457@roeck-us.net/,
which is a convenient, ad-free, long-term, text-only, tools-friendly
archive maintained by the Linux Foundation.


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: aarch64 efi boot failures with qemu 6.0+
  2021-07-29 14:42                         ` Bjorn Helgaas
@ 2021-07-29 15:59                           ` Michael S. Tsirkin
  0 siblings, 0 replies; 31+ messages in thread
From: Michael S. Tsirkin @ 2021-07-29 15:59 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: Jiahui Cen, Ard Biesheuvel, qemu-devel, Igor Mammedov,
	Philippe Mathieu-Daudé,
	Ard Biesheuvel, Guenter Roeck

On Thu, Jul 29, 2021 at 09:42:52AM -0500, Bjorn Helgaas wrote:
> On Thu, Jul 29, 2021 at 3:08 AM Philippe Mathieu-Daudé
> <philmd@redhat.com> wrote:
> 
> > Michael, if describing the issue in the revert is too complex, could you
> > include a link to this thread in the revert description?
> > (Message-Id: <20210724185234.GA2265457@roeck-us.net> or
> > https://www.mail-archive.com/qemu-devel@nongnu.org/msg826392.html)
> 
> Or https://lore.kernel.org/r/20210724185234.GA2265457@roeck-us.net/,
> which is a convenient, ad-free, long-term, text-only, tools-friendly
> archive maintained by the Linux Foundation.

OK, thanks!

-- 
MST



^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: aarch64 efi boot failures with qemu 6.0+
  2021-07-27 10:14                     ` Ard Biesheuvel
@ 2022-03-18 11:48                       ` Lorenzo Pieralisi
  0 siblings, 0 replies; 31+ messages in thread
From: Lorenzo Pieralisi @ 2022-03-18 11:48 UTC (permalink / raw)
  To: Ard Biesheuvel
  Cc: Jiahui Cen, Michael S. Tsirkin, Ard Biesheuvel, qemu-devel,
	Bjorn Helgaas, Igor Mammedov, Philippe Mathieu-Daudé,
	Guenter Roeck

On Tue, Jul 27, 2021 at 12:14:48PM +0200, Ard Biesheuvel wrote:
> (+ Lorenzo)
> 
> On Tue, 27 Jul 2021 at 12:07, Michael S. Tsirkin <mst@redhat.com> wrote:
> >
> > On Tue, Jul 27, 2021 at 11:50:23AM +0200, Ard Biesheuvel wrote:
> > > On Tue, 27 Jul 2021 at 11:30, Michael S. Tsirkin <mst@redhat.com> wrote:
> > > >
> > > > On Tue, Jul 27, 2021 at 09:04:20AM +0200, Ard Biesheuvel wrote:
> > > > > On Tue, 27 Jul 2021 at 07:12, Guenter Roeck <linux@roeck-us.net> wrote:
> > > > > >
> > > > > > On 7/26/21 9:45 PM, Michael S. Tsirkin wrote:
> > > > > > > On Mon, Jul 26, 2021 at 06:00:57PM +0200, Ard Biesheuvel wrote:
> > > > > > >> (cc Bjorn)
> > > > > > >>
> > > > > > >> On Mon, 26 Jul 2021 at 11:08, Philippe Mathieu-Daudé <philmd@redhat.com> wrote:
> > > > > > >>>
> > > > > > >>> On 7/26/21 12:56 AM, Guenter Roeck wrote:
> > > > > > >>>> On 7/25/21 3:14 PM, Michael S. Tsirkin wrote:
> > > > > > >>>>> On Sat, Jul 24, 2021 at 11:52:34AM -0700, Guenter Roeck wrote:
> > > > > > >>>>>> Hi all,
> > > > > > >>>>>>
> > > > > > >>>>>> starting with qemu v6.0, some of my aarch64 efi boot tests no longer
> > > > > > >>>>>> work. Analysis shows that PCI devices with IO ports do not instantiate
> > > > > > >>>>>> in qemu v6.0 (or v6.1-rc0) when booting through efi. The problem affects
> > > > > > >>>>>> (at least) ne2k_pci, tulip, dc390, and am53c974. The problem only
> > > > > > >>>>>> affects
> > > > > > >>>>>> aarch64, not x86/x86_64.
> > > > > > >>>>>>
> > > > > > >>>>>> I bisected the problem to commit 0cf8882fd0 ("acpi/gpex: Inform os to
> > > > > > >>>>>> keep firmware resource map"). Since this commit, PCI device BAR
> > > > > > >>>>>> allocation has changed. Taking tulip as example, the kernel reports
> > > > > > >>>>>> the following PCI bar assignments when running qemu v5.2.
> > > > > > >>>>>>
> > > > > > >>>>>> [    3.921801] pci 0000:00:01.0: [1011:0019] type 00 class 0x020000
> > > > > > >>>>>> [    3.922207] pci 0000:00:01.0: reg 0x10: [io  0x0000-0x007f]
> > > > > > >>>>>> [    3.922505] pci 0000:00:01.0: reg 0x14: [mem 0x10000000-0x1000007f]
> > > > > > >>
> > > > > > >> IIUC, these lines are read back from the BARs
> > > > > > >>
> > > > > > >>>>>> [    3.927111] pci 0000:00:01.0: BAR 0: assigned [io  0x1000-0x107f]
> > > > > > >>>>>> [    3.927455] pci 0000:00:01.0: BAR 1: assigned [mem
> > > > > > >>>>>> 0x10000000-0x1000007f]
> > > > > > >>>>>>
> > > > > > >>
> > > > > > >> ... and this is the assignment created by the kernel.
> > > > > > >>
> > > > > > >>>>>> With qemu v6.0, the assignment is reported as follows.
> > > > > > >>>>>>
> > > > > > >>>>>> [    3.922887] pci 0000:00:01.0: [1011:0019] type 00 class 0x020000
> > > > > > >>>>>> [    3.923278] pci 0000:00:01.0: reg 0x10: [io  0x0000-0x007f]
> > > > > > >>>>>> [    3.923451] pci 0000:00:01.0: reg 0x14: [mem 0x10000000-0x1000007f]
> > > > > > >>>>>>
> > > > > > >>
> > > > > > >> The problem here is that Linux, for legacy reasons, does not support
> > > > > > >> I/O ports <= 0x1000 on PCI, so the I/O assignment created by EFI is
> > > > > > >> rejected.
> > > > > > >>
> > > > > > >> This might make sense on x86, where legacy I/O ports may exist, but on
> > > > > > >> other architectures, this makes no sense.
> > > > > > >
> > > > > > >
> > > > > > > Fixing Linux makes sense but OTOH EFI probably shouldn't create mappings
> > > > > > > that trip up existing guests, right?
> > > > > > >
> > > > > >
> > > > > > I think it is difficult to draw a line. Sure, maybe EFI should not create
> > > > > > such mappings, but then maybe qemu should not suddenly start to enforce
> > > > > > those mappings for existing guests either.
> > > > > >
> > > > >
> > > > > EFI creates the mappings primarily for itself, and up until DSM #5
> > > > > started to be enforced, all PCI resource allocations that existed at
> > > > > boot were ignored by Linux and recreated from scratch.
> > > > >
> > > > > Also, the commit in question looks dubious to me. I don't think it is
> > > > > likely that Linux would fail to create a resource tree. What does
> > > > > happen is that BARs get moved around, which may cause trouble in some
> > > > > cases: for instance, we had to add special code to the EFI framebuffer
> > > > > driver to copy with framebuffer BARs being relocated.
> > > > >
> > > > > > For my own testing, I simply reverted commit 0cf8882fd0 in my copy of
> > > > > > qemu. That solves my immediate problem, giving us time to find a solution
> > > > > > that is acceptable for everyone. After all, it doesn't look like anyone
> > > > > > else has noticed the problem, so there is no real urgency.
> > > > > >
> > > > >
> > > > > I would argue that it is better to revert that commit. DSM #5 has a
> > > > > long history of debate and misinterpretation, and while I think we
> > > > > ended up with something sane, I don't think we should be using it in
> > > > > this particular case.
> > > >
> > > > I think revert might make sense, however:
> > > >
> > > > 0: No (The operating system shall not ignore the PCI configuration that firmware has done
> > > > at boot time. However, the operating system is free to configure the devices in this hierarchy
> > > > that have not been configured by the firmware. There may be a reduced level of hot plug
> > > > capability support in this hierarchy due to resource constraints. This situation is the same as
> > > > the legacy situation where this _DSM is not provided.)
> > > >
> > > > ^^^^ does not this imply that reporting a 0 as we currently do
> > > >      should be mostly a NOP?
> > > >
> > >
> > > Not really. The resource allocation strategies are different between
> > > EDK2 and Linux, and as Guenter's testing proves, EDK2 may lay out PCI
> > > resources in a way that interferes with Linux's expectations. The I/O
> > > port 0x0 problem is just one potential issue here: another issue is
> > > resource padding for hotplug, which is important for VMs, not only the
> > > IO/MEM resource allocations, but the bus ranges as well.
> >
> > Hmm not sure I understand the answer. The text above seems to say
> > that 0 should be the same as _DSM 5 is not provided, does it not?
> 
> That is what the spec says, but it has never been what Linux/arm64
> does. Its PCI arch code is based on 32-bit ARM, which uses simple
> bootloaders that are completely unaware of the existence of PCI, and
> so from the beginning, we have always reassigned all resources (but
> not bus numbers IIRC) because that is what ARM did. On arm64, of
> course, we often do have rich firmware that initializes PCI, but by
> that time, the cat was out of the bag already, and we could not simply
> stop reassigning resources without running a substantial regression
> risk, even when booting in ACPI mode.
> 
> So the default behavior on arm64 is '1' not '0' in terms of DSM #5.

So, to summarize, what happens here is that if the _DSM #5 returns
0, we try to keep the BAR content, which in Linux terms means we
"claim" the resources and succeed, correct ?

In other words: pci_bus_claim_resources(), for IO BARs succeeds,
it is not the claiming that fails but kernel code that tries to
access IO resources at 0x0 and deems it inappropriate on arm64.

The issue then is that the kernel rejects IO port addresses on PCI
starting at 0x0 - are we able to pinpoint where the actual bug
kicks in in this specific case ?

Thanks,
Lorenzo

> 
> > Why did behaviour change when we switched from not providing _DSM 5
> > to providing but returning 0?
> >
> >
> > > >
> > > > 1: Yes (The operating system may ignore the PCI configuration that the firmware has done
> > > > at boot time, and reconfigure/rebalance the resources in the hierarchy.)
> > > >
> > > >
> > > > So I am debating with myself whether this should be a plain revert or
> > > > return 1 here:
> > > >      /*
> > > >       * 0 - The operating system must not ignore the PCI configuration that
> > > >       *     firmware has done at boot time.
> > > >       */
> > > >      aml_append(ifctx1, aml_return(aml_int(0)));
> > > > -    aml_append(ifctx, ifctx1);
> > > > +    aml_append(ifctx1, aml_return(aml_int(1)));
> > > >      aml_append(method, ifctx);
> > > >
> > >
> > > I agree that returning '1' here is a better choice, as it explicitly
> > > gives the OS license to reassign all resources, which is what we have
> > > been relying on to begin with.
> > >
> > > OTOH, I do think we should fix arbitrary zero checks in Linux that
> > > make no sense on !x86
> > >
> > > >
> > > >
> > > > Guenter what happens if we return 1? Do things work well?
> > > >
> > > > --
> > > > MST
> > > >
> >


^ permalink raw reply	[flat|nested] 31+ messages in thread

end of thread, other threads:[~2022-03-18 11:51 UTC | newest]

Thread overview: 31+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-07-24 18:52 aarch64 efi boot failures with qemu 6.0+ Guenter Roeck
2021-07-25 22:14 ` Michael S. Tsirkin
2021-07-25 22:56   ` Guenter Roeck
2021-07-26  9:08     ` Philippe Mathieu-Daudé
2021-07-26 16:00       ` Ard Biesheuvel
2021-07-26 21:16         ` Bjorn Helgaas
2021-07-26 21:31           ` Bjorn Helgaas
2021-07-26 21:31             ` Bjorn Helgaas
2021-07-27  4:22             ` Guenter Roeck
2021-07-27  4:22               ` Guenter Roeck
2021-07-27 14:25               ` Bjorn Helgaas
2021-07-27 14:25                 ` Bjorn Helgaas
2021-07-27  4:45         ` Michael S. Tsirkin
2021-07-27  5:12           ` Guenter Roeck
2021-07-27  7:04             ` Ard Biesheuvel
2021-07-27  9:02               ` Michael S. Tsirkin
2021-07-27  9:30               ` Michael S. Tsirkin
2021-07-27  9:50                 ` Ard Biesheuvel
2021-07-27 10:07                   ` Michael S. Tsirkin
2021-07-27 10:14                     ` Ard Biesheuvel
2022-03-18 11:48                       ` Lorenzo Pieralisi
2021-07-27 11:18                 ` Guenter Roeck
2021-07-27  9:01             ` Michael S. Tsirkin
2021-07-27 10:36               ` Igor Mammedov
2021-07-27 11:32                 ` Guenter Roeck
2021-07-28 13:11                 ` Michael S. Tsirkin
2021-07-28 13:25                   ` Ard Biesheuvel
2021-07-28 14:03                     ` Guenter Roeck
2021-07-29  8:08                       ` Philippe Mathieu-Daudé
2021-07-29 14:42                         ` Bjorn Helgaas
2021-07-29 15:59                           ` Michael S. Tsirkin

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.