qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: "Michael S. Tsirkin" <mst@redhat.com>
To: Ard Biesheuvel <ardb@kernel.org>
Cc: "Jiahui Cen" <cenjiahui@huawei.com>,
	"Ard Biesheuvel" <ardb+tianocore@kernel.org>,
	qemu-devel@nongnu.org, "Bjorn Helgaas" <bhelgaas@google.com>,
	"Igor Mammedov" <imammedo@redhat.com>,
	"Philippe Mathieu-Daudé" <philmd@redhat.com>,
	"Guenter Roeck" <linux@roeck-us.net>
Subject: Re: aarch64 efi boot failures with qemu 6.0+
Date: Tue, 27 Jul 2021 06:07:33 -0400	[thread overview]
Message-ID: <20210727060550-mutt-send-email-mst@kernel.org> (raw)
In-Reply-To: <CAMj1kXHtjZh_n-iBObPTDqdN8oV0DKtpXgRfUApNOYgVeYpCBA@mail.gmail.com>

On Tue, Jul 27, 2021 at 11:50:23AM +0200, Ard Biesheuvel wrote:
> On Tue, 27 Jul 2021 at 11:30, Michael S. Tsirkin <mst@redhat.com> wrote:
> >
> > On Tue, Jul 27, 2021 at 09:04:20AM +0200, Ard Biesheuvel wrote:
> > > On Tue, 27 Jul 2021 at 07:12, Guenter Roeck <linux@roeck-us.net> wrote:
> > > >
> > > > On 7/26/21 9:45 PM, Michael S. Tsirkin wrote:
> > > > > On Mon, Jul 26, 2021 at 06:00:57PM +0200, Ard Biesheuvel wrote:
> > > > >> (cc Bjorn)
> > > > >>
> > > > >> On Mon, 26 Jul 2021 at 11:08, Philippe Mathieu-Daudé <philmd@redhat.com> wrote:
> > > > >>>
> > > > >>> On 7/26/21 12:56 AM, Guenter Roeck wrote:
> > > > >>>> On 7/25/21 3:14 PM, Michael S. Tsirkin wrote:
> > > > >>>>> On Sat, Jul 24, 2021 at 11:52:34AM -0700, Guenter Roeck wrote:
> > > > >>>>>> Hi all,
> > > > >>>>>>
> > > > >>>>>> starting with qemu v6.0, some of my aarch64 efi boot tests no longer
> > > > >>>>>> work. Analysis shows that PCI devices with IO ports do not instantiate
> > > > >>>>>> in qemu v6.0 (or v6.1-rc0) when booting through efi. The problem affects
> > > > >>>>>> (at least) ne2k_pci, tulip, dc390, and am53c974. The problem only
> > > > >>>>>> affects
> > > > >>>>>> aarch64, not x86/x86_64.
> > > > >>>>>>
> > > > >>>>>> I bisected the problem to commit 0cf8882fd0 ("acpi/gpex: Inform os to
> > > > >>>>>> keep firmware resource map"). Since this commit, PCI device BAR
> > > > >>>>>> allocation has changed. Taking tulip as example, the kernel reports
> > > > >>>>>> the following PCI bar assignments when running qemu v5.2.
> > > > >>>>>>
> > > > >>>>>> [    3.921801] pci 0000:00:01.0: [1011:0019] type 00 class 0x020000
> > > > >>>>>> [    3.922207] pci 0000:00:01.0: reg 0x10: [io  0x0000-0x007f]
> > > > >>>>>> [    3.922505] pci 0000:00:01.0: reg 0x14: [mem 0x10000000-0x1000007f]
> > > > >>
> > > > >> IIUC, these lines are read back from the BARs
> > > > >>
> > > > >>>>>> [    3.927111] pci 0000:00:01.0: BAR 0: assigned [io  0x1000-0x107f]
> > > > >>>>>> [    3.927455] pci 0000:00:01.0: BAR 1: assigned [mem
> > > > >>>>>> 0x10000000-0x1000007f]
> > > > >>>>>>
> > > > >>
> > > > >> ... and this is the assignment created by the kernel.
> > > > >>
> > > > >>>>>> With qemu v6.0, the assignment is reported as follows.
> > > > >>>>>>
> > > > >>>>>> [    3.922887] pci 0000:00:01.0: [1011:0019] type 00 class 0x020000
> > > > >>>>>> [    3.923278] pci 0000:00:01.0: reg 0x10: [io  0x0000-0x007f]
> > > > >>>>>> [    3.923451] pci 0000:00:01.0: reg 0x14: [mem 0x10000000-0x1000007f]
> > > > >>>>>>
> > > > >>
> > > > >> The problem here is that Linux, for legacy reasons, does not support
> > > > >> I/O ports <= 0x1000 on PCI, so the I/O assignment created by EFI is
> > > > >> rejected.
> > > > >>
> > > > >> This might make sense on x86, where legacy I/O ports may exist, but on
> > > > >> other architectures, this makes no sense.
> > > > >
> > > > >
> > > > > Fixing Linux makes sense but OTOH EFI probably shouldn't create mappings
> > > > > that trip up existing guests, right?
> > > > >
> > > >
> > > > I think it is difficult to draw a line. Sure, maybe EFI should not create
> > > > such mappings, but then maybe qemu should not suddenly start to enforce
> > > > those mappings for existing guests either.
> > > >
> > >
> > > EFI creates the mappings primarily for itself, and up until DSM #5
> > > started to be enforced, all PCI resource allocations that existed at
> > > boot were ignored by Linux and recreated from scratch.
> > >
> > > Also, the commit in question looks dubious to me. I don't think it is
> > > likely that Linux would fail to create a resource tree. What does
> > > happen is that BARs get moved around, which may cause trouble in some
> > > cases: for instance, we had to add special code to the EFI framebuffer
> > > driver to copy with framebuffer BARs being relocated.
> > >
> > > > For my own testing, I simply reverted commit 0cf8882fd0 in my copy of
> > > > qemu. That solves my immediate problem, giving us time to find a solution
> > > > that is acceptable for everyone. After all, it doesn't look like anyone
> > > > else has noticed the problem, so there is no real urgency.
> > > >
> > >
> > > I would argue that it is better to revert that commit. DSM #5 has a
> > > long history of debate and misinterpretation, and while I think we
> > > ended up with something sane, I don't think we should be using it in
> > > this particular case.
> >
> > I think revert might make sense, however:
> >
> > 0: No (The operating system shall not ignore the PCI configuration that firmware has done
> > at boot time. However, the operating system is free to configure the devices in this hierarchy
> > that have not been configured by the firmware. There may be a reduced level of hot plug
> > capability support in this hierarchy due to resource constraints. This situation is the same as
> > the legacy situation where this _DSM is not provided.)
> >
> > ^^^^ does not this imply that reporting a 0 as we currently do
> >      should be mostly a NOP?
> >
> 
> Not really. The resource allocation strategies are different between
> EDK2 and Linux, and as Guenter's testing proves, EDK2 may lay out PCI
> resources in a way that interferes with Linux's expectations. The I/O
> port 0x0 problem is just one potential issue here: another issue is
> resource padding for hotplug, which is important for VMs, not only the
> IO/MEM resource allocations, but the bus ranges as well.

Hmm not sure I understand the answer. The text above seems to say
that 0 should be the same as _DSM 5 is not provided, does it not?
Why did behaviour change when we switched from not providing _DSM 5
to providing but returning 0?


> >
> > 1: Yes (The operating system may ignore the PCI configuration that the firmware has done
> > at boot time, and reconfigure/rebalance the resources in the hierarchy.)
> >
> >
> > So I am debating with myself whether this should be a plain revert or
> > return 1 here:
> >      /*
> >       * 0 - The operating system must not ignore the PCI configuration that
> >       *     firmware has done at boot time.
> >       */
> >      aml_append(ifctx1, aml_return(aml_int(0)));
> > -    aml_append(ifctx, ifctx1);
> > +    aml_append(ifctx1, aml_return(aml_int(1)));
> >      aml_append(method, ifctx);
> >
> 
> I agree that returning '1' here is a better choice, as it explicitly
> gives the OS license to reassign all resources, which is what we have
> been relying on to begin with.
> 
> OTOH, I do think we should fix arbitrary zero checks in Linux that
> make no sense on !x86
> 
> >
> >
> > Guenter what happens if we return 1? Do things work well?
> >
> > --
> > MST
> >



  reply	other threads:[~2021-07-27 10:08 UTC|newest]

Thread overview: 28+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-07-24 18:52 aarch64 efi boot failures with qemu 6.0+ Guenter Roeck
2021-07-25 22:14 ` Michael S. Tsirkin
2021-07-25 22:56   ` Guenter Roeck
2021-07-26  9:08     ` Philippe Mathieu-Daudé
2021-07-26 16:00       ` Ard Biesheuvel
2021-07-26 21:16         ` Bjorn Helgaas
2021-07-26 21:31           ` Bjorn Helgaas
2021-07-27  4:22             ` Guenter Roeck
2021-07-27 14:25               ` Bjorn Helgaas
2021-07-27  4:45         ` Michael S. Tsirkin
2021-07-27  5:12           ` Guenter Roeck
2021-07-27  7:04             ` Ard Biesheuvel
2021-07-27  9:02               ` Michael S. Tsirkin
2021-07-27  9:30               ` Michael S. Tsirkin
2021-07-27  9:50                 ` Ard Biesheuvel
2021-07-27 10:07                   ` Michael S. Tsirkin [this message]
2021-07-27 10:14                     ` Ard Biesheuvel
2022-03-18 11:48                       ` Lorenzo Pieralisi
2021-07-27 11:18                 ` Guenter Roeck
2021-07-27  9:01             ` Michael S. Tsirkin
2021-07-27 10:36               ` Igor Mammedov
2021-07-27 11:32                 ` Guenter Roeck
2021-07-28 13:11                 ` Michael S. Tsirkin
2021-07-28 13:25                   ` Ard Biesheuvel
2021-07-28 14:03                     ` Guenter Roeck
2021-07-29  8:08                       ` Philippe Mathieu-Daudé
2021-07-29 14:42                         ` Bjorn Helgaas
2021-07-29 15:59                           ` Michael S. Tsirkin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210727060550-mutt-send-email-mst@kernel.org \
    --to=mst@redhat.com \
    --cc=ardb+tianocore@kernel.org \
    --cc=ardb@kernel.org \
    --cc=bhelgaas@google.com \
    --cc=cenjiahui@huawei.com \
    --cc=imammedo@redhat.com \
    --cc=linux@roeck-us.net \
    --cc=philmd@redhat.com \
    --cc=qemu-devel@nongnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).