All of lore.kernel.org
 help / color / mirror / Atom feed
From: Hans de Goede <hdegoede@redhat.com>
To: Bjorn Helgaas <helgaas@kernel.org>
Cc: Mark Brown <broonie@kernel.org>,
	Bjorn Helgaas <bhelgaas@google.com>,
	"Rafael J . Wysocki" <rjw@rjwysocki.net>,
	Mika Westerberg <mika.westerberg@linux.intel.com>,
	kernelci-results@groups.io, bot@kernelci.org,
	gtucker@collabora.com, linux-pci@vger.kernel.org
Subject: Re: next/master bisection: baseline.login on asus-C523NA-A20057-coral
Date: Mon, 4 Apr 2022 10:45:10 +0200	[thread overview]
Message-ID: <76c5de03-a3a4-8444-d7f6-496fa119d830@redhat.com> (raw)
In-Reply-To: <20220330113550.GA1638045@bhelgaas>

Hi,

On 3/30/22 13:35, Bjorn Helgaas wrote:
> On Mon, Mar 28, 2022 at 02:54:42PM +0200, Hans de Goede wrote:
>> Hi,
>>
>> On 3/24/22 23:19, Mark Brown wrote:
>>> On Thu, Mar 24, 2022 at 09:34:30PM +0100, Hans de Goede wrote:
>>>
>>>> Mark, if one of use writes a test patch, can you get that Asus machine to boot a
>>>> kernel build from next + the test patch ?
>>>
>>> I can't directly unfortunately as the board is in Collabora's lab but
>>> Guillaume (who's already CCed) ought to be able to, and I can generally
>>> prod and try to get that done.
>>
>> Ok, Guillaume, can you try a kernel with commit 5949965ec9340cfc0e65f7d8a576b660b26e2535
>> ("x86/PCI: Preserve host bridge windows completely covered by E820") + the 
>> attached patch added on top a try on the asus-C523NA-A20057-coral machine please
>> and see if that makes it boot again ?
>>
>> Regards,
>>
>> Hans
> 
>> From b8080a6d2d889847900e1408f71d0c01c73f5c94 Mon Sep 17 00:00:00 2001
>> From: Hans de Goede <hdegoede@redhat.com>
>> Date: Mon, 28 Mar 2022 14:47:41 +0200
>> Subject: [PATCH] x86/PCI: Limit "e820 entry fully covers window" check to non
>>  ISA MMIO addresses
>>
>> Commit FIXME ("x86/PCI: Preserve host bridge windows completely
>> covered by E820") added a check to skip e820 table entries which
>> fully cover a PCI bride's memory window when clipping PCI bridge
>> memory windows.
>>
>> This check also caused ISA MMIO windows to not get clipped when
>> fully covered, which is causing some coreboot based Chromebooks
>> to not boot.
>>
>> Modify the fully covered check to not apply to ISA MMIO windows.
> 
> I'd like to include URLs to the kernelci results unless they are
> ephemeral.  There's a lot of valuable information in these:
> 
>   Asus C523NA-A20057-coral with the last good commit:
>   https://lava.collabora.co.uk/scheduler/job/5937945
> 
>   https://storage.kernelci.org/next/master/next-20220310/x86_64/x86_64_defconfig+x86-chromebook/gcc-10/lab-collabora/baseline-asus-C523NA-A20057-coral.html
>   https://storage.kernelci.org/next/master/next-20220310/x86_64/x86_64_defconfig+x86-chromebook/gcc-10/lab-collabora/baseline-hp-x360-12b-n4000-octopus.html

Ok, I'll include this in any future patches for this.

> 
>> Fixes: FIXME ("x86/PCI: Preserve host bridge windows completely covered by E820")
>> Signed-off-by: Hans de Goede <hdegoede@redhat.com>
>> ---
>>  arch/x86/kernel/resource.c | 6 +++++-
>>  1 file changed, 5 insertions(+), 1 deletion(-)
>>
>> diff --git a/arch/x86/kernel/resource.c b/arch/x86/kernel/resource.c
>> index 6be82e16e5f4..d9ec913619c3 100644
>> --- a/arch/x86/kernel/resource.c
>> +++ b/arch/x86/kernel/resource.c
>> @@ -46,8 +46,12 @@ void remove_e820_regions(struct device *dev, struct resource *avail)
>>  		 * devices.  But if it covers the *entire* resource, it's
>>  		 * more likely just telling us that this is MMIO space, and
>>  		 * that doesn't need to be removed.
>> +		 * Note this *entire* resource covering check is only
>> +		 * intended for 32 bit memory resources for the 16 bit
>> +		 * isa window we always apply the e820 entries.
>>  		 */
>> -		if (e820_start <= avail->start && avail->end <= e820_end) {
>> +		if (avail->start >= ISA_END_ADDRESS &&
> 
> What is the justification for needing to check ISA_END_ADDRESS here?
> The commit log basically says "this makes it work", which isn't very
> satisfying.

I did not have a log with the:

>   acpi PNP0A08:00: clipped [mem 0x000a0000-0x000bffff window] to [mem 0x00100000-0x000bffff window] for e820 entry [mem 0x000a0000-0x000fffff]
>   acpi PNP0A08:00: clipped [mem 0x7b800000-0x7fffffff window] to [mem 0x80000000-0x7fffffff window] for e820 entry [mem 0x7b000000-0x7fffffff]
>   acpi PNP0A08:00: clipped [mem 0x80000000-0xe0000000 window] to [mem 0x80000000-0xcfffffff window] for e820 entry [mem 0xd0000000-0xd0ffffff]

messages. Instead I was looking at this log:

https://storage.kernelci.org/next/master/next-20220310/x86_64/x86_64_defconfig+x86-chromebook/gcc-10/lab-collabora/baseline-asus-C523NA-A20057-coral.html

With the following messages (as I quoted higher up in the email-thread):

"""
 1839 17:54:41.406548  <6>[    0.000000] BIOS-provided physical RAM map:
 1840 17:54:41.413121  <6>[    0.000000] BIOS-e820: [mem 0x0000000000000000-0x0000000000000fff] type 16
 1841 17:54:41.419712  <6>[    0.000000] BIOS-e820: [mem 0x0000000000001000-0x000000000009ffff] usable
 1842 17:54:41.430192  <6>[    0.000000] BIOS-e820: [mem 0x00000000000a0000-0x00000000000fffff] reserved
 1843 17:54:41.436207  <6>[    0.000000] BIOS-e820: [mem 0x0000000000100000-0x000000000fffffff] usable
 1844 17:54:41.446353  <6>[    0.000000] BIOS-e820: [mem 0x0000000010000000-0x0000000012150fff] reserved
 1845 17:54:41.453290  <6>[    0.000000] BIOS-e820: [mem 0x0000000012151000-0x000000007a9fcfff] usable
 1846 17:54:41.459966  <6>[    0.000000] BIOS-e820: [mem 0x000000007a9fd000-0x000000007affffff] type 16
 1847 17:54:41.469549  <6>[    0.000000] BIOS-e820: [mem 0x000000007b000000-0x000000007fffffff] reserved
 1848 17:54:41.476685  <6>[    0.000000] BIOS-e820: [mem 0x00000000d0000000-0x00000000d0ffffff] reserved
 1849 17:54:41.486439  <6>[    0.000000] BIOS-e820: [mem 0x00000000e0000000-0x00000000efffffff] reserved
 1850 17:54:41.492994  <6>[    0.000000] BIOS-e820: [mem 0x00000000fed10000-0x00000000fed17fff] reserved
 1851 17:54:41.503008  <6>[    0.000000] BIOS-e820: [mem 0x0000000100000000-0x000000017fffffff] usable
...
 2030 17:54:42.809183  <6>[    0.313771] pci_bus 0000:00: root bus resource [mem 0x000a0000-0x000bffff window]
 2031 17:54:42.819092  <6>[    0.314424] pci_bus 0000:00: root bus resource [mem 0x7b800000-0xe0000000 window]
"""

###

What I find weird here is that this boot with a somewhat earlier kernel has:

 2030 17:54:42.809183  <6>[    0.313771] pci_bus 0000:00: root bus resource [mem 0x000a0000-0x000bffff window]
 2031 17:54:42.819092  <6>[    0.314424] pci_bus 0000:00: root bus resource [mem 0x7b800000-0xe0000000 window]

Where as the boot with the clipped messages has:

<6>[    0.313705] acpi PNP0A08:00: ignoring host bridge window [mem 0x00100000-0x000bffff window] (conflicts with PCI mem [mem 0x00000000-0x7fffffffff])
<6>[    0.314702] acpi PNP0A08:00: ignoring host bridge window [mem 0x80000000-0x7fffffff window] (conflicts with PCI mem [mem 0x00000000-0x7fffffffff])
<6>[    0.315747] PCI host bridge to bus 0000:00
<6>[    0.316118] pci_bus 0000:00: root bus resource [io  0x0000-0x0cf7 window]
<6>[    0.316703] pci_bus 0000:00: root bus resource [io  0x1000-0xffff window]
<6>[    0.317298] pci_bus 0000:00: root bus resource [mem 0x80000000-0xcfffffff window]
<6>[    0.317703] pci_bus 0000:00: root bus resource [bus 00-ff]

So in the boot with the clipped messages we are getting 3 windows from _CRS
where as before we were getting only 2?  I know that we are now applying
the clipping directly when we are parsing the resources. So I guess that
before we somehow also merged the 2 resources which are back to back together
before the "root bus resource" messages get printed. This caused me to just
see the "root bus resource [mem 0x7b800000-0xe0000000 window]" which is
not fully covered which is why I focused on the ISA MMIO window.

> The Asus log of the last good commit shows:
> 
>   PCI: 00:0d.0 [8086/5a92] enabled
>   constrain_resources: PCI: 00:0d.0 10 base d0000000 limit d0ffffff mem (fixed)
>   ...
>   BIOS-e820: [mem 0x000000007b000000-0x000000007fffffff] reserved
>   BIOS-e820: [mem 0x00000000d0000000-0x00000000d0ffffff] reserved
>   BIOS-e820: [mem 0x00000000e0000000-0x00000000efffffff] reserved
>   ...
>   acpi PNP0A08:00: clipped [mem 0x000a0000-0x000bffff window] to [mem 0x00100000-0x000bffff window] for e820 entry [mem 0x000a0000-0x000fffff]
>   acpi PNP0A08:00: clipped [mem 0x7b800000-0x7fffffff window] to [mem 0x80000000-0x7fffffff window] for e820 entry [mem 0x7b000000-0x7fffffff]
>   acpi PNP0A08:00: clipped [mem 0x80000000-0xe0000000 window] to [mem 0x80000000-0xcfffffff window] for e820 entry [mem 0xd0000000-0xd0ffffff]
>   acpi PNP0A08:00: ignoring host bridge window [mem 0x00100000-0x000bffff window] (conflicts with PCI mem [mem 0x00000000-0x7fffffffff])
>   acpi PNP0A08:00: ignoring host bridge window [mem 0x80000000-0x7fffffff window] (conflicts with PCI mem [mem 0x00000000-0x7fffffffff])
> 
> It looks like _CRS gave us [mem 0x80000000-0xe0000000 window], which
> is one byte too big (it should end at 0xdfffffff).

Yeah but that gets clipped off anyways, so that should not matter.

s> 
> From the firmware part of the log, it looks like 00:0d.0 is a hidden
> device that consumes [mem d0000000-0xd0ffffff].  Linux doesn't
> enumerate 00:0d.0, so firmware should have carved that out of the [mem
> 0x80000000-0xe0000000 window] in _CRS.
> 
> We don't have a log with 5949965ec934 ("x86/PCI: Preserve host bridge
> windows completely covered by E820") applied, but I think it would
> show this:
> 
>   acpi PNP0A08:00: resource [mem 0x000a0000-0x000bffff window] fully covered by e820 entry [mem 0x000a0000-0x000fffff]
>   acpi PNP0A08:00: resource [mem 0x7b800000-0x7fffffff window] fully covered by e820 entry [mem 0x7b000000-0x7fffffff]
> 
> instead of clipping those windows.  But none of the devices we
> enumerate appears to be using either of those windows.

Not with a working kernel no, because they are clipped of, but
with the don't clip fully-covered _CRS windows change, the 
[mem 0x7b000000-0x7fffffff] all of a sudden becomes fair game
to assign BARs to.

I agree that we will get a fully-covered msg for that one with
the patch, which would change:

[    0.317298] pci_bus 0000:00: root bus resource [mem 0x80000000-0xcfffffff window]

to:

[    0.317298] pci_bus 0000:00: root bus resource [mem 0x7b800000-0xcfffffff window]

and I believe that likely is our culprit.

So to fix this I guess that we first need to merge back-to-back
windows coming from _CRS into a single window, before calling
remove_e820_regions()

That would pass [mem 0x7b800000-0xe0000000 window] to remove_e820_regions()
in a single call (as I expected from the logs), which should result in
both the top and the bottom still getting clipped as before.

I've been looking around in the code a but I could not quickly find
a helper to do the back-to-back resource merging before calling
remove_e820_regions(), any suggestions for this?

Regards,

Hans







> 
> We do have this:
> 
>   pci 0000:00:18.2: reg 0x10: [mem 0xde000000-0xde000fff 64bit]
>   pci 0000:00:18.2: reg 0x18: [mem 0xc2b31000-0xc2b31fff 64bit]
>   pci 0000:00:18.2: can't claim BAR 0 [mem 0xde000000-0xde000fff 64bit]: no compatible bridge window
>   pci 0000:00:18.2: BAR 0: assigned [mem 0x80000000-0x80000fff 64bit]
> 
> Where the original [mem 0xde000000-0xde000fff 64bit] assignment was
> perfectly legal, but we clipped [mem 0x80000000-0xe0000000 window] to
> [mem 0x80000000-0xcfffffff window] instead of just punching a hole for
> the 00:0d.0 carve-out.
> 
> Maybe 5949965ec934 puts 00:18.2 BAR 0 somewhere that doesn't work,
> or maybe the clipping to [mem 0x00100000-0x000bffff window] or
> [mem 0x80000000-0x7fffffff window] doesn't work as expected?
> They are supposed to be interpreted as "empty", but certainly
> resource_size([0x00100000-0x000bffff]) is != 0.
> 
>> +		    e820_start <= avail->start && avail->end <= e820_end) {
>>  			dev_info(dev, "resource %pR fully covered by e820 entry [mem %#010Lx-%#010Lx]\n",
>>  				 avail, e820_start, e820_end);
>>  			continue;
>> -- 
>> 2.35.1
>>
> 


  reply	other threads:[~2022-04-04  8:45 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <623c13ec.1c69fb81.8cbdb.5a7a@mx.google.com>
2022-03-24 17:52 ` next/master bisection: baseline.login on asus-C523NA-A20057-coral Mark Brown
2022-03-24 20:34   ` Hans de Goede
2022-03-24 22:19     ` Mark Brown
2022-03-28 12:54       ` Hans de Goede
2022-03-29 18:44         ` Guillaume Tucker
2022-04-04 19:44           ` Guillaume Tucker
2022-04-05  8:13             ` Hans de Goede
2022-04-05 17:57             ` Bjorn Helgaas
     [not found]           ` <16E2C910B4947F17.5433@groups.io>
2022-04-04 19:48             ` Guillaume Tucker
2022-03-30 11:35         ` Bjorn Helgaas
2022-04-04  8:45           ` Hans de Goede [this message]
2022-04-06  0:19             ` Bjorn Helgaas
2022-04-11  9:54               ` Hans de Goede
2022-04-11  9:57                 ` Hans de Goede
2022-03-24 23:08     ` Bjorn Helgaas
2022-03-29 22:14   ` Bjorn Helgaas
2022-04-05 23:53   ` Bjorn Helgaas
2022-04-06 18:59     ` Bjorn Helgaas
2022-04-06 19:37       ` Mark Brown
2022-04-06 20:11         ` Guillaume Tucker
2022-04-07 15:17           ` Denys Fedoryshchenko
2022-04-06 20:56         ` Guenter Roeck

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=76c5de03-a3a4-8444-d7f6-496fa119d830@redhat.com \
    --to=hdegoede@redhat.com \
    --cc=bhelgaas@google.com \
    --cc=bot@kernelci.org \
    --cc=broonie@kernel.org \
    --cc=gtucker@collabora.com \
    --cc=helgaas@kernel.org \
    --cc=kernelci-results@groups.io \
    --cc=linux-pci@vger.kernel.org \
    --cc=mika.westerberg@linux.intel.com \
    --cc=rjw@rjwysocki.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.