linux-pci.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Hans de Goede <hdegoede@redhat.com>
To: Mark Brown <broonie@kernel.org>,
	Bjorn Helgaas <bhelgaas@google.com>,
	"Rafael J . Wysocki" <rjw@rjwysocki.net>,
	Mika Westerberg <mika.westerberg@linux.intel.com>
Cc: kernelci-results@groups.io, bot@kernelci.org,
	gtucker@collabora.com, linux-pci@vger.kernel.org
Subject: Re: next/master bisection: baseline.login on asus-C523NA-A20057-coral
Date: Thu, 24 Mar 2022 21:34:30 +0100	[thread overview]
Message-ID: <4e9fca2f-0af1-3684-6c97-4c35befd5019@redhat.com> (raw)
In-Reply-To: <Yjyv03JsetIsTJxN@sirena.org.uk>

Hi Mark,

Thank you for the report.

On 3/24/22 18:52, Mark Brown wrote:
> On Wed, Mar 23, 2022 at 11:47:08PM -0700, KernelCI bot wrote:
> 
> The KernelCI bisection bot has identified commit 5949965ec9340cfc0e
> ("x86/PCI: Preserve host bridge windows completely covered by E820")
> as causing a boot regression in next on asus-C523NA-A20057-coral (a
> Chromebook AIUI).  Unfortunately there's no useful output when starting
> the kernel.  I've left the full report below including links to the web
> dashboard.
> 
> The last successful boot in -next had this log:
> 
>    https://storage.kernelci.org/next/master/next-20220310/x86_64/x86_64_defconfig+x86-chromebook/gcc-10/lab-collabora/baseline-asus-C523NA-A20057-coral.html

So the interesting bits from this log are:

 1839 17:54:41.406548  <6>[    0.000000] BIOS-provided physical RAM map:
 1840 17:54:41.413121  <6>[    0.000000] BIOS-e820: [mem 0x0000000000000000-0x0000000000000fff] type 16
 1841 17:54:41.419712  <6>[    0.000000] BIOS-e820: [mem 0x0000000000001000-0x000000000009ffff] usable
 1842 17:54:41.430192  <6>[    0.000000] BIOS-e820: [mem 0x00000000000a0000-0x00000000000fffff] reserved
 1843 17:54:41.436207  <6>[    0.000000] BIOS-e820: [mem 0x0000000000100000-0x000000000fffffff] usable
 1844 17:54:41.446353  <6>[    0.000000] BIOS-e820: [mem 0x0000000010000000-0x0000000012150fff] reserved
 1845 17:54:41.453290  <6>[    0.000000] BIOS-e820: [mem 0x0000000012151000-0x000000007a9fcfff] usable
 1846 17:54:41.459966  <6>[    0.000000] BIOS-e820: [mem 0x000000007a9fd000-0x000000007affffff] type 16
 1847 17:54:41.469549  <6>[    0.000000] BIOS-e820: [mem 0x000000007b000000-0x000000007fffffff] reserved
 1848 17:54:41.476685  <6>[    0.000000] BIOS-e820: [mem 0x00000000d0000000-0x00000000d0ffffff] reserved
 1849 17:54:41.486439  <6>[    0.000000] BIOS-e820: [mem 0x00000000e0000000-0x00000000efffffff] reserved
 1850 17:54:41.492994  <6>[    0.000000] BIOS-e820: [mem 0x00000000fed10000-0x00000000fed17fff] reserved
 1851 17:54:41.503008  <6>[    0.000000] BIOS-e820: [mem 0x0000000100000000-0x000000017fffffff] usable
...
 2030 17:54:42.809183  <6>[    0.313771] pci_bus 0000:00: root bus resource [mem 0x000a0000-0x000bffff window]
 2031 17:54:42.819092  <6>[    0.314424] pci_bus 0000:00: root bus resource [mem 0x7b800000-0xe0000000 window]

Since the main [mem 0x7b800000-0xe0000000 window] is not fully covered by a single e820 entry, for that
resource there should be no change.

But the ISA MMIO window: [mem 0x000a0000-0x000bffff window] is fully covered by:

BIOS-e820: [mem 0x00000000000a0000-0x00000000000fffff] reserved

So that will now become available as memory to assign some resources to, where before it was
not.

So I guess we should try adding a patch to skip the "fully covered" tests for ISA MMIO space
and see if that helps ?

Bjorn do you agree?

Mark, if one of use writes a test patch, can you get that Asus machine to boot a
kernel build from next + the test patch ?

Regards,

Hans





> 
> I'd also note that the machine hp-x360-12b-n4000-octopus appears to have
> started failing at the same time with similar symptoms, failing log:
> 
>    https://storage.kernelci.org/next/master/next-20220324/x86_64/x86_64_defconfig+x86-chromebook/gcc-10/lab-collabora/baseline-hp-x360-12b-n4000-octopus.html
> 
> and passing log:
> 
>    https://storage.kernelci.org/next/master/next-20220310/x86_64/x86_64_defconfig+x86-chromebook/gcc-10/lab-collabora/baseline-hp-x360-12b-n4000-octopus.html
> 
> though we didn't get a bisect for that yet.  That's also a Chromebook.
> 
>> * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
>> * This automated bisection report was sent to you on the basis  *
>> * that you may be involved with the breaking commit it has      *
>> * found.  No manual investigation has been done to verify it,   *
>> * and the root cause of the problem may be somewhere else.      *
>> *                                                               *
>> * If you do send a fix, please include this trailer:            *
>> *   Reported-by: "kernelci.org bot" <bot@kernelci.org>          *
>> *                                                               *
>> * Hope this helps!                                              *
>> * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
>>
>> next/master bisection: baseline.login on asus-C523NA-A20057-coral
>>
>> Summary:
>>   Start:      f8833a2b2356 Add linux-next specific files for 20220322
>>   Plain log:  https://storage.kernelci.org/next/master/next-20220322/x86_64/x86_64_defconfig+x86-chromebook/gcc-10/lab-collabora/baseline-asus-C523NA-A20057-coral.txt
>>   HTML log:   https://storage.kernelci.org/next/master/next-20220322/x86_64/x86_64_defconfig+x86-chromebook/gcc-10/lab-collabora/baseline-asus-C523NA-A20057-coral.html
>>   Result:     5949965ec934 x86/PCI: Preserve host bridge windows completely covered by E820
>>
>> Checks:
>>   revert:     PASS
>>   verify:     PASS
>>
>> Parameters:
>>   Tree:       next
>>   URL:        https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git
>>   Branch:     master
>>   Target:     asus-C523NA-A20057-coral
>>   CPU arch:   x86_64
>>   Lab:        lab-collabora
>>   Compiler:   gcc-10
>>   Config:     x86_64_defconfig+x86-chromebook
>>   Test case:  baseline.login
>>
>> Breaking commit found:
>>
>> -------------------------------------------------------------------------------
>> commit 5949965ec9340cfc0e65f7d8a576b660b26e2535
>> Author: Bjorn Helgaas <bhelgaas@google.com>
>> Date:   Thu Mar 3 18:03:30 2022 -0600
>>
>>     x86/PCI: Preserve host bridge windows completely covered by E820
>>     
>>     Many folks have reported PCI devices not working.  It could affect any
>>     device, but most reports are for Thunderbolt controllers on Lenovo Yoga and
>>     Clevo Barebone laptops and the touchpad on Lenovo IdeaPads.
>>     
>>     In every report, a region in the E820 table entirely encloses a PCI host
>>     bridge window from _CRS, and because of 4dc2287c1805 ("x86: avoid E820
>>     regions when allocating address space"), we ignore the entire window,
>>     preventing us from assigning space to PCI devices.
>>     
>>     For example, the dmesg log [2] from bug report [1] shows:
>>     
>>       BIOS-e820: [mem 0x000000004bc50000-0x00000000cfffffff] reserved
>>       pci_bus 0000:00: root bus resource [mem 0x65400000-0xbfffffff window]
>>       pci 0000:00:15.0: BAR 0: no space for [mem size 0x00001000 64bit]
>>     
>>     The efi=debug dmesg log [3] from the same report shows the EFI memory map
>>     entries that created the E820 map:
>>     
>>       efi: mem47: [Reserved |   |WB|WT|WC|UC] range=[0x4bc50000-0x5fffffff]
>>       efi: mem48: [Reserved |   |WB|  |  |UC] range=[0x60000000-0x60ffffff]
>>       efi: mem49: [Reserved |   |  |  |  |  ] range=[0x61000000-0x653fffff]
>>       efi: mem50: [MMIO     |RUN|  |  |  |UC] range=[0x65400000-0xcfffffff]
>>     
>>     4dc2287c1805 ("x86: avoid E820 regions when allocating address space")
>>     works around issues where _CRS contains non-window address space that can't
>>     be used for PCI devices.  It does this by removing E820 regions from host
>>     bridge windows.  But in these reports, the E820 region covers the entire
>>     window, so 4dc2287c1805 makes it completely unusable.
>>     
>>     Per UEFI v2.8, sec 7.2, the EfiMemoryMappedIO type means:
>>     
>>       Used by system firmware to request that a memory-mapped IO region be
>>       mapped by the OS to a virtual address so it can be accessed by EFI
>>       runtime services.
>>     
>>     A host bridge window is definitely a memory-mapped IO region, and EFI
>>     runtime services may need to access it, so I don't think we can argue that
>>     this is a firmware defect.
>>     
>>     Instead, change the 4dc2287c1805 strategy so it only removes E820 regions
>>     when they overlap *part* of a host bridge window on the assumption that a
>>     partial overlap is really register space, not part of the window proper.
>>     
>>     If an E820 region covers the entire window from _CRS, assume the _CRS
>>     window is correct and do nothing.
>>     
>>     [1] https://bugzilla.redhat.com/show_bug.cgi?id=1868899
>>     [2] https://bugzilla.redhat.com/attachment.cgi?id=1711424
>>     [3] https://bugzilla.redhat.com/attachment.cgi?id=1861407
>>     
>>     BugLink: https://bugzilla.kernel.org/show_bug.cgi?id=206459
>>     BugLink: https://bugzilla.kernel.org/show_bug.cgi?id=214259
>>     BugLink: https://bugzilla.redhat.com/show_bug.cgi?id=1868899
>>     BugLink: https://bugzilla.redhat.com/show_bug.cgi?id=1871793
>>     BugLink: https://bugs.launchpad.net/bugs/1878279
>>     BugLink: https://bugs.launchpad.net/bugs/1931715
>>     BugLink: https://bugs.launchpad.net/bugs/1932069
>>     BugLink: https://bugs.launchpad.net/bugs/1921649
>>     Fixes: 4dc2287c1805 ("x86: avoid E820 regions when allocating address space")
>>     Link: https://lore.kernel.org/r/20220228105259.230903-1-hdegoede@redhat.com
>>     Based-on-patch-by: Hans de Goede <hdegoede@redhat.com>
>>     Link: https://lore.kernel.org/r/20220304035110.988712-4-helgaas@kernel.org
>>     Reported-by: Benoit Grégoire <benoitg@coeus.ca>   # BZ 206459
>>     Reported-by: wse@tuxedocomputers.com              # BZ 214259
>>     Tested-by: Matt Hansen <2lprbe78@duck.com>
>>     Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
>>     Reviewed-by: Hans de Goede <hdegoede@redhat.com>
>>     Reviewed-by: Mika Westerberg <mika.westerberg@linux.intel.com>
>>     Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
>>
>> diff --git a/arch/x86/kernel/resource.c b/arch/x86/kernel/resource.c
>> index 7378ea146976..90203217c359 100644
>> --- a/arch/x86/kernel/resource.c
>> +++ b/arch/x86/kernel/resource.c
>> @@ -39,6 +39,21 @@ void remove_e820_regions(struct device *dev, struct resource *avail)
>>  		e820_start = entry->addr;
>>  		e820_end = entry->addr + entry->size - 1;
>>  
>> +		/*
>> +		 * If an E820 entry covers just part of the resource, we
>> +		 * assume E820 is telling us about something like host
>> +		 * bridge register space that is unavailable for PCI
>> +		 * devices.  But if it covers the *entire* resource, it's
>> +		 * more likely just telling us that this is MMIO space, and
>> +		 * that doesn't need to be removed.
>> +		 */
>> +		if (e820_start <= avail->start && avail->end <= e820_end) {
>> +			dev_info(dev, "resource %pR fully covered by e820 entry [mem %#010Lx-%#010Lx]\n",
>> +				 avail, e820_start, e820_end);
>> +
>> +			continue;
>> +		}
>> +
>>  		resource_clip(avail, e820_start, e820_end);
>>  		if (orig.start != avail->start || orig.end != avail->end) {
>>  			dev_info(dev, "clipped %pR to %pR for e820 entry [mem %#010Lx-%#010Lx]\n",
>> -------------------------------------------------------------------------------
>>
>>
>> Git bisection log:
>>
>> -------------------------------------------------------------------------------
>> git bisect start
>> # good: [5628b8de1228436d47491c662dc521bc138a3d43] Merge tag 'random-5.18-rc1-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/crng/random
>> git bisect good 5628b8de1228436d47491c662dc521bc138a3d43
>> # bad: [f8833a2b23562be2dae91775127c8014c44d8566] Add linux-next specific files for 20220322
>> git bisect bad f8833a2b23562be2dae91775127c8014c44d8566
>> # bad: [d2de72259f3d22054272217eac92e624835bfc3b] Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next.git
>> git bisect bad d2de72259f3d22054272217eac92e624835bfc3b
>> # bad: [5920db3e4b50218dcf2101f3d87c3b69a1120981] Merge branch 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/hid/hid.git
>> git bisect bad 5920db3e4b50218dcf2101f3d87c3b69a1120981
>> # bad: [b579dc07dce4637b7f2a3fb84394ebbd6666a81f] Merge branch 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc.git
>> git bisect bad b579dc07dce4637b7f2a3fb84394ebbd6666a81f
>> # bad: [b5475dd9fab03e6867abad00ecb98e0d3827ad31] Merge branch 'for-next' of git://git.armlinux.org.uk/~rmk/linux-arm.git
>> git bisect bad b5475dd9fab03e6867abad00ecb98e0d3827ad31
>> # good: [7b72f3bb0907319e15765ae9dcf1f15fdd112bcf] Merge remote-tracking branch 'asoc/for-5.17' into asoc-linus
>> git bisect good 7b72f3bb0907319e15765ae9dcf1f15fdd112bcf
>> # bad: [077dc6bc0658177057bfd69ef3a990e6d8d32146] Merge branch 'gpio/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux.git
>> git bisect bad 077dc6bc0658177057bfd69ef3a990e6d8d32146
>> # good: [fd11727eec0dd95ee1b7d8f9f10ee60678eecc29] crypto: hisilicon/qm - fix memset during queues clearing
>> git bisect good fd11727eec0dd95ee1b7d8f9f10ee60678eecc29
>> # good: [646b907e1559f006c79a752ee3eebe220ceb983d] Merge tag 'asoc-v5.18' of https://git.kernel.org/pub/scm/linux/kernel/git/broonie/sound into for-linus
>> git bisect good 646b907e1559f006c79a752ee3eebe220ceb983d
>> # bad: [f8ed0b7c999405bd12ab9ebb0765e2baa7eb6184] Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci.git
>> git bisect bad f8ed0b7c999405bd12ab9ebb0765e2baa7eb6184
>> # good: [9fd75b66b8f68498454d685dc4ba13192ae069b0] ax25: Fix refcount leaks caused by ax25_cb_del()
>> git bisect good 9fd75b66b8f68498454d685dc4ba13192ae069b0
>> # good: [3fd177beee75eb2d7e5b19992e8c90eb1a141432] Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound.git
>> git bisect good 3fd177beee75eb2d7e5b19992e8c90eb1a141432
>> # good: [09005bef55291a99b491a47ce676dfb4f40f8edd] Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi.git
>> git bisect good 09005bef55291a99b491a47ce676dfb4f40f8edd
>> # good: [d13f73e9108a75209d03217d60462f51092499fe] x86/PCI: Log host bridge window clipping for E820 regions
>> git bisect good d13f73e9108a75209d03217d60462f51092499fe
>> # bad: [5949965ec9340cfc0e65f7d8a576b660b26e2535] x86/PCI: Preserve host bridge windows completely covered by E820
>> git bisect bad 5949965ec9340cfc0e65f7d8a576b660b26e2535
>> # first bad commit: [5949965ec9340cfc0e65f7d8a576b660b26e2535] x86/PCI: Preserve host bridge windows completely covered by E820
>> -------------------------------------------------------------------------------
>>
>>
>> -=-=-=-=-=-=-=-=-=-=-=-
>> Groups.io Links: You receive all messages sent to this group.
>> View/Reply Online (#25006): https://groups.io/g/kernelci-results/message/25006
>> Mute This Topic: https://groups.io/mt/89994186/1131744
>> Group Owner: kernelci-results+owner@groups.io
>> Unsubscribe: https://groups.io/g/kernelci-results/unsub [broonie@kernel.org]
>> -=-=-=-=-=-=-=-=-=-=-=-
>>
>>


  reply	other threads:[~2022-03-24 20:34 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <623c13ec.1c69fb81.8cbdb.5a7a@mx.google.com>
2022-03-24 17:52 ` next/master bisection: baseline.login on asus-C523NA-A20057-coral Mark Brown
2022-03-24 20:34   ` Hans de Goede [this message]
2022-03-24 22:19     ` Mark Brown
2022-03-28 12:54       ` Hans de Goede
2022-03-29 18:44         ` Guillaume Tucker
2022-04-04 19:44           ` Guillaume Tucker
2022-04-05  8:13             ` Hans de Goede
2022-04-05 17:57             ` Bjorn Helgaas
     [not found]           ` <16E2C910B4947F17.5433@groups.io>
2022-04-04 19:48             ` Guillaume Tucker
2022-03-30 11:35         ` Bjorn Helgaas
2022-04-04  8:45           ` Hans de Goede
2022-04-06  0:19             ` Bjorn Helgaas
2022-04-11  9:54               ` Hans de Goede
2022-04-11  9:57                 ` Hans de Goede
2022-03-24 23:08     ` Bjorn Helgaas
2022-03-29 22:14   ` Bjorn Helgaas
2022-04-05 23:53   ` Bjorn Helgaas
2022-04-06 18:59     ` Bjorn Helgaas
2022-04-06 19:37       ` Mark Brown
2022-04-06 20:11         ` Guillaume Tucker
2022-04-07 15:17           ` Denys Fedoryshchenko
2022-04-06 20:56         ` Guenter Roeck

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4e9fca2f-0af1-3684-6c97-4c35befd5019@redhat.com \
    --to=hdegoede@redhat.com \
    --cc=bhelgaas@google.com \
    --cc=bot@kernelci.org \
    --cc=broonie@kernel.org \
    --cc=gtucker@collabora.com \
    --cc=kernelci-results@groups.io \
    --cc=linux-pci@vger.kernel.org \
    --cc=mika.westerberg@linux.intel.com \
    --cc=rjw@rjwysocki.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).