All of lore.kernel.org
 help / color / mirror / Atom feed
From: Hans de Goede <hdegoede@redhat.com>
To: Mark Brown <broonie@kernel.org>,
	Bjorn Helgaas <bhelgaas@google.com>,
	"Rafael J . Wysocki" <rjw@rjwysocki.net>,
	Mika Westerberg <mika.westerberg@linux.intel.com>
Cc: kernelci-results@groups.io, bot@kernelci.org,
	gtucker@collabora.com, linux-pci@vger.kernel.org
Subject: Re: next/master bisection: baseline.login on asus-C523NA-A20057-coral
Date: Thu, 24 Mar 2022 21:34:30 +0100	[thread overview]
Message-ID: <4e9fca2f-0af1-3684-6c97-4c35befd5019@redhat.com> (raw)
In-Reply-To: <Yjyv03JsetIsTJxN@sirena.org.uk>

Hi Mark,

Thank you for the report.

On 3/24/22 18:52, Mark Brown wrote:
> On Wed, Mar 23, 2022 at 11:47:08PM -0700, KernelCI bot wrote:
> 
> The KernelCI bisection bot has identified commit 5949965ec9340cfc0e
> ("x86/PCI: Preserve host bridge windows completely covered by E820")
> as causing a boot regression in next on asus-C523NA-A20057-coral (a
> Chromebook AIUI).  Unfortunately there's no useful output when starting
> the kernel.  I've left the full report below including links to the web
> dashboard.
> 
> The last successful boot in -next had this log:
> 
>    https://storage.kernelci.org/next/master/next-20220310/x86_64/x86_64_defconfig+x86-chromebook/gcc-10/lab-collabora/baseline-asus-C523NA-A20057-coral.html

So the interesting bits from this log are:

 1839 17:54:41.406548  <6>[    0.000000] BIOS-provided physical RAM map:
 1840 17:54:41.413121  <6>[    0.000000] BIOS-e820: [mem 0x0000000000000000-0x0000000000000fff] type 16
 1841 17:54:41.419712  <6>[    0.000000] BIOS-e820: [mem 0x0000000000001000-0x000000000009ffff] usable
 1842 17:54:41.430192  <6>[    0.000000] BIOS-e820: [mem 0x00000000000a0000-0x00000000000fffff] reserved
 1843 17:54:41.436207  <6>[    0.000000] BIOS-e820: [mem 0x0000000000100000-0x000000000fffffff] usable
 1844 17:54:41.446353  <6>[    0.000000] BIOS-e820: [mem 0x0000000010000000-0x0000000012150fff] reserved
 1845 17:54:41.453290  <6>[    0.000000] BIOS-e820: [mem 0x0000000012151000-0x000000007a9fcfff] usable
 1846 17:54:41.459966  <6>[    0.000000] BIOS-e820: [mem 0x000000007a9fd000-0x000000007affffff] type 16
 1847 17:54:41.469549  <6>[    0.000000] BIOS-e820: [mem 0x000000007b000000-0x000000007fffffff] reserved
 1848 17:54:41.476685  <6>[    0.000000] BIOS-e820: [mem 0x00000000d0000000-0x00000000d0ffffff] reserved
 1849 17:54:41.486439  <6>[    0.000000] BIOS-e820: [mem 0x00000000e0000000-0x00000000efffffff] reserved
 1850 17:54:41.492994  <6>[    0.000000] BIOS-e820: [mem 0x00000000fed10000-0x00000000fed17fff] reserved
 1851 17:54:41.503008  <6>[    0.000000] BIOS-e820: [mem 0x0000000100000000-0x000000017fffffff] usable
...
 2030 17:54:42.809183  <6>[    0.313771] pci_bus 0000:00: root bus resource [mem 0x000a0000-0x000bffff window]
 2031 17:54:42.819092  <6>[    0.314424] pci_bus 0000:00: root bus resource [mem 0x7b800000-0xe0000000 window]

Since the main [mem 0x7b800000-0xe0000000 window] is not fully covered by a single e820 entry, for that
resource there should be no change.

But the ISA MMIO window: [mem 0x000a0000-0x000bffff window] is fully covered by:

BIOS-e820: [mem 0x00000000000a0000-0x00000000000fffff] reserved

So that will now become available as memory to assign some resources to, where before it was
not.

So I guess we should try adding a patch to skip the "fully covered" tests for ISA MMIO space
and see if that helps ?

Bjorn do you agree?

Mark, if one of use writes a test patch, can you get that Asus machine to boot a
kernel build from next + the test patch ?

Regards,

Hans





> 
> I'd also note that the machine hp-x360-12b-n4000-octopus appears to have
> started failing at the same time with similar symptoms, failing log:
> 
>    https://storage.kernelci.org/next/master/next-20220324/x86_64/x86_64_defconfig+x86-chromebook/gcc-10/lab-collabora/baseline-hp-x360-12b-n4000-octopus.html
> 
> and passing log:
> 
>    https://storage.kernelci.org/next/master/next-20220310/x86_64/x86_64_defconfig+x86-chromebook/gcc-10/lab-collabora/baseline-hp-x360-12b-n4000-octopus.html
> 
> though we didn't get a bisect for that yet.  That's also a Chromebook.
> 
>> * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
>> * This automated bisection report was sent to you on the basis  *
>> * that you may be involved with the breaking commit it has      *
>> * found.  No manual investigation has been done to verify it,   *
>> * and the root cause of the problem may be somewhere else.      *
>> *                                                               *
>> * If you do send a fix, please include this trailer:            *
>> *   Reported-by: "kernelci.org bot" <bot@kernelci.org>          *
>> *                                                               *
>> * Hope this helps!                                              *
>> * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
>>
>> next/master bisection: baseline.login on asus-C523NA-A20057-coral
>>
>> Summary:
>>   Start:      f8833a2b2356 Add linux-next specific files for 20220322
>>   Plain log:  https://storage.kernelci.org/next/master/next-20220322/x86_64/x86_64_defconfig+x86-chromebook/gcc-10/lab-collabora/baseline-asus-C523NA-A20057-coral.txt
>>   HTML log:   https://storage.kernelci.org/next/master/next-20220322/x86_64/x86_64_defconfig+x86-chromebook/gcc-10/lab-collabora/baseline-asus-C523NA-A20057-coral.html
>>   Result:     5949965ec934 x86/PCI: Preserve host bridge windows completely covered by E820
>>
>> Checks:
>>   revert:     PASS
>>   verify:     PASS
>>
>> Parameters:
>>   Tree:       next
>>   URL:        https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git
>>   Branch:     master
>>   Target:     asus-C523NA-A20057-coral
>>   CPU arch:   x86_64
>>   Lab:        lab-collabora
>>   Compiler:   gcc-10
>>   Config:     x86_64_defconfig+x86-chromebook
>>   Test case:  baseline.login
>>
>> Breaking commit found:
>>
>> -------------------------------------------------------------------------------
>> commit 5949965ec9340cfc0e65f7d8a576b660b26e2535
>> Author: Bjorn Helgaas <bhelgaas@google.com>
>> Date:   Thu Mar 3 18:03:30 2022 -0600
>>
>>     x86/PCI: Preserve host bridge windows completely covered by E820
>>     
>>     Many folks have reported PCI devices not working.  It could affect any
>>     device, but most reports are for Thunderbolt controllers on Lenovo Yoga and
>>     Clevo Barebone laptops and the touchpad on Lenovo IdeaPads.
>>     
>>     In every report, a region in the E820 table entirely encloses a PCI host
>>     bridge window from _CRS, and because of 4dc2287c1805 ("x86: avoid E820
>>     regions when allocating address space"), we ignore the entire window,
>>     preventing us from assigning space to PCI devices.
>>     
>>     For example, the dmesg log [2] from bug report [1] shows:
>>     
>>       BIOS-e820: [mem 0x000000004bc50000-0x00000000cfffffff] reserved
>>       pci_bus 0000:00: root bus resource [mem 0x65400000-0xbfffffff window]
>>       pci 0000:00:15.0: BAR 0: no space for [mem size 0x00001000 64bit]
>>     
>>     The efi=debug dmesg log [3] from the same report shows the EFI memory map
>>     entries that created the E820 map:
>>     
>>       efi: mem47: [Reserved |   |WB|WT|WC|UC] range=[0x4bc50000-0x5fffffff]
>>       efi: mem48: [Reserved |   |WB|  |  |UC] range=[0x60000000-0x60ffffff]
>>       efi: mem49: [Reserved |   |  |  |  |  ] range=[0x61000000-0x653fffff]
>>       efi: mem50: [MMIO     |RUN|  |  |  |UC] range=[0x65400000-0xcfffffff]
>>     
>>     4dc2287c1805 ("x86: avoid E820 regions when allocating address space")
>>     works around issues where _CRS contains non-window address space that can't
>>     be used for PCI devices.  It does this by removing E820 regions from host
>>     bridge windows.  But in these reports, the E820 region covers the entire
>>     window, so 4dc2287c1805 makes it completely unusable.
>>     
>>     Per UEFI v2.8, sec 7.2, the EfiMemoryMappedIO type means:
>>     
>>       Used by system firmware to request that a memory-mapped IO region be
>>       mapped by the OS to a virtual address so it can be accessed by EFI
>>       runtime services.
>>     
>>     A host bridge window is definitely a memory-mapped IO region, and EFI
>>     runtime services may need to access it, so I don't think we can argue that
>>     this is a firmware defect.
>>     
>>     Instead, change the 4dc2287c1805 strategy so it only removes E820 regions
>>     when they overlap *part* of a host bridge window on the assumption that a
>>     partial overlap is really register space, not part of the window proper.
>>     
>>     If an E820 region covers the entire window from _CRS, assume the _CRS
>>     window is correct and do nothing.
>>     
>>     [1] https://bugzilla.redhat.com/show_bug.cgi?id=1868899
>>     [2] https://bugzilla.redhat.com/attachment.cgi?id=1711424
>>     [3] https://bugzilla.redhat.com/attachment.cgi?id=1861407
>>     
>>     BugLink: https://bugzilla.kernel.org/show_bug.cgi?id=206459
>>     BugLink: https://bugzilla.kernel.org/show_bug.cgi?id=214259
>>     BugLink: https://bugzilla.redhat.com/show_bug.cgi?id=1868899
>>     BugLink: https://bugzilla.redhat.com/show_bug.cgi?id=1871793
>>     BugLink: https://bugs.launchpad.net/bugs/1878279
>>     BugLink: https://bugs.launchpad.net/bugs/1931715
>>     BugLink: https://bugs.launchpad.net/bugs/1932069
>>     BugLink: https://bugs.launchpad.net/bugs/1921649
>>     Fixes: 4dc2287c1805 ("x86: avoid E820 regions when allocating address space")
>>     Link: https://lore.kernel.org/r/20220228105259.230903-1-hdegoede@redhat.com
>>     Based-on-patch-by: Hans de Goede <hdegoede@redhat.com>
>>     Link: https://lore.kernel.org/r/20220304035110.988712-4-helgaas@kernel.org
>>     Reported-by: Benoit Grégoire <benoitg@coeus.ca>   # BZ 206459
>>     Reported-by: wse@tuxedocomputers.com              # BZ 214259
>>     Tested-by: Matt Hansen <2lprbe78@duck.com>
>>     Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
>>     Reviewed-by: Hans de Goede <hdegoede@redhat.com>
>>     Reviewed-by: Mika Westerberg <mika.westerberg@linux.intel.com>
>>     Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
>>
>> diff --git a/arch/x86/kernel/resource.c b/arch/x86/kernel/resource.c
>> index 7378ea146976..90203217c359 100644
>> --- a/arch/x86/kernel/resource.c
>> +++ b/arch/x86/kernel/resource.c
>> @@ -39,6 +39,21 @@ void remove_e820_regions(struct device *dev, struct resource *avail)
>>  		e820_start = entry->addr;
>>  		e820_end = entry->addr + entry->size - 1;
>>  
>> +		/*
>> +		 * If an E820 entry covers just part of the resource, we
>> +		 * assume E820 is telling us about something like host
>> +		 * bridge register space that is unavailable for PCI
>> +		 * devices.  But if it covers the *entire* resource, it's
>> +		 * more likely just telling us that this is MMIO space, and
>> +		 * that doesn't need to be removed.
>> +		 */
>> +		if (e820_start <= avail->start && avail->end <= e820_end) {
>> +			dev_info(dev, "resource %pR fully covered by e820 entry [mem %#010Lx-%#010Lx]\n",
>> +				 avail, e820_start, e820_end);
>> +
>> +			continue;
>> +		}
>> +
>>  		resource_clip(avail, e820_start, e820_end);
>>  		if (orig.start != avail->start || orig.end != avail->end) {
>>  			dev_info(dev, "clipped %pR to %pR for e820 entry [mem %#010Lx-%#010Lx]\n",
>> -------------------------------------------------------------------------------
>>
>>
>> Git bisection log:
>>
>> -------------------------------------------------------------------------------
>> git bisect start
>> # good: [5628b8de1228436d47491c662dc521bc138a3d43] Merge tag 'random-5.18-rc1-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/crng/random
>> git bisect good 5628b8de1228436d47491c662dc521bc138a3d43
>> # bad: [f8833a2b23562be2dae91775127c8014c44d8566] Add linux-next specific files for 20220322
>> git bisect bad f8833a2b23562be2dae91775127c8014c44d8566
>> # bad: [d2de72259f3d22054272217eac92e624835bfc3b] Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next.git
>> git bisect bad d2de72259f3d22054272217eac92e624835bfc3b
>> # bad: [5920db3e4b50218dcf2101f3d87c3b69a1120981] Merge branch 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/hid/hid.git
>> git bisect bad 5920db3e4b50218dcf2101f3d87c3b69a1120981
>> # bad: [b579dc07dce4637b7f2a3fb84394ebbd6666a81f] Merge branch 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc.git
>> git bisect bad b579dc07dce4637b7f2a3fb84394ebbd6666a81f
>> # bad: [b5475dd9fab03e6867abad00ecb98e0d3827ad31] Merge branch 'for-next' of git://git.armlinux.org.uk/~rmk/linux-arm.git
>> git bisect bad b5475dd9fab03e6867abad00ecb98e0d3827ad31
>> # good: [7b72f3bb0907319e15765ae9dcf1f15fdd112bcf] Merge remote-tracking branch 'asoc/for-5.17' into asoc-linus
>> git bisect good 7b72f3bb0907319e15765ae9dcf1f15fdd112bcf
>> # bad: [077dc6bc0658177057bfd69ef3a990e6d8d32146] Merge branch 'gpio/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux.git
>> git bisect bad 077dc6bc0658177057bfd69ef3a990e6d8d32146
>> # good: [fd11727eec0dd95ee1b7d8f9f10ee60678eecc29] crypto: hisilicon/qm - fix memset during queues clearing
>> git bisect good fd11727eec0dd95ee1b7d8f9f10ee60678eecc29
>> # good: [646b907e1559f006c79a752ee3eebe220ceb983d] Merge tag 'asoc-v5.18' of https://git.kernel.org/pub/scm/linux/kernel/git/broonie/sound into for-linus
>> git bisect good 646b907e1559f006c79a752ee3eebe220ceb983d
>> # bad: [f8ed0b7c999405bd12ab9ebb0765e2baa7eb6184] Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci.git
>> git bisect bad f8ed0b7c999405bd12ab9ebb0765e2baa7eb6184
>> # good: [9fd75b66b8f68498454d685dc4ba13192ae069b0] ax25: Fix refcount leaks caused by ax25_cb_del()
>> git bisect good 9fd75b66b8f68498454d685dc4ba13192ae069b0
>> # good: [3fd177beee75eb2d7e5b19992e8c90eb1a141432] Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound.git
>> git bisect good 3fd177beee75eb2d7e5b19992e8c90eb1a141432
>> # good: [09005bef55291a99b491a47ce676dfb4f40f8edd] Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi.git
>> git bisect good 09005bef55291a99b491a47ce676dfb4f40f8edd
>> # good: [d13f73e9108a75209d03217d60462f51092499fe] x86/PCI: Log host bridge window clipping for E820 regions
>> git bisect good d13f73e9108a75209d03217d60462f51092499fe
>> # bad: [5949965ec9340cfc0e65f7d8a576b660b26e2535] x86/PCI: Preserve host bridge windows completely covered by E820
>> git bisect bad 5949965ec9340cfc0e65f7d8a576b660b26e2535
>> # first bad commit: [5949965ec9340cfc0e65f7d8a576b660b26e2535] x86/PCI: Preserve host bridge windows completely covered by E820
>> -------------------------------------------------------------------------------
>>
>>
>> -=-=-=-=-=-=-=-=-=-=-=-
>> Groups.io Links: You receive all messages sent to this group.
>> View/Reply Online (#25006): https://groups.io/g/kernelci-results/message/25006
>> Mute This Topic: https://groups.io/mt/89994186/1131744
>> Group Owner: kernelci-results+owner@groups.io
>> Unsubscribe: https://groups.io/g/kernelci-results/unsub [broonie@kernel.org]
>> -=-=-=-=-=-=-=-=-=-=-=-
>>
>>


  reply	other threads:[~2022-03-24 20:34 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <623c13ec.1c69fb81.8cbdb.5a7a@mx.google.com>
2022-03-24 17:52 ` next/master bisection: baseline.login on asus-C523NA-A20057-coral Mark Brown
2022-03-24 20:34   ` Hans de Goede [this message]
2022-03-24 22:19     ` Mark Brown
2022-03-28 12:54       ` Hans de Goede
2022-03-29 18:44         ` Guillaume Tucker
2022-04-04 19:44           ` Guillaume Tucker
2022-04-05  8:13             ` Hans de Goede
2022-04-05 17:57             ` Bjorn Helgaas
     [not found]           ` <16E2C910B4947F17.5433@groups.io>
2022-04-04 19:48             ` Guillaume Tucker
2022-03-30 11:35         ` Bjorn Helgaas
2022-04-04  8:45           ` Hans de Goede
2022-04-06  0:19             ` Bjorn Helgaas
2022-04-11  9:54               ` Hans de Goede
2022-04-11  9:57                 ` Hans de Goede
2022-03-24 23:08     ` Bjorn Helgaas
2022-03-29 22:14   ` Bjorn Helgaas
2022-04-05 23:53   ` Bjorn Helgaas
2022-04-06 18:59     ` Bjorn Helgaas
2022-04-06 19:37       ` Mark Brown
2022-04-06 20:11         ` Guillaume Tucker
2022-04-07 15:17           ` Denys Fedoryshchenko
2022-04-06 20:56         ` Guenter Roeck

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4e9fca2f-0af1-3684-6c97-4c35befd5019@redhat.com \
    --to=hdegoede@redhat.com \
    --cc=bhelgaas@google.com \
    --cc=bot@kernelci.org \
    --cc=broonie@kernel.org \
    --cc=gtucker@collabora.com \
    --cc=kernelci-results@groups.io \
    --cc=linux-pci@vger.kernel.org \
    --cc=mika.westerberg@linux.intel.com \
    --cc=rjw@rjwysocki.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.