All of lore.kernel.org
 help / color / mirror / Atom feed
From: Hans de Goede <hdegoede@redhat.com>
To: Bjorn Helgaas <helgaas@kernel.org>
Cc: "Rafael J . Wysocki" <rjw@rjwysocki.net>,
	"Mika Westerberg" <mika.westerberg@linux.intel.com>,
	"Krzysztof Wilczyński" <kw@linux.com>,
	"Bjorn Helgaas" <bhelgaas@google.com>,
	"Myron Stowe" <myron.stowe@redhat.com>,
	"Juha-Pekka Heikkila" <juhapekka.heikkila@gmail.com>,
	"Thomas Gleixner" <tglx@linutronix.de>,
	"Ingo Molnar" <mingo@redhat.com>,
	"Borislav Petkov" <bp@alien8.de>,
	"H . Peter Anvin" <hpa@zytor.com>,
	linux-acpi@vger.kernel.org, linux-pci@vger.kernel.org,
	x86@kernel.org, linux-kernel@vger.kernel.org,
	"Benoit Grégoire" <benoitg@coeus.ca>,
	"Hui Wang" <hui.wang@canonical.com>,
	stable@vger.kernel.org,
	"Rafael J . Wysocki" <rafael.j.wysocki@intel.com>
Subject: Re: [PATCH v5 1/2] x86/PCI: Ignore E820 reservations for bridge windows on newer systems
Date: Fri, 29 Oct 2021 10:10:05 +0200	[thread overview]
Message-ID: <82035130-d810-9f0b-259e-61280de1d81f@redhat.com> (raw)
In-Reply-To: <75d1ef5a-13d9-9a67-0139-90b27b084c84@redhat.com>

Hi Bjorn,

On 10/22/21 11:53, Hans de Goede wrote:
> Hi Bjorn,
> 
> On 10/22/21 03:20, Bjorn Helgaas wrote:
>> On Thu, Oct 21, 2021 at 07:15:57PM +0200, Hans de Goede wrote:
>>> On 10/20/21 23:14, Bjorn Helgaas wrote:
>>>> On Wed, Oct 20, 2021 at 12:23:26PM +0200, Hans de Goede wrote:
>>>>> On 10/19/21 23:52, Bjorn Helgaas wrote:
>>>>>> On Thu, Oct 14, 2021 at 08:39:42PM +0200, Hans de Goede wrote:
>>>>>>> Some BIOS-es contain a bug where they add addresses which map to system
>>>>>>> RAM in the PCI host bridge window returned by the ACPI _CRS method, see
>>>>>>> commit 4dc2287c1805 ("x86: avoid E820 regions when allocating address
>>>>>>> space").
>>>>>>>
>>>>>>> To work around this bug Linux excludes E820 reserved addresses when
>>>>>>> allocating addresses from the PCI host bridge window since 2010.
>>>>>>> ...
>>>>
>>>>>> I haven't seen anybody else eager to merge this, so I guess I'll stick
>>>>>> my neck out here.
>>>>>>
>>>>>> I applied this to my for-linus branch for v5.15.
>>>>>
>>>>> Thank you, and sorry about the build-errors which the lkp
>>>>> kernel-test-robot found.
>>>>>
>>>>> I've just send out a patch which fixes these build-errors
>>>>> (verified with both .config-s from the lkp reports).
>>>>> Feel free to squash this into the original patch (or keep
>>>>> them separate, whatever works for you).
>>>>
>>>> Thanks, I squashed the fix in.
>>>>
>>>> HOWEVER, I think it would be fairly risky to push this into v5.15.
>>>> We would be relying on the assumption that current machines have all
>>>> fixed the BIOS defect that 4dc2287c1805 addressed, and we have little
>>>> evidence for that.
>>>
>>> It is a 10 year old BIOS defect, so hopefully anything from 2018
>>> or later will not have it.
>>
>> We can hope.  AFAIK, Windows allocates space top-down, while Linux
>> allocates bottom-up, so I think it's quite possible these defects
>> would never be discovered or fixed.  In any event, I don't think we
>> have much evidence either way.
> 
> Ack.
> 
>>>> I'm not sure there's significant benefit to having this in v5.15.
>>>> Yes, the mainline v5.15 kernel would work on the affected machines,
>>>> but I suspect most people with those machines are running distro
>>>> kernels, not mainline kernels.
>>>
>>> Fedora and Arch do follow mainline pretty closely and a lot of
>>> users are affected by this (see the large number of BugLinks in
>>> the commit).
>>>
>>> I completely understand why you are reluctant to push this out, but
>>> your argument about most distros not running mainline kernels also
>>> applies to chances of people where this may cause a regression
>>> running mainline kernels also being quite small.
>>
>> True.
>>
>>>> This issue has been around a long time, so it's not like a regression
>>>> that we just introduced.  If we fixed these machines and regressed
>>>> *other* machines, we'd be worse off than we are now.
>>>
>>> If we break one machine model and fix a whole bunch of other machines
>>> then in my book that is a win. Ideally we would not break anything,
>>> but we can only find out if we actually break anything if we ship
>>> the change.
>>
>> I'm definitely not going to try the "fix many, break one" argument on
>> Linus.  Of course we want to fix systems, but IMO it's far better to
>> leave a system broken than it is to break one that used to work.
> 
> Right, what I meant to say with "a win" is a step in the right direction,
> we definitely must address any regressions coming from this change as
> soon as we learn about them.
> 
>>>> In the meantime, here's another possibility for working around this.
>>>> What if we discarded remove_e820_regions() completely, but aligned the
>>>> problem _CRS windows a little more?  The 4dc2287c1805 case was this:
>>>>
>>>>   BIOS-e820: 00000000bfe4dc00 - 00000000c0000000 (reserved)
>>>>   pci_root PNP0A03:00: host bridge window [mem 0xbff00000-0xdfffffff]
>>>>
>>>> where the _CRS window was of size 0x20100000, i.e., 512M + 1M.  At
>>>> least in this particular case, we could avoid the problem by throwing
>>>> away that first 1M and aligning the window to a nice 3G boundary.
>>>> Maybe it would be worth giving up a small fraction (less than 0.2% in
>>>> this case) of questionable windows like this?
>>>
>>> The PCI BAR allocation code tries to fall back to the BIOS assigned
>>> resource if the allocation fails. That BIOS assigned resource might
>>> fall outside of the host bridge window after we round the address.
>>>
>>> My initial gut instinct here is that this has a bigger chance
>>> of breaking things then my change.
>>>
>>> In the beginning of the thread you said that ideally we would
>>> completely stop using the E820 reservations for PCI host bridge
>>> windows. Because in hindsight messing with the windows on all
>>> machines just to work around a clear BIOS bug in some was not a
>>> good idea.
>>>
>>> This address-rounding/-aligning you now suggest, is again
>>> messing with the windows on all machines just to work around
>>> a clear BIOS bug in some. At least that is how I see this.
>>
>> That's true.  I assume Red Hat has a bunch of machines and hopefully
>> an archive of dmesg logs from them.  Those logs should contain good
>> E820 and _CRS information, so with a little scripting, maybe we could
>> get some idea of what's out there.
> 
> We do have a (large-ish) test-lab, but that contains almost exclusively
> servers, where as the original problem was on Dell Precision laptops.
> 
> Also I'm not sure if I can get aggregate data from the lab's machines.
> I can reserve time on any model we have to debug specific problems,
> but that is targeting one specific model. I'll ask around about this.

So I had another idea to get us a whole bunch of dmesg outputs and that
is to use the database collected by linux-hardware.org . The dmesg
were already individually accessible by selecting a specific model machine,
but I asked them if they could do a dump and I just got an email that a
dmesg dump is now available here:

https://github.com/linuxhw/Dmesg

Note be careful with the size of the repository - it will take ~3 gigabytes
of network traffic and ~20 gigabytes of space on the drive to checkout it.

So if you want dmesg outputs to grep through for e820 / host-bridge-window
info, here you go.

Regards,

Hans


  reply	other threads:[~2021-10-29  8:10 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <bfbac749-7434-1497-039b-3b8bc4dc5499@redhat.com>
2021-10-20 21:14 ` [PATCH v5 1/2] x86/PCI: Ignore E820 reservations for bridge windows on newer systems Bjorn Helgaas
2021-10-21 17:15   ` Hans de Goede
2021-10-22  1:20     ` Bjorn Helgaas
2021-10-22  9:53       ` Hans de Goede
2021-10-29  8:10         ` Hans de Goede [this message]
2021-11-06 10:15   ` Hans de Goede
2021-11-09 22:07     ` Bjorn Helgaas
2021-11-10  8:45       ` Hans de Goede
2021-11-10 13:05         ` Hans de Goede
2021-11-10 21:51           ` Thomas Backlund
2021-12-07 16:52           ` Hans de Goede
2021-12-15 16:01             ` Bjorn Helgaas
2021-12-15 16:33               ` Hans de Goede
2021-10-14 18:39 [PATCH v5 0/2] " Hans de Goede
2021-10-14 18:39 ` [PATCH v5 1/2] " Hans de Goede
2021-10-15 20:03   ` Bjorn Helgaas
2021-10-15 20:15     ` Hans de Goede
2021-10-19 21:52   ` Bjorn Helgaas
2021-10-19 21:55     ` Bjorn Helgaas
2021-10-21 11:30     ` Hans de Goede

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=82035130-d810-9f0b-259e-61280de1d81f@redhat.com \
    --to=hdegoede@redhat.com \
    --cc=benoitg@coeus.ca \
    --cc=bhelgaas@google.com \
    --cc=bp@alien8.de \
    --cc=helgaas@kernel.org \
    --cc=hpa@zytor.com \
    --cc=hui.wang@canonical.com \
    --cc=juhapekka.heikkila@gmail.com \
    --cc=kw@linux.com \
    --cc=linux-acpi@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-pci@vger.kernel.org \
    --cc=mika.westerberg@linux.intel.com \
    --cc=mingo@redhat.com \
    --cc=myron.stowe@redhat.com \
    --cc=rafael.j.wysocki@intel.com \
    --cc=rjw@rjwysocki.net \
    --cc=stable@vger.kernel.org \
    --cc=tglx@linutronix.de \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.