All of lore.kernel.org
 help / color / mirror / Atom feed
From: Hans de Goede <hdegoede@redhat.com>
To: Bjorn Helgaas <helgaas@kernel.org>
Cc: "Rafael J . Wysocki" <rjw@rjwysocki.net>,
	Len Brown <lenb@kernel.org>, Bjorn Helgaas <bhelgaas@google.com>,
	linux-acpi@vger.kernel.org, linux-pci@vger.kernel.org
Subject: Re: [RFC 1/1] PCI/ACPI: Make acpi_pci_root_validate_resources() reject IOMEM resources which start at address 0
Date: Thu, 1 Jul 2021 16:21:33 +0200	[thread overview]
Message-ID: <7d80f639-0768-850b-5313-3bdedf0a5579@redhat.com> (raw)
In-Reply-To: <c55ee3ef-f15e-d043-4cdf-35c1026089ec@redhat.com>

Hi Bjorn,

On 6/21/21 1:49 PM, Hans de Goede wrote:

<snip>

>> But maybe we aren't smart enough when trying to allocate space.
>> Places like __pci_bus_size_bridges() and __pci_assign_resource() are
>> full of assumptions about what PCI BARs can go where, depending on
>> 64bit-ness, prefetchability, etc.
>>
>> Maybe instrumenting those allocation paths would give some insight.
>> Possibly we should go ahead and merge some permanent pci_dbg() stuff
>> there too.
> 
> I agree that this seems to be the most likely issue. I've build
> a Fedora kernel pkg with some extra debugging added for the reporter
> to be test.  Since I'm reliant on a debug cycle where I provide
> a kernel and then the reporter comes back with a new debug,
> and then rince-repeat it might be a while before I get back to you
> on this.  Hopefully when I do get back I will have figured out
> what the problem is.
> 
> (and in case it was not clear yet the RFC patch which started this
> discussion clearly is bogus and can be considered dropped).

Ok, I believe I've figured out what the root cause of things
failing is here.

I've added a couple of pr_info-s to kernel/resource.c __find_resource()
like this:

                pr_info("__find_resource() trying 0x%llx - 0x%llx\n", tmp.start, tmp.end);
                if (tmp.end < tmp.start)
                        goto next;

                resource_clip(&tmp, constraint->min, constraint->max);
                arch_remove_reservations(&tmp);
                pr_info("__find_resource() after clip + arch-reserv 0x%llx - 0x%llx\n", tmp.start, tmp.end);

And the following (relevant) messages get logged by these 2 pr_info-s
when trying to allocate a resource for the unassigned IOMEM BAR
of the I2C controller on the troublesome laptops:

[    1.262423] resource: allocate_resource root [mem 0x65400000-0xbfffffff window] new [mem size 0x00001000 64bit] size 0x1000 min 0x65400000 max 0xbfffffff align 0x1000
[    1.262426] resource: __find_resource() trying 0x65400000 - 0x6fffffff
[    1.262428] resource: __find_resource() after clip + arch-reserv 0xd0000000 - 0x6fffffff
...
[    1.262543] resource: __find_resource() trying 0x8122d000 - 0x8122efff
[    1.262544] resource: __find_resource() after clip + arch-reserv 0xd0000000 - 0x8122efff
etc.

Notice that the regions start at 0xd0000000 rather then their original start
address after these 2 calls:

                resource_clip(&tmp, constraint->min, constraint->max);
                arch_remove_reservations(&tmp);

I believe that this is caused by the arch_remove_reservations call,
which applies e820 reservations and the e820 table has the following:

[    0.000000] BIOS-e820: [mem 0x000000004bc50000-0x00000000cfffffff] reserved

So the e820 table is basically reserving the entire main PCI bus iomem
window, which runs from 0x65400000-0xbfffffff :

[    0.609979] pci_bus 0000:00: root bus resource [io  0x0000-0x0cf7 window]
[    0.609983] pci_bus 0000:00: root bus resource [io  0x0d00-0xffff window]
[    0.609986] pci_bus 0000:00: root bus resource [mem 0x000a0000-0x000bffff window]
[    0.609988] pci_bus 0000:00: root bus resource [mem 0x65400000-0xbfffffff window]
[    0.609991] pci_bus 0000:00: root bus resource [bus 00-fe]

One way of fixing this would be to check CRS provided PCI root bridge windows
vs e820 reservations and remove (carve out) the PCI root bridge iomem windows
from any e820 reservations. Probably with some warning about the firmware
being buggy.   I'm not sure if this won't cause issues else where though.

So before I spend time on trying to code something like this, any other
ideas / suggestions how to fix this ?

Regards,

Hans


      parent reply	other threads:[~2021-07-01 14:21 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-06-15 10:25 [RFC 0/1] " Hans de Goede
2021-06-15 10:25 ` [RFC 1/1] " Hans de Goede
2021-06-15 10:59   ` Rafael J. Wysocki
2021-06-15 11:33     ` Hans de Goede
2021-06-15 11:52       ` Rafael J. Wysocki
2021-06-15 20:23   ` Bjorn Helgaas
2021-06-16 18:43     ` Hans de Goede
2021-06-16 22:57       ` Bjorn Helgaas
2021-06-21 11:49         ` Hans de Goede
2021-06-21 12:37           ` Bjorn Helgaas
2021-07-01 14:21           ` Hans de Goede [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=7d80f639-0768-850b-5313-3bdedf0a5579@redhat.com \
    --to=hdegoede@redhat.com \
    --cc=bhelgaas@google.com \
    --cc=helgaas@kernel.org \
    --cc=lenb@kernel.org \
    --cc=linux-acpi@vger.kernel.org \
    --cc=linux-pci@vger.kernel.org \
    --cc=rjw@rjwysocki.net \
    --subject='Re: [RFC 1/1] PCI/ACPI: Make acpi_pci_root_validate_resources() reject IOMEM resources which start at address 0' \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.