* Re: [PATCH] ACPI / hotplug / PCI: Allocate resources directly under the non-hotplug bridge
2019-10-30 15:05 [PATCH] ACPI / hotplug / PCI: Allocate resources directly under the non-hotplug bridge Mika Westerberg
@ 2019-11-06 12:28 ` Rafael J. Wysocki
2019-11-06 23:24 ` Bjorn Helgaas
2019-11-12 0:19 ` Bjorn Helgaas
2 siblings, 0 replies; 10+ messages in thread
From: Rafael J. Wysocki @ 2019-11-06 12:28 UTC (permalink / raw)
To: Mika Westerberg, Bjorn Helgaas
Cc: Len Brown, Valerio Passini, linux-acpi, linux-pci
On Wednesday, October 30, 2019 4:05:45 PM CET Mika Westerberg wrote:
> Valerio and others reported that commit 84c8b58ed3ad ("ACPI / hotplug /
> PCI: Don't scan bridges managed by native hotplug") prevents some recent
> LG and HP laptops from booting with endless loop of:
>
> [ 26.237796] ACPI Error: No handler or method for GPE 08, disabling event (20190215/evgpe-835)
> [ 26.238699] ACPI Error: No handler or method for GPE 09, disabling event (20190215/evgpe-835)
> [ 26.239306] ACPI Error: No handler or method for GPE 0A, disabling event (20190215/evgpe-835)
> ...
>
> What seems to happen is that during boot, after the initial PCI
> enumeration when EC is enabled the platform triggers ACPI Notify() to
> one of the root ports. The root port itself looks like this:
>
> [ 0.723757] pci 0000:00:1b.0: PCI bridge to [bus 02-3a]
> [ 0.723765] pci 0000:00:1b.0: bridge window [mem 0xc4000000-0xda0fffff]
> [ 0.723773] pci 0000:00:1b.0: bridge window [mem 0x80000000-0xa1ffffff 64bit pref]
>
> The BIOS has configured the root port so that it does not have I/O
> bridge window.
>
> Now when the ACPI Notify() is triggered ACPI hotplug handler calls
> acpiphp_native_scan_bridge() for each non-hotplug bridge (as this system
> is using native PCIe hotplug) and pci_assign_unassigned_bridge_resources()
> to allocate resources.
>
> The device connected to the root port is a PCIe switch (Thunderbolt
> controller) with two hotplug downstream ports. Because of the hotplug
> ports __pci_bus_size_bridges() tries to add "additional I/O" of 256
> bytes to each (DEFAULT_HOTPLUG_IO_SIZE). This gets further aligned to 4k
> as that's the minimum I/O window size so each hotplug port gets 4k I/O
> window and the same happens for the root port (which is also hotplug
> port). This means 3 * 4k = 12k I/O window.
>
> Because of this pci_assign_unassigned_bridge_resources() ends up opening
> a I/O bridge window for the root port at first available I/O address
> which seems to be in range 0x1000 - 0x3fff. Normally this range is used
> for ACPI stuff such as GPE bits (below is part of /proc/ioports):
>
> 1800-1803 : ACPI PM1a_EVT_BLK
> 1804-1805 : ACPI PM1a_CNT_BLK
> 1808-180b : ACPI PM_TMR
> 1810-1815 : ACPI CPU throttle
> 1850-1850 : ACPI PM2_CNT_BLK
> 1854-1857 : pnp 00:05
> 1860-187f : ACPI GPE0_BLK
>
> However, when the ACPI Notify() happened this range was not yet reserved
> for ACPI/PNP (that happens later) so PCI gets it. It then starts writing
> to this range and accidentally stomps over GPE bits among other things
> causing the endless stream of messages about missing GPE handler.
>
> This problem does not happen if "pci=hpiosize=0" is passed in the kernel
> command line. The reason is that then the kernel does not try to
> allocate the additional 256 bytes for each hotplug port.
>
> Fix this by allocating resources directly below the non-hotplug bridges
> where a new device may appear as a result of ACPI Notify(). This avoids
> the hotplug bridges and prevents opening the additional I/O window.
>
> Link: https://bugzilla.kernel.org/show_bug.cgi?id=203617
> Fixes: 84c8b58ed3ad ("ACPI / hotplug / PCI: Don't scan bridges managed by native hotplug")
> Cc: stable@vger.kernel.org
> Reported-by: Valerio Passini <passini.valerio@gmail.com>
> Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com>
That should work:
Reviewed-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Bjorn, if you want me to take this, please let me know.
> ---
> I was able to reproduce this without access to the affected system by
> forcing ACPI core to send Notify() to the TBT root port like this:
>
> void acpi_notify_rp(void)
> {
> struct acpi_device *adev;
> acpi_handle handle;
>
> if (ACPI_FAILURE(acpi_get_handle(NULL, "\\_SB.PCI0.RP17", &handle)))
> return;
>
> if (acpi_bus_get_device(handle, &adev))
> return;
>
> dev_info(&adev->dev, "queueing hotplug\n");
> acpiphp_hotplug_notify(adev, ACPI_NOTIFY_BUS_CHECK);
> }
>
> and calling it from acpi_init() directly after acpi_ec_init().
>
> drivers/pci/hotplug/acpiphp_glue.c | 12 +++++++++---
> 1 file changed, 9 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/pci/hotplug/acpiphp_glue.c b/drivers/pci/hotplug/acpiphp_glue.c
> index e4c46637f32f..b3869951c0eb 100644
> --- a/drivers/pci/hotplug/acpiphp_glue.c
> +++ b/drivers/pci/hotplug/acpiphp_glue.c
> @@ -449,8 +449,15 @@ static void acpiphp_native_scan_bridge(struct pci_dev *bridge)
>
> /* Scan non-hotplug bridges that need to be reconfigured */
> for_each_pci_bridge(dev, bus) {
> - if (!hotplug_is_native(dev))
> - max = pci_scan_bridge(bus, dev, max, 1);
> + if (hotplug_is_native(dev))
> + continue;
> +
> + max = pci_scan_bridge(bus, dev, max, 1);
> + if (dev->subordinate) {
> + pcibios_resource_survey_bus(dev->subordinate);
> + pci_bus_size_bridges(dev->subordinate);
> + pci_bus_assign_resources(dev->subordinate);
> + }
> }
> }
>
> @@ -480,7 +487,6 @@ static void enable_slot(struct acpiphp_slot *slot, bool bridge)
> if (PCI_SLOT(dev->devfn) == slot->device)
> acpiphp_native_scan_bridge(dev);
> }
> - pci_assign_unassigned_bridge_resources(bus->self);
> } else {
> LIST_HEAD(add_list);
> int max, pass;
>
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] ACPI / hotplug / PCI: Allocate resources directly under the non-hotplug bridge
2019-10-30 15:05 [PATCH] ACPI / hotplug / PCI: Allocate resources directly under the non-hotplug bridge Mika Westerberg
2019-11-06 12:28 ` Rafael J. Wysocki
@ 2019-11-06 23:24 ` Bjorn Helgaas
2019-11-07 1:05 ` Benjamin Herrenschmidt
2019-11-07 9:03 ` Mika Westerberg
2019-11-12 0:19 ` Bjorn Helgaas
2 siblings, 2 replies; 10+ messages in thread
From: Bjorn Helgaas @ 2019-11-06 23:24 UTC (permalink / raw)
To: Mika Westerberg
Cc: Rafael J. Wysocki, Len Brown, Valerio Passini, linux-acpi,
linux-pci, Benjamin Herrenschmidt
[+cc Ben]
On Wed, Oct 30, 2019 at 06:05:45PM +0300, Mika Westerberg wrote:
> Valerio and others reported that commit 84c8b58ed3ad ("ACPI / hotplug /
> PCI: Don't scan bridges managed by native hotplug") prevents some recent
> LG and HP laptops from booting with endless loop of:
>
> [ 26.237796] ACPI Error: No handler or method for GPE 08, disabling event (20190215/evgpe-835)
> [ 26.238699] ACPI Error: No handler or method for GPE 09, disabling event (20190215/evgpe-835)
> [ 26.239306] ACPI Error: No handler or method for GPE 0A, disabling event (20190215/evgpe-835)
> ...
>
> What seems to happen is that during boot, after the initial PCI
> enumeration when EC is enabled the platform triggers ACPI Notify() to
> one of the root ports. The root port itself looks like this:
>
> [ 0.723757] pci 0000:00:1b.0: PCI bridge to [bus 02-3a]
> [ 0.723765] pci 0000:00:1b.0: bridge window [mem 0xc4000000-0xda0fffff]
> [ 0.723773] pci 0000:00:1b.0: bridge window [mem 0x80000000-0xa1ffffff 64bit pref]
>
> The BIOS has configured the root port so that it does not have I/O
> bridge window.
>
> Now when the ACPI Notify() is triggered ACPI hotplug handler calls
> acpiphp_native_scan_bridge() for each non-hotplug bridge (as this system
> is using native PCIe hotplug) and pci_assign_unassigned_bridge_resources()
> to allocate resources.
>
> The device connected to the root port is a PCIe switch (Thunderbolt
> controller) with two hotplug downstream ports. Because of the hotplug
> ports __pci_bus_size_bridges() tries to add "additional I/O" of 256
> bytes to each (DEFAULT_HOTPLUG_IO_SIZE). This gets further aligned to 4k
> as that's the minimum I/O window size so each hotplug port gets 4k I/O
> window and the same happens for the root port (which is also hotplug
> port). This means 3 * 4k = 12k I/O window.
>
> Because of this pci_assign_unassigned_bridge_resources() ends up opening
> a I/O bridge window for the root port at first available I/O address
> which seems to be in range 0x1000 - 0x3fff. Normally this range is used
> for ACPI stuff such as GPE bits (below is part of /proc/ioports):
>
> 1800-1803 : ACPI PM1a_EVT_BLK
> 1804-1805 : ACPI PM1a_CNT_BLK
> 1808-180b : ACPI PM_TMR
> 1810-1815 : ACPI CPU throttle
> 1850-1850 : ACPI PM2_CNT_BLK
> 1854-1857 : pnp 00:05
> 1860-187f : ACPI GPE0_BLK
Nice debugging work!
> However, when the ACPI Notify() happened this range was not yet reserved
> for ACPI/PNP (that happens later) so PCI gets it.
I think this is really the underlying problem. IMO those ACPI/PNP
resources ought to be reserved before we start assigning resources to
PCI devices. That would be a huge problem to fix, though.
> It then starts writing
> to this range and accidentally stomps over GPE bits among other things
> causing the endless stream of messages about missing GPE handler.
>
> This problem does not happen if "pci=hpiosize=0" is passed in the kernel
> command line. The reason is that then the kernel does not try to
> allocate the additional 256 bytes for each hotplug port.
>
> Fix this by allocating resources directly below the non-hotplug bridges
> where a new device may appear as a result of ACPI Notify(). This avoids
> the hotplug bridges and prevents opening the additional I/O window.
Looking at [1], here's the topology I see:
00:1b.0: Root Port to [bus 02-3a] # pciehp
02:00.0: Switch Upstream Port to [bus 03-3a] # thunderbolt
03:00.0: Switch Downstream Port to [bus 04] # thunderbolt
03:01.0: Switch Downstream Port to [bus 05-39] # thunderbolt, pciehp
03:02.0: Switch Downstream Port to [bus 3a] # thunderbolt
04:00.0: reg 0x10: [mem 0xda000000-0xda03ffff]
04:00.0: reg 0x14: [mem 0xda040000-0xda040fff]
3a:00.0: reg 0x10: [mem 0xd9f00000-0xd9f0ffff]
In this topology, I guess the non-hotplug bridges would be 03:00.0
and 03:02.0? And we skip 03:01.0 because hotplug_is_native() is true
for that?
What would happen if a device below one of the non-hotplug bridges,
e.g., 3a:00.0, had an I/O BAR? Would this patch still work?
[1] https://bugzilla.kernel.org/show_bug.cgi?id=203617#c49
> Link: https://bugzilla.kernel.org/show_bug.cgi?id=203617
> Fixes: 84c8b58ed3ad ("ACPI / hotplug / PCI: Don't scan bridges managed by native hotplug")
> Cc: stable@vger.kernel.org
> Reported-by: Valerio Passini <passini.valerio@gmail.com>
> Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com>
> ---
> I was able to reproduce this without access to the affected system by
> forcing ACPI core to send Notify() to the TBT root port like this:
>
> void acpi_notify_rp(void)
> {
> struct acpi_device *adev;
> acpi_handle handle;
>
> if (ACPI_FAILURE(acpi_get_handle(NULL, "\\_SB.PCI0.RP17", &handle)))
> return;
>
> if (acpi_bus_get_device(handle, &adev))
> return;
>
> dev_info(&adev->dev, "queueing hotplug\n");
> acpiphp_hotplug_notify(adev, ACPI_NOTIFY_BUS_CHECK);
> }
>
> and calling it from acpi_init() directly after acpi_ec_init().
>
> drivers/pci/hotplug/acpiphp_glue.c | 12 +++++++++---
> 1 file changed, 9 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/pci/hotplug/acpiphp_glue.c b/drivers/pci/hotplug/acpiphp_glue.c
> index e4c46637f32f..b3869951c0eb 100644
> --- a/drivers/pci/hotplug/acpiphp_glue.c
> +++ b/drivers/pci/hotplug/acpiphp_glue.c
> @@ -449,8 +449,15 @@ static void acpiphp_native_scan_bridge(struct pci_dev *bridge)
>
> /* Scan non-hotplug bridges that need to be reconfigured */
> for_each_pci_bridge(dev, bus) {
> - if (!hotplug_is_native(dev))
> - max = pci_scan_bridge(bus, dev, max, 1);
> + if (hotplug_is_native(dev))
> + continue;
> +
> + max = pci_scan_bridge(bus, dev, max, 1);
> + if (dev->subordinate) {
> + pcibios_resource_survey_bus(dev->subordinate);
> + pci_bus_size_bridges(dev->subordinate);
> + pci_bus_assign_resources(dev->subordinate);
> + }
> }
> }
>
> @@ -480,7 +487,6 @@ static void enable_slot(struct acpiphp_slot *slot, bool bridge)
> if (PCI_SLOT(dev->devfn) == slot->device)
> acpiphp_native_scan_bridge(dev);
> }
> - pci_assign_unassigned_bridge_resources(bus->self);
> } else {
> LIST_HEAD(add_list);
> int max, pass;
> --
> 2.23.0
>
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] ACPI / hotplug / PCI: Allocate resources directly under the non-hotplug bridge
2019-11-06 23:24 ` Bjorn Helgaas
@ 2019-11-07 1:05 ` Benjamin Herrenschmidt
2019-11-07 9:03 ` Mika Westerberg
1 sibling, 0 replies; 10+ messages in thread
From: Benjamin Herrenschmidt @ 2019-11-07 1:05 UTC (permalink / raw)
To: Bjorn Helgaas, Mika Westerberg
Cc: Rafael J. Wysocki, Len Brown, Valerio Passini, linux-acpi, linux-pci
On Wed, 2019-11-06 at 17:24 -0600, Bjorn Helgaas wrote:
> [+cc Ben]
Note: I'm travelling this week, I'll try to have a look in the next few
days but can't promise.
Cheers,
Ben.
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] ACPI / hotplug / PCI: Allocate resources directly under the non-hotplug bridge
2019-11-06 23:24 ` Bjorn Helgaas
2019-11-07 1:05 ` Benjamin Herrenschmidt
@ 2019-11-07 9:03 ` Mika Westerberg
2019-11-07 13:52 ` Bjorn Helgaas
1 sibling, 1 reply; 10+ messages in thread
From: Mika Westerberg @ 2019-11-07 9:03 UTC (permalink / raw)
To: Bjorn Helgaas
Cc: Rafael J. Wysocki, Len Brown, Valerio Passini, linux-acpi,
linux-pci, Benjamin Herrenschmidt
On Wed, Nov 06, 2019 at 05:24:05PM -0600, Bjorn Helgaas wrote:
> [+cc Ben]
>
> On Wed, Oct 30, 2019 at 06:05:45PM +0300, Mika Westerberg wrote:
> > Valerio and others reported that commit 84c8b58ed3ad ("ACPI / hotplug /
> > PCI: Don't scan bridges managed by native hotplug") prevents some recent
> > LG and HP laptops from booting with endless loop of:
> >
> > [ 26.237796] ACPI Error: No handler or method for GPE 08, disabling event (20190215/evgpe-835)
> > [ 26.238699] ACPI Error: No handler or method for GPE 09, disabling event (20190215/evgpe-835)
> > [ 26.239306] ACPI Error: No handler or method for GPE 0A, disabling event (20190215/evgpe-835)
> > ...
> >
> > What seems to happen is that during boot, after the initial PCI
> > enumeration when EC is enabled the platform triggers ACPI Notify() to
> > one of the root ports. The root port itself looks like this:
> >
> > [ 0.723757] pci 0000:00:1b.0: PCI bridge to [bus 02-3a]
> > [ 0.723765] pci 0000:00:1b.0: bridge window [mem 0xc4000000-0xda0fffff]
> > [ 0.723773] pci 0000:00:1b.0: bridge window [mem 0x80000000-0xa1ffffff 64bit pref]
> >
> > The BIOS has configured the root port so that it does not have I/O
> > bridge window.
> >
> > Now when the ACPI Notify() is triggered ACPI hotplug handler calls
> > acpiphp_native_scan_bridge() for each non-hotplug bridge (as this system
> > is using native PCIe hotplug) and pci_assign_unassigned_bridge_resources()
> > to allocate resources.
> >
> > The device connected to the root port is a PCIe switch (Thunderbolt
> > controller) with two hotplug downstream ports. Because of the hotplug
> > ports __pci_bus_size_bridges() tries to add "additional I/O" of 256
> > bytes to each (DEFAULT_HOTPLUG_IO_SIZE). This gets further aligned to 4k
> > as that's the minimum I/O window size so each hotplug port gets 4k I/O
> > window and the same happens for the root port (which is also hotplug
> > port). This means 3 * 4k = 12k I/O window.
> >
> > Because of this pci_assign_unassigned_bridge_resources() ends up opening
> > a I/O bridge window for the root port at first available I/O address
> > which seems to be in range 0x1000 - 0x3fff. Normally this range is used
> > for ACPI stuff such as GPE bits (below is part of /proc/ioports):
> >
> > 1800-1803 : ACPI PM1a_EVT_BLK
> > 1804-1805 : ACPI PM1a_CNT_BLK
> > 1808-180b : ACPI PM_TMR
> > 1810-1815 : ACPI CPU throttle
> > 1850-1850 : ACPI PM2_CNT_BLK
> > 1854-1857 : pnp 00:05
> > 1860-187f : ACPI GPE0_BLK
>
> Nice debugging work!
>
> > However, when the ACPI Notify() happened this range was not yet reserved
> > for ACPI/PNP (that happens later) so PCI gets it.
>
> I think this is really the underlying problem. IMO those ACPI/PNP
> resources ought to be reserved before we start assigning resources to
> PCI devices. That would be a huge problem to fix, though.
I agree, changing the order would probably open another can of worms.
> > It then starts writing
> > to this range and accidentally stomps over GPE bits among other things
> > causing the endless stream of messages about missing GPE handler.
> >
> > This problem does not happen if "pci=hpiosize=0" is passed in the kernel
> > command line. The reason is that then the kernel does not try to
> > allocate the additional 256 bytes for each hotplug port.
> >
> > Fix this by allocating resources directly below the non-hotplug bridges
> > where a new device may appear as a result of ACPI Notify(). This avoids
> > the hotplug bridges and prevents opening the additional I/O window.
>
> Looking at [1], here's the topology I see:
>
> 00:1b.0: Root Port to [bus 02-3a] # pciehp
> 02:00.0: Switch Upstream Port to [bus 03-3a] # thunderbolt
> 03:00.0: Switch Downstream Port to [bus 04] # thunderbolt
> 03:01.0: Switch Downstream Port to [bus 05-39] # thunderbolt, pciehp
> 03:02.0: Switch Downstream Port to [bus 3a] # thunderbolt
> 04:00.0: reg 0x10: [mem 0xda000000-0xda03ffff]
> 04:00.0: reg 0x14: [mem 0xda040000-0xda040fff]
> 3a:00.0: reg 0x10: [mem 0xd9f00000-0xd9f0ffff]
>
> In this topology, I guess the non-hotplug bridges would be 03:00.0
> and 03:02.0? And we skip 03:01.0 because hotplug_is_native() is true
> for that?
Yes.
> What would happen if a device below one of the non-hotplug bridges,
> e.g., 3a:00.0, had an I/O BAR? Would this patch still work?
I think it would still work because now we call pci_bus_size_bridges()
only for non-hotplug bridge which do not have I/O window open so
pbus_size_io() fails to find the "free" I/O resource on that bus and the
kernel then fails to assign that I/O resource for the device.
This is based on reading the code - I cannot verify this on real
hardware, though.
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] ACPI / hotplug / PCI: Allocate resources directly under the non-hotplug bridge
2019-11-07 9:03 ` Mika Westerberg
@ 2019-11-07 13:52 ` Bjorn Helgaas
2019-11-07 14:02 ` Mika Westerberg
0 siblings, 1 reply; 10+ messages in thread
From: Bjorn Helgaas @ 2019-11-07 13:52 UTC (permalink / raw)
To: Mika Westerberg
Cc: Rafael J. Wysocki, Len Brown, Valerio Passini, linux-acpi,
linux-pci, Benjamin Herrenschmidt
On Thu, Nov 07, 2019 at 11:03:33AM +0200, Mika Westerberg wrote:
> On Wed, Nov 06, 2019 at 05:24:05PM -0600, Bjorn Helgaas wrote:
> > On Wed, Oct 30, 2019 at 06:05:45PM +0300, Mika Westerberg wrote:
> > > Valerio and others reported that commit 84c8b58ed3ad ("ACPI / hotplug /
> > > PCI: Don't scan bridges managed by native hotplug") prevents some recent
> > > LG and HP laptops from booting with endless loop of:
> > >
> > > [ 26.237796] ACPI Error: No handler or method for GPE 08, disabling event (20190215/evgpe-835)
> > > [ 26.238699] ACPI Error: No handler or method for GPE 09, disabling event (20190215/evgpe-835)
> > > [ 26.239306] ACPI Error: No handler or method for GPE 0A, disabling event (20190215/evgpe-835)
> > > ...
> > >
> > > What seems to happen is that during boot, after the initial PCI
> > > enumeration when EC is enabled the platform triggers ACPI Notify() to
> > > one of the root ports. The root port itself looks like this:
> > >
> > > [ 0.723757] pci 0000:00:1b.0: PCI bridge to [bus 02-3a]
> > > [ 0.723765] pci 0000:00:1b.0: bridge window [mem 0xc4000000-0xda0fffff]
> > > [ 0.723773] pci 0000:00:1b.0: bridge window [mem 0x80000000-0xa1ffffff 64bit pref]
> > >
> > > The BIOS has configured the root port so that it does not have I/O
> > > bridge window.
> > >
> > > Now when the ACPI Notify() is triggered ACPI hotplug handler calls
> > > acpiphp_native_scan_bridge() for each non-hotplug bridge (as this system
> > > is using native PCIe hotplug) and pci_assign_unassigned_bridge_resources()
> > > to allocate resources.
> > >
> > > The device connected to the root port is a PCIe switch (Thunderbolt
> > > controller) with two hotplug downstream ports. Because of the hotplug
> > > ports __pci_bus_size_bridges() tries to add "additional I/O" of 256
> > > bytes to each (DEFAULT_HOTPLUG_IO_SIZE). This gets further aligned to 4k
> > > as that's the minimum I/O window size so each hotplug port gets 4k I/O
> > > window and the same happens for the root port (which is also hotplug
> > > port). This means 3 * 4k = 12k I/O window.
> > >
> > > Because of this pci_assign_unassigned_bridge_resources() ends up opening
> > > a I/O bridge window for the root port at first available I/O address
> > > which seems to be in range 0x1000 - 0x3fff. Normally this range is used
> > > for ACPI stuff such as GPE bits (below is part of /proc/ioports):
> > >
> > > 1800-1803 : ACPI PM1a_EVT_BLK
> > > 1804-1805 : ACPI PM1a_CNT_BLK
> > > 1808-180b : ACPI PM_TMR
> > > 1810-1815 : ACPI CPU throttle
> > > 1850-1850 : ACPI PM2_CNT_BLK
> > > 1854-1857 : pnp 00:05
> > > 1860-187f : ACPI GPE0_BLK
> >
> > Nice debugging work!
> >
> > > However, when the ACPI Notify() happened this range was not yet reserved
> > > for ACPI/PNP (that happens later) so PCI gets it.
> >
> > I think this is really the underlying problem. IMO those ACPI/PNP
> > resources ought to be reserved before we start assigning resources to
> > PCI devices. That would be a huge problem to fix, though.
>
> I agree, changing the order would probably open another can of worms.
Just for the record, I think it's even worse than just reserving them
in the wrong order. IIRC, we don't reserve ACPI/PNP resources *at
all* except for PNP0C01 and PNP0C02 (drivers/pnp/system.c) and
whatever individual drivers reserve. This is like pretending that
devices don't respond to their address space until a driver claims
them. But I'm not suggesting opening this can of worms for *this*
problem.
> > > It then starts writing
> > > to this range and accidentally stomps over GPE bits among other things
> > > causing the endless stream of messages about missing GPE handler.
> > >
> > > This problem does not happen if "pci=hpiosize=0" is passed in the kernel
> > > command line. The reason is that then the kernel does not try to
> > > allocate the additional 256 bytes for each hotplug port.
> > >
> > > Fix this by allocating resources directly below the non-hotplug bridges
> > > where a new device may appear as a result of ACPI Notify(). This avoids
> > > the hotplug bridges and prevents opening the additional I/O window.
> >
> > Looking at [1], here's the topology I see:
> >
> > 00:1b.0: Root Port to [bus 02-3a] # pciehp
> > 02:00.0: Switch Upstream Port to [bus 03-3a] # thunderbolt
> > 03:00.0: Switch Downstream Port to [bus 04] # thunderbolt
> > 03:01.0: Switch Downstream Port to [bus 05-39] # thunderbolt, pciehp
> > 03:02.0: Switch Downstream Port to [bus 3a] # thunderbolt
> > 04:00.0: reg 0x10: [mem 0xda000000-0xda03ffff]
> > 04:00.0: reg 0x14: [mem 0xda040000-0xda040fff]
> > 3a:00.0: reg 0x10: [mem 0xd9f00000-0xd9f0ffff]
> >
> > In this topology, I guess the non-hotplug bridges would be 03:00.0
> > and 03:02.0? And we skip 03:01.0 because hotplug_is_native() is true
> > for that?
>
> Yes.
>
> > What would happen if a device below one of the non-hotplug bridges,
> > e.g., 3a:00.0, had an I/O BAR? Would this patch still work?
>
> I think it would still work because now we call pci_bus_size_bridges()
> only for non-hotplug bridge which do not have I/O window open so
> pbus_size_io() fails to find the "free" I/O resource on that bus and the
> kernel then fails to assign that I/O resource for the device.
Not sure I understand; are you saying that we wouldn't have the EC/GPE
issue, but we'd be unable to use a device below 3a:00.0 that happened
to have an I/O BAR? That doesn't sound optimal because there is I/O
space available that could be routed to 3a:00.0
Bjorn
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] ACPI / hotplug / PCI: Allocate resources directly under the non-hotplug bridge
2019-11-07 13:52 ` Bjorn Helgaas
@ 2019-11-07 14:02 ` Mika Westerberg
2019-11-07 14:40 ` Bjorn Helgaas
0 siblings, 1 reply; 10+ messages in thread
From: Mika Westerberg @ 2019-11-07 14:02 UTC (permalink / raw)
To: Bjorn Helgaas
Cc: Rafael J. Wysocki, Len Brown, Valerio Passini, linux-acpi,
linux-pci, Benjamin Herrenschmidt
On Thu, Nov 07, 2019 at 07:52:46AM -0600, Bjorn Helgaas wrote:
> > > What would happen if a device below one of the non-hotplug bridges,
> > > e.g., 3a:00.0, had an I/O BAR? Would this patch still work?
> >
> > I think it would still work because now we call pci_bus_size_bridges()
> > only for non-hotplug bridge which do not have I/O window open so
> > pbus_size_io() fails to find the "free" I/O resource on that bus and the
> > kernel then fails to assign that I/O resource for the device.
>
> Not sure I understand; are you saying that we wouldn't have the EC/GPE
> issue, but we'd be unable to use a device below 3a:00.0 that happened
> to have an I/O BAR?
Yes.
> That doesn't sound optimal because there is I/O space available that
> could be routed to 3a:00.0
If the none of the upstream bridges up to the PCIe root port does not
have I/O window open, I don't think we can do much about it. Unless I'm
missing something of course.
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] ACPI / hotplug / PCI: Allocate resources directly under the non-hotplug bridge
2019-11-07 14:02 ` Mika Westerberg
@ 2019-11-07 14:40 ` Bjorn Helgaas
2019-11-07 14:52 ` Mika Westerberg
0 siblings, 1 reply; 10+ messages in thread
From: Bjorn Helgaas @ 2019-11-07 14:40 UTC (permalink / raw)
To: Mika Westerberg
Cc: Rafael J. Wysocki, Len Brown, Valerio Passini, linux-acpi,
linux-pci, Benjamin Herrenschmidt
On Thu, Nov 07, 2019 at 04:02:59PM +0200, Mika Westerberg wrote:
> On Thu, Nov 07, 2019 at 07:52:46AM -0600, Bjorn Helgaas wrote:
> > > > What would happen if a device below one of the non-hotplug
> > > > bridges, e.g., 3a:00.0, had an I/O BAR? Would this patch
> > > > still work?
> > >
> > > I think it would still work because now we call
> > > pci_bus_size_bridges() only for non-hotplug bridge which do not
> > > have I/O window open so pbus_size_io() fails to find the "free"
> > > I/O resource on that bus and the kernel then fails to assign
> > > that I/O resource for the device.
> >
> > Not sure I understand; are you saying that we wouldn't have the
> > EC/GPE issue, but we'd be unable to use a device below 3a:00.0
> > that happened to have an I/O BAR?
>
> Yes.
>
> > That doesn't sound optimal because there is I/O space available
> > that could be routed to 3a:00.0
>
> If the none of the upstream bridges up to the PCIe root port does
> not have I/O window open, I don't think we can do much about it.
> Unless I'm missing something of course.
The path to this hypothetical 3a:00.0 device is:
PCI host bridge to bus 0000:00
pci_bus 0000:00: root bus resource [io 0x0000-0x0cf7 window]
pci_bus 0000:00: root bus resource [io 0x0d00-0xffff window]
00:1b.0: Root Port to [bus 02-3a]
02:00.0: Switch Upstream Port to [bus 03-3a]
03:02.0: Switch Downstream Port to [bus 3a]
3a:00.0: reg 0x10: [io 0x????-0x????]
None of the bridges (00:1b.0, 02:00.0, 03:02.0) currently has an open
I/O window, but there's space available on bus 00 and windows *could*
be opened.
I guess it comes down to that ordering problem: this Notify() and
acpiphp_native_scan_bridge() happens before pnp/system.c reserves
things, so we don't yet know what space is actually available.
If firmware had configured I/O windows for these bridges, 3a:00.0
would probably work. But it doesn't seem right that we would depend
on that firmware configuration.
Bjorn
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] ACPI / hotplug / PCI: Allocate resources directly under the non-hotplug bridge
2019-11-07 14:40 ` Bjorn Helgaas
@ 2019-11-07 14:52 ` Mika Westerberg
0 siblings, 0 replies; 10+ messages in thread
From: Mika Westerberg @ 2019-11-07 14:52 UTC (permalink / raw)
To: Bjorn Helgaas
Cc: Rafael J. Wysocki, Len Brown, Valerio Passini, linux-acpi,
linux-pci, Benjamin Herrenschmidt
On Thu, Nov 07, 2019 at 08:40:55AM -0600, Bjorn Helgaas wrote:
> On Thu, Nov 07, 2019 at 04:02:59PM +0200, Mika Westerberg wrote:
> > On Thu, Nov 07, 2019 at 07:52:46AM -0600, Bjorn Helgaas wrote:
> > > > > What would happen if a device below one of the non-hotplug
> > > > > bridges, e.g., 3a:00.0, had an I/O BAR? Would this patch
> > > > > still work?
> > > >
> > > > I think it would still work because now we call
> > > > pci_bus_size_bridges() only for non-hotplug bridge which do not
> > > > have I/O window open so pbus_size_io() fails to find the "free"
> > > > I/O resource on that bus and the kernel then fails to assign
> > > > that I/O resource for the device.
> > >
> > > Not sure I understand; are you saying that we wouldn't have the
> > > EC/GPE issue, but we'd be unable to use a device below 3a:00.0
> > > that happened to have an I/O BAR?
> >
> > Yes.
> >
> > > That doesn't sound optimal because there is I/O space available
> > > that could be routed to 3a:00.0
> >
> > If the none of the upstream bridges up to the PCIe root port does
> > not have I/O window open, I don't think we can do much about it.
> > Unless I'm missing something of course.
>
> The path to this hypothetical 3a:00.0 device is:
>
> PCI host bridge to bus 0000:00
> pci_bus 0000:00: root bus resource [io 0x0000-0x0cf7 window]
> pci_bus 0000:00: root bus resource [io 0x0d00-0xffff window]
> 00:1b.0: Root Port to [bus 02-3a]
> 02:00.0: Switch Upstream Port to [bus 03-3a]
> 03:02.0: Switch Downstream Port to [bus 3a]
> 3a:00.0: reg 0x10: [io 0x????-0x????]
>
> None of the bridges (00:1b.0, 02:00.0, 03:02.0) currently has an open
> I/O window, but there's space available on bus 00 and windows *could*
> be opened.
Right.
> I guess it comes down to that ordering problem: this Notify() and
> acpiphp_native_scan_bridge() happens before pnp/system.c reserves
> things, so we don't yet know what space is actually available.
Yes, exactly. The some parts of the range 0x0d00-0xffff belongs to
PNP/ACPI so if we open the window now it might stomp over some crucial
ACPI resources.
> If firmware had configured I/O windows for these bridges, 3a:00.0
> would probably work. But it doesn't seem right that we would depend
> on that firmware configuration.
Well, if firmware has configured the topology in such way I think we
should at least try to trust it is intentional. Even if the (PCIe)
device might have an I/O BAR it should work without it (only legacy
endpoints are allowed to create I/O requests, althought many non-legacy
seem to include I/O BAR).
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] ACPI / hotplug / PCI: Allocate resources directly under the non-hotplug bridge
2019-10-30 15:05 [PATCH] ACPI / hotplug / PCI: Allocate resources directly under the non-hotplug bridge Mika Westerberg
2019-11-06 12:28 ` Rafael J. Wysocki
2019-11-06 23:24 ` Bjorn Helgaas
@ 2019-11-12 0:19 ` Bjorn Helgaas
2 siblings, 0 replies; 10+ messages in thread
From: Bjorn Helgaas @ 2019-11-12 0:19 UTC (permalink / raw)
To: Mika Westerberg
Cc: Rafael J. Wysocki, Len Brown, Valerio Passini, linux-acpi, linux-pci
On Wed, Oct 30, 2019 at 06:05:45PM +0300, Mika Westerberg wrote:
> Valerio and others reported that commit 84c8b58ed3ad ("ACPI / hotplug /
> PCI: Don't scan bridges managed by native hotplug") prevents some recent
> LG and HP laptops from booting with endless loop of:
>
> [ 26.237796] ACPI Error: No handler or method for GPE 08, disabling event (20190215/evgpe-835)
> [ 26.238699] ACPI Error: No handler or method for GPE 09, disabling event (20190215/evgpe-835)
> [ 26.239306] ACPI Error: No handler or method for GPE 0A, disabling event (20190215/evgpe-835)
> ...
>
> What seems to happen is that during boot, after the initial PCI
> enumeration when EC is enabled the platform triggers ACPI Notify() to
> one of the root ports. The root port itself looks like this:
>
> [ 0.723757] pci 0000:00:1b.0: PCI bridge to [bus 02-3a]
> [ 0.723765] pci 0000:00:1b.0: bridge window [mem 0xc4000000-0xda0fffff]
> [ 0.723773] pci 0000:00:1b.0: bridge window [mem 0x80000000-0xa1ffffff 64bit pref]
>
> The BIOS has configured the root port so that it does not have I/O
> bridge window.
>
> Now when the ACPI Notify() is triggered ACPI hotplug handler calls
> acpiphp_native_scan_bridge() for each non-hotplug bridge (as this system
> is using native PCIe hotplug) and pci_assign_unassigned_bridge_resources()
> to allocate resources.
>
> The device connected to the root port is a PCIe switch (Thunderbolt
> controller) with two hotplug downstream ports. Because of the hotplug
> ports __pci_bus_size_bridges() tries to add "additional I/O" of 256
> bytes to each (DEFAULT_HOTPLUG_IO_SIZE). This gets further aligned to 4k
> as that's the minimum I/O window size so each hotplug port gets 4k I/O
> window and the same happens for the root port (which is also hotplug
> port). This means 3 * 4k = 12k I/O window.
>
> Because of this pci_assign_unassigned_bridge_resources() ends up opening
> a I/O bridge window for the root port at first available I/O address
> which seems to be in range 0x1000 - 0x3fff. Normally this range is used
> for ACPI stuff such as GPE bits (below is part of /proc/ioports):
>
> 1800-1803 : ACPI PM1a_EVT_BLK
> 1804-1805 : ACPI PM1a_CNT_BLK
> 1808-180b : ACPI PM_TMR
> 1810-1815 : ACPI CPU throttle
> 1850-1850 : ACPI PM2_CNT_BLK
> 1854-1857 : pnp 00:05
> 1860-187f : ACPI GPE0_BLK
>
> However, when the ACPI Notify() happened this range was not yet reserved
> for ACPI/PNP (that happens later) so PCI gets it. It then starts writing
> to this range and accidentally stomps over GPE bits among other things
> causing the endless stream of messages about missing GPE handler.
>
> This problem does not happen if "pci=hpiosize=0" is passed in the kernel
> command line. The reason is that then the kernel does not try to
> allocate the additional 256 bytes for each hotplug port.
>
> Fix this by allocating resources directly below the non-hotplug bridges
> where a new device may appear as a result of ACPI Notify(). This avoids
> the hotplug bridges and prevents opening the additional I/O window.
>
> Link: https://bugzilla.kernel.org/show_bug.cgi?id=203617
> Fixes: 84c8b58ed3ad ("ACPI / hotplug / PCI: Don't scan bridges managed by native hotplug")
> Cc: stable@vger.kernel.org
> Reported-by: Valerio Passini <passini.valerio@gmail.com>
> Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Applied with Rafael's reviewed-by to pci/hotplug for v5.5, thanks!
> ---
> I was able to reproduce this without access to the affected system by
> forcing ACPI core to send Notify() to the TBT root port like this:
>
> void acpi_notify_rp(void)
> {
> struct acpi_device *adev;
> acpi_handle handle;
>
> if (ACPI_FAILURE(acpi_get_handle(NULL, "\\_SB.PCI0.RP17", &handle)))
> return;
>
> if (acpi_bus_get_device(handle, &adev))
> return;
>
> dev_info(&adev->dev, "queueing hotplug\n");
> acpiphp_hotplug_notify(adev, ACPI_NOTIFY_BUS_CHECK);
> }
>
> and calling it from acpi_init() directly after acpi_ec_init().
>
> drivers/pci/hotplug/acpiphp_glue.c | 12 +++++++++---
> 1 file changed, 9 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/pci/hotplug/acpiphp_glue.c b/drivers/pci/hotplug/acpiphp_glue.c
> index e4c46637f32f..b3869951c0eb 100644
> --- a/drivers/pci/hotplug/acpiphp_glue.c
> +++ b/drivers/pci/hotplug/acpiphp_glue.c
> @@ -449,8 +449,15 @@ static void acpiphp_native_scan_bridge(struct pci_dev *bridge)
>
> /* Scan non-hotplug bridges that need to be reconfigured */
> for_each_pci_bridge(dev, bus) {
> - if (!hotplug_is_native(dev))
> - max = pci_scan_bridge(bus, dev, max, 1);
> + if (hotplug_is_native(dev))
> + continue;
> +
> + max = pci_scan_bridge(bus, dev, max, 1);
> + if (dev->subordinate) {
> + pcibios_resource_survey_bus(dev->subordinate);
> + pci_bus_size_bridges(dev->subordinate);
> + pci_bus_assign_resources(dev->subordinate);
> + }
> }
> }
>
> @@ -480,7 +487,6 @@ static void enable_slot(struct acpiphp_slot *slot, bool bridge)
> if (PCI_SLOT(dev->devfn) == slot->device)
> acpiphp_native_scan_bridge(dev);
> }
> - pci_assign_unassigned_bridge_resources(bus->self);
> } else {
> LIST_HEAD(add_list);
> int max, pass;
> --
> 2.23.0
>
^ permalink raw reply [flat|nested] 10+ messages in thread