All of lore.kernel.org
 help / color / mirror / Atom feed
From: Bjorn Helgaas <helgaas@kernel.org>
To: Mika Westerberg <mika.westerberg@linux.intel.com>
Cc: Bjorn Helgaas <bhelgaas@google.com>,
	"Rafael J. Wysocki" <rjw@rjwysocki.net>,
	Len Brown <lenb@kernel.org>,
	Mario.Limonciello@dell.com,
	Michael Jamet <michael.jamet@intel.com>,
	Yehezkel Bernat <YehezkelShB@gmail.com>,
	Andy Shevchenko <andriy.shevchenko@linux.intel.com>,
	Lukas Wunner <lukas@wunner.de>,
	linux-pci@vger.kernel.org, linux-acpi@vger.kernel.org
Subject: Re: [PATCH v5 2/9] PCI: Take bridge window alignment into account when distributing resources
Date: Wed, 25 Apr 2018 17:38:54 -0500	[thread overview]
Message-ID: <20180425223853.GA225403@bhelgaas-glaptop.roam.corp.google.com> (raw)
In-Reply-To: <20180416103453.46232-3-mika.westerberg@linux.intel.com>

On Mon, Apr 16, 2018 at 01:34:46PM +0300, Mika Westerberg wrote:
> When hot-adding a PCIe switch the way we currently distribute resources
> does not always work well because devices connected to the switch might
> need to have their MMIO resources aligned to something else than the
> default 1 MB boundary. For example Intel Gigabit ET2 quad port server
> adapter includes PCIe switch leading to 4 x GbE NIC devices that want
> to have their MMIO resources aligned to 2 MB boundary instead.
> 
> The current resource distribution code does not take this alignment into
> account and might try to add too much resources for the extension
> hotplug bridge(s). The resulting bridge window is too big which makes
> the resource assignment operation fail, and we are left with a bridge
> window with minimal amount (1 MB) of MMIO space.
> 
> Here is what happens when an Intel Gigabit ET2 quad port server adapter
> is hot-added:
> 
>   pci 0000:39:00.0: BAR 14: assigned [mem 0x53300000-0x6a0fffff]
>                                           ^^^^^^^^^^
>   pci 0000:3a:01.0: BAR 14: assigned [mem 0x53400000-0x547fffff]
>                                           ^^^^^^^^^^
> The above shows that the downstream bridge (3a:01.0) window is aligned
> to 2 MB instead of 1 MB as is the upstream bridge (39:00.0) window. The
> remaining MMIO space (0x15a00000) is assigned to the hotplug bridge
> (3a:04.0) but it fails:
> 
>   pci 0000:3a:04.0: BAR 14: no space for [mem size 0x15a00000]
>   pci 0000:3a:04.0: BAR 14: failed to assign [mem size 0x15a00000]
> 
> The MMIO resource is calculated as follows:
> 
>   start = 0x54800000
>   end = 0x54800000 + 0x15a00000 - 1 = 0x6a1fffff
> 
> This results bridge window [mem 0x54800000 - 0x6a1fffff] and it ends
> after the upstream bridge window [mem 0x53300000-0x6a0fffff] explaining
> the above failure. Because of this Linux falls back to the default
> allocation of 1 MB as can be seen from 'lspci' output:
> 
>  39:00.0 Memory behind bridge: 53300000-6a0fffff [size=366M]
>    3a:01.0 Memory behind bridge: 53400000-547fffff [size=20M]
>    3a:04.0 Memory behind bridge: 53300000-533fffff [size=1M]
> 
> The hotplug bridge 3a:04.0 only occupies 1 MB MMIO window which is
> clearly not enough for extending the PCIe topology later if more devices
> are to be hot-added.
> 
> Fix this by substracting properly aligned non-hotplug downstream bridge
> window size from the remaining resources used for extension. After this
> change the resource allocation looks like:
> 
>   39:00.0 Memory behind bridge: 53300000-6a0fffff [size=366M]
>     3a:01.0 Memory behind bridge: 53400000-547fffff [size=20M]
>     3a:04.0 Memory behind bridge: 54800000-6a0fffff [size=345M]
> 
> This matches the expectation. All the extra MMIO resource space (345 MB)
> is allocated to the extension hotplug bridge (3a:04.0).

Sorry, I've spent a lot of time trying to trace through this code, and
I'm still hopelessly confused.  Can you post the complete "lspci -vv"
output and the dmesg log (including the hot-add event) somewhere and
include a URL to it?

I think I understand the problem you're solving:

  - You have 366M, 1M-aligned, available for things on bus 3a
  - You assign 20M, 2M-aligned to 3a:01.0
  - This leaves 346M for other things on bus 3a, but it's not all
    contiguous because the 20M is in the middle.
  - The remaining 346M might be 1M on one side and 345M on the other
    (and there are many other possibilities, e.g., 3M + 343M, 5M +
    341M, ..., 345M + 1M).
  - The current code tries to assign all 346M to 3a:04.0, which
    fails because that space is not contiguous, so it falls back to
    allocating 1M, which works but is insufficient for future
    hot-adds.

Obviously this patch makes *this* situation work: it assigns 345M to
3a:04.0 and (I assume) leaves the 1M unused.  But I haven't been able
to convince myself that this patch works *in general*.

For example, what if we assigned the 20M from the end of the 366M
window instead of the beginning, so the 345M piece is below the 20M
and there's 1M left above it?  That is legal and should work, but I
suspect this patch would ignore the 345M piece and again assign 1M to
3a:04.0.

Or what if there are several hotplug bridges on bus 3a?  This example
has two, but there could be many more.

Or what if there are normal bridges as well as hotplug bridges on bus
3a?  Or if they're in arbitrary orders?

> Fixes: 1a5767725cec ("PCI: Distribute available resources to hotplug-capable bridges")
> Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com>
> Reviewed-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
> Cc: stable@vger.kernel.org

Given my confusion about this, I doubt this satisfies the stable
kernel "obviously correct" rule.

s/substracting/subtracting/ above

> ---
>  drivers/pci/setup-bus.c | 41 ++++++++++++++++++++++++++++++++++++++++-
>  1 file changed, 40 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/pci/setup-bus.c b/drivers/pci/setup-bus.c
> index 072784f55ea5..eb3059fb7f63 100644
> --- a/drivers/pci/setup-bus.c
> +++ b/drivers/pci/setup-bus.c
> @@ -1878,6 +1878,7 @@ static void pci_bus_distribute_available_resources(struct pci_bus *bus,
>  	resource_size_t available_mmio, resource_size_t available_mmio_pref)
>  {
>  	resource_size_t remaining_io, remaining_mmio, remaining_mmio_pref;
> +	resource_size_t io_start, mmio_start, mmio_pref_start;
>  	unsigned int normal_bridges = 0, hotplug_bridges = 0;
>  	struct resource *io_res, *mmio_res, *mmio_pref_res;
>  	struct pci_dev *dev, *bridge = bus->self;
> @@ -1942,11 +1943,16 @@ static void pci_bus_distribute_available_resources(struct pci_bus *bus,
>  			remaining_mmio_pref -= resource_size(res);
>  	}
>  
> +	io_start = io_res->start;
> +	mmio_start = mmio_res->start;
> +	mmio_pref_start = mmio_pref_res->start;
> +
>  	/*
>  	 * Go over devices on this bus and distribute the remaining
>  	 * resource space between hotplug bridges.
>  	 */
>  	for_each_pci_bridge(dev, bus) {
> +		resource_size_t align;
>  		struct pci_bus *b;
>  
>  		b = dev->subordinate;
> @@ -1964,7 +1970,7 @@ static void pci_bus_distribute_available_resources(struct pci_bus *bus,
>  				available_io, available_mmio,
>  				available_mmio_pref);
>  		} else if (dev->is_hotplug_bridge) {
> -			resource_size_t align, io, mmio, mmio_pref;
> +			resource_size_t io, mmio, mmio_pref;
>  
>  			/*
>  			 * Distribute available extra resources equally
> @@ -1977,11 +1983,13 @@ static void pci_bus_distribute_available_resources(struct pci_bus *bus,
>  			io = div64_ul(available_io, hotplug_bridges);
>  			io = min(ALIGN(io, align), remaining_io);
>  			remaining_io -= io;
> +			io_start += io;
>  
>  			align = pci_resource_alignment(bridge, mmio_res);
>  			mmio = div64_ul(available_mmio, hotplug_bridges);
>  			mmio = min(ALIGN(mmio, align), remaining_mmio);
>  			remaining_mmio -= mmio;
> +			mmio_start += mmio;
>  
>  			align = pci_resource_alignment(bridge, mmio_pref_res);
>  			mmio_pref = div64_ul(available_mmio_pref,
> @@ -1989,9 +1997,40 @@ static void pci_bus_distribute_available_resources(struct pci_bus *bus,
>  			mmio_pref = min(ALIGN(mmio_pref, align),
>  					remaining_mmio_pref);
>  			remaining_mmio_pref -= mmio_pref;
> +			mmio_pref_start += mmio_pref;
>  
>  			pci_bus_distribute_available_resources(b, add_list, io,
>  							       mmio, mmio_pref);
> +		} else {
> +			/*
> +			 * For normal bridges, track start of the parent
> +			 * bridge window to make sure we align the
> +			 * remaining space which is distributed to the
> +			 * hotplug bridges properly.
> +			 */
> +			resource_size_t aligned;
> +			struct resource *res;
> +
> +			res = &dev->resource[PCI_BRIDGE_RESOURCES + 0];
> +			io_start += resource_size(res);
> +			aligned = ALIGN(io_start,
> +					pci_resource_alignment(dev, res));
> +			if (aligned > io_start)
> +				remaining_io -= aligned - io_start;
> +
> +			res = &dev->resource[PCI_BRIDGE_RESOURCES + 1];
> +			mmio_start += resource_size(res);
> +			aligned = ALIGN(mmio_start,
> +					pci_resource_alignment(dev, res));
> +			if (aligned > mmio_start)
> +				remaining_mmio -= aligned - mmio_start;
> +
> +			res = &dev->resource[PCI_BRIDGE_RESOURCES + 2];
> +			mmio_pref_start += resource_size(res);
> +			aligned = ALIGN(mmio_pref_start,
> +					pci_resource_alignment(dev, res));
> +			if (aligned > mmio_pref_start)
> +				remaining_mmio_pref -= aligned - mmio_pref_start;
>  		}
>  	}
>  }
> -- 
> 2.16.3
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-acpi" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

  reply	other threads:[~2018-04-25 22:38 UTC|newest]

Thread overview: 30+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-04-16 10:34 [PATCH v5 0/9] PCI: Fixes and cleanups for native PCIe and ACPI hotplug Mika Westerberg
2018-04-16 10:34 ` [PATCH v5 1/9] PCI: Take all bridges into account when calculating bus numbers for extension Mika Westerberg
2018-04-16 10:34 ` [PATCH v5 2/9] PCI: Take bridge window alignment into account when distributing resources Mika Westerberg
2018-04-25 22:38   ` Bjorn Helgaas [this message]
2018-04-26 12:23     ` Mika Westerberg
2018-05-01 20:32       ` Bjorn Helgaas
2018-05-03 12:39         ` Mika Westerberg
2018-04-16 10:34 ` [PATCH v5 3/9] PCI: pciehp: Clear Presence Detect and Data Link Layer Status Changed on resume Mika Westerberg
2018-05-01 21:52   ` Bjorn Helgaas
2018-05-02 11:55     ` Mika Westerberg
2018-05-02 13:41       ` Bjorn Helgaas
2018-05-03 10:42         ` Mika Westerberg
2018-05-03 23:01           ` Bjorn Helgaas
2018-05-04  7:20             ` Mika Westerberg
2018-05-30 10:40             ` Lukas Wunner
2018-05-30 13:27               ` Mika Westerberg
2018-05-04  7:18     ` Lukas Wunner
2018-05-04  8:02       ` Mika Westerberg
2018-04-16 10:34 ` [PATCH v5 4/9] ACPI / hotplug / PCI: Do not scan all bridges when native PCIe hotplug is used Mika Westerberg
     [not found]   ` <20180502204932.GG11698@bhelgaas-glaptop.roam.corp.google.com>
2018-05-03 10:22     ` Mika Westerberg
2018-05-05  0:04       ` Bjorn Helgaas
2018-05-07 11:34         ` Mika Westerberg
2018-05-07 20:37           ` Bjorn Helgaas
2018-04-16 10:34 ` [PATCH v5 5/9] ACPI / hotplug / PCI: Mark stale PCI devices disconnected Mika Westerberg
2018-04-16 10:34 ` [PATCH v5 6/9] PCI: Move resource distribution for a single bridge outside of the loop Mika Westerberg
2018-04-24 23:05   ` Bjorn Helgaas
2018-04-25  7:29     ` Mika Westerberg
2018-04-16 10:34 ` [PATCH v5 7/9] PCI: Document return value of pci_scan_bridge() and pci_scan_bridge_extend() Mika Westerberg
2018-04-16 10:34 ` [PATCH v5 8/9] PCI: Improve "partially hidden behind bridge" log message Mika Westerberg
2018-04-16 10:34 ` [PATCH v5 9/9] ACPI / hotplug / PCI: Drop unnecessary parentheses Mika Westerberg

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20180425223853.GA225403@bhelgaas-glaptop.roam.corp.google.com \
    --to=helgaas@kernel.org \
    --cc=Mario.Limonciello@dell.com \
    --cc=YehezkelShB@gmail.com \
    --cc=andriy.shevchenko@linux.intel.com \
    --cc=bhelgaas@google.com \
    --cc=lenb@kernel.org \
    --cc=linux-acpi@vger.kernel.org \
    --cc=linux-pci@vger.kernel.org \
    --cc=lukas@wunner.de \
    --cc=michael.jamet@intel.com \
    --cc=mika.westerberg@linux.intel.com \
    --cc=rjw@rjwysocki.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.