Re: [nicholas.johnson-opensource@outlook.com.au: [PATCH v6 3/4] PCI: Fix bug resulting in double hpmemsize being assigned to MMIO window]

* Re: [nicholas.johnson-opensource@outlook.com.au: [PATCH v6 3/4] PCI: Fix bug resulting in double hpmemsize being assigned to MMIO window]
       [not found] <SL2P216MB01874DFDDBDE49B935A9B1B380E50@SL2P216MB0187.KORP216.PROD.OUTLOOK.COM>
@ 2019-06-19 16:21 ` Logan Gunthorpe
  2019-06-20  0:44   ` Nicholas Johnson
  2019-06-27  7:50   ` Nicholas Johnson
  0 siblings, 2 replies; 21+ messages in thread
From: Logan Gunthorpe @ 2019-06-19 16:21 UTC (permalink / raw)
  To: Nicholas Johnson; +Cc: benh, Bjorn Helgaas, linux-pci

*(cc'd back Bjorn and the list)

On 2019-06-19 8:00 a.m., Nicholas Johnson wrote:
> Hi Ben and Logan,
> 
> It looks like my git send-email has been not working correctly since I
> started trying to get these patches accepted. I may have remedied this
> now, but I have seen that Logan tried to find these patches and failed.
> So as a courtesy until I post PATCH v7 (hopefully correctly, this time),
> I am forwarding you the patches. I hope you like them. I would love to 
> know of any concerns or questions you may have, and / or what happens if 
> you test them. Thanks and all the best!
> 
> ----- Forwarded message from Nicholas Johnson <nicholas.johnson-opensource@outlook.com.au> -----
> 
> Date: Thu, 23 May 2019 06:29:27 +0800
> From: Nicholas Johnson <nicholas.johnson-opensource@outlook.com.au>
> To: linux-kernel@vger.kernel.org
> Cc: linux-pci@vger.kernel.org, bhelgaas@google.com, mika.westerberg@linux.intel.com, corbet@lwn.net, Nicholas Johnson <nicholas.johnson-opensource@outlook.com.au>
> Subject: [PATCH v6 3/4] PCI: Fix bug resulting in double hpmemsize being assigned to MMIO window
> X-Mailer: git-send-email 2.19.1
> 
> Background
> ==========================================================================
> 
> Solve bug report:
> https://bugzilla.kernel.org/show_bug.cgi?id=203243

This is all kinds of confusing... the bug report just seems to be a copy
of the patch set. The description of the actual symptoms of the problem
appears to be missing from all of it.

> Currently, the kernel can sometimes assign the MMIO_PREF window
> additional size into the MMIO window, resulting in double the MMIO
> additional size, even if the MMIO_PREF window was successful.
> 
> This happens if in the first pass, the MMIO_PREF succeeds but the MMIO
> fails. In the next pass, because MMIO_PREF is already assigned, the
> attempt to assign MMIO_PREF returns an error code instead of success
> (nothing more to do, already allocated).
> 
> Example of problem (more context can be found in the bug report URL):
> 
> Mainline kernel:
> pci 0000:06:01.0: BAR 14: assigned [mem 0x90100000-0xa00fffff] = 256M
> pci 0000:06:04.0: BAR 14: assigned [mem 0xa0200000-0xb01fffff] = 256M
> 
> Patched kernel:
> pci 0000:06:01.0: BAR 14: assigned [mem 0x90100000-0x980fffff] = 128M
> pci 0000:06:04.0: BAR 14: assigned [mem 0x98200000-0xa01fffff] = 128M
> 
> This was using pci=realloc,hpmemsize=128M,nocrs - on the same machine
> with the same configuration, with a Ubuntu mainline kernel and a kernel
> patched with this patch series.
> 
> This patch is vital for the next patch in the series. The next patch
> allows the user to specify MMIO and MMIO_PREF independently. If the
> MMIO_PREF is set to be very large, this bug will end up more than
> doubling the MMIO size. The bug results in the MMIO_PREF being added to
> the MMIO window, which means doubling if MMIO_PREF size == MMIO size.
> With a large MMIO_PREF, without this patch, the MMIO window will likely
> fail to be assigned altogether due to lack of 32-bit address space.
> 
> Patch notes
> ==========================================================================
> 
> Change find_free_bus_resource() to not skip assigned resources with
> non-null parent.
> 
> Add checks in pbus_size_io() and pbus_size_mem() to return success if
> resource returned from find_free_bus_resource() is already allocated.
> 
> This avoids pbus_size_io() and pbus_size_mem() returning error code to
> __pci_bus_size_bridges() when a resource has been successfully assigned
> in a previous pass. This fixes the existing behaviour where space for a
> resource could be reserved multiple times in different parent bridge
> windows. This also greatly reduces the number of failed BAR messages in
> dmesg when Linux assigns resources.

This patch looks like the same bug that I tracked down earlier but I
solved in a slightly different way. See this patch[1] which is still
under review. Can you maybe test it and see if it solves the same problem?

Thanks,

Logan

[1]
https://lore.kernel.org/lkml/20190531171216.20532-2-logang@deltatee.com/T/#u

^ permalink raw reply	[flat|nested] 21+ messages in thread