archive mirror
 help / color / mirror / Atom feed
From: "Cédric Le Goater" <>
To: Oliver O'Halloran <>
Cc: Michael Ellerman <>,
	linuxppc-dev <>,
	Alexey Kardashevskiy <>,
	linux-pci <>
Subject: Re: [PATCH] powerpc/pci: Fix PHB removal/rescan on PowerNV
Date: Thu, 8 Oct 2020 06:37:02 +0200	[thread overview]
Message-ID: <> (raw)
In-Reply-To: <>

On 10/8/20 4:23 AM, Oliver O'Halloran wrote:
> On Fri, Sep 25, 2020 at 7:23 PM Cédric Le Goater <> wrote:
>> To fix an issue with PHB hotplug on pSeries machine (HPT/XIVE), commit
>> 3a3181e16fbd introduced a PPC specific pcibios_remove_bus() routine to
>> clear all interrupt mappings when the bus is removed. This routine
>> frees an array allocated in pcibios_scan_phb().
>> This broke PHB hotplug on PowerNV because, when a PHB is removed and
>> re-scanned through sysfs, the PCI layer un-assigns and re-assigns
>> resources to the PHB but does not destroy and recreate the PCI
>> controller structure. Since pcibios_remove_bus() does not clear the
>> 'irq_map' array pointer, a second removal of the PHB will try to free
>> the array a second time and corrupt memory.
> "PHB hotplug" and "hot-plugging devices under a PHB" are different
> things. What you're saying here doesn't make a whole lot of sense to
> me unless you're conflating the two. The distinction is important
> since on pseries we can use DLPAR to add and remove actual PHBs (i.e.
> the pci_controller) at runtime, but there's no corresponding mechanism
> on PowerNV.

And it's even different on QEMU. 

>> Free the 'irq_map' array in pcibios_free_controller() to fix
>> corruption and clear interrupt mapping after it has been
>> disposed. This to avoid filling up the array with successive
>> remove/rescan of a bus.
> Even with this patch I think we're still broken. With this patch
> applied the init path is something like:
> per-phb init:
>     allocate phb->irq_map array
> per-bus init:
>     nothing
> per-device init:
>     pcibios_bus_add_device()
>        pci_read_irq_line()
>             pci_irq_map_register(pci_dev, virq)
>                *record the device's interrupt in phb->irq_map*
> And the teardown path:
> per-device teardown:
>     nothing
> per-bus teardown:
>     pcibios_remove_bus()
>         pci_irq_map_dispose()
>             *walk phb->irq_map and dispose of each mapped interrupt*
> per-phb teardown:
>     free(phb->irq_map)
> There's a lot of asymmetry here, which is a problem in itself, but the
> real issue is that when removing *any* pci_bus under a PHB we dispose
> of the LSI\ for *every* device under that PHB. Not good.
> Ideally we should be fixing this by having the per-device teardown
> handle disposing the mapping. Unfortunately, there's no pcibios hook
> that's called when removing a pci_dev. However, we can register a bus
> notifier which will be called when the pci_dev is removed from its bus
> and we already do that for the per-device EEH teardown and to handle
> IOMMU TCE invalidation when the device is removed.

I lack the knowledge here and I think some else should take over,
as I am not doing a good job. 

Michael, can you drop the initial patch again :/ It is better not
to merge anything.



  reply	other threads:[~2020-10-08  4:46 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <>
2020-10-08  2:23 ` [PATCH] powerpc/pci: Fix PHB removal/rescan on PowerNV Oliver O'Halloran
2020-10-08  4:37   ` Cédric Le Goater [this message]
2020-10-14 14:24     ` Greg Kurz

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \ \ \ \ \ \ \ \

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).