From: Greg Kurz <groug@kaod.org>
To: "Cédric Le Goater" <clg@kaod.org>
Cc: Oliver O'Halloran <oohall@gmail.com>,
Alexey Kardashevskiy <aik@ozlabs.ru>,
linuxppc-dev <linuxppc-dev@lists.ozlabs.org>,
linux-pci <linux-pci@vger.kernel.org>
Subject: Re: [PATCH] powerpc/pci: Fix PHB removal/rescan on PowerNV
Date: Wed, 14 Oct 2020 16:24:16 +0200 [thread overview]
Message-ID: <20201014162416.5e29e477@bahia.lan> (raw)
In-Reply-To: <06bf1b7b-e9b4-44b0-1aad-60b938f8e924@kaod.org>
On Thu, 8 Oct 2020 06:37:02 +0200
Cédric Le Goater <clg@kaod.org> wrote:
> On 10/8/20 4:23 AM, Oliver O'Halloran wrote:
> > On Fri, Sep 25, 2020 at 7:23 PM Cédric Le Goater <clg@kaod.org> wrote:
> >>
> >> To fix an issue with PHB hotplug on pSeries machine (HPT/XIVE), commit
> >> 3a3181e16fbd introduced a PPC specific pcibios_remove_bus() routine to
> >> clear all interrupt mappings when the bus is removed. This routine
> >> frees an array allocated in pcibios_scan_phb().
> >>
> >> This broke PHB hotplug on PowerNV because, when a PHB is removed and
> >> re-scanned through sysfs, the PCI layer un-assigns and re-assigns
> >> resources to the PHB but does not destroy and recreate the PCI
> >> controller structure. Since pcibios_remove_bus() does not clear the
> >> 'irq_map' array pointer, a second removal of the PHB will try to free
> >> the array a second time and corrupt memory.
> >
> > "PHB hotplug" and "hot-plugging devices under a PHB" are different
> > things. What you're saying here doesn't make a whole lot of sense to
> > me unless you're conflating the two. The distinction is important
> > since on pseries we can use DLPAR to add and remove actual PHBs (i.e.
> > the pci_controller) at runtime, but there's no corresponding mechanism
> > on PowerNV.
>
> And it's even different on QEMU.
>
If the real HW doesn't have the notion of adding/removing a PHB at
runtime, then QEMU should stick to that, ie. setting dc->hotpluggable
to false for PNV PHB device types.
> >> Free the 'irq_map' array in pcibios_free_controller() to fix
> >> corruption and clear interrupt mapping after it has been
> >> disposed. This to avoid filling up the array with successive
> >> remove/rescan of a bus.
> >
> > Even with this patch I think we're still broken. With this patch
> > applied the init path is something like:
> >
> > per-phb init:
> > allocate phb->irq_map array
> > per-bus init:
> > nothing
> > per-device init:
> > pcibios_bus_add_device()
> > pci_read_irq_line()
> > pci_irq_map_register(pci_dev, virq)
> > *record the device's interrupt in phb->irq_map*
> >
> > And the teardown path:
> >
> > per-device teardown:
> > nothing
> > per-bus teardown:
> > pcibios_remove_bus()
> > pci_irq_map_dispose()
> > *walk phb->irq_map and dispose of each mapped interrupt*
> > per-phb teardown:
> > free(phb->irq_map)
> >
> > There's a lot of asymmetry here, which is a problem in itself, but the
> > real issue is that when removing *any* pci_bus under a PHB we dispose
> > of the LSI\ for *every* device under that PHB. Not good.
> >
> > Ideally we should be fixing this by having the per-device teardown
> > handle disposing the mapping. Unfortunately, there's no pcibios hook
> > that's called when removing a pci_dev. However, we can register a bus
> > notifier which will be called when the pci_dev is removed from its bus
> > and we already do that for the per-device EEH teardown and to handle
> > IOMMU TCE invalidation when the device is removed.
>
> I lack the knowledge here and I think some else should take over,
> as I am not doing a good job.
>
> Michael, can you drop the initial patch again :/ It is better not
> to merge anything.
>
> Thanks,
>
> C.
>
>
prev parent reply other threads:[~2020-10-14 14:31 UTC|newest]
Thread overview: 3+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <20200925092258.525079-1-clg@kaod.org>
2020-10-08 2:23 ` [PATCH] powerpc/pci: Fix PHB removal/rescan on PowerNV Oliver O'Halloran
2020-10-08 4:37 ` Cédric Le Goater
2020-10-14 14:24 ` Greg Kurz [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20201014162416.5e29e477@bahia.lan \
--to=groug@kaod.org \
--cc=aik@ozlabs.ru \
--cc=clg@kaod.org \
--cc=linux-pci@vger.kernel.org \
--cc=linuxppc-dev@lists.ozlabs.org \
--cc=oohall@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).