linux-pci.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Greg Kurz <groug@kaod.org>
To: "Cédric Le Goater" <clg@kaod.org>
Cc: Oliver O'Halloran <oohall@gmail.com>,
	Alexey Kardashevskiy <aik@ozlabs.ru>,
	linuxppc-dev <linuxppc-dev@lists.ozlabs.org>,
	linux-pci <linux-pci@vger.kernel.org>
Subject: Re: [PATCH] powerpc/pci: Fix PHB removal/rescan on PowerNV
Date: Wed, 14 Oct 2020 16:24:16 +0200	[thread overview]
Message-ID: <20201014162416.5e29e477@bahia.lan> (raw)
In-Reply-To: <06bf1b7b-e9b4-44b0-1aad-60b938f8e924@kaod.org>

On Thu, 8 Oct 2020 06:37:02 +0200
Cédric Le Goater <clg@kaod.org> wrote:

> On 10/8/20 4:23 AM, Oliver O'Halloran wrote:
> > On Fri, Sep 25, 2020 at 7:23 PM Cédric Le Goater <clg@kaod.org> wrote:
> >>
> >> To fix an issue with PHB hotplug on pSeries machine (HPT/XIVE), commit
> >> 3a3181e16fbd introduced a PPC specific pcibios_remove_bus() routine to
> >> clear all interrupt mappings when the bus is removed. This routine
> >> frees an array allocated in pcibios_scan_phb().
> >>
> >> This broke PHB hotplug on PowerNV because, when a PHB is removed and
> >> re-scanned through sysfs, the PCI layer un-assigns and re-assigns
> >> resources to the PHB but does not destroy and recreate the PCI
> >> controller structure. Since pcibios_remove_bus() does not clear the
> >> 'irq_map' array pointer, a second removal of the PHB will try to free
> >> the array a second time and corrupt memory.
> > 
> > "PHB hotplug" and "hot-plugging devices under a PHB" are different
> > things. What you're saying here doesn't make a whole lot of sense to
> > me unless you're conflating the two. The distinction is important
> > since on pseries we can use DLPAR to add and remove actual PHBs (i.e.
> > the pci_controller) at runtime, but there's no corresponding mechanism
> > on PowerNV.
> 
> And it's even different on QEMU. 
> 

If the real HW doesn't have the notion of adding/removing a PHB at
runtime, then QEMU should stick to that, ie. setting dc->hotpluggable
to false for PNV PHB device types.

> >> Free the 'irq_map' array in pcibios_free_controller() to fix
> >> corruption and clear interrupt mapping after it has been
> >> disposed. This to avoid filling up the array with successive
> >> remove/rescan of a bus.
> > 
> > Even with this patch I think we're still broken. With this patch
> > applied the init path is something like:
> > 
> > per-phb init:
> >     allocate phb->irq_map array
> > per-bus init:
> >     nothing
> > per-device init:
> >     pcibios_bus_add_device()
> >        pci_read_irq_line()
> >             pci_irq_map_register(pci_dev, virq)
> >                *record the device's interrupt in phb->irq_map*
> > 
> > And the teardown path:
> > 
> > per-device teardown:
> >     nothing
> > per-bus teardown:
> >     pcibios_remove_bus()
> >         pci_irq_map_dispose()
> >             *walk phb->irq_map and dispose of each mapped interrupt*
> > per-phb teardown:
> >     free(phb->irq_map)
> > 
> > There's a lot of asymmetry here, which is a problem in itself, but the
> > real issue is that when removing *any* pci_bus under a PHB we dispose
> > of the LSI\ for *every* device under that PHB. Not good.
> > 
> > Ideally we should be fixing this by having the per-device teardown
> > handle disposing the mapping. Unfortunately, there's no pcibios hook
> > that's called when removing a pci_dev. However, we can register a bus
> > notifier which will be called when the pci_dev is removed from its bus
> > and we already do that for the per-device EEH teardown and to handle
> > IOMMU TCE invalidation when the device is removed.
> 
> I lack the knowledge here and I think some else should take over,
> as I am not doing a good job. 
> 
> Michael, can you drop the initial patch again :/ It is better not
> to merge anything.
> 
> Thanks,
> 
> C. 
> 
> 


      reply	other threads:[~2020-10-14 14:31 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <20200925092258.525079-1-clg@kaod.org>
2020-10-08  2:23 ` [PATCH] powerpc/pci: Fix PHB removal/rescan on PowerNV Oliver O'Halloran
2020-10-08  4:37   ` Cédric Le Goater
2020-10-14 14:24     ` Greg Kurz [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20201014162416.5e29e477@bahia.lan \
    --to=groug@kaod.org \
    --cc=aik@ozlabs.ru \
    --cc=clg@kaod.org \
    --cc=linux-pci@vger.kernel.org \
    --cc=linuxppc-dev@lists.ozlabs.org \
    --cc=oohall@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).