LinuxPPC-Dev Archive on lore.kernel.org
 help / color / Atom feed
* [PATCH] powerpc/pci: Fix PHB removal/rescan on PowerNV
@ 2020-09-25  9:22 Cédric Le Goater
  2020-10-08  2:23 ` Oliver O'Halloran
  0 siblings, 1 reply; 4+ messages in thread
From: Cédric Le Goater @ 2020-09-25  9:22 UTC (permalink / raw)
  To: Michael Ellerman
  Cc: Alexey Kardashevskiy, Oliver O'Halloran, linuxppc-dev,
	Cédric Le Goater

To fix an issue with PHB hotplug on pSeries machine (HPT/XIVE), commit
3a3181e16fbd introduced a PPC specific pcibios_remove_bus() routine to
clear all interrupt mappings when the bus is removed. This routine
frees an array allocated in pcibios_scan_phb().

This broke PHB hotplug on PowerNV because, when a PHB is removed and
re-scanned through sysfs, the PCI layer un-assigns and re-assigns
resources to the PHB but does not destroy and recreate the PCI
controller structure. Since pcibios_remove_bus() does not clear the
'irq_map' array pointer, a second removal of the PHB will try to free
the array a second time and corrupt memory.

Free the 'irq_map' array in pcibios_free_controller() to fix
corruption and clear interrupt mapping after it has been
disposed. This to avoid filling up the array with successive
remove/rescan of a bus.

Cc: "Oliver O'Halloran" <oohall@gmail.com>
Cc: Alexey Kardashevskiy <aik@ozlabs.ru>
Fixes: 3a3181e16fbd ("powerpc/pci: unmap legacy INTx interrupts when a PHB is removed")
Signed-off-by: Cédric Le Goater <clg@kaod.org>
---

Michael, I am not sure the Fixes tag is required. Feel free to drop
it. 

---
 arch/powerpc/kernel/pci-common.c | 8 +++++---
 1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/arch/powerpc/kernel/pci-common.c b/arch/powerpc/kernel/pci-common.c
index deb831f0ae13..6fc228e0359d 100644
--- a/arch/powerpc/kernel/pci-common.c
+++ b/arch/powerpc/kernel/pci-common.c
@@ -143,6 +143,8 @@ void pcibios_free_controller(struct pci_controller *phb)
 	list_del(&phb->list_node);
 	spin_unlock(&hose_spinlock);
 
+	kfree(phb->irq_map);
+
 	if (phb->is_dynamic)
 		kfree(phb);
 }
@@ -450,10 +452,10 @@ static void pci_irq_map_dispose(struct pci_bus *bus)
 
 	pr_debug("PCI: Clearing interrupt mappings for PHB %04x:%02x...\n",
 		 pci_domain_nr(bus), bus->number);
-	for (i = 0; i < phb->irq_count; i++)
+	for (i = 0; i < phb->irq_count; i++) {
 		irq_dispose_mapping(phb->irq_map[i]);
-
-	kfree(phb->irq_map);
+		phb->irq_map[i] = 0;
+	}
 }
 
 void pcibios_remove_bus(struct pci_bus *bus)
-- 
2.25.4


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] powerpc/pci: Fix PHB removal/rescan on PowerNV
  2020-09-25  9:22 [PATCH] powerpc/pci: Fix PHB removal/rescan on PowerNV Cédric Le Goater
@ 2020-10-08  2:23 ` Oliver O'Halloran
  2020-10-08  4:37   ` Cédric Le Goater
  0 siblings, 1 reply; 4+ messages in thread
From: Oliver O'Halloran @ 2020-10-08  2:23 UTC (permalink / raw)
  To: Cédric Le Goater; +Cc: Alexey Kardashevskiy, linuxppc-dev, linux-pci

On Fri, Sep 25, 2020 at 7:23 PM Cédric Le Goater <clg@kaod.org> wrote:
>
> To fix an issue with PHB hotplug on pSeries machine (HPT/XIVE), commit
> 3a3181e16fbd introduced a PPC specific pcibios_remove_bus() routine to
> clear all interrupt mappings when the bus is removed. This routine
> frees an array allocated in pcibios_scan_phb().
>
> This broke PHB hotplug on PowerNV because, when a PHB is removed and
> re-scanned through sysfs, the PCI layer un-assigns and re-assigns
> resources to the PHB but does not destroy and recreate the PCI
> controller structure. Since pcibios_remove_bus() does not clear the
> 'irq_map' array pointer, a second removal of the PHB will try to free
> the array a second time and corrupt memory.

"PHB hotplug" and "hot-plugging devices under a PHB" are different
things. What you're saying here doesn't make a whole lot of sense to
me unless you're conflating the two. The distinction is important
since on pseries we can use DLPAR to add and remove actual PHBs (i.e.
the pci_controller) at runtime, but there's no corresponding mechanism
on PowerNV.

> Free the 'irq_map' array in pcibios_free_controller() to fix
> corruption and clear interrupt mapping after it has been
> disposed. This to avoid filling up the array with successive
> remove/rescan of a bus.

Even with this patch I think we're still broken. With this patch
applied the init path is something like:

per-phb init:
    allocate phb->irq_map array
per-bus init:
    nothing
per-device init:
    pcibios_bus_add_device()
       pci_read_irq_line()
            pci_irq_map_register(pci_dev, virq)
               *record the device's interrupt in phb->irq_map*

And the teardown path:

per-device teardown:
    nothing
per-bus teardown:
    pcibios_remove_bus()
        pci_irq_map_dispose()
            *walk phb->irq_map and dispose of each mapped interrupt*
per-phb teardown:
    free(phb->irq_map)

There's a lot of asymmetry here, which is a problem in itself, but the
real issue is that when removing *any* pci_bus under a PHB we dispose
of the LSI\ for *every* device under that PHB. Not good.

Ideally we should be fixing this by having the per-device teardown
handle disposing the mapping. Unfortunately, there's no pcibios hook
that's called when removing a pci_dev. However, we can register a bus
notifier which will be called when the pci_dev is removed from its bus
and we already do that for the per-device EEH teardown and to handle
IOMMU TCE invalidation when the device is removed.

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] powerpc/pci: Fix PHB removal/rescan on PowerNV
  2020-10-08  2:23 ` Oliver O'Halloran
@ 2020-10-08  4:37   ` Cédric Le Goater
  2020-10-14 14:24     ` Greg Kurz
  0 siblings, 1 reply; 4+ messages in thread
From: Cédric Le Goater @ 2020-10-08  4:37 UTC (permalink / raw)
  To: Oliver O'Halloran; +Cc: Alexey Kardashevskiy, linuxppc-dev, linux-pci

On 10/8/20 4:23 AM, Oliver O'Halloran wrote:
> On Fri, Sep 25, 2020 at 7:23 PM Cédric Le Goater <clg@kaod.org> wrote:
>>
>> To fix an issue with PHB hotplug on pSeries machine (HPT/XIVE), commit
>> 3a3181e16fbd introduced a PPC specific pcibios_remove_bus() routine to
>> clear all interrupt mappings when the bus is removed. This routine
>> frees an array allocated in pcibios_scan_phb().
>>
>> This broke PHB hotplug on PowerNV because, when a PHB is removed and
>> re-scanned through sysfs, the PCI layer un-assigns and re-assigns
>> resources to the PHB but does not destroy and recreate the PCI
>> controller structure. Since pcibios_remove_bus() does not clear the
>> 'irq_map' array pointer, a second removal of the PHB will try to free
>> the array a second time and corrupt memory.
> 
> "PHB hotplug" and "hot-plugging devices under a PHB" are different
> things. What you're saying here doesn't make a whole lot of sense to
> me unless you're conflating the two. The distinction is important
> since on pseries we can use DLPAR to add and remove actual PHBs (i.e.
> the pci_controller) at runtime, but there's no corresponding mechanism
> on PowerNV.

And it's even different on QEMU. 

>> Free the 'irq_map' array in pcibios_free_controller() to fix
>> corruption and clear interrupt mapping after it has been
>> disposed. This to avoid filling up the array with successive
>> remove/rescan of a bus.
> 
> Even with this patch I think we're still broken. With this patch
> applied the init path is something like:
> 
> per-phb init:
>     allocate phb->irq_map array
> per-bus init:
>     nothing
> per-device init:
>     pcibios_bus_add_device()
>        pci_read_irq_line()
>             pci_irq_map_register(pci_dev, virq)
>                *record the device's interrupt in phb->irq_map*
> 
> And the teardown path:
> 
> per-device teardown:
>     nothing
> per-bus teardown:
>     pcibios_remove_bus()
>         pci_irq_map_dispose()
>             *walk phb->irq_map and dispose of each mapped interrupt*
> per-phb teardown:
>     free(phb->irq_map)
> 
> There's a lot of asymmetry here, which is a problem in itself, but the
> real issue is that when removing *any* pci_bus under a PHB we dispose
> of the LSI\ for *every* device under that PHB. Not good.
> 
> Ideally we should be fixing this by having the per-device teardown
> handle disposing the mapping. Unfortunately, there's no pcibios hook
> that's called when removing a pci_dev. However, we can register a bus
> notifier which will be called when the pci_dev is removed from its bus
> and we already do that for the per-device EEH teardown and to handle
> IOMMU TCE invalidation when the device is removed.

I lack the knowledge here and I think some else should take over,
as I am not doing a good job. 

Michael, can you drop the initial patch again :/ It is better not
to merge anything.

Thanks,

C. 



^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] powerpc/pci: Fix PHB removal/rescan on PowerNV
  2020-10-08  4:37   ` Cédric Le Goater
@ 2020-10-14 14:24     ` Greg Kurz
  0 siblings, 0 replies; 4+ messages in thread
From: Greg Kurz @ 2020-10-14 14:24 UTC (permalink / raw)
  To: Cédric Le Goater
  Cc: Alexey Kardashevskiy, linuxppc-dev, Oliver O'Halloran, linux-pci

On Thu, 8 Oct 2020 06:37:02 +0200
Cédric Le Goater <clg@kaod.org> wrote:

> On 10/8/20 4:23 AM, Oliver O'Halloran wrote:
> > On Fri, Sep 25, 2020 at 7:23 PM Cédric Le Goater <clg@kaod.org> wrote:
> >>
> >> To fix an issue with PHB hotplug on pSeries machine (HPT/XIVE), commit
> >> 3a3181e16fbd introduced a PPC specific pcibios_remove_bus() routine to
> >> clear all interrupt mappings when the bus is removed. This routine
> >> frees an array allocated in pcibios_scan_phb().
> >>
> >> This broke PHB hotplug on PowerNV because, when a PHB is removed and
> >> re-scanned through sysfs, the PCI layer un-assigns and re-assigns
> >> resources to the PHB but does not destroy and recreate the PCI
> >> controller structure. Since pcibios_remove_bus() does not clear the
> >> 'irq_map' array pointer, a second removal of the PHB will try to free
> >> the array a second time and corrupt memory.
> > 
> > "PHB hotplug" and "hot-plugging devices under a PHB" are different
> > things. What you're saying here doesn't make a whole lot of sense to
> > me unless you're conflating the two. The distinction is important
> > since on pseries we can use DLPAR to add and remove actual PHBs (i.e.
> > the pci_controller) at runtime, but there's no corresponding mechanism
> > on PowerNV.
> 
> And it's even different on QEMU. 
> 

If the real HW doesn't have the notion of adding/removing a PHB at
runtime, then QEMU should stick to that, ie. setting dc->hotpluggable
to false for PNV PHB device types.

> >> Free the 'irq_map' array in pcibios_free_controller() to fix
> >> corruption and clear interrupt mapping after it has been
> >> disposed. This to avoid filling up the array with successive
> >> remove/rescan of a bus.
> > 
> > Even with this patch I think we're still broken. With this patch
> > applied the init path is something like:
> > 
> > per-phb init:
> >     allocate phb->irq_map array
> > per-bus init:
> >     nothing
> > per-device init:
> >     pcibios_bus_add_device()
> >        pci_read_irq_line()
> >             pci_irq_map_register(pci_dev, virq)
> >                *record the device's interrupt in phb->irq_map*
> > 
> > And the teardown path:
> > 
> > per-device teardown:
> >     nothing
> > per-bus teardown:
> >     pcibios_remove_bus()
> >         pci_irq_map_dispose()
> >             *walk phb->irq_map and dispose of each mapped interrupt*
> > per-phb teardown:
> >     free(phb->irq_map)
> > 
> > There's a lot of asymmetry here, which is a problem in itself, but the
> > real issue is that when removing *any* pci_bus under a PHB we dispose
> > of the LSI\ for *every* device under that PHB. Not good.
> > 
> > Ideally we should be fixing this by having the per-device teardown
> > handle disposing the mapping. Unfortunately, there's no pcibios hook
> > that's called when removing a pci_dev. However, we can register a bus
> > notifier which will be called when the pci_dev is removed from its bus
> > and we already do that for the per-device EEH teardown and to handle
> > IOMMU TCE invalidation when the device is removed.
> 
> I lack the knowledge here and I think some else should take over,
> as I am not doing a good job. 
> 
> Michael, can you drop the initial patch again :/ It is better not
> to merge anything.
> 
> Thanks,
> 
> C. 
> 
> 


^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, back to index

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-09-25  9:22 [PATCH] powerpc/pci: Fix PHB removal/rescan on PowerNV Cédric Le Goater
2020-10-08  2:23 ` Oliver O'Halloran
2020-10-08  4:37   ` Cédric Le Goater
2020-10-14 14:24     ` Greg Kurz

LinuxPPC-Dev Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/linuxppc-dev/0 linuxppc-dev/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 linuxppc-dev linuxppc-dev/ https://lore.kernel.org/linuxppc-dev \
		linuxppc-dev@lists.ozlabs.org linuxppc-dev@ozlabs.org
	public-inbox-index linuxppc-dev

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.ozlabs.lists.linuxppc-dev


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git