On Sat, Aug 25, 2012 at 05:59:44PM +0800, Yijing Wang wrote: > Date: Sat, 25 Aug 2012 17:59:44 +0800 > From: Yijing Wang > To: Bjorn Helgaas , Rusty Russell > , Mauro Carvalho Chehab > CC: PCI , Jiang Liu , Huang > Ying , Hanjun Guo , > linux-kernel@vger.kernel.org > Subject: [RESEND BUGFIX PATCH 1/3] PCI/AER: fix pci_ops return NULL when > hotplug a pci bus which was doing aer error inject > User-Agent: Mozilla/5.0 (Windows NT 6.1; rv:14.0) Gecko/20120713 > Thunderbird/14.0 > > When we inject aer errors to the target pci device by aer_inject module, the pci_ops of pci > bus which the target device is on will be assign to pci_ops_aer.So if the target pci device > is a bridge, once we hotplug the pci bus(child bus) which the target device bridges to, child > bus's pci_ops will be assigned to pci_ops_aer too.Now every access to the child bus's device > will result to system panic, because it return NULL pci_ops in pci_read_aer. > This patch fix this. > > CallTrace: > bash[5908]: NaT consumption 17179869216 [1] > Modules linked in: aer_inject cpufreq_conservative cpufreq_userspace cpufreq_pow > ersave acpi_cpufreq binfmt_misc fuse nls_iso8859_1 loop ipmi_si(+) ipmi_devintf > ipmi_msghandler dm_mod ppdev iTCO_wdt iTCO_vendor_support sg igb parport_pc i2c_ > i801 mptctl i2c_core serio_raw hid_generic lpc_ich mfd_core parport button conta > iner usbhid hid uhci_hcd ehci_hcd usbcore usb_common sd_mod crc_t10dif ext3 mbca > che jbd fan processor ide_pci_generic ide_core ata_piix libata mptsas mptscsih m > ptbase scsi_transport_sas scsi_mod thermal thermal_sys hwmon > [...] > > Signed-off-by: Yijing Wang > Signed-off-by: Jiang Liu > --- > drivers/pci/pcie/aer/aer_inject.c | 21 +++++++++++++++++++++ > 1 files changed, 21 insertions(+), 0 deletions(-) > > diff --git a/drivers/pci/pcie/aer/aer_inject.c b/drivers/pci/pcie/aer/aer_inject.c > index 5222986..fc28785 100644 > --- a/drivers/pci/pcie/aer/aer_inject.c > +++ b/drivers/pci/pcie/aer/aer_inject.c > @@ -109,6 +109,19 @@ static struct aer_error *__find_aer_error_by_dev(struct pci_dev *dev) > return __find_aer_error((u16)domain, dev->bus->number, dev->devfn); > } > > +static bool pci_is_upstream_bus(struct pci_bus *bus, struct pci_bus *up_bus) > +{ > + struct pci_bus *pbus = bus->parent; > + > + while (pbus) { > + if (pbus == up_bus) > + return true; > + pbus = pbus->parent; > + } > + > + return false; > +} > + > /* inject_lock must be held before calling */ > static struct pci_ops *__find_pci_bus_ops(struct pci_bus *bus) > { > @@ -118,6 +131,13 @@ static struct pci_ops *__find_pci_bus_ops(struct pci_bus *bus) > if (bus_ops->bus == bus) > return bus_ops->ops; > } > + > + /* here can't find bus_ops, fall back to get bus_ops of upstream bus */ > + list_for_each_entry(bus_ops, &pci_bus_ops_list, list) { > + if (pci_is_upstream_bus(bus, bus_ops->bus)) > + return bus_ops->ops; > + } > + > return NULL; > } > At least, when returning NULL, a proper check and protection is needed.