From: Nam Cao <namcao@linutronix.de>
To: Lukas Wunner <lukas@wunner.de>
Cc: "Bjorn Helgaas" <bhelgaas@google.com>,
"Yinghai Lu" <yinghai@kernel.org>,
"Greg Kroah-Hartman" <gregkh@linuxfoundation.org>,
linux-pci@vger.kernel.org, linux-kernel@vger.kernel.org,
"Ilpo Järvinen" <ilpo.jarvinen@linux.intel.com>,
"Mika Westerberg" <mika.westerberg@linux.intel.com>
Subject: Re: [PATCH v2 2/2] PCI: pciehp: Abort hot-plug if pci_hp_add_bridge() fails
Date: Tue, 7 May 2024 16:27:38 +0200 [thread overview]
Message-ID: <20240507142738.wyj19VVh@linutronix.de> (raw)
In-Reply-To: <ZjkxTGaAc48jPzqC@wunner.de>
On Mon, May 06, 2024 at 09:36:44PM +0200, Lukas Wunner wrote:
> On Mon, May 06, 2024 at 10:37:01AM +0200, Nam Cao wrote:
> > I just discovered a new crash scenario with shpchp:
> >
> > First, hot-add a bridge:
> > (qemu) device_add pci-bridge,id=br2,bus=br1,chassis_nr=1,addr=1
> >
> > But with the current patch, the hot-plug is still successful, just that the
> > bridge is not properly configured:
> > $ lspci -t
> > -[0000:00]-+-00.0
> > +-01.0
> > +-02.0
> > +-03.0-[01-02]----00.0-[02]----01.0-- <-- new hot-added bridge
> > +-1f.0
> > +-1f.2
> > \-1f.3
> >
> > But! Now I leave the hot-added bridge as is, and hot-add an ethernet card:
> > (qemu) device_add e1000,bus=br1,id=eth0,addr=2
> >
> > this command crashes the kernel (full crash log below).
> >
> > The problem is because during hot-plugging of this ethernet card,
> > pci_bus_add_devices() is invoked and the previously hot-plugged bridge hasn't
> > been marked as "added" yet, so the driver attempts to add this bridge
> > again, leading to the crash.
> >
> > Now for pciehp, this scenario cannot happen, because there is only a single
> > slot, so we cannot hot-plug another device. But still, this makes me think
> > perhaps we shouldn't leave the hot-plugged bridge in a improperly-configured
> > state as this patch is doing. I cannot think of a scenario that would crash pciehp
> > similarly to shpchp. But that's just me, maybe there is a rare scenario
> > that can crash pciehp, but I just haven't seen yet.
> >
> > My point is: better safe than sorry. I propose going back to the original
> > solution: calling pci_stop_and_remove_bus_device() and return a negative
> > error, and get rid of this improperly configured bridge. It is useless
> > anyways without a subordinate bus number.
> >
> > What do you think?
>
> When the kernel runs out of BAR address space for devices downstream of
> a bridge, it doesn't de-enumerate the bridge. The devices are just
> unusable.
>
> So my first intuition is that running out of bus numbers shouldn't result
> in de-enumeration either.
The BAR case is a bit different: it is a legal configuration for a bridge
to have no address space downstream: we can configure the bridge with
limit<base to indicate that there is no downstream addresses. There is
nothing similar for secondary bus number: there is no legal bus number
configuration for the bridge in this case.
> Remind me, how exactly does the NULL pointer deref occur? I think it's
> because no struct pci_bus was allocated for the subordinate bus of the
> hot-plugged bridge, right? Because AFAICS that would happen in
>
> pci_hp_add_bridge()
> pci_can_bridge_extend()
> pci_add_new_bus()
> pci_alloc_child_bus()
>
> but we never get that far because pci_hp_add_bridge() bails out with -1.
> So the subordinate pointer remains a NULL pointer.
This is correct. NULL deference happens due to subordinate pointer being
NULL.
> Perhaps we should allocate an empty struct pci_bus instead.
This feels a bit hacky. What bus number could we set to this struct? And I
doubt that the PCI subsystem is written with the expectation that struct pci_bus
can be a dummy one, so I would guess that the probability of breaking thing
is high with this approach.
> Or check for a NULL subordinate pointer instead of crashing.
I think this is a possible solution, but it is a bit complicated: all usage
of subordinate pointers will need to be looked at. Further more, secondary bus
number being zero is currently taken as unconfigured/invalid (for example
in pci_scan_bridge_extend()), that needs to be changed as well. Doing this
has a good chance of breaking things.
I don't really see the upside of the above, compared to just deleting this
bridge. It is not really counter-intuitive that the OS rejects a
hot-plugged device that cannot be configured, right? Also this solution is
wayyy simpler. Unless there is problem with this that I am not seeing?
Best regards,
Nam
next prev parent reply other threads:[~2024-05-07 14:27 UTC|newest]
Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-05-04 16:15 [PATCH v2 0/2] abort hot-plug if pci_hp_add_bridge() fails Nam Cao
2024-05-04 16:15 ` [PATCH v2 1/2] PCI: shpchp: Abort " Nam Cao
2024-05-04 16:15 ` [PATCH v2 2/2] PCI: pciehp: " Nam Cao
2024-05-05 5:45 ` Lukas Wunner
2024-05-05 7:14 ` Nam Cao
2024-05-06 8:37 ` Nam Cao
2024-05-06 19:36 ` Lukas Wunner
2024-05-07 14:27 ` Nam Cao [this message]
2024-05-27 9:15 ` Lukas Wunner
2024-05-27 9:23 ` Nam Cao
2024-05-27 12:33 ` Lukas Wunner
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20240507142738.wyj19VVh@linutronix.de \
--to=namcao@linutronix.de \
--cc=bhelgaas@google.com \
--cc=gregkh@linuxfoundation.org \
--cc=ilpo.jarvinen@linux.intel.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-pci@vger.kernel.org \
--cc=lukas@wunner.de \
--cc=mika.westerberg@linux.intel.com \
--cc=yinghai@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).