From: Bjorn Helgaas <helgaas@kernel.org>
To: Oliver O'Halloran <oohall@gmail.com>
Cc: "Pali Rohár" <pali@kernel.org>,
"Aaron Ma" <aaron.ma@canonical.com>,
jesse.brandeburg@intel.com, anthony.l.nguyen@intel.com,
"David S. Miller" <davem@davemloft.net>,
"Jakub Kicinski" <kuba@kernel.org>,
intel-wired-lan@lists.osuosl.org, netdev@vger.kernel.org,
"Linux Kernel Mailing List" <linux-kernel@vger.kernel.org>,
"Krzysztof Wilczyński" <kw@linux.com>,
linux-pci <linux-pci@vger.kernel.org>
Subject: Re: [PATCH 1/2] igc: don't rd/wr iomem when PCI is removed
Date: Mon, 19 Jul 2021 19:17:13 -0500 [thread overview]
Message-ID: <20210720001713.GA38755@bjorn-Precision-5520> (raw)
In-Reply-To: <CAOSf1CHOrUBfibO0t6Zr2=SZ7GjLTiAzfoKBeZL8RXdcC+Ou3A@mail.gmail.com>
On Mon, Jul 19, 2021 at 12:49:18PM +1000, Oliver O'Halloran wrote:
> On Mon, Jul 19, 2021 at 8:51 AM Pali Rohár <pali@kernel.org> wrote:
> >
> > And do we have some solution for this kind of issue? There are more PCIe
> > controllers / platforms which do not like MMIO read/write operation when
> > card / link is not connected.
>
> Do you have some actual examples? The few times I've seen those
> crashes were due to broken firmware-first error handling. The AER
> notifications would be escalated into some kind of ACPI error which
> the kernel didn't have a good way of dealing with so it panicked
> instead.
>
> Assuming it is a real problem then as Bjorn pointed out this sort of
> hack doesn't really fix the issue because hotplug and AER
> notifications are fundamentally asynchronous. If the driver is
> actively using the device when the error / removal happens then the
> pci_dev_is_disconnected() check will pass and the MMIO will go
> through. If the MMIO is poisonous because of dumb hardware then this
> sort of hack will only paper over the issue.
>
> > If we do not provide a way how to solve these problems then we can
> > expect that people would just hack ethernet / wifi / ... device drivers
> > which are currently crashing by patches like in this thread.
> >
> > Maybe PCI subsystem could provide wrapper function which implements
> > above pattern and which can be used by device drivers?
>
> We could do that and I think there was a proposal to add some
> pci_readl(pdev, <addr>) style wrappers at one point.
Obviously this wouldn't help user-space mmaps, but in the kernel,
Documentation/driver-api/device-io.rst [1] does say that drivers are
supposed to use readl() et al even though on most arches it "works"
to just dereference the result of ioremap(), so maybe we could make
a useful wrapper.
Seems like we should do *something*, even if it's just a generic
#define and some examples. I took a stab at this [2] a couple years
ago, but it was only for the PCI core, and it didn't go anywhere.
[1] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/Documentation/driver-api/device-io.rst?id=v5.13#n160
[2] https://lore.kernel.org/linux-pci/20190822200551.129039-1-helgaas@kernel.org/
> On powerpc
> there's hooks in the arch provided MMIO functions to detect error
> responses and kick off the error handling machinery when a problem is
> detected. Those hooks are mainly there to help the platform detect
> errors though and they don't make life much easier for drivers. Due to
> locking concerns the driver's .error_detected() callback cannot be
> called in the MMIO hook so even when the platform detects errors
> synchronously the driver notifications must happen asynchronously. In
> the meanwhile the driver still needs to handle the 0xFFs response
> safely and there's not much we can do from the platform side to help
> there.
>
> Oliver
prev parent reply other threads:[~2021-07-20 0:58 UTC|newest]
Thread overview: 14+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <20210702045120.22855-1-aaron.ma@canonical.com>
2021-07-04 14:28 ` [PATCH 1/2] igc: don't rd/wr iomem when PCI is removed Pali Rohár
2021-07-05 7:23 ` Aaron Ma
2021-07-05 23:02 ` Krzysztof Wilczyński
2021-07-06 14:23 ` Pali Rohár
2021-07-06 20:12 ` Bjorn Helgaas
2021-07-07 21:53 ` Pali Rohár
2021-07-07 22:10 ` Bjorn Helgaas
2021-07-08 2:04 ` Oliver O'Halloran
2021-07-08 15:45 ` Bjorn Helgaas
2021-07-18 16:31 ` Oliver O'Halloran
2021-07-18 22:50 ` Pali Rohár
2021-07-19 2:49 ` Oliver O'Halloran
2021-07-19 8:13 ` Pali Rohár
2021-07-20 0:17 ` Bjorn Helgaas [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20210720001713.GA38755@bjorn-Precision-5520 \
--to=helgaas@kernel.org \
--cc=aaron.ma@canonical.com \
--cc=anthony.l.nguyen@intel.com \
--cc=davem@davemloft.net \
--cc=intel-wired-lan@lists.osuosl.org \
--cc=jesse.brandeburg@intel.com \
--cc=kuba@kernel.org \
--cc=kw@linux.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-pci@vger.kernel.org \
--cc=netdev@vger.kernel.org \
--cc=oohall@gmail.com \
--cc=pali@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).