From: <Alex_Gagniuc@Dellteam.com>
To: <helgaas@kernel.org>, <mr.nuke.me@gmail.com>
Cc: <linux-pci@vger.kernel.org>, <keith.busch@intel.com>,
<Austin.Bolen@dell.com>, <Shyam.Iyer@dell.com>,
<linux-kernel@vger.kernel.org>, <jonathan.derrick@intel.com>,
<gregkh@linuxfoundation.org>, <lukas@wunner.de>,
<ruscur@russell.cc>, <sbobroff@linux.ibm.com>, <oohall@gmail.com>,
<linuxppc-dev@lists.ozlabs.org>
Subject: Re: [PATCH v2] PCI/MSI: Don't touch MSI bits when the PCI device is disconnected
Date: Thu, 8 Nov 2018 22:20:46 +0000 [thread overview]
Message-ID: <555e85227c6541ea96afa330e632dead@ausx13mps321.AMER.DELL.COM> (raw)
In-Reply-To: 20181108200855.GE41183@google.com
On 11/08/2018 02:09 PM, Bjorn Helgaas wrote:
>
> [EXTERNAL EMAIL]
> Please report any suspicious attachments, links, or requests for sensitive information.
>
>
> [+cc Jonathan, Greg, Lukas, Russell, Sam, Oliver for discussion about
> PCI error recovery in general]
Has anyone seen seen the ECRs in the PCIe base spec and ACPI that have
been floating around the past few months? -- HPX, SFI, CER. Without
divulging too much before publication, I'm curious on opinions on how
well (or not well) those flows would work in general, and in linux.
> On Wed, Nov 07, 2018 at 05:42:57PM -0600, Bjorn Helgaas wrote:
> I'm having second thoughts about this. One thing I'm uncomfortable
> with is that sprinkling pci_dev_is_disconnected() around feels ad hoc
> instead of systematic, in the sense that I don't know how we convince
> ourselves that this (and only this) is the correct place to put it. >
> Another is that the only place we call pci_dev_set_disconnected() is
> in pciehp and acpiphp, so the only "disconnected" case we catch is if
> hotplug happens to be involved. Every MMIO read from the device is an
> opportunity to learn whether it is reachable (a read from an
> unreachable device typically returns ~0 data), but we don't do
> anything at all with those.
>
> The config accessors already check pci_dev_is_disconnected(), so this
> patch is really aimed at MMIO accesses. I think it would be more
> robust if we added wrappers for readl() and writel() so we could
> notice read errors and avoid future reads and writes.
I wouldn't expect anything less than complete scrutiny and quality
control of unquestionable moral integrity :). In theory ~0 can be a
great indicator that something may be wrong. Though I think it's about
as ad-hoc as pci_dev_is_disconnected().
I slightly like the idea of wrapping the MMIO accessors. There's still
memcpy and DMA that cause the same MemRead/Wr PCIe transactions, and the
same sort of errors in PCIe land, and it would be good to have more
testing on this. Since this patch is tested and confirmed to fix a known
failure case, I would keep it, and the look at fixing the problem in a
more generic way.
BTW, a lot of the problems we're fixing here come courtesy of
firmware-first error handling. Do we reach a point where we draw a line
in handling new problems introduced by FFS? So, if something is a
problem with FFS, but not native handling, do we commit to supporting it?
Alex
next prev parent reply other threads:[~2018-11-08 22:21 UTC|newest]
Thread overview: 34+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-09-18 22:15 [PATCH v2] PCI/MSI: Don't touch MSI bits when the PCI device is disconnected Alexandru Gagniuc
2018-11-06 0:32 ` Alex G.
2018-11-07 17:04 ` Derrick, Jonathan
2018-11-07 23:42 ` Bjorn Helgaas
2018-11-08 20:09 ` Bjorn Helgaas
2018-11-08 21:49 ` Keith Busch
2018-11-08 22:01 ` Greg Kroah-Hartman
2018-11-08 22:32 ` Keith Busch
2018-11-08 22:42 ` Greg Kroah-Hartman
2018-11-08 22:49 ` Alex_Gagniuc
2018-11-08 22:51 ` Greg KH
2018-11-08 23:06 ` Alex_Gagniuc
2018-11-12 5:49 ` Oliver O'Halloran
2018-11-12 20:05 ` Alex_Gagniuc
2018-11-13 5:02 ` Bjorn Helgaas
2018-11-13 22:39 ` Alex_Gagniuc
2018-11-13 22:52 ` Keith Busch
2018-11-14 0:31 ` Alex_Gagniuc
2018-11-14 5:59 ` Bjorn Helgaas
2018-11-14 19:22 ` Alex_Gagniuc
2018-11-14 19:41 ` Derrick, Jonathan
2018-11-14 20:23 ` Keith Busch
2018-11-14 20:52 ` Alex_Gagniuc
2018-11-14 20:58 ` Keith Busch
2018-11-15 6:24 ` Bjorn Helgaas
2018-11-16 0:19 ` Alex_Gagniuc
2018-11-08 23:03 ` Keith Busch
2018-11-09 7:29 ` Lukas Wunner
2018-11-09 11:32 ` Greg Kroah-Hartman
2018-11-09 16:36 ` Keith Busch
2018-11-08 22:20 ` Alex_Gagniuc [this message]
2018-11-09 7:11 ` Lukas Wunner
2018-11-12 5:48 ` Oliver O'Halloran
2018-12-27 19:28 ` Alex_Gagniuc
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=555e85227c6541ea96afa330e632dead@ausx13mps321.AMER.DELL.COM \
--to=alex_gagniuc@dellteam.com \
--cc=Austin.Bolen@dell.com \
--cc=Shyam.Iyer@dell.com \
--cc=gregkh@linuxfoundation.org \
--cc=helgaas@kernel.org \
--cc=jonathan.derrick@intel.com \
--cc=keith.busch@intel.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-pci@vger.kernel.org \
--cc=linuxppc-dev@lists.ozlabs.org \
--cc=lukas@wunner.de \
--cc=mr.nuke.me@gmail.com \
--cc=oohall@gmail.com \
--cc=ruscur@russell.cc \
--cc=sbobroff@linux.ibm.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).