From: poza@codeaurora.org
To: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Thomas Tai <thomas.tai@oracle.com>,
bhelgaas@google.com, keith.busch@intel.com,
linux-pci@vger.kernel.org, linux-pci-owner@vger.kernel.org
Subject: Re: [PATCH 1/1] PCI/AER: prevent pcie_do_fatal_recovery from using device after it is removed
Date: Thu, 16 Aug 2018 14:33:18 +0530 [thread overview]
Message-ID: <2894d8df6e44860456377dade9ea5737@codeaurora.org> (raw)
In-Reply-To: <05bc3bccb2c6a0cb1696faf20073e567d7a5b8ee.camel@kernel.crashing.org>
On 2018-08-16 13:42, Benjamin Herrenschmidt wrote:
> On Thu, 2018-08-16 at 13:37 +0530, poza@codeaurora.org wrote:
>> >
>> > In fact looking at pcie_do_nonfatal_recovery() it's indeed completely
>> > broken. It tells the driver that the slot was reset without actually
>> > resetting anything... Ugh.
>> >
>> > Ben.
>>
>> pcie_do_nonfatal_recovery() exhibit the same behavior with or without
>> the patch-series.
>> in short, there was no functional change brought in to
>> pcie_do_nonfatal_recovery()
>
> Yes, I know, I'm just saying what it does is broken :-)
when I meant spec, i meant PCIe Spec.
At least spec distinguish fatal and non-fatal
"
Non-fatal errors are uncorrectable errors which cause a particular
transaction to be unreliable but the Link is otherwise fully functional.
Isolating Non-fatal from Fatal errors provides Requester/Receiver logic
in a device or system management software the opportunity to recover
from the error without resetting the components on the Link and
disturbing other transactions in progress.
"
Here the basic assumption is link is fully functional, hence we do not
initiate link reset. (while in case FATAL we do initiate Secondary Bus
Reset)
okay, so here is what current pcie_do_nonfatal_recovery() doe.
pcie_do_nonfatal_recovery
report_error_detected() >> calls driver callbacks
report_mmio_enabled()
report_slot_reset() >> if PCI_ERS_RESULT_NEED_RESET
report_resume()
If you suggest how it is broken, it will help me to understand.
probably you might want to point out what are the calls need to be added
or removed or differently handled, specially storage point of view.
Regards,
Oza.
>
> Keep in mind that those callbacks were designed originally for EEH
> (which predates AER), and so was the spec written.
>
> We don't actually use the AER code on POWER today, so we didn't notice
> how broken the implementation was :-)
>
> We should fix that.
>
> Either we can sort all that out by email, or we should plan some kind
> of catch-up, either at Plumbers (provided I can go there) or maybe a
> webex call.
>
> Cheers,
> Ben.
next prev parent reply other threads:[~2018-08-16 12:00 UTC|newest]
Thread overview: 91+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-08-13 16:51 [PATCH 0/1] PCI/AER: prevent pcie_do_fatal_recovery from using device after it is removed Thomas Tai
2018-08-13 16:51 ` [PATCH 1/1] " Thomas Tai
2018-08-14 9:16 ` poza
2018-08-14 9:22 ` poza
2018-08-14 13:51 ` Thomas Tai
2018-08-15 14:57 ` poza
2018-08-15 15:02 ` Thomas Tai
2018-08-15 15:26 ` poza
2018-08-15 15:43 ` Thomas Tai
2018-08-15 15:59 ` poza
2018-08-15 16:04 ` Thomas Tai
2018-08-15 21:55 ` Benjamin Herrenschmidt
2018-08-15 21:56 ` Benjamin Herrenschmidt
2018-08-16 6:36 ` poza
2018-08-16 6:51 ` Benjamin Herrenschmidt
2018-08-16 6:59 ` Benjamin Herrenschmidt
2018-08-16 8:07 ` poza
2018-08-16 8:12 ` Benjamin Herrenschmidt
2018-08-16 9:03 ` poza [this message]
2018-08-16 10:07 ` Benjamin Herrenschmidt
2018-08-16 14:11 ` poza
2018-08-16 23:30 ` Benjamin Herrenschmidt
2018-08-17 10:29 ` poza
2018-08-17 10:44 ` poza
2018-08-18 7:38 ` Benjamin Herrenschmidt
2018-08-16 7:05 ` Benjamin Herrenschmidt
2018-08-16 7:15 ` Benjamin Herrenschmidt
2018-08-16 7:56 ` poza
2018-08-16 8:10 ` Benjamin Herrenschmidt
2018-08-16 8:05 ` poza
2018-08-16 8:15 ` Benjamin Herrenschmidt
2018-08-16 8:22 ` poza
2018-08-16 8:28 ` Benjamin Herrenschmidt
2018-08-16 13:30 ` Thomas Tai
2018-08-16 13:46 ` Sinan Kaya
2018-08-16 23:27 ` Benjamin Herrenschmidt
2018-08-17 6:35 ` poza
2018-08-19 2:24 ` Bjorn Helgaas
2018-08-20 5:09 ` poza
2018-08-20 5:15 ` Benjamin Herrenschmidt
2018-08-20 13:02 ` Thomas Tai
2018-08-20 13:27 ` Benjamin Herrenschmidt
2018-08-19 2:19 ` Bjorn Helgaas
2018-08-19 21:41 ` Sinan Kaya
2018-08-20 2:03 ` Benjamin Herrenschmidt
2018-08-20 5:19 ` poza
2018-08-20 5:33 ` Benjamin Herrenschmidt
2018-08-20 7:56 ` poza
2018-08-20 11:22 ` Benjamin Herrenschmidt
2018-08-20 13:26 ` poza
2018-08-20 21:02 ` Benjamin Herrenschmidt
2018-08-21 5:14 ` poza
2018-08-21 6:06 ` Benjamin Herrenschmidt
2018-08-21 14:37 ` Keith Busch
2018-08-21 15:07 ` Sinan Kaya
2018-08-21 15:29 ` Keith Busch
2018-08-21 15:50 ` Sinan Kaya
2018-08-21 15:55 ` Sinan Kaya
2018-08-21 16:44 ` Keith Busch
2018-08-21 15:30 ` poza
2018-08-21 21:14 ` Benjamin Herrenschmidt
2018-08-21 22:04 ` Keith Busch
2018-08-21 22:33 ` Keith Busch
2018-08-21 23:06 ` Benjamin Herrenschmidt
2018-08-21 23:13 ` Keith Busch
2018-08-22 0:36 ` Benjamin Herrenschmidt
2018-08-30 0:01 ` Keith Busch
2018-08-30 0:10 ` Sinan Kaya
2018-08-30 0:46 ` Keith Busch
2018-08-30 4:26 ` Benjamin Herrenschmidt
2018-08-20 15:53 ` Keith Busch
2018-08-20 16:13 ` poza
2018-08-20 16:32 ` Keith Busch
2018-08-20 21:05 ` Benjamin Herrenschmidt
2018-08-20 21:21 ` Sinan Kaya
2018-08-20 21:35 ` Keith Busch
2018-08-20 21:53 ` Benjamin Herrenschmidt
2018-08-20 22:02 ` Sinan Kaya
2018-08-20 22:04 ` Benjamin Herrenschmidt
2018-08-20 22:13 ` Sinan Kaya
2018-08-20 22:19 ` Benjamin Herrenschmidt
2018-08-22 9:13 ` Lukas Wunner
2018-08-22 14:38 ` Keith Busch
2018-08-22 14:51 ` Sinan Kaya
2018-08-20 22:13 ` Keith Busch
2018-08-20 22:19 ` Benjamin Herrenschmidt
2018-08-21 1:30 ` Keith Busch
2018-08-20 4:37 ` Benjamin Herrenschmidt
2018-08-20 4:39 ` PATCH] Partial revert of "PCI/AER: Handle ERR_FATAL with removal and re-enumeration of devices" Benjamin Herrenschmidt
2018-08-21 19:50 ` Bjorn Helgaas
2018-08-22 4:35 ` poza
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=2894d8df6e44860456377dade9ea5737@codeaurora.org \
--to=poza@codeaurora.org \
--cc=benh@kernel.crashing.org \
--cc=bhelgaas@google.com \
--cc=keith.busch@intel.com \
--cc=linux-pci-owner@vger.kernel.org \
--cc=linux-pci@vger.kernel.org \
--cc=thomas.tai@oracle.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).