linux-pci.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Dan Williams <dan.j.williams@intel.com>
To: "Kuppuswamy, Sathyanarayanan" <sathyanarayanan.kuppuswamy@intel.com>
Cc: Keith Busch <kbusch@kernel.org>,
	"linux-pci@vger.kernel.org" <linux-pci@vger.kernel.org>,
	"helgaas@kernel.org" <helgaas@kernel.org>,
	"hinko.kocevar@ess.eu" <hinko.kocevar@ess.eu>,
	"Kelley, Sean V" <sean.v.kelley@intel.com>
Subject: Re: [PATCHv2 3/5] PCI/ERR: Retain status from error notification
Date: Thu, 4 Mar 2021 16:23:01 -0800	[thread overview]
Message-ID: <CAPcyv4iHPEtpftGMMqkvKW5_SaLJN5R=kVV8urnqibJ5-Lo=_A@mail.gmail.com> (raw)
In-Reply-To: <4c2a799f-c4e9-b203-3487-f9c117fba5e7@intel.com>

On Thu, Mar 4, 2021 at 3:19 PM Kuppuswamy, Sathyanarayanan
<sathyanarayanan.kuppuswamy@intel.com> wrote:
>
>
> On 3/4/21 2:59 PM, Dan Williams wrote:
> > On Thu, Mar 4, 2021 at 2:38 PM Kuppuswamy, Sathyanarayanan
> > <sathyanarayanan.kuppuswamy@intel.com> wrote:
> >>
> >> On 3/4/21 2:11 PM, Dan Williams wrote:
> >>
> >> On Thu, Mar 4, 2021 at 12:03 PM Keith Busch <kbusch@kernel.org> wrote:
> >>
> >> On Tue, Mar 02, 2021 at 09:46:40PM -0800, Kuppuswamy, Sathyanarayanan wrote:
> >>
> >> On 3/2/21 9:34 PM, Williams, Dan J wrote:
> >>
> >> [ Add Sathya ]
> >>
> >> On Mon, 2021-01-04 at 15:02 -0800, Keith Busch wrote:
> >>
> >> Overwriting the frozen detected status with the result of the link reset
> >> loses the NEED_RESET result that drivers are depending on for error
> >> handling to report the .slot_reset() callback. Retain this status so
> >> that subsequent error handling has the correct flow.
> >>
> >> Reported-by: Hinko Kocevar <hinko.kocevar@ess.eu>
> >> Acked-by: Sean V Kelley <sean.v.kelley@intel.com>
> >> Signed-off-by: Keith Busch <kbusch@kernel.org>
> >>
> >> Just want to report that this fix might be a candidate for -stable.
> >>
> >> Agree.
> >>
> >> I think it can be merged in both stable and mainline kernels.
> >>
> >> Reviewed-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
> >>
> >> Just FYI, this patch is practically a revert of this one:
> >>
> >>    https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit?id=6d2c89441571ea534d6240f7724f518936c44f8d
> >>
> >> so please let me know if that is still a problem for you.
> >>
> >> For what it's worth I think "6d2c89441571 PCI/ERR: Update error status
> >> after reset_link()" is not justified. The link shouldn't recover if
> >> the attached device is not prepared to handle DPC events.
> >>
> >> I added that fix to address the recovery issue seen in a Dell server
> >> platform (for EDR test case). If I understand the history correctly,
> >> In EDR case, AER and DPC is owned by firmware, hence we get
> >> PCI_ERS_RESULT_NO_AER_DRIVER when executing error_detected() callbacks.
> >> So If we continue the pcie_do_recovery() with PCI_ERS_RESULT_NO_AER_DRIVER
> >> as error status, then even if we successfully reset the link we will report
> >> the recovery status as failure.
> > But that's the right response if there is no handler.
> If the handler is not available due to AER being owned by firmware,
> then it needs to be fixed. In EDR mode, even if DPC/AER is owned
> by firmware , OS need to own the recovery part. So I think it
> needs further investigation to understand why it reports,
> PCI_ERS_RESULT_NO_AER_DRIVER

As far as I can see the only way to get PCI_ERS_RESULT_NO_AER_DRIVER
is when there actually is no handler, or the device io state has set
to failed. I notice the hotplug handler sets the device io state to
failed while processing link down. If the device is actually missing a
handler definition then disconnect seems the right response.

  reply	other threads:[~2021-03-05  0:23 UTC|newest]

Thread overview: 36+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-01-04 23:02 [PATCHv2 0/5] aer handling fixups Keith Busch
2021-01-04 23:02 ` [PATCHv2 1/5] PCI/ERR: Clear status of the reporting device Keith Busch
2021-01-04 23:02 ` [PATCHv2 2/5] PCI/AER: Actually get the root port Keith Busch
2021-01-04 23:02 ` [PATCHv2 3/5] PCI/ERR: Retain status from error notification Keith Busch
2021-03-03  5:34   ` Williams, Dan J
2021-03-03  5:46     ` Kuppuswamy, Sathyanarayanan
2021-03-04 20:01       ` Keith Busch
2021-03-04 22:11         ` Dan Williams
     [not found]           ` <23551edc-965c-21dc-0da8-a492c27c362d@intel.com>
2021-03-04 22:59             ` Dan Williams
2021-03-04 23:19               ` Kuppuswamy, Sathyanarayanan
2021-03-05  0:23                 ` Dan Williams [this message]
2021-03-05  0:54                   ` Keith Busch
2021-01-04 23:02 ` [PATCHv2 4/5] PCI/AER: Specify the type of port that was reset Keith Busch
2021-01-04 23:03 ` [PATCHv2 5/5] PCI/portdrv: Report reset for frozen channel Keith Busch
2021-01-05 14:21 ` [PATCHv2 0/5] aer handling fixups Hinko Kocevar
2021-01-05 15:06   ` Hinko Kocevar
2021-01-05 18:33     ` Keith Busch
2021-01-05 23:07       ` Kelley, Sean V
2021-01-07 21:42         ` Keith Busch
2021-01-08  9:38           ` Hinko Kocevar
2021-01-11 13:39             ` Hinko Kocevar
2021-01-11 16:37               ` Keith Busch
2021-01-11 20:02                 ` Hinko Kocevar
2021-01-11 22:09                   ` Keith Busch
     [not found]                     ` <ed8256dd-d70d-b8dc-fdc0-a78b9aa3bbd9@ess.eu>
2021-01-12 19:27                       ` Keith Busch
2021-01-12 22:19                         ` Hinko Kocevar
2021-01-12 23:17                           ` Keith Busch
2021-01-18  8:00                             ` Hinko Kocevar
2021-01-19 18:28                               ` Keith Busch
2021-02-03  0:03 ` Keith Busch
2021-02-04  8:35   ` Hinko Kocevar
2021-02-08 12:55 ` Hedi Berriche
2021-02-09 23:06 ` Bjorn Helgaas
2021-02-10  4:05   ` Keith Busch
2021-02-10 21:38     ` Bjorn Helgaas
2021-02-10  9:36 ` Yicong Yang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAPcyv4iHPEtpftGMMqkvKW5_SaLJN5R=kVV8urnqibJ5-Lo=_A@mail.gmail.com' \
    --to=dan.j.williams@intel.com \
    --cc=helgaas@kernel.org \
    --cc=hinko.kocevar@ess.eu \
    --cc=kbusch@kernel.org \
    --cc=linux-pci@vger.kernel.org \
    --cc=sathyanarayanan.kuppuswamy@intel.com \
    --cc=sean.v.kelley@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).