linux-pci.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: James Puthukattukaran <james.puthukattukaran@oracle.com>
To: "Kelley, Sean V" <sean.v.kelley@intel.com>,
	"Kuppuswamy,
	Sathyanarayanan" <sathyanarayanan.kuppuswamy@intel.com>
Cc: Linux PCI <linux-pci@vger.kernel.org>,
	"bhelgaas@google.com" <bhelgaas@google.com>
Subject: RE: pci_do_recovery not handling fata errors
Date: Fri, 12 Mar 2021 22:57:18 +0000	[thread overview]
Message-ID: <MN2PR10MB40933D5232D0F58ECAF4387D996F9@MN2PR10MB4093.namprd10.prod.outlook.com> (raw)
In-Reply-To: <B2696632-CB84-420E-B072-044603A6D3B7@intel.com>

But the clearing of fatal error in the dpc_process_error is only for DPC trigger due to "unmaskable uncorrectable". 
If the trigger reason is ERR_FATAL, then it does not hit the else clause and neither is it cleared in the pci_do_recovery code.

From dpc_process_error with more context -- 

       else if (reason == 0 &&  <<<<<<< only for "unmaskable uncorrectable". What about for ERR_FATAL?
                 dpc_get_aer_uncorrect_severity(pdev, &info) &&
                 aer_get_device_error_info(pdev, &info)) {
                aer_print_error(pdev, &info);
                pci_aer_clear_nonfatal_status(pdev);
                pci_aer_clear_fatal_status(pdev);
        }
 

> -----Original Message-----
> From: Kelley, Sean V <sean.v.kelley@intel.com>
> Sent: Friday, March 12, 2021 5:25 PM
> To: James Puthukattukaran <james.puthukattukaran@oracle.com>;
> Kuppuswamy, Sathyanarayanan
> <sathyanarayanan.kuppuswamy@intel.com>
> Cc: Linux PCI <linux-pci@vger.kernel.org>; bhelgaas@google.com
> Subject: [External] : Re: pci_do_recovery not handling fata errors
> 
> 
> 
> > On Mar 12, 2021, at 12:56 PM, James Puthukattukaran
> <james.puthukattukaran@oracle.com> wrote:
> >
> > Hi -
> > I’m trying to understand why pci_do_recovery() only clears non-fatal but
> not fata errors? My immediate concern is call from dpc_handler. If a device
> sends an ERR_FATAL to the root port, I would think that as part of recovery
> the fatal status in the AER registers of the endpoint device would be cleared?
> >
> 
> 
> Adding Sathya who mentioned to me that:
> 
> Fatal error are cleared in
> 
> void dpc_process_error(struct pci_dev *pdev)
> 
> 253                  dpc_get_aer_uncorrect_severity(pdev, &info) &&
> 254                  aer_get_device_error_info(pdev, &info)) {
> 255                 aer_print_error(pdev, &info);
> 256                 pci_aer_clear_nonfatal_status(pdev);
> 257                 pci_aer_clear_fatal_status(pdev);
> 
> Thanks,
> 
> Sean
> 
> > Snippet of concern in pci_do_recovery –
> >
> >         /*
> >          * If we have native control of AER, clear error status in the Root
> >          * Port or Downstream Port that signaled the error.  If the
> >          * platform retained control of AER, it is responsible for clearing
> >          * this status.  In that case, the signaling device may not even be
> >          * visible to the OS.
> >          */
> >         if (host->native_aer || pcie_ports_native) {
> >                 pcie_clear_device_status(bridge);
> >                 pci_aer_clear_nonfatal_status(bridge);   <<<< Just clearing
> nonfatal. What about fatal?
> >         }
> >
> > Thanks
> > James


  reply	other threads:[~2021-03-12 22:58 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <MN2PR10MB4093188B8CDC659AE68E5640996F9@MN2PR10MB4093.namprd10.prod.outlook.com>
2021-03-12 22:25 ` pci_do_recovery not handling fata errors Kelley, Sean V
2021-03-12 22:57   ` James Puthukattukaran [this message]
2021-03-13 17:11     ` Keith Busch
2021-03-16 21:13       ` [External] : " James Puthukattukaran
2021-03-16 21:51         ` Keith Busch
2021-04-01  2:15           ` James Puthukattukaran
2021-04-01  2:22             ` Keith Busch

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=MN2PR10MB40933D5232D0F58ECAF4387D996F9@MN2PR10MB4093.namprd10.prod.outlook.com \
    --to=james.puthukattukaran@oracle.com \
    --cc=bhelgaas@google.com \
    --cc=linux-pci@vger.kernel.org \
    --cc=sathyanarayanan.kuppuswamy@intel.com \
    --cc=sean.v.kelley@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).