All of lore.kernel.org
 help / color / mirror / Atom feed
From: Borislav Petkov <bp@suse.de>
To: Sinan Kaya <okaya@codeaurora.org>
Cc: "Baicar, Tyler" <tbaicar@codeaurora.org>,
	Tony Luck <tony.luck@intel.com>,
	rjw@rjwysocki.net, lenb@kernel.org, will.deacon@arm.com,
	james.morse@arm.com, prarit@redhat.com, punit.agrawal@arm.com,
	shiju.jose@huawei.com, andriy.shevchenko@linux.intel.com,
	linux-acpi@vger.kernel.org, linux-kernel@vger.kernel.org,
	Linux PCI <linux-pci@vger.kernel.org>,
	Huang Ying <ying.huang@intel.com>
Subject: Re: [PATCH] acpi: apei: call into AER handling regardless of severity
Date: Wed, 30 Aug 2017 17:16:01 +0200	[thread overview]
Message-ID: <20170830151601.ro5qt5272e2msevp@pd.tnic> (raw)
In-Reply-To: <b630fa5e-c24b-06f4-47da-bed53161e0b7@codeaurora.org>

On Wed, Aug 30, 2017 at 10:05:44AM -0400, Sinan Kaya wrote:
> Link reset is not the only recovery mechanism. In the case of nonfatal
> errors, it is assumed that the endpoint CSR is still reachable.
> Error is propagated the PCIe endpoint driver. Endpoint driver does a
> re-initialization, we are back in business.

I'm assuming that's broadcast_error_message()'s job.

> That's not true. The GHES code is changing the severity here before posting
> to the AER driver in ghes_do_proc().
> 
> 	if (gdata->flags & CPER_SEC_RESET)
> 		aer_severity = AER_FATAL;

You're missing the point that we would walk into that if branch *only* for

                        if (sev == GHES_SEV_RECOVERABLE &&
                            sec_sev == GHES_SEV_RECOVERABLE

severities. So if you have an AER_FATAL error but ghes severities are
not GHES_SEV_RECOVERABLE, nothing happens.

> No, AER ISR is not set up if firmware first is enabled.

So then this is a major suckage. We do AER recovery on FF systems only
for GHES_SEV_RECOVERABLE severity.

> The behavior should match non firmware-first case ideally.
> 
> 1. Print all correctable errors.
> 2. Go to do_recovery for all uncorrectable errors including fatal and
> non-fatal. 
> 
> This is also what AER driver does in the absence of firmware first via
> handle_error_source().

Yes, that makes sense.

Which would mean that we'd call aer_recover_queue() regardless of GHES
severity but we'd do recovery only if GHES_SEV_RECOVERABLE is set
or CPER_SEC_RESET. I.e., we can communicate all that by setting the
correct AER severity before calling aer_recover_queue(). And then call
do_recovery() based on AER severity.

Hmmm?

-- 
Regards/Gruss,
    Boris.

SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nürnberg)
-- 

  reply	other threads:[~2017-08-30 15:16 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-08-28 17:11 [PATCH] acpi: apei: call into AER handling regardless of severity Tyler Baicar
2017-08-28 20:52 ` Rafael J. Wysocki
2017-08-29  8:20 ` Borislav Petkov
2017-08-29 21:27   ` Baicar, Tyler
2017-08-29 22:19     ` Borislav Petkov
2017-08-29 22:34       ` Sinan Kaya
2017-08-30 10:16         ` Borislav Petkov
2017-08-30 14:05           ` Sinan Kaya
2017-08-30 15:16             ` Borislav Petkov [this message]
2017-08-30 15:31               ` Sinan Kaya
2017-08-30 15:42                 ` Baicar, Tyler
2017-08-30 17:14                   ` Borislav Petkov
2017-08-30 18:09                     ` Baicar, Tyler
2017-08-30 17:02                 ` Borislav Petkov
2017-08-29 23:06       ` Luck, Tony
2017-08-29 23:06         ` Luck, Tony

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20170830151601.ro5qt5272e2msevp@pd.tnic \
    --to=bp@suse.de \
    --cc=andriy.shevchenko@linux.intel.com \
    --cc=james.morse@arm.com \
    --cc=lenb@kernel.org \
    --cc=linux-acpi@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-pci@vger.kernel.org \
    --cc=okaya@codeaurora.org \
    --cc=prarit@redhat.com \
    --cc=punit.agrawal@arm.com \
    --cc=rjw@rjwysocki.net \
    --cc=shiju.jose@huawei.com \
    --cc=tbaicar@codeaurora.org \
    --cc=tony.luck@intel.com \
    --cc=will.deacon@arm.com \
    --cc=ying.huang@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.