From: Tyler Baicar <baicar.tyler@gmail.com>
To: Borislav Petkov <bp@alien8.de>
Cc: Rafael Wysocki <rjw@rjwysocki.net>,
Tony Luck <tony.luck@intel.com>, Fan Wu <wufan@codeaurora.org>,
linux-mm@kvack.org, Marc Zyngier <marc.zyngier@arm.com>,
Catalin Marinas <catalin.marinas@arm.com>,
Xie XiuQi <xiexiuqi@huawei.com>,
Will Deacon <will.deacon@arm.com>,
Christoffer Dall <christoffer.dall@arm.com>,
Dongjiu Geng <gengdongjiu@huawei.com>,
Linux ACPI <linux-acpi@vger.kernel.org>,
James Morse <james.morse@arm.com>,
Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>,
kvmarm@lists.cs.columbia.edu,
arm-mail-list <linux-arm-kernel@lists.infradead.org>,
Len Brown <lenb@kernel.org>
Subject: Re: [PATCH v7 10/25] ACPI / APEI: Tell firmware the estatus queue consumed the records
Date: Fri, 11 Jan 2019 10:32:23 -0500 [thread overview]
Message-ID: <CABo9ajAk5XNBmNHRRfUb-dQzW7-UOs5826jPkrVz-8zrtMUYkg@mail.gmail.com> (raw)
In-Reply-To: <20190111120322.GD4729@zn.tnic>
On Fri, Jan 11, 2019 at 7:03 AM Borislav Petkov <bp@alien8.de> wrote:
> On Thu, Jan 10, 2019 at 04:01:27PM -0500, Tyler Baicar wrote:
> > On Thu, Jan 10, 2019 at 1:23 PM James Morse <james.morse@arm.com> wrote:
> > > >>
> > > >> + if (is_hest_type_generic_v2(ghes) && ghes_ack_error(ghes->generic_v2))
> > > >
> > > > Since ghes_ack_error() is always prepended with this check, you could
> > > > push it down into the function:
> > > >
> > > > ghes_ack_error(ghes)
> > > > ...
> > > >
> > > > if (!is_hest_type_generic_v2(ghes))
> > > > return 0;
> > > >
> > > > and simplify the two callsites :)
> > >
> > > Great idea! ...
> > >
> > > .. huh. Turns out for ghes_proc() we discard any errors other than ENOENT from
> > > ghes_read_estatus() if is_hest_type_generic_v2(). This masks EIO.
> > >
> > > Most of the error sources discard the result, the worst thing I can find is
> > > ghes_irq_func() will return IRQ_HANDLED, instead of IRQ_NONE when we didn't
> > > really handle the IRQ. They're registered as SHARED, but I don't have an example
> > > of what goes wrong next.
> > >
> > > I think this will also stop the spurious handling code kicking in to shut it up
> > > if its broken and screaming. Unlikely, but not impossible.
> > >
> > > Fixed in a prior patch, with Boris' suggestion, ghes_proc()s tail ends up look
> > > like this:
> > > ----------------------%<----------------------
> > > diff --git a/drivers/acpi/apei/ghes.c b/drivers/acpi/apei/ghes.c
> > > index 0321d9420b1e..8d1f9930b159 100644
> > > --- a/drivers/acpi/apei/ghes.c
> > > +++ b/drivers/acpi/apei/ghes.c
> > > @@ -700,18 +708,11 @@ static int ghes_proc(struct ghes *ghes)
> > >
> > > out:
> > > ghes_clear_estatus(ghes, buf_paddr);
> > > + if (rc != -ENOENT)
> > > + rc_ack = ghes_ack_error(ghes);
> > >
> > > - if (rc == -ENOENT)
> > > - return rc;
> > > -
> > > - /*
> > > - * GHESv2 type HEST entries introduce support for error acknowledgment,
> > > - * so only acknowledge the error if this support is present.
> > > - */
> > > - if (is_hest_type_generic_v2(ghes))
> > > - return ghes_ack_error(ghes->generic_v2);
> > > -
> > > - return rc;
> > > + /* If rc and rc_ack failed, return the first one */
> > > + return rc ? rc : rc_ack;
> > > }
> > > ----------------------%<----------------------
> > >
> >
> > Looks good to me, I guess there's no harm in acking invalid error status blocks.
>
> Err, why?
If ghes_read_estatus() fails, then either there was no error populated or the
error status block was invalid. If the error status block is invalid, then the
kernel doesn't know what happened in hardware.
I originally thought this was changing what's acked, but it's just changing the
return value of ghes_proc() when ghes_read_estatus() returns -EIO.
> I don't know what the firmware glue does on ARM but if I'd have to
> remain logical - which is hard to do with firmware - the proper thing to
> do would be this:
>
> rc = ghes_read_estatus(ghes, &buf_paddr);
> if (rc) {
> ghes_reset_hardware();
The kernel would have no way of knowing what to do here.
> }
>
> /* clear estatus and bla bla */
>
> /* Now, I'm in the success case: */
> ghes_ack_error();
>
>
> This way, you have the error path clear of something unexpected happened
> when reading the hardware, obvious and separated. ghes_reset_hardware()
> clears the registers and does the necessary steps to put the hardware in
> good state again so that it can report the next error.
>
> And the success path simply acks the error and does possibly the same
> thing. The naming of the functions is important though, to denote what
> gets called when.
>
> This way you handle all the cases just fine. No looking at the error
> type and blabla.
>
> Right?
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
next prev parent reply other threads:[~2019-01-11 15:32 UTC|newest]
Thread overview: 72+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-12-03 18:05 [PATCH v7 00/25] APEI in_nmi() rework and SDEI wire-up James Morse
2018-12-03 18:05 ` [PATCH v7 01/25] ACPI / APEI: Don't wait to serialise with oops messages when panic()ing James Morse
2018-12-03 18:05 ` [PATCH v7 02/25] ACPI / APEI: Remove silent flag from ghes_read_estatus() James Morse
2018-12-04 11:36 ` Borislav Petkov
2018-12-03 18:05 ` [PATCH v7 03/25] ACPI / APEI: Switch estatus pool to use vmalloc memory James Morse
2018-12-04 13:01 ` Borislav Petkov
2018-12-03 18:05 ` [PATCH v7 04/25] ACPI / APEI: Make hest.c manage the estatus memory pool James Morse
2018-12-11 16:48 ` Borislav Petkov
2018-12-14 13:56 ` James Morse
2018-12-19 14:42 ` Borislav Petkov
2019-01-10 18:20 ` James Morse
2018-12-03 18:05 ` [PATCH v7 05/25] ACPI / APEI: Make estatus pool allocation a static size James Morse
2018-12-11 16:54 ` Borislav Petkov
2018-12-03 18:05 ` [PATCH v7 06/25] ACPI / APEI: Don't store CPER records physical address in struct ghes James Morse
2018-12-11 17:04 ` Borislav Petkov
2018-12-03 18:05 ` [PATCH v7 07/25] ACPI / APEI: Remove spurious GHES_TO_CLEAR check James Morse
2018-12-11 17:18 ` Borislav Petkov
2018-12-03 18:05 ` [PATCH v7 08/25] ACPI / APEI: Don't update struct ghes' flags in read/clear estatus James Morse
2018-12-03 18:05 ` [PATCH v7 09/25] ACPI / APEI: Generalise the estatus queue's notify code James Morse
2018-12-11 17:44 ` Borislav Petkov
2019-01-10 18:21 ` James Morse
2019-01-11 11:46 ` Borislav Petkov
2018-12-03 18:05 ` [PATCH v7 10/25] ACPI / APEI: Tell firmware the estatus queue consumed the records James Morse
2018-12-11 18:36 ` Borislav Petkov
2019-01-10 18:22 ` James Morse
2019-01-10 21:01 ` Tyler Baicar
2019-01-11 12:03 ` Borislav Petkov
2019-01-11 15:32 ` Tyler Baicar [this message]
2019-01-11 17:45 ` Borislav Petkov
2019-01-11 18:25 ` James Morse
2019-01-11 19:58 ` Borislav Petkov
2019-01-23 18:36 ` James Morse
2019-01-29 11:49 ` Borislav Petkov
2019-01-29 18:48 ` James Morse
2019-01-31 13:29 ` Borislav Petkov
2019-01-11 18:09 ` James Morse
2019-01-11 20:01 ` Borislav Petkov
2019-01-11 20:53 ` Tyler Baicar
2019-01-29 18:48 ` James Morse
2018-12-03 18:05 ` [PATCH v7 11/25] ACPI / APEI: Move NOTIFY_SEA between the estatus-queue and NOTIFY_NMI James Morse
2019-01-21 13:01 ` Borislav Petkov
2018-12-03 18:06 ` [PATCH v7 12/25] ACPI / APEI: Switch NOTIFY_SEA to use the estatus queue James Morse
2018-12-03 18:06 ` [PATCH v7 13/25] KVM: arm/arm64: Add kvm_ras.h to collect kvm specific RAS plumbing James Morse
2018-12-06 16:17 ` Catalin Marinas
2018-12-03 18:06 ` [PATCH v7 14/25] arm64: KVM/mm: Move SEA handling behind a single 'claim' interface James Morse
2018-12-06 16:17 ` Catalin Marinas
2018-12-03 18:06 ` [PATCH v7 15/25] ACPI / APEI: Move locking to the notification helper James Morse
2018-12-03 18:06 ` [PATCH v7 16/25] ACPI / APEI: Let the notification helper specify the fixmap slot James Morse
2018-12-03 18:06 ` [PATCH v7 17/25] ACPI / APEI: Pass ghes and estatus separately to avoid a later copy James Morse
2019-01-21 13:35 ` Borislav Petkov
2018-12-03 18:06 ` [PATCH v7 18/25] ACPI / APEI: Split ghes_read_estatus() to allow a peek at the CPER length James Morse
2019-01-21 13:53 ` Borislav Petkov
2018-12-03 18:06 ` [PATCH v7 19/25] ACPI / APEI: Only use queued estatus entry during _in_nmi_notify_one() James Morse
2019-01-21 17:19 ` Borislav Petkov
2018-12-03 18:06 ` [PATCH v7 20/25] ACPI / APEI: Use separate fixmap pages for arm64 NMI-like notifications James Morse
2019-01-21 17:27 ` Borislav Petkov
2019-01-23 18:33 ` James Morse
2019-01-31 13:38 ` Borislav Petkov
2018-12-03 18:06 ` [PATCH v7 21/25] mm/memory-failure: Add memory_failure_queue_kick() James Morse
2018-12-03 18:06 ` [PATCH v7 22/25] ACPI / APEI: Kick the memory_failure() queue for synchronous errors James Morse
2018-12-05 2:02 ` Xie XiuQi
2018-12-10 19:15 ` James Morse
2019-01-22 10:51 ` Borislav Petkov
2019-01-23 18:37 ` James Morse
2019-01-21 17:58 ` Borislav Petkov
2019-01-23 18:40 ` James Morse
2019-01-31 14:04 ` Borislav Petkov
2018-12-03 18:06 ` [PATCH v7 23/25] arm64: acpi: Make apei_claim_sea() synchronise with APEI's irq work James Morse
2018-12-06 16:18 ` Catalin Marinas
2018-12-03 18:06 ` [PATCH v7 24/25] firmware: arm_sdei: Add ACPI GHES registration helper James Morse
2018-12-06 16:18 ` Catalin Marinas
2018-12-03 18:06 ` [PATCH v7 25/25] ACPI / APEI: Add support for the SDEI GHES Notification type James Morse
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=CABo9ajAk5XNBmNHRRfUb-dQzW7-UOs5826jPkrVz-8zrtMUYkg@mail.gmail.com \
--to=baicar.tyler@gmail.com \
--cc=bp@alien8.de \
--cc=catalin.marinas@arm.com \
--cc=christoffer.dall@arm.com \
--cc=gengdongjiu@huawei.com \
--cc=james.morse@arm.com \
--cc=kvmarm@lists.cs.columbia.edu \
--cc=lenb@kernel.org \
--cc=linux-acpi@vger.kernel.org \
--cc=linux-arm-kernel@lists.infradead.org \
--cc=linux-mm@kvack.org \
--cc=marc.zyngier@arm.com \
--cc=n-horiguchi@ah.jp.nec.com \
--cc=rjw@rjwysocki.net \
--cc=tony.luck@intel.com \
--cc=will.deacon@arm.com \
--cc=wufan@codeaurora.org \
--cc=xiexiuqi@huawei.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).