From: "Luck, Tony" <tony.luck@intel.com>
To: Borislav Petkov <bp@alien8.de>
Cc: linux-edac <linux-edac@vger.kernel.org>,
Yazen Ghannam <Yazen.Ghannam@amd.com>, X86 ML <x86@kernel.org>,
LKML <linux-kernel@vger.kernel.org>
Subject: Re: [RFC PATCH 1/4] RAS: Add a Corrected Errors Collector
Date: Tue, 7 Jun 2016 11:11:09 -0700 [thread overview]
Message-ID: <20160607181109.GA23770@intel.com> (raw)
In-Reply-To: <1465318345-22043-2-git-send-email-bp@alien8.de>
On Tue, Jun 07, 2016 at 06:52:22PM +0200, Borislav Petkov wrote:
> +void mce_log(struct mce *m)
> {
> unsigned next, entry;
>
> + if (!in_atomic() && memory_error(m) && mce_usable_address(m))
> + if (!ce_add_elem(m->addr >> PAGE_SHIFT))
> + return;
> +
> /* Emit the trace record: */
> - trace_mce_record(mce);
> + trace_mce_record(m);
>
> - if (!mce_gen_pool_add(mce))
> + if (!mce_gen_pool_add(m))
> irq_work_queue(&mce_irq_work);
Is there a reason that we need to call the ce_add_elem() inline
here instead of having it just register on the mce_notifier chain?
This series just cleaned out all the /dev/mcelog special code from
here, and you are adding something back before the ink is dry on
that change.
I'm also strongly divided about whether this corrected error
handler should be allowed to preempt anything else even seeing
the error.
Argument for:
Lonely corrected errors are "No Big Deal"(TM). Just counting them
and moving on is a good thing.
Arguments against:
1) We may miss out on a one-time opportunity to get extra information
(from acpi_extlog.c).
2) I think this subverts our CMCI storm detection and mitigation code?
We could make the chain more caller friendly by adding a filter
argument so users could say "just tell me about memory errors"
(currently each of the EDAC drivers has inline code to do the same
as "memory_error(m) && mce_usable_address(m)")
-Tony
next prev parent reply other threads:[~2016-06-07 18:11 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-06-07 16:52 [R-F-C PATCH 0/4] RFC: x86/mce: Deprecate mcelog and other funsies Borislav Petkov
2016-06-07 16:52 ` [RFC PATCH 1/4] RAS: Add a Corrected Errors Collector Borislav Petkov
2016-06-07 18:11 ` Luck, Tony [this message]
2016-06-07 21:04 ` Borislav Petkov
2016-06-07 16:52 ` [RFC PATCH 2/4] x86/mce: Deprecate /dev/mcelog Borislav Petkov
2016-06-13 9:00 ` Thomas Gleixner
2016-06-07 16:52 ` [RFC PATCH 3/4] x86/mce: Merge mce_amd_inj into mce-inject Borislav Petkov
2016-06-07 16:52 ` [RFC PATCH 4/4] x86/mce-inject: Use debugfs_remove_recursive() Borislav Petkov
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20160607181109.GA23770@intel.com \
--to=tony.luck@intel.com \
--cc=Yazen.Ghannam@amd.com \
--cc=bp@alien8.de \
--cc=linux-edac@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=x86@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).