linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Borislav Petkov <bp@alien8.de>
To: Cong Wang <xiyou.wangcong@gmail.com>
Cc: Tony Luck <tony.luck@intel.com>, LKML <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH] RAS/CEC: Add debugfs switch to disable at run time
Date: Sat, 20 Apr 2019 20:47:52 +0200	[thread overview]
Message-ID: <20190420184751.GE29704@zn.tnic> (raw)
In-Reply-To: <CAM_iQpUaBOS92u7K+hzzRX5hw0Kbrp-0o0DSfLwSGBhenntVjA@mail.gmail.com>

On Sat, Apr 20, 2019 at 11:18:46AM -0700, Cong Wang wrote:
> You didn't answer my question here, because I asked you whether
> the following change (PoC only) makes sense:

I answered it - the answer is to disable CONFIG_RAS_CEC. But let me do a
more detailed answer, maybe that'll help.

The PoC doesn't make sense.

Why?

Because if you don't return early from the notifier when the CEC has
consumed the error, you don't need the CEC at all. Ergo, you can just as
well disable it.

Because, let me paste from a couple of mails ago what the CEC is:

"CEC is something *completely* different and its purpose is to run in
the kernel and prevent users and admins from upsetting unnecessarily
with every sporadic correctable error and just because an alpha particle
flew through their DIMMs, they all start running in headless chicken
mode, trying to RMA perfectly good hardware."

IOW, when you have the CEC enabled, you don't need to log memory errors
with a userspace agent. The CEC collects them and discards them if they
don't repeat.

If they do repeat, then it offlines the page.

Without user intervention and interference.

Now, if you still want to know how many errors and where they happened
and when they happened and yadda yadda, you *disable* the CEC.

I hope this makes more sense now.

> I knew disabling it could cure the problem from the beginning, please
> save your own time by not repeating things we both already knew. :)
> 
> Once again, I still don't think it is the right answer, which is also why I
> keep finding different solutions.

This is where you come in and say "it is not the right answer
because..." and give your arguments why. I gave mine a couple of times
already. I never said this functionality is cast in stone the way it is
but there has to be a *good* *reason* why it needs to be changed. I.e.,
basic kernel deveopment. People come with ideas and they *justify* those
ideas with arguments why they're better.

> I know you disagree, but you never explain why you disagree,

You're kidding, right?

https://lkml.kernel.org/r/20190419002645.GA559@zn.tnic

-- 
Regards/Gruss,
    Boris.

Good mailing practices for 400: avoid top-posting and trim the reply.

  reply	other threads:[~2019-04-20 18:56 UTC|newest]

Thread overview: 25+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-04-18 22:02 [PATCH] RAS/CEC: Add debugfs switch to disable at run time Tony Luck
2019-04-18 22:51 ` Cong Wang
2019-04-18 23:29   ` Borislav Petkov
2019-04-18 23:58     ` Cong Wang
2019-04-19  0:26       ` Borislav Petkov
2019-04-20  5:43         ` Cong Wang
2019-04-20  9:13           ` Borislav Petkov
2019-04-20 18:18             ` Cong Wang
2019-04-20 18:47               ` Borislav Petkov [this message]
2019-04-20 19:08                 ` Cong Wang
2019-04-22 16:29                 ` Luck, Tony
2019-04-22 16:31                   ` Borislav Petkov
2019-04-22 16:43                     ` Luck, Tony
2019-04-22 17:05                       ` Borislav Petkov
2019-04-22 17:23                         ` Luck, Tony
2019-04-19  0:07     ` Luck, Tony
2019-04-19  0:29       ` Borislav Petkov
2019-04-19 15:04         ` Luck, Tony
2019-04-20  9:41           ` Borislav Petkov
2019-04-22 15:59             ` Luck, Tony
2019-04-22 17:15               ` Borislav Petkov
2019-04-22 17:44                 ` Luck, Tony
2019-04-22 18:08                   ` Borislav Petkov
2019-04-20  5:50       ` Cong Wang
2019-04-20 19:50 ` Borislav Petkov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190420184751.GE29704@zn.tnic \
    --to=bp@alien8.de \
    --cc=linux-kernel@vger.kernel.org \
    --cc=tony.luck@intel.com \
    --cc=xiyou.wangcong@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).