From: Borislav Petkov <bp@alien8.de> To: Mauro Carvalho Chehab <mchehab@s-opensource.com> Cc: "Kani, Toshimitsu" <toshi.kani@hpe.com>, "linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>, "tglx@linutronix.de" <tglx@linutronix.de>, "mchehab@kernel.org" <mchehab@kernel.org>, "rjw@rjwysocki.net" <rjw@rjwysocki.net>, "srinivas.pandruvada@linux.intel.com" <srinivas.pandruvada@linux.intel.com>, "tony.luck@intel.com" <tony.luck@intel.com>, "lenb@kernel.org" <lenb@kernel.org>, "linux-acpi@vger.kernel.org" <linux-acpi@vger.kernel.org>, "linux-edac@vger.kernel.org" <linux-edac@vger.kernel.org> Subject: Re: [PATCH 3/3] ghes_edac: add platform check to enable ghes_edac Date: Fri, 21 Jul 2017 19:23:45 +0200 [thread overview] Message-ID: <20170721172344.GA11316@nazgul.tnic> (raw) In-Reply-To: <20170721140131.40079805@vento.lan> On Fri, Jul 21, 2017 at 02:01:31PM -0300, Mauro Carvalho Chehab wrote: > I see the value of having a threshold in BIOS, provided that it is > well documented, and whose value can be adjusted, if needed. > > One of the things I wanted to implement in ras-daemon were an > algorithm that would be doing such threshold in software. We have that now in the kernel: drivers/ras/cec.c We did it exactly for that purpose - not upsetting users unnecessarily. > The thing with a BIOS threshold is that the user has no way to > audit the algorithm. So, when BIOS start reporting such errors, > it may be already too late: the systems may be in the verge of > losing data (or some data was already lost). Not only that: thresholds depend on the DIMM types which means, BIOS must know what DIMM types are in there which I doubt. So exposing that to configuration instead of "deciding" for people would be better. > That's critical on cluster systems with thousands of machines: > while the impact of disabling a cluster node to do some maintainance > is marginal, the impact of an uncorrected error on a single > machine may compromise weeks of expensive processing. > > That's why some users prefer to monitor every single corrected > error, and compare with the probability distribution they > know that the risk of uncorrected errors is acceptable. Yap, you need to have stuff like that configurable - BIOS can't predict all possible use cases. -- Regards/Gruss, Boris. ECO tip #101: Trim your mails when you reply. --
WARNING: multiple messages have this Message-ID (diff)
From: Borislav Petkov <bp@alien8.de> To: Mauro Carvalho Chehab <mchehab@s-opensource.com> Cc: "Kani, Toshimitsu" <toshi.kani@hpe.com>, "linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>, "tglx@linutronix.de" <tglx@linutronix.de>, "mchehab@kernel.org" <mchehab@kernel.org>, "rjw@rjwysocki.net" <rjw@rjwysocki.net>, "srinivas.pandruvada@linux.intel.com" <srinivas.pandruvada@linux.intel.com>, "tony.luck@intel.com" <tony.luck@intel.com>, "lenb@kernel.org" <lenb@kernel.org>, "linux-acpi@vger.kernel.org" <linux-acpi@vger.kernel.org>, "linux-edac@vger.kernel.org" <linux-edac@vger.kernel.org> Subject: [3/3] ghes_edac: add platform check to enable ghes_edac Date: Fri, 21 Jul 2017 19:23:45 +0200 [thread overview] Message-ID: <20170721172344.GA11316@nazgul.tnic> (raw) On Fri, Jul 21, 2017 at 02:01:31PM -0300, Mauro Carvalho Chehab wrote: > I see the value of having a threshold in BIOS, provided that it is > well documented, and whose value can be adjusted, if needed. > > One of the things I wanted to implement in ras-daemon were an > algorithm that would be doing such threshold in software. We have that now in the kernel: drivers/ras/cec.c We did it exactly for that purpose - not upsetting users unnecessarily. > The thing with a BIOS threshold is that the user has no way to > audit the algorithm. So, when BIOS start reporting such errors, > it may be already too late: the systems may be in the verge of > losing data (or some data was already lost). Not only that: thresholds depend on the DIMM types which means, BIOS must know what DIMM types are in there which I doubt. So exposing that to configuration instead of "deciding" for people would be better. > That's critical on cluster systems with thousands of machines: > while the impact of disabling a cluster node to do some maintainance > is marginal, the impact of an uncorrected error on a single > machine may compromise weeks of expensive processing. > > That's why some users prefer to monitor every single corrected > error, and compare with the probability distribution they > know that the risk of uncorrected errors is acceptable. Yap, you need to have stuff like that configurable - BIOS can't predict all possible use cases.
next prev parent reply other threads:[~2017-07-21 17:23 UTC|newest] Thread overview: 238+ messages / expand[flat|nested] mbox.gz Atom feed top 2017-07-17 21:59 [PATCH 0/3] enable ghes_edac on selected platforms Toshi Kani 2017-07-17 21:59 ` [PATCH 1/3] ACPI / blacklist: add acpi_match_oemlist() interface Toshi Kani 2017-07-17 21:59 ` [1/3] " Toshi Kani 2017-07-18 5:34 ` [PATCH 1/3] " Borislav Petkov 2017-07-18 5:34 ` [1/3] " Borislav Petkov 2017-07-18 15:48 ` [PATCH 1/3] " Kani, Toshimitsu 2017-07-18 15:48 ` [1/3] " Toshi Kani 2017-07-18 15:48 ` [PATCH 1/3] " Kani, Toshimitsu 2017-07-18 16:43 ` Borislav Petkov 2017-07-18 16:43 ` [1/3] " Borislav Petkov 2017-07-18 16:43 ` [PATCH 1/3] " Borislav Petkov 2017-07-18 17:24 ` Kani, Toshimitsu 2017-07-18 17:24 ` [1/3] " Toshi Kani 2017-07-18 17:24 ` [PATCH 1/3] " Kani, Toshimitsu 2017-07-18 17:42 ` Borislav Petkov 2017-07-18 17:42 ` [1/3] " Borislav Petkov 2017-07-18 17:42 ` [PATCH 1/3] " Borislav Petkov 2017-07-18 18:49 ` Kani, Toshimitsu 2017-07-18 18:49 ` [1/3] " Toshi Kani 2017-07-18 18:49 ` [PATCH 1/3] " Kani, Toshimitsu 2017-07-18 19:32 ` Borislav Petkov 2017-07-18 19:32 ` [1/3] " Borislav Petkov 2017-07-18 19:32 ` [PATCH 1/3] " Borislav Petkov 2017-07-18 20:17 ` Kani, Toshimitsu 2017-07-18 20:17 ` [1/3] " Toshi Kani 2017-07-18 20:17 ` [PATCH 1/3] " Kani, Toshimitsu 2017-07-17 21:59 ` [PATCH 2/3] intel_pstate: convert to use acpi_match_oemlist() Toshi Kani 2017-07-17 21:59 ` [2/3] " Toshi Kani 2017-07-17 21:59 ` [PATCH 3/3] ghes_edac: add platform check to enable ghes_edac Toshi Kani 2017-07-17 21:59 ` [3/3] " Toshi Kani 2017-07-18 6:00 ` [PATCH 3/3] " Borislav Petkov 2017-07-18 6:00 ` [3/3] " Borislav Petkov 2017-07-18 8:08 ` [PATCH 3/3] " Borislav Petkov 2017-07-18 8:08 ` [3/3] " Borislav Petkov 2017-07-18 21:20 ` [PATCH 3/3] " Kani, Toshimitsu 2017-07-18 21:20 ` [3/3] " Toshi Kani 2017-07-18 21:20 ` [PATCH 3/3] " Kani, Toshimitsu 2017-07-19 5:52 ` Borislav Petkov 2017-07-19 5:52 ` [3/3] " Borislav Petkov 2017-07-19 5:52 ` [PATCH 3/3] " Borislav Petkov 2017-07-19 16:10 ` Kani, Toshimitsu 2017-07-19 16:10 ` [3/3] " Toshi Kani 2017-07-19 16:10 ` [PATCH 3/3] " Kani, Toshimitsu 2017-07-19 16:22 ` Borislav Petkov 2017-07-19 16:22 ` [3/3] " Borislav Petkov 2017-07-19 16:22 ` [PATCH 3/3] " Borislav Petkov 2017-07-19 16:56 ` Kani, Toshimitsu 2017-07-19 16:56 ` [3/3] " Toshi Kani 2017-07-19 16:56 ` [PATCH 3/3] " Kani, Toshimitsu 2017-07-20 4:16 ` Borislav Petkov 2017-07-20 4:16 ` [3/3] " Borislav Petkov 2017-07-20 4:16 ` [PATCH 3/3] " Borislav Petkov 2017-07-20 14:42 ` Kani, Toshimitsu 2017-07-20 14:42 ` [3/3] " Toshi Kani 2017-07-20 14:42 ` [PATCH 3/3] " Kani, Toshimitsu 2017-07-20 15:04 ` Borislav Petkov 2017-07-20 15:04 ` [3/3] " Borislav Petkov 2017-07-20 15:04 ` [PATCH 3/3] " Borislav Petkov 2017-07-20 16:55 ` Luck, Tony 2017-07-20 16:55 ` [3/3] " Luck, Tony 2017-07-20 16:55 ` [PATCH 3/3] " Luck, Tony 2017-07-20 17:05 ` Borislav Petkov 2017-07-20 17:05 ` [3/3] " Borislav Petkov 2017-07-20 17:05 ` [PATCH 3/3] " Borislav Petkov 2017-07-20 17:10 ` Luck, Tony 2017-07-20 17:10 ` [3/3] " Luck, Tony 2017-07-20 17:10 ` [PATCH 3/3] " Luck, Tony 2017-07-20 18:16 ` Mauro Carvalho Chehab 2017-07-20 18:16 ` [3/3] " Mauro Carvalho Chehab 2017-07-20 18:16 ` [PATCH 3/3] " Mauro Carvalho Chehab 2017-07-19 18:55 ` Aristeu Rozanski 2017-07-19 18:55 ` [3/3] " Aristeu Rozanski 2017-07-19 18:55 ` [PATCH 3/3] " Aristeu Rozanski 2017-07-19 20:13 ` Kani, Toshimitsu 2017-07-19 20:13 ` [3/3] " Toshi Kani 2017-07-19 20:13 ` [PATCH 3/3] " Kani, Toshimitsu 2017-07-20 4:19 ` Borislav Petkov 2017-07-20 4:19 ` [3/3] " Borislav Petkov 2017-07-20 4:19 ` [PATCH 3/3] " Borislav Petkov 2017-07-18 19:58 ` Kani, Toshimitsu 2017-07-18 19:58 ` [3/3] " Toshi Kani 2017-07-18 19:58 ` [PATCH 3/3] " Kani, Toshimitsu 2017-07-18 21:15 ` Mauro Carvalho Chehab 2017-07-18 21:15 ` [3/3] " Mauro Carvalho Chehab 2017-07-18 21:15 ` [PATCH 3/3] " Mauro Carvalho Chehab 2017-07-19 5:58 ` Borislav Petkov 2017-07-19 5:58 ` [3/3] " Borislav Petkov 2017-07-19 5:58 ` [PATCH 3/3] " Borislav Petkov 2017-07-19 15:14 ` Luck, Tony 2017-07-19 15:14 ` [3/3] " Luck, Tony 2017-07-19 15:14 ` [PATCH 3/3] " Luck, Tony 2017-07-19 15:57 ` Borislav Petkov 2017-07-19 15:57 ` [3/3] " Borislav Petkov 2017-07-19 15:57 ` [PATCH 3/3] " Borislav Petkov 2017-07-19 18:06 ` Luck, Tony 2017-07-19 18:06 ` [3/3] " Luck, Tony 2017-07-19 18:06 ` [PATCH 3/3] " Luck, Tony 2017-07-19 16:02 ` Mauro Carvalho Chehab 2017-07-19 16:02 ` [3/3] " Mauro Carvalho Chehab 2017-07-19 20:06 ` [PATCH 3/3] " Luck, Tony 2017-07-19 20:06 ` [3/3] " Luck, Tony 2017-07-20 21:15 ` [PATCH 3/3] " Luck, Tony 2017-07-20 21:15 ` [3/3] " Luck, Tony 2017-07-21 0:00 ` [PATCH 3/3] " Mauro Carvalho Chehab 2017-07-21 0:00 ` [3/3] " Mauro Carvalho Chehab 2017-07-21 16:53 ` [PATCH 3/3] " Luck, Tony 2017-07-21 16:53 ` [3/3] " Luck, Tony 2017-07-19 16:40 ` [PATCH 3/3] " Kani, Toshimitsu 2017-07-19 16:40 ` [3/3] " Toshi Kani 2017-07-19 16:40 ` [PATCH 3/3] " Kani, Toshimitsu 2017-07-20 4:33 ` Borislav Petkov 2017-07-20 4:33 ` [3/3] " Borislav Petkov 2017-07-20 4:33 ` [PATCH 3/3] " Borislav Petkov 2017-07-20 19:50 ` Kani, Toshimitsu 2017-07-20 19:50 ` [3/3] " Toshi Kani 2017-07-20 19:50 ` [PATCH 3/3] " Kani, Toshimitsu 2017-07-20 20:15 ` Mauro Carvalho Chehab 2017-07-20 20:15 ` [3/3] " Mauro Carvalho Chehab 2017-07-20 20:15 ` [PATCH 3/3] " Mauro Carvalho Chehab 2017-07-20 21:07 ` Kani, Toshimitsu 2017-07-20 21:07 ` [3/3] " Toshi Kani 2017-07-20 21:07 ` [PATCH 3/3] " Kani, Toshimitsu 2017-07-21 13:34 ` Borislav Petkov 2017-07-21 13:34 ` [3/3] " Borislav Petkov 2017-07-21 13:34 ` [PATCH 3/3] " Borislav Petkov 2017-07-21 13:40 ` Mauro Carvalho Chehab 2017-07-21 13:40 ` [3/3] " Mauro Carvalho Chehab 2017-07-21 13:40 ` [PATCH 3/3] " Mauro Carvalho Chehab 2017-07-21 13:47 ` Borislav Petkov 2017-07-21 13:47 ` [3/3] " Borislav Petkov 2017-07-21 13:47 ` [PATCH 3/3] " Borislav Petkov 2017-07-21 15:08 ` Kani, Toshimitsu 2017-07-21 15:08 ` [3/3] " Toshi Kani 2017-07-21 15:08 ` [PATCH 3/3] " Kani, Toshimitsu 2017-07-21 15:13 ` Borislav Petkov 2017-07-21 15:13 ` [3/3] " Borislav Petkov 2017-07-21 15:13 ` [PATCH 3/3] " Borislav Petkov 2017-07-21 15:34 ` Kani, Toshimitsu 2017-07-21 15:34 ` [3/3] " Toshi Kani 2017-07-21 15:34 ` [PATCH 3/3] " Kani, Toshimitsu 2017-07-21 15:44 ` Mauro Carvalho Chehab 2017-07-21 15:44 ` [3/3] " Mauro Carvalho Chehab 2017-07-21 15:44 ` [PATCH 3/3] " Mauro Carvalho Chehab 2017-07-21 16:40 ` Kani, Toshimitsu 2017-07-21 16:40 ` [3/3] " Toshi Kani 2017-07-21 16:40 ` [PATCH 3/3] " Kani, Toshimitsu 2017-07-21 17:01 ` Mauro Carvalho Chehab 2017-07-21 17:01 ` [3/3] " Mauro Carvalho Chehab 2017-07-21 17:01 ` [PATCH 3/3] " Mauro Carvalho Chehab 2017-07-21 17:21 ` Kani, Toshimitsu 2017-07-21 17:21 ` [3/3] " Toshi Kani 2017-07-21 17:21 ` [PATCH 3/3] " Kani, Toshimitsu 2017-07-21 17:23 ` Borislav Petkov [this message] 2017-07-21 17:23 ` [3/3] " Borislav Petkov 2017-07-21 17:23 ` [PATCH 3/3] " Borislav Petkov 2017-07-21 18:38 ` Kani, Toshimitsu 2017-07-21 18:38 ` [3/3] " Toshi Kani 2017-07-21 18:38 ` [PATCH 3/3] " Kani, Toshimitsu 2017-07-22 6:28 ` Borislav Petkov 2017-07-22 6:28 ` [3/3] " Borislav Petkov 2017-07-22 6:28 ` [PATCH 3/3] " Borislav Petkov 2017-07-24 14:49 ` Kani, Toshimitsu 2017-07-24 14:49 ` [3/3] " Toshi Kani 2017-07-24 14:49 ` [PATCH 3/3] " Kani, Toshimitsu 2017-07-24 15:04 ` Borislav Petkov 2017-07-24 15:04 ` [3/3] " Borislav Petkov 2017-07-24 15:04 ` [PATCH 3/3] " Borislav Petkov 2017-07-24 15:25 ` Kani, Toshimitsu 2017-07-24 15:25 ` [3/3] " Toshi Kani 2017-07-24 15:25 ` [PATCH 3/3] " Kani, Toshimitsu 2017-07-24 15:37 ` Borislav Petkov 2017-07-24 15:37 ` [3/3] " Borislav Petkov 2017-07-24 15:37 ` [PATCH 3/3] " Borislav Petkov 2017-07-24 15:56 ` Kani, Toshimitsu 2017-07-24 15:56 ` [3/3] " Toshi Kani 2017-07-24 15:56 ` [PATCH 3/3] " Kani, Toshimitsu 2017-07-24 16:37 ` Borislav Petkov 2017-07-24 16:37 ` [3/3] " Borislav Petkov 2017-07-24 16:37 ` [PATCH 3/3] " Borislav Petkov 2017-07-24 17:44 ` Kani, Toshimitsu 2017-07-24 17:44 ` [3/3] " Toshi Kani 2017-07-24 17:44 ` [PATCH 3/3] " Kani, Toshimitsu 2017-07-24 17:50 ` Boris Petkov 2017-07-24 17:50 ` [3/3] " Borislav Petkov 2017-07-24 17:50 ` [PATCH 3/3] " Boris Petkov 2017-07-24 17:54 ` Kani, Toshimitsu 2017-07-24 17:54 ` [3/3] " Toshi Kani 2017-07-24 17:54 ` [PATCH 3/3] " Kani, Toshimitsu 2017-07-24 18:18 ` Borislav Petkov 2017-07-24 18:18 ` [3/3] " Borislav Petkov 2017-07-24 18:18 ` [PATCH 3/3] " Borislav Petkov 2017-07-24 17:56 ` Mauro Carvalho Chehab 2017-07-24 17:56 ` [3/3] " Mauro Carvalho Chehab 2017-07-24 17:56 ` [PATCH 3/3] " Mauro Carvalho Chehab 2017-07-24 18:12 ` Kani, Toshimitsu 2017-07-24 18:12 ` [3/3] " Toshi Kani 2017-07-24 18:12 ` [PATCH 3/3] " Kani, Toshimitsu 2017-07-24 16:04 ` Mauro Carvalho Chehab 2017-07-24 16:04 ` [3/3] " Mauro Carvalho Chehab 2017-07-24 16:04 ` [PATCH 3/3] " Mauro Carvalho Chehab 2017-07-24 16:44 ` Borislav Petkov 2017-07-24 16:44 ` [3/3] " Borislav Petkov 2017-07-24 16:44 ` [PATCH 3/3] " Borislav Petkov 2017-07-24 18:10 ` Mauro Carvalho Chehab 2017-07-24 18:10 ` [3/3] " Mauro Carvalho Chehab 2017-07-24 18:10 ` [PATCH 3/3] " Mauro Carvalho Chehab 2017-07-24 18:30 ` Borislav Petkov 2017-07-24 18:30 ` [3/3] " Borislav Petkov 2017-07-24 18:30 ` [PATCH 3/3] " Borislav Petkov 2017-07-25 23:00 ` Kani, Toshimitsu 2017-07-25 23:00 ` [3/3] " Toshi Kani 2017-07-25 23:00 ` [PATCH 3/3] " Kani, Toshimitsu 2017-07-21 15:53 ` Borislav Petkov 2017-07-21 15:53 ` [3/3] " Borislav Petkov 2017-07-21 15:53 ` [PATCH 3/3] " Borislav Petkov 2017-07-21 16:32 ` Kani, Toshimitsu 2017-07-21 16:32 ` [3/3] " Toshi Kani 2017-07-21 16:32 ` [PATCH 3/3] " Kani, Toshimitsu 2017-07-19 5:55 ` Borislav Petkov 2017-07-19 5:55 ` [3/3] " Borislav Petkov 2017-07-19 5:55 ` [PATCH 3/3] " Borislav Petkov 2017-07-18 22:13 ` Luck, Tony 2017-07-18 22:13 ` [3/3] " Luck, Tony 2017-07-18 22:13 ` [PATCH 3/3] " Luck, Tony 2017-07-19 6:01 ` Borislav Petkov 2017-07-19 6:01 ` [3/3] " Borislav Petkov 2017-07-19 6:01 ` [PATCH 3/3] " Borislav Petkov 2017-07-18 14:39 ` Jeffrey Hugo 2017-07-18 14:39 ` [3/3] " Jeffrey Hugo 2017-07-18 15:36 ` [PATCH 3/3] " Kani, Toshimitsu 2017-07-18 15:36 ` [3/3] " Toshi Kani 2017-07-18 15:36 ` [PATCH 3/3] " Kani, Toshimitsu 2017-07-18 16:24 ` Jeffrey Hugo 2017-07-18 16:24 ` [3/3] " Jeffrey Hugo 2017-07-18 16:24 ` [PATCH 3/3] " Jeffrey Hugo 2017-07-18 16:42 ` Kani, Toshimitsu 2017-07-18 16:42 ` [3/3] " Toshi Kani 2017-07-18 16:42 ` [PATCH 3/3] " Kani, Toshimitsu
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=20170721172344.GA11316@nazgul.tnic \ --to=bp@alien8.de \ --cc=lenb@kernel.org \ --cc=linux-acpi@vger.kernel.org \ --cc=linux-edac@vger.kernel.org \ --cc=linux-kernel@vger.kernel.org \ --cc=mchehab@kernel.org \ --cc=mchehab@s-opensource.com \ --cc=rjw@rjwysocki.net \ --cc=srinivas.pandruvada@linux.intel.com \ --cc=tglx@linutronix.de \ --cc=tony.luck@intel.com \ --cc=toshi.kani@hpe.com \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: linkBe sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.