From: "Alex G." <mr.nuke.me@gmail.com> To: Borislav Petkov <bp@alien8.de> Cc: linux-acpi@vger.kernel.org, linux-edac@vger.kernel.org, rjw@rjwysocki.net, lenb@kernel.org, tony.luck@intel.com, tbaicar@codeaurora.org, will.deacon@arm.com, james.morse@arm.com, shiju.jose@huawei.com, zjzhang@codeaurora.org, gengdongjiu@huawei.com, linux-kernel@vger.kernel.org, alex_gagniuc@dellteam.com, austin_bolen@dell.com, shyam_iyer@dell.com, devel@acpica.org, mchehab@kernel.org, robert.moore@intel.com, erik.schmauss@intel.com Subject: Re: [RFC PATCH v2 4/4] acpi: apei: Warn when GHES marks correctable errors as "fatal" Date: Thu, 19 Apr 2018 10:11:03 -0500 [thread overview] Message-ID: <807002b1-ccb9-22c8-6563-ade7e44912ff@gmail.com> (raw) In-Reply-To: <20180418175452.GK4795@pd.tnic> On 04/18/2018 12:54 PM, Borislav Petkov wrote: > On Mon, Apr 16, 2018 at 04:59:03PM -0500, Alexandru Gagniuc wrote: (snip) >> + >> + corrected_sev = max(corrected_sev, sec_sev); >> + } >> + >> + if ((sev >= GHES_SEV_PANIC) && (corrected_sev < sev)) { >> + pr_warn("FIRMWARE BUG: Firmware sent fatal error that we were able to correct"); >> + pr_warn("BROKEN FIRMWARE: Complain to your hardware vendor"); > > No, I don't want any of that crap issuing stuff in dmesg and then people > opening bugs and running around and trying to replace hardware. > > We either can handle the error and log a normal record somewhere or we > cannot and explode. There is value in this. From my observations, fw claims it will do everything through FFS, yet fails to fully handle the situation. It's rooted in FW's assumptions about OS behavior. Because the (old) versions of windows, esxi, and rhel used during development crash, fw assumes that _all_ OSes crash. The result in a surprising majority of cases is that FFS doesn't properly handle recurring errors, and fw is, in fact, broken. > The complaining about the FW doesn't bring shit. You are correct. It doesn't bring defecation. It brings a red flag that helps people get closer to the root cause of problems. That being said, I can just drop this patch. Alex
WARNING: multiple messages have this Message-ID (diff)
From: Alexandru Gagniuc <mr.nuke.me@gmail.com> To: Borislav Petkov <bp@alien8.de> Cc: linux-acpi@vger.kernel.org, linux-edac@vger.kernel.org, rjw@rjwysocki.net, lenb@kernel.org, tony.luck@intel.com, tbaicar@codeaurora.org, will.deacon@arm.com, james.morse@arm.com, shiju.jose@huawei.com, zjzhang@codeaurora.org, gengdongjiu@huawei.com, linux-kernel@vger.kernel.org, alex_gagniuc@dellteam.com, austin_bolen@dell.com, shyam_iyer@dell.com, devel@acpica.org, mchehab@kernel.org, robert.moore@intel.com, erik.schmauss@intel.com Subject: [RFC,v2,4/4] acpi: apei: Warn when GHES marks correctable errors as "fatal" Date: Thu, 19 Apr 2018 10:11:03 -0500 [thread overview] Message-ID: <807002b1-ccb9-22c8-6563-ade7e44912ff@gmail.com> (raw) On 04/18/2018 12:54 PM, Borislav Petkov wrote: > On Mon, Apr 16, 2018 at 04:59:03PM -0500, Alexandru Gagniuc wrote: (snip) >> + >> + corrected_sev = max(corrected_sev, sec_sev); >> + } >> + >> + if ((sev >= GHES_SEV_PANIC) && (corrected_sev < sev)) { >> + pr_warn("FIRMWARE BUG: Firmware sent fatal error that we were able to correct"); >> + pr_warn("BROKEN FIRMWARE: Complain to your hardware vendor"); > > No, I don't want any of that crap issuing stuff in dmesg and then people > opening bugs and running around and trying to replace hardware. > > We either can handle the error and log a normal record somewhere or we > cannot and explode. There is value in this. From my observations, fw claims it will do everything through FFS, yet fails to fully handle the situation. It's rooted in FW's assumptions about OS behavior. Because the (old) versions of windows, esxi, and rhel used during development crash, fw assumes that _all_ OSes crash. The result in a surprising majority of cases is that FFS doesn't properly handle recurring errors, and fw is, in fact, broken. > The complaining about the FW doesn't bring shit. You are correct. It doesn't bring defecation. It brings a red flag that helps people get closer to the root cause of problems. That being said, I can just drop this patch. Alex --- To unsubscribe from this list: send the line "unsubscribe linux-edac" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
next prev parent reply other threads:[~2018-04-19 15:11 UTC|newest] Thread overview: 89+ messages / expand[flat|nested] mbox.gz Atom feed top 2018-04-16 21:58 [RFC PATCH v2 0/4] acpi: apei: Improve error handling with firmware-first Alexandru Gagniuc 2018-04-16 21:59 ` [RFC PATCH v2 1/4] EDAC, GHES: Remove unused argument to ghes_edac_report_mem_error Alexandru Gagniuc 2018-04-16 21:59 ` [RFC,v2,1/4] " Alexandru Gagniuc 2018-04-17 9:36 ` [RFC PATCH v2 1/4] " Borislav Petkov 2018-04-17 9:36 ` [RFC,v2,1/4] " Borislav Petkov 2018-04-17 16:43 ` [RFC PATCH v2 1/4] " Alex G. 2018-04-17 16:43 ` [RFC,v2,1/4] " Alexandru Gagniuc 2018-04-16 21:59 ` [RFC PATCH v2 2/4] acpi: apei: Split GHES handlers outside of ghes_do_proc Alexandru Gagniuc 2018-04-16 21:59 ` [RFC,v2,2/4] " Alexandru Gagniuc 2018-04-18 17:52 ` [RFC PATCH v2 2/4] " Borislav Petkov 2018-04-18 17:52 ` [RFC,v2,2/4] " Borislav Petkov 2018-04-19 14:19 ` [RFC PATCH v2 2/4] " Alex G. 2018-04-19 14:19 ` [RFC,v2,2/4] " Alexandru Gagniuc 2018-04-19 14:30 ` [RFC PATCH v2 2/4] " Borislav Petkov 2018-04-19 14:30 ` [RFC,v2,2/4] " Borislav Petkov 2018-04-19 14:57 ` [RFC PATCH v2 2/4] " Alex G. 2018-04-19 14:57 ` [RFC,v2,2/4] " Alexandru Gagniuc 2018-04-19 15:29 ` [RFC PATCH v2 2/4] " Borislav Petkov 2018-04-19 15:29 ` [RFC,v2,2/4] " Borislav Petkov 2018-04-19 15:46 ` [RFC PATCH v2 2/4] " Alex G. 2018-04-19 15:46 ` [RFC,v2,2/4] " Alexandru Gagniuc 2018-04-19 16:40 ` [RFC PATCH v2 2/4] " Borislav Petkov 2018-04-19 16:40 ` [RFC,v2,2/4] " Borislav Petkov 2018-04-16 21:59 ` [RFC PATCH v2 3/4] acpi: apei: Do not panic() when correctable errors are marked as fatal Alexandru Gagniuc 2018-04-16 21:59 ` [RFC,v2,3/4] " Alexandru Gagniuc 2018-04-18 17:54 ` [RFC PATCH v2 3/4] " Borislav Petkov 2018-04-18 17:54 ` [RFC,v2,3/4] " Borislav Petkov 2018-04-19 14:57 ` [RFC PATCH v2 3/4] " Alex G. 2018-04-19 14:57 ` [RFC,v2,3/4] " Alexandru Gagniuc 2018-04-19 15:35 ` [RFC PATCH v2 3/4] " James Morse 2018-04-19 15:35 ` [Devel] " James Morse 2018-04-19 15:35 ` [RFC,v2,3/4] " James Morse 2018-04-19 16:27 ` [RFC PATCH v2 3/4] " Alex G. 2018-04-19 16:27 ` [RFC,v2,3/4] " Alexandru Gagniuc 2018-04-19 15:40 ` [RFC PATCH v2 3/4] " Borislav Petkov 2018-04-19 15:40 ` [RFC,v2,3/4] " Borislav Petkov 2018-04-19 16:26 ` [RFC PATCH v2 3/4] " Alex G. 2018-04-19 16:26 ` [RFC,v2,3/4] " Alexandru Gagniuc 2018-04-19 16:45 ` [RFC PATCH v2 3/4] " Borislav Petkov 2018-04-19 16:45 ` [RFC,v2,3/4] " Borislav Petkov 2018-04-19 17:40 ` [RFC PATCH v2 3/4] " Alex G. 2018-04-19 17:40 ` [RFC,v2,3/4] " Alexandru Gagniuc 2018-04-19 19:03 ` [RFC PATCH v2 3/4] " Borislav Petkov 2018-04-19 19:03 ` [RFC,v2,3/4] " Borislav Petkov 2018-04-19 22:55 ` [RFC PATCH v2 3/4] " Alex G. 2018-04-19 22:55 ` [RFC,v2,3/4] " Alexandru Gagniuc 2018-04-22 10:48 ` [RFC PATCH v2 3/4] " Borislav Petkov 2018-04-22 10:48 ` [RFC,v2,3/4] " Borislav Petkov 2018-04-24 4:19 ` [RFC PATCH v2 3/4] " Alex G. 2018-04-24 4:19 ` [RFC,v2,3/4] " Alexandru Gagniuc 2018-04-25 14:01 ` [RFC PATCH v2 3/4] " Borislav Petkov 2018-04-25 14:01 ` [RFC,v2,3/4] " Borislav Petkov 2018-04-25 15:00 ` [RFC PATCH v2 3/4] " Alex G. 2018-04-25 15:00 ` [RFC,v2,3/4] " Alexandru Gagniuc 2018-04-25 17:15 ` [RFC PATCH v2 3/4] " Borislav Petkov 2018-04-25 17:15 ` [RFC,v2,3/4] " Borislav Petkov 2018-04-25 17:27 ` [RFC PATCH v2 3/4] " Alex G. 2018-04-25 17:27 ` [RFC,v2,3/4] " Alexandru Gagniuc 2018-04-25 17:39 ` [RFC PATCH v2 3/4] " Borislav Petkov 2018-04-25 17:39 ` [RFC,v2,3/4] " Borislav Petkov 2018-04-16 21:59 ` [RFC PATCH v2 4/4] acpi: apei: Warn when GHES marks correctable errors as "fatal" Alexandru Gagniuc 2018-04-16 21:59 ` [RFC,v2,4/4] " Alexandru Gagniuc 2018-04-18 17:54 ` [RFC PATCH v2 4/4] " Borislav Petkov 2018-04-18 17:54 ` [RFC,v2,4/4] " Borislav Petkov 2018-04-19 15:11 ` Alex G. [this message] 2018-04-19 15:11 ` Alexandru Gagniuc 2018-04-19 15:46 ` [RFC PATCH v2 4/4] " Borislav Petkov 2018-04-19 15:46 ` [RFC,v2,4/4] " Borislav Petkov 2018-04-25 20:39 ` [RFC PATCH v3 0/3] acpi: apei: Improve PCIe error handling with firmware-first Alexandru Gagniuc 2018-04-25 20:39 ` [RFC PATCH v3 1/3] EDAC, GHES: Remove unused argument to ghes_edac_report_mem_error Alexandru Gagniuc 2018-04-25 20:39 ` [RFC,v3,1/3] " Alexandru Gagniuc 2018-04-25 20:39 ` [RFC PATCH v3 2/3] acpi: apei: Do not panic() on PCIe errors reported through GHES Alexandru Gagniuc 2018-04-25 20:39 ` [RFC,v3,2/3] " Alexandru Gagniuc 2018-04-26 11:19 ` [RFC PATCH v3 2/3] " Borislav Petkov 2018-04-26 11:19 ` [RFC,v3,2/3] " Borislav Petkov 2018-04-26 17:44 ` [RFC PATCH v3 2/3] " Alex G. 2018-04-26 17:44 ` [RFC,v3,2/3] " Alexandru Gagniuc 2018-04-25 20:39 ` [RFC PATCH v3 3/3] acpi: apei: Warn when GHES marks correctable errors as "fatal" Alexandru Gagniuc 2018-04-25 20:39 ` [RFC,v3,3/3] " Alexandru Gagniuc 2018-04-26 11:20 ` [RFC PATCH v3 3/3] " Borislav Petkov 2018-04-26 11:20 ` [RFC,v3,3/3] " Borislav Petkov 2018-04-26 17:47 ` [RFC PATCH v3 3/3] " Alex G. 2018-04-26 17:47 ` [RFC,v3,3/3] " Alexandru Gagniuc 2018-04-26 18:03 ` [RFC PATCH v3 3/3] " Borislav Petkov 2018-04-26 18:03 ` [RFC,v3,3/3] " Borislav Petkov 2018-05-02 19:10 ` [RFC PATCH v3 3/3] " Pavel Machek 2018-05-02 19:10 ` [RFC,v3,3/3] " Pavel Machek 2018-05-02 19:29 ` [RFC PATCH v3 3/3] " Alex G. 2018-05-02 19:29 ` [RFC,v3,3/3] " Alexandru Gagniuc
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=807002b1-ccb9-22c8-6563-ade7e44912ff@gmail.com \ --to=mr.nuke.me@gmail.com \ --cc=alex_gagniuc@dellteam.com \ --cc=austin_bolen@dell.com \ --cc=bp@alien8.de \ --cc=devel@acpica.org \ --cc=erik.schmauss@intel.com \ --cc=gengdongjiu@huawei.com \ --cc=james.morse@arm.com \ --cc=lenb@kernel.org \ --cc=linux-acpi@vger.kernel.org \ --cc=linux-edac@vger.kernel.org \ --cc=linux-kernel@vger.kernel.org \ --cc=mchehab@kernel.org \ --cc=rjw@rjwysocki.net \ --cc=robert.moore@intel.com \ --cc=shiju.jose@huawei.com \ --cc=shyam_iyer@dell.com \ --cc=tbaicar@codeaurora.org \ --cc=tony.luck@intel.com \ --cc=will.deacon@arm.com \ --cc=zjzhang@codeaurora.org \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: linkBe sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.