From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.0 required=3.0 tests=HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_PATCH,MAILING_LIST_MULTI, MENTIONS_GIT_HOSTING,SIGNED_OFF_BY,SPF_PASS,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id B919CC43381 for ; Wed, 27 Mar 2019 09:58:42 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 8AA842070B for ; Wed, 27 Mar 2019 09:58:42 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1732972AbfC0J6l (ORCPT ); Wed, 27 Mar 2019 05:58:41 -0400 Received: from terminus.zytor.com ([198.137.202.136]:38083 "EHLO terminus.zytor.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1731668AbfC0J6k (ORCPT ); Wed, 27 Mar 2019 05:58:40 -0400 Received: from terminus.zytor.com (localhost [127.0.0.1]) by terminus.zytor.com (8.15.2/8.15.2) with ESMTPS id x2R9wIJi2804316 (version=TLSv1.3 cipher=TLS_AES_256_GCM_SHA384 bits=256 verify=NO); Wed, 27 Mar 2019 02:58:18 -0700 Received: (from tipbot@localhost) by terminus.zytor.com (8.15.2/8.15.2/Submit) id x2R9wHCY2804313; Wed, 27 Mar 2019 02:58:17 -0700 Date: Wed, 27 Mar 2019 02:58:17 -0700 X-Authentication-Warning: terminus.zytor.com: tipbot set sender to tipbot@zytor.com using -f From: tip-bot for Tony Luck Message-ID: Cc: ashok.raj@intel.com, tony.luck@intel.com, linux-kernel@vger.kernel.org, tglx@linutronix.de, mingo@redhat.com, Yazen.Ghannam@amd.com, mingo@kernel.org, hpa@zytor.com, x86@kernel.org, bp@suse.de, linux-edac@vger.kernel.org Reply-To: tglx@linutronix.de, mingo@redhat.com, Yazen.Ghannam@amd.com, linux-edac@vger.kernel.org, mingo@kernel.org, ashok.raj@intel.com, hpa@zytor.com, linux-kernel@vger.kernel.org, tony.luck@intel.com, bp@suse.de, x86@kernel.org In-Reply-To: <20190312170938.GA23035@agluck-desk> References: <20190312170938.GA23035@agluck-desk> To: linux-tip-commits@vger.kernel.org Subject: [tip:ras/core] x86/mce: Fix machine_check_poll() tests for error types Git-Commit-ID: f19501aa07f18268ab14f458b51c1c6b7f72a134 X-Mailer: tip-git-log-daemon Robot-ID: Robot-Unsubscribe: Contact to get blacklisted from these emails MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Content-Type: text/plain; charset=UTF-8 Content-Disposition: inline Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Commit-ID: f19501aa07f18268ab14f458b51c1c6b7f72a134 Gitweb: https://git.kernel.org/tip/f19501aa07f18268ab14f458b51c1c6b7f72a134 Author: Tony Luck AuthorDate: Tue, 12 Mar 2019 10:09:38 -0700 Committer: Borislav Petkov CommitDate: Wed, 27 Mar 2019 10:53:49 +0100 x86/mce: Fix machine_check_poll() tests for error types There has been a lurking "TBD" in the machine check poll routine ever since it was first split out from the machine check handler. The potential issue is that the poll routine may have just begun a read from the STATUS register in a machine check bank when the hardware logs an error in that bank and signals a machine check. That race used to be pretty small back when machine checks were broadcast, but the addition of local machine check means that the poll code could continue running and clear the error from the bank before the local machine check handler on another CPU gets around to reading it. Fix the code to be sure to only process errors that need to be processed in the poll code, leaving other logged errors alone for the machine check handler to find and process. [ bp: Massage a bit and flip the "== 0" check to the usual !(..) test. ] Fixes: b79109c3bbcf ("x86, mce: separate correct machine check poller and fatal exception handler") Fixes: ed7290d0ee8f ("x86, mce: implement new status bits") Reported-by: Ashok Raj Signed-off-by: Tony Luck Signed-off-by: Borislav Petkov Cc: Ashok Raj Cc: "H. Peter Anvin" Cc: Ingo Molnar Cc: linux-edac Cc: Thomas Gleixner Cc: x86-ml Cc: Yazen Ghannam Link: https://lkml.kernel.org/r/20190312170938.GA23035@agluck-desk --- arch/x86/kernel/cpu/mce/core.c | 44 +++++++++++++++++++++++++++++++++++------- 1 file changed, 37 insertions(+), 7 deletions(-) diff --git a/arch/x86/kernel/cpu/mce/core.c b/arch/x86/kernel/cpu/mce/core.c index b7fb541a4873..e558ca77cfe8 100644 --- a/arch/x86/kernel/cpu/mce/core.c +++ b/arch/x86/kernel/cpu/mce/core.c @@ -712,19 +712,49 @@ bool machine_check_poll(enum mcp_flags flags, mce_banks_t *b) barrier(); m.status = mce_rdmsrl(msr_ops.status(i)); + + /* If this entry is not valid, ignore it */ if (!(m.status & MCI_STATUS_VAL)) continue; /* - * Uncorrected or signalled events are handled by the exception - * handler when it is enabled, so don't process those here. - * - * TBD do the same check for MCI_STATUS_EN here? + * If we are logging everything (at CPU online) or this + * is a corrected error, then we must log it. */ - if (!(flags & MCP_UC) && - (m.status & (mca_cfg.ser ? MCI_STATUS_S : MCI_STATUS_UC))) - continue; + if ((flags & MCP_UC) || !(m.status & MCI_STATUS_UC)) + goto log_it; + + /* + * Newer Intel systems that support software error + * recovery need to make additional checks. Other + * CPUs should skip over uncorrected errors, but log + * everything else. + */ + if (!mca_cfg.ser) { + if (m.status & MCI_STATUS_UC) + continue; + goto log_it; + } + + /* Log "not enabled" (speculative) errors */ + if (!(m.status & MCI_STATUS_EN)) + goto log_it; + + /* + * Log UCNA (SDM: 15.6.3 "UCR Error Classification") + * UC == 1 && PCC == 0 && S == 0 + */ + if (!(m.status & MCI_STATUS_PCC) && !(m.status & MCI_STATUS_S)) + goto log_it; + + /* + * Skip anything else. Presumption is that our read of this + * bank is racing with a machine check. Leave the log alone + * for do_machine_check() to deal with it. + */ + continue; +log_it: error_seen = true; mce_read_aux(&m, i); From mboxrd@z Thu Jan 1 00:00:00 1970 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: base64 Subject: [tip:ras/core] x86/mce: Fix machine_check_poll() tests for error types From: tip-bot for Borislav Petkov Message-Id: Date: Wed, 27 Mar 2019 02:58:17 -0700 To: linux-tip-commits@vger.kernel.org Cc: ashok.raj@intel.com, tony.luck@intel.com, linux-kernel@vger.kernel.org, tglx@linutronix.de, mingo@redhat.com, Yazen.Ghannam@amd.com, mingo@kernel.org, hpa@zytor.com, x86@kernel.org, bp@suse.de, linux-edac@vger.kernel.org List-ID: Q29tbWl0LUlEOiAgZjE5NTAxYWEwN2YxODI2OGFiMTRmNDU4YjUxYzFjNmI3ZjcyYTEzNApHaXR3 ZWI6ICAgICBodHRwczovL2dpdC5rZXJuZWwub3JnL3RpcC9mMTk1MDFhYTA3ZjE4MjY4YWIxNGY0 NThiNTFjMWM2YjdmNzJhMTM0CkF1dGhvcjogICAgIFRvbnkgTHVjayA8dG9ueS5sdWNrQGludGVs LmNvbT4KQXV0aG9yRGF0ZTogVHVlLCAxMiBNYXIgMjAxOSAxMDowOTozOCAtMDcwMApDb21taXR0 ZXI6ICBCb3Jpc2xhdiBQZXRrb3YgPGJwQHN1c2UuZGU+CkNvbW1pdERhdGU6IFdlZCwgMjcgTWFy IDIwMTkgMTA6NTM6NDkgKzAxMDAKCng4Ni9tY2U6IEZpeCBtYWNoaW5lX2NoZWNrX3BvbGwoKSB0 ZXN0cyBmb3IgZXJyb3IgdHlwZXMKClRoZXJlIGhhcyBiZWVuIGEgbHVya2luZyAiVEJEIiBpbiB0 aGUgbWFjaGluZSBjaGVjayBwb2xsIHJvdXRpbmUgZXZlcgpzaW5jZSBpdCB3YXMgZmlyc3Qgc3Bs aXQgb3V0IGZyb20gdGhlIG1hY2hpbmUgY2hlY2sgaGFuZGxlci4gVGhlCnBvdGVudGlhbCBpc3N1 ZSBpcyB0aGF0IHRoZSBwb2xsIHJvdXRpbmUgbWF5IGhhdmUganVzdCBiZWd1biBhIHJlYWQgZnJv bQp0aGUgU1RBVFVTIHJlZ2lzdGVyIGluIGEgbWFjaGluZSBjaGVjayBiYW5rIHdoZW4gdGhlIGhh cmR3YXJlIGxvZ3MgYW4KZXJyb3IgaW4gdGhhdCBiYW5rIGFuZCBzaWduYWxzIGEgbWFjaGluZSBj aGVjay4KClRoYXQgcmFjZSB1c2VkIHRvIGJlIHByZXR0eSBzbWFsbCBiYWNrIHdoZW4gbWFjaGlu ZSBjaGVja3Mgd2VyZQpicm9hZGNhc3QsIGJ1dCB0aGUgYWRkaXRpb24gb2YgbG9jYWwgbWFjaGlu ZSBjaGVjayBtZWFucyB0aGF0IHRoZSBwb2xsCmNvZGUgY291bGQgY29udGludWUgcnVubmluZyBh bmQgY2xlYXIgdGhlIGVycm9yIGZyb20gdGhlIGJhbmsgYmVmb3JlIHRoZQpsb2NhbCBtYWNoaW5l IGNoZWNrIGhhbmRsZXIgb24gYW5vdGhlciBDUFUgZ2V0cyBhcm91bmQgdG8gcmVhZGluZyBpdC4K CkZpeCB0aGUgY29kZSB0byBiZSBzdXJlIHRvIG9ubHkgcHJvY2VzcyBlcnJvcnMgdGhhdCBuZWVk IHRvIGJlIHByb2Nlc3NlZAppbiB0aGUgcG9sbCBjb2RlLCBsZWF2aW5nIG90aGVyIGxvZ2dlZCBl cnJvcnMgYWxvbmUgZm9yIHRoZSBtYWNoaW5lCmNoZWNrIGhhbmRsZXIgdG8gZmluZCBhbmQgcHJv Y2Vzcy4KCiBbIGJwOiBNYXNzYWdlIGEgYml0IGFuZCBmbGlwIHRoZSAiPT0gMCIgY2hlY2sgdG8g dGhlIHVzdWFsICEoLi4pIHRlc3QuIF0KCkZpeGVzOiBiNzkxMDljM2JiY2YgKCJ4ODYsIG1jZTog c2VwYXJhdGUgY29ycmVjdCBtYWNoaW5lIGNoZWNrIHBvbGxlciBhbmQgZmF0YWwgZXhjZXB0aW9u IGhhbmRsZXIiKQpGaXhlczogZWQ3MjkwZDBlZThmICgieDg2LCBtY2U6IGltcGxlbWVudCBuZXcg c3RhdHVzIGJpdHMiKQpSZXBvcnRlZC1ieTogQXNob2sgUmFqIDxhc2hvay5yYWpAaW50ZWwuY29t PgpTaWduZWQtb2ZmLWJ5OiBUb255IEx1Y2sgPHRvbnkubHVja0BpbnRlbC5jb20+ClNpZ25lZC1v ZmYtYnk6IEJvcmlzbGF2IFBldGtvdiA8YnBAc3VzZS5kZT4KQ2M6IEFzaG9rIFJhaiA8YXNob2su cmFqQGludGVsLmNvbT4KQ2M6ICJILiBQZXRlciBBbnZpbiIgPGhwYUB6eXRvci5jb20+CkNjOiBJ bmdvIE1vbG5hciA8bWluZ29AcmVkaGF0LmNvbT4KQ2M6IGxpbnV4LWVkYWMgPGxpbnV4LWVkYWNA dmdlci5rZXJuZWwub3JnPgpDYzogVGhvbWFzIEdsZWl4bmVyIDx0Z2x4QGxpbnV0cm9uaXguZGU+ CkNjOiB4ODYtbWwgPHg4NkBrZXJuZWwub3JnPgpDYzogWWF6ZW4gR2hhbm5hbSA8WWF6ZW4uR2hh bm5hbUBhbWQuY29tPgpMaW5rOiBodHRwczovL2xrbWwua2VybmVsLm9yZy9yLzIwMTkwMzEyMTcw OTM4LkdBMjMwMzVAYWdsdWNrLWRlc2sKLS0tCiBhcmNoL3g4Ni9rZXJuZWwvY3B1L21jZS9jb3Jl LmMgfCA0NCArKysrKysrKysrKysrKysrKysrKysrKysrKysrKysrKysrKy0tLS0tLS0KIDEgZmls ZSBjaGFuZ2VkLCAzNyBpbnNlcnRpb25zKCspLCA3IGRlbGV0aW9ucygtKQoKZGlmZiAtLWdpdCBh L2FyY2gveDg2L2tlcm5lbC9jcHUvbWNlL2NvcmUuYyBiL2FyY2gveDg2L2tlcm5lbC9jcHUvbWNl L2NvcmUuYwppbmRleCBiN2ZiNTQxYTQ4NzMuLmU1NThjYTc3Y2ZlOCAxMDA2NDQKLS0tIGEvYXJj aC94ODYva2VybmVsL2NwdS9tY2UvY29yZS5jCisrKyBiL2FyY2gveDg2L2tlcm5lbC9jcHUvbWNl L2NvcmUuYwpAQCAtNzEyLDE5ICs3MTIsNDkgQEAgYm9vbCBtYWNoaW5lX2NoZWNrX3BvbGwoZW51 bSBtY3BfZmxhZ3MgZmxhZ3MsIG1jZV9iYW5rc190ICpiKQogCiAJCWJhcnJpZXIoKTsKIAkJbS5z dGF0dXMgPSBtY2VfcmRtc3JsKG1zcl9vcHMuc3RhdHVzKGkpKTsKKworCQkvKiBJZiB0aGlzIGVu dHJ5IGlzIG5vdCB2YWxpZCwgaWdub3JlIGl0ICovCiAJCWlmICghKG0uc3RhdHVzICYgTUNJX1NU QVRVU19WQUwpKQogCQkJY29udGludWU7CiAKIAkJLyoKLQkJICogVW5jb3JyZWN0ZWQgb3Igc2ln bmFsbGVkIGV2ZW50cyBhcmUgaGFuZGxlZCBieSB0aGUgZXhjZXB0aW9uCi0JCSAqIGhhbmRsZXIg d2hlbiBpdCBpcyBlbmFibGVkLCBzbyBkb24ndCBwcm9jZXNzIHRob3NlIGhlcmUuCi0JCSAqCi0J CSAqIFRCRCBkbyB0aGUgc2FtZSBjaGVjayBmb3IgTUNJX1NUQVRVU19FTiBoZXJlPworCQkgKiBJ ZiB3ZSBhcmUgbG9nZ2luZyBldmVyeXRoaW5nIChhdCBDUFUgb25saW5lKSBvciB0aGlzCisJCSAq IGlzIGEgY29ycmVjdGVkIGVycm9yLCB0aGVuIHdlIG11c3QgbG9nIGl0LgogCQkgKi8KLQkJaWYg KCEoZmxhZ3MgJiBNQ1BfVUMpICYmCi0JCSAgICAobS5zdGF0dXMgJiAobWNhX2NmZy5zZXIgPyBN Q0lfU1RBVFVTX1MgOiBNQ0lfU1RBVFVTX1VDKSkpCi0JCQljb250aW51ZTsKKwkJaWYgKChmbGFn cyAmIE1DUF9VQykgfHwgIShtLnN0YXR1cyAmIE1DSV9TVEFUVVNfVUMpKQorCQkJZ290byBsb2df aXQ7CisKKwkJLyoKKwkJICogTmV3ZXIgSW50ZWwgc3lzdGVtcyB0aGF0IHN1cHBvcnQgc29mdHdh cmUgZXJyb3IKKwkJICogcmVjb3ZlcnkgbmVlZCB0byBtYWtlIGFkZGl0aW9uYWwgY2hlY2tzLiBP dGhlcgorCQkgKiBDUFVzIHNob3VsZCBza2lwIG92ZXIgdW5jb3JyZWN0ZWQgZXJyb3JzLCBidXQg bG9nCisJCSAqIGV2ZXJ5dGhpbmcgZWxzZS4KKwkJICovCisJCWlmICghbWNhX2NmZy5zZXIpIHsK KwkJCWlmIChtLnN0YXR1cyAmIE1DSV9TVEFUVVNfVUMpCisJCQkJY29udGludWU7CisJCQlnb3Rv IGxvZ19pdDsKKwkJfQorCisJCS8qIExvZyAibm90IGVuYWJsZWQiIChzcGVjdWxhdGl2ZSkgZXJy b3JzICovCisJCWlmICghKG0uc3RhdHVzICYgTUNJX1NUQVRVU19FTikpCisJCQlnb3RvIGxvZ19p dDsKKworCQkvKgorCQkgKiBMb2cgVUNOQSAoU0RNOiAxNS42LjMgIlVDUiBFcnJvciBDbGFzc2lm aWNhdGlvbiIpCisJCSAqIFVDID09IDEgJiYgUENDID09IDAgJiYgUyA9PSAwCisJCSAqLworCQlp ZiAoIShtLnN0YXR1cyAmIE1DSV9TVEFUVVNfUENDKSAmJiAhKG0uc3RhdHVzICYgTUNJX1NUQVRV U19TKSkKKwkJCWdvdG8gbG9nX2l0OworCisJCS8qCisJCSAqIFNraXAgYW55dGhpbmcgZWxzZS4g UHJlc3VtcHRpb24gaXMgdGhhdCBvdXIgcmVhZCBvZiB0aGlzCisJCSAqIGJhbmsgaXMgcmFjaW5n IHdpdGggYSBtYWNoaW5lIGNoZWNrLiBMZWF2ZSB0aGUgbG9nIGFsb25lCisJCSAqIGZvciBkb19t YWNoaW5lX2NoZWNrKCkgdG8gZGVhbCB3aXRoIGl0LgorCQkgKi8KKwkJY29udGludWU7CiAKK2xv Z19pdDoKIAkJZXJyb3Jfc2VlbiA9IHRydWU7CiAKIAkJbWNlX3JlYWRfYXV4KCZtLCBpKTsK