From: WANG Chao <chao.wang@ucloud.cn> To: Borislav Petkov <bp@alien8.de> Cc: Tony Luck <tony.luck@intel.com>, linux-kernel@vger.kernel.org, linux-edac@vger.kernel.org Subject: [PATCH 3/3] RAS/CEC: immediate soft-offline page when count_threshold == 1 Date: Thu, 18 Apr 2019 11:41:15 +0800 [thread overview] Message-ID: <20190418034115.75954-3-chao.wang@ucloud.cn> (raw) In-Reply-To: <20190418034115.75954-1-chao.wang@ucloud.cn> count_threshol == 1 isn't working as expected. CEC only does soft offline the second time the same pfn is hit by a correctable error. Signed-off-by: WANG Chao <chao.wang@ucloud.cn> --- drivers/ras/cec.c | 36 +++++++++++++++++++++--------------- 1 file changed, 21 insertions(+), 15 deletions(-) diff --git a/drivers/ras/cec.c b/drivers/ras/cec.c index 702e4c02c713..ac879c45377c 100644 --- a/drivers/ras/cec.c +++ b/drivers/ras/cec.c @@ -272,7 +272,22 @@ static u64 __maybe_unused del_lru_elem(void) return pfn; } +static void cec_valid_soft_offline(u64 pfn) +{ + if (!pfn_valid(pfn)) { + pr_warn("CEC: Invalid pfn: 0x%llx\n", pfn); + } else { + /* We have reached max count for this page, soft-offline it. */ + pr_err("Soft-offlining pfn: 0x%llx\n", pfn); + memory_failure_queue(pfn, MF_SOFT_OFFLINE, &cec_chain); + ce_arr.pfns_poisoned++; + } +} +/* + * Return a >0 value to denote that we've reached the offlining + * threshold. + */ int cec_add_elem(u64 pfn) { struct ce_array *ca = &ce_arr; @@ -295,6 +310,11 @@ int cec_add_elem(u64 pfn) ret = find_elem(ca, pfn, &to); if (ret < 0) { + if (count_threshold == 1) { + cec_valid_soft_offline(pfn); + ret = 1; + goto unlock; + } /* * Shift range [to-end] to make room for one more element. */ @@ -320,23 +340,9 @@ int cec_add_elem(u64 pfn) ret = 0; } else { - u64 pfn = ca->array[to] >> PAGE_SHIFT; - - if (!pfn_valid(pfn)) { - pr_warn("CEC: Invalid pfn: 0x%llx\n", pfn); - } else { - /* We have reached max count for this page, soft-offline it. */ - pr_err("Soft-offlining pfn: 0x%llx\n", pfn); - memory_failure_queue(pfn, MF_SOFT_OFFLINE); - ca->pfns_poisoned++; - } - + cec_valid_soft_offline(pfn); del_elem(ca, to); - /* - * Return a >0 value to denote that we've reached the offlining - * threshold. - */ ret = 1; goto unlock; -- 2.21.0
WARNING: multiple messages have this Message-ID (diff)
From: WANG Chao <chao.wang@ucloud.cn> To: Borislav Petkov <bp@alien8.de> Cc: Tony Luck <tony.luck@intel.com>, linux-kernel@vger.kernel.org, linux-edac@vger.kernel.org Subject: [3/3] RAS/CEC: immediate soft-offline page when count_threshold == 1 Date: Thu, 18 Apr 2019 11:41:15 +0800 [thread overview] Message-ID: <20190418034115.75954-3-chao.wang@ucloud.cn> (raw) count_threshol == 1 isn't working as expected. CEC only does soft offline the second time the same pfn is hit by a correctable error. Signed-off-by: WANG Chao <chao.wang@ucloud.cn> --- drivers/ras/cec.c | 36 +++++++++++++++++++++--------------- 1 file changed, 21 insertions(+), 15 deletions(-) diff --git a/drivers/ras/cec.c b/drivers/ras/cec.c index 702e4c02c713..ac879c45377c 100644 --- a/drivers/ras/cec.c +++ b/drivers/ras/cec.c @@ -272,7 +272,22 @@ static u64 __maybe_unused del_lru_elem(void) return pfn; } +static void cec_valid_soft_offline(u64 pfn) +{ + if (!pfn_valid(pfn)) { + pr_warn("CEC: Invalid pfn: 0x%llx\n", pfn); + } else { + /* We have reached max count for this page, soft-offline it. */ + pr_err("Soft-offlining pfn: 0x%llx\n", pfn); + memory_failure_queue(pfn, MF_SOFT_OFFLINE, &cec_chain); + ce_arr.pfns_poisoned++; + } +} +/* + * Return a >0 value to denote that we've reached the offlining + * threshold. + */ int cec_add_elem(u64 pfn) { struct ce_array *ca = &ce_arr; @@ -295,6 +310,11 @@ int cec_add_elem(u64 pfn) ret = find_elem(ca, pfn, &to); if (ret < 0) { + if (count_threshold == 1) { + cec_valid_soft_offline(pfn); + ret = 1; + goto unlock; + } /* * Shift range [to-end] to make room for one more element. */ @@ -320,23 +340,9 @@ int cec_add_elem(u64 pfn) ret = 0; } else { - u64 pfn = ca->array[to] >> PAGE_SHIFT; - - if (!pfn_valid(pfn)) { - pr_warn("CEC: Invalid pfn: 0x%llx\n", pfn); - } else { - /* We have reached max count for this page, soft-offline it. */ - pr_err("Soft-offlining pfn: 0x%llx\n", pfn); - memory_failure_queue(pfn, MF_SOFT_OFFLINE); - ca->pfns_poisoned++; - } - + cec_valid_soft_offline(pfn); del_elem(ca, to); - /* - * Return a >0 value to denote that we've reached the offlining - * threshold. - */ ret = 1; goto unlock;
next prev parent reply other threads:[~2019-04-18 3:50 UTC|newest] Thread overview: 21+ messages / expand[flat|nested] mbox.gz Atom feed top 2019-04-18 3:41 [PATCH 1/3] RAS/CEC: fix __find_elem WANG Chao 2019-04-18 3:41 ` [1/3] " WANG Chao 2019-04-18 3:41 ` [PATCH 2/3] RAS/CEC: make ces_entered smp safe WANG Chao 2019-04-18 3:41 ` [2/3] " WANG Chao 2019-04-20 10:19 ` [tip:ras/core] RAS/CEC: Increment cec_entered under the mutex lock tip-bot for WANG Chao 2019-04-20 10:19 ` tip-bot for Borislav Petkov 2019-04-20 10:22 ` tip-bot for WANG Chao 2019-04-20 10:22 ` tip-bot for Borislav Petkov 2019-04-18 3:41 ` WANG Chao [this message] 2019-04-18 3:41 ` [3/3] RAS/CEC: immediate soft-offline page when count_threshold == 1 WANG Chao 2019-04-20 11:57 ` [PATCH 3/3] " Borislav Petkov 2019-04-20 11:57 ` [3/3] " Borislav Petkov 2019-04-24 2:43 ` [PATCH 3/3] " WANG Chao 2019-04-24 2:43 ` [3/3] " WANG Chao 2019-04-24 10:26 ` [PATCH 3/3] " Borislav Petkov 2019-04-24 10:26 ` [3/3] " Borislav Petkov 2019-06-08 21:26 ` [tip:ras/core] RAS/CEC: Check count_threshold unconditionally tip-bot for Borislav Petkov 2019-04-25 7:56 ` [PATCH 1/3] RAS/CEC: fix __find_elem WANG Chao 2019-04-25 7:56 ` [1/3] " WANG Chao 2019-04-25 8:05 ` [PATCH 1/3] " WANG Chao 2019-04-25 8:05 ` [1/3] " WANG Chao
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=20190418034115.75954-3-chao.wang@ucloud.cn \ --to=chao.wang@ucloud.cn \ --cc=bp@alien8.de \ --cc=linux-edac@vger.kernel.org \ --cc=linux-kernel@vger.kernel.org \ --cc=tony.luck@intel.com \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: linkBe sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.