* [PATCH] nfit, mce: only handle uncorrectable machine checks
@ 2018-10-24 20:01 ` Vishal Verma
0 siblings, 0 replies; 10+ messages in thread
From: Vishal Verma @ 2018-10-24 20:01 UTC (permalink / raw)
To: linux-nvdimm; +Cc: Tony Luck, stable, Borislav Petkov, linux-edac
We only want a machine check error to be added to libnvdimm's 'badblock'
if it was an uncorrectable error. Currently we insert both corrected and
uncorrectable errors. Add a check in the nfit mce handler to filter out
corrected mce events.
Reported-by: Omar Avelar <omar.avelar@intel.com>
Fixes: 6839a6d96f4e ("nfit: do an ARS scrub on hitting a latent media error")
Cc: stable@vger.kernel.org
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Borislav Petkov <bp@alien8.de>
Signed-off-by: Vishal Verma <vishal.l.verma@intel.com>
---
arch/x86/include/asm/mce.h | 1 +
arch/x86/kernel/cpu/mcheck/mce.c | 3 ++-
drivers/acpi/nfit/mce.c | 4 ++--
3 files changed, 5 insertions(+), 3 deletions(-)
diff --git a/arch/x86/include/asm/mce.h b/arch/x86/include/asm/mce.h
index 3a17107594c8..3111b3cee2ee 100644
--- a/arch/x86/include/asm/mce.h
+++ b/arch/x86/include/asm/mce.h
@@ -216,6 +216,7 @@ static inline int umc_normaddr_to_sysaddr(u64 norm_addr, u16 nid, u8 umc, u64 *s
int mce_available(struct cpuinfo_x86 *c);
bool mce_is_memory_error(struct mce *m);
+bool mce_is_correctable(struct mce *m);
DECLARE_PER_CPU(unsigned, mce_exception_count);
DECLARE_PER_CPU(unsigned, mce_poll_count);
diff --git a/arch/x86/kernel/cpu/mcheck/mce.c b/arch/x86/kernel/cpu/mcheck/mce.c
index 953b3ce92dcc..27015948bc41 100644
--- a/arch/x86/kernel/cpu/mcheck/mce.c
+++ b/arch/x86/kernel/cpu/mcheck/mce.c
@@ -534,7 +534,7 @@ bool mce_is_memory_error(struct mce *m)
}
EXPORT_SYMBOL_GPL(mce_is_memory_error);
-static bool mce_is_correctable(struct mce *m)
+bool mce_is_correctable(struct mce *m)
{
if (m->cpuvendor == X86_VENDOR_AMD && m->status & MCI_STATUS_DEFERRED)
return false;
@@ -544,6 +544,7 @@ static bool mce_is_correctable(struct mce *m)
return true;
}
+EXPORT_SYMBOL_GPL(mce_is_correctable);
static bool cec_add_mce(struct mce *m)
{
diff --git a/drivers/acpi/nfit/mce.c b/drivers/acpi/nfit/mce.c
index e9626bf6ca29..7a51707f87e9 100644
--- a/drivers/acpi/nfit/mce.c
+++ b/drivers/acpi/nfit/mce.c
@@ -25,8 +25,8 @@ static int nfit_handle_mce(struct notifier_block *nb, unsigned long val,
struct acpi_nfit_desc *acpi_desc;
struct nfit_spa *nfit_spa;
- /* We only care about memory errors */
- if (!mce_is_memory_error(mce))
+ /* We only care about uncorrectable memory errors */
+ if (!mce_is_memory_error(mce) || mce_is_correctable(mce))
return NOTIFY_DONE;
/*
--
2.17.1
_______________________________________________
Linux-nvdimm mailing list
Linux-nvdimm@lists.01.org
https://lists.01.org/mailman/listinfo/linux-nvdimm
^ permalink raw reply related [flat|nested] 10+ messages in thread
* nfit, mce: only handle uncorrectable machine checks
@ 2018-10-24 20:01 ` Vishal Verma
0 siblings, 0 replies; 10+ messages in thread
From: Vishal Verma @ 2018-10-24 20:01 UTC (permalink / raw)
To: linux-nvdimm
Cc: linux-edac, Vishal Verma, stable, Dan Williams, Tony Luck,
Borislav Petkov
We only want a machine check error to be added to libnvdimm's 'badblock'
if it was an uncorrectable error. Currently we insert both corrected and
uncorrectable errors. Add a check in the nfit mce handler to filter out
corrected mce events.
Reported-by: Omar Avelar <omar.avelar@intel.com>
Fixes: 6839a6d96f4e ("nfit: do an ARS scrub on hitting a latent media error")
Cc: stable@vger.kernel.org
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Borislav Petkov <bp@alien8.de>
Signed-off-by: Vishal Verma <vishal.l.verma@intel.com>
---
arch/x86/include/asm/mce.h | 1 +
arch/x86/kernel/cpu/mcheck/mce.c | 3 ++-
drivers/acpi/nfit/mce.c | 4 ++--
3 files changed, 5 insertions(+), 3 deletions(-)
diff --git a/arch/x86/include/asm/mce.h b/arch/x86/include/asm/mce.h
index 3a17107594c8..3111b3cee2ee 100644
--- a/arch/x86/include/asm/mce.h
+++ b/arch/x86/include/asm/mce.h
@@ -216,6 +216,7 @@ static inline int umc_normaddr_to_sysaddr(u64 norm_addr, u16 nid, u8 umc, u64 *s
int mce_available(struct cpuinfo_x86 *c);
bool mce_is_memory_error(struct mce *m);
+bool mce_is_correctable(struct mce *m);
DECLARE_PER_CPU(unsigned, mce_exception_count);
DECLARE_PER_CPU(unsigned, mce_poll_count);
diff --git a/arch/x86/kernel/cpu/mcheck/mce.c b/arch/x86/kernel/cpu/mcheck/mce.c
index 953b3ce92dcc..27015948bc41 100644
--- a/arch/x86/kernel/cpu/mcheck/mce.c
+++ b/arch/x86/kernel/cpu/mcheck/mce.c
@@ -534,7 +534,7 @@ bool mce_is_memory_error(struct mce *m)
}
EXPORT_SYMBOL_GPL(mce_is_memory_error);
-static bool mce_is_correctable(struct mce *m)
+bool mce_is_correctable(struct mce *m)
{
if (m->cpuvendor == X86_VENDOR_AMD && m->status & MCI_STATUS_DEFERRED)
return false;
@@ -544,6 +544,7 @@ static bool mce_is_correctable(struct mce *m)
return true;
}
+EXPORT_SYMBOL_GPL(mce_is_correctable);
static bool cec_add_mce(struct mce *m)
{
diff --git a/drivers/acpi/nfit/mce.c b/drivers/acpi/nfit/mce.c
index e9626bf6ca29..7a51707f87e9 100644
--- a/drivers/acpi/nfit/mce.c
+++ b/drivers/acpi/nfit/mce.c
@@ -25,8 +25,8 @@ static int nfit_handle_mce(struct notifier_block *nb, unsigned long val,
struct acpi_nfit_desc *acpi_desc;
struct nfit_spa *nfit_spa;
- /* We only care about memory errors */
- if (!mce_is_memory_error(mce))
+ /* We only care about uncorrectable memory errors */
+ if (!mce_is_memory_error(mce) || mce_is_correctable(mce))
return NOTIFY_DONE;
/*
^ permalink raw reply related [flat|nested] 10+ messages in thread
* [PATCH] nfit, mce: only handle uncorrectable machine checks
@ 2018-10-24 20:01 ` Vishal Verma
0 siblings, 0 replies; 10+ messages in thread
From: Vishal Verma @ 2018-10-24 20:01 UTC (permalink / raw)
To: linux-nvdimm
Cc: linux-edac, Vishal Verma, stable, Dan Williams, Tony Luck,
Borislav Petkov
We only want a machine check error to be added to libnvdimm's 'badblock'
if it was an uncorrectable error. Currently we insert both corrected and
uncorrectable errors. Add a check in the nfit mce handler to filter out
corrected mce events.
Reported-by: Omar Avelar <omar.avelar@intel.com>
Fixes: 6839a6d96f4e ("nfit: do an ARS scrub on hitting a latent media error")
Cc: stable@vger.kernel.org
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Borislav Petkov <bp@alien8.de>
Signed-off-by: Vishal Verma <vishal.l.verma@intel.com>
---
arch/x86/include/asm/mce.h | 1 +
arch/x86/kernel/cpu/mcheck/mce.c | 3 ++-
drivers/acpi/nfit/mce.c | 4 ++--
3 files changed, 5 insertions(+), 3 deletions(-)
diff --git a/arch/x86/include/asm/mce.h b/arch/x86/include/asm/mce.h
index 3a17107594c8..3111b3cee2ee 100644
--- a/arch/x86/include/asm/mce.h
+++ b/arch/x86/include/asm/mce.h
@@ -216,6 +216,7 @@ static inline int umc_normaddr_to_sysaddr(u64 norm_addr, u16 nid, u8 umc, u64 *s
int mce_available(struct cpuinfo_x86 *c);
bool mce_is_memory_error(struct mce *m);
+bool mce_is_correctable(struct mce *m);
DECLARE_PER_CPU(unsigned, mce_exception_count);
DECLARE_PER_CPU(unsigned, mce_poll_count);
diff --git a/arch/x86/kernel/cpu/mcheck/mce.c b/arch/x86/kernel/cpu/mcheck/mce.c
index 953b3ce92dcc..27015948bc41 100644
--- a/arch/x86/kernel/cpu/mcheck/mce.c
+++ b/arch/x86/kernel/cpu/mcheck/mce.c
@@ -534,7 +534,7 @@ bool mce_is_memory_error(struct mce *m)
}
EXPORT_SYMBOL_GPL(mce_is_memory_error);
-static bool mce_is_correctable(struct mce *m)
+bool mce_is_correctable(struct mce *m)
{
if (m->cpuvendor == X86_VENDOR_AMD && m->status & MCI_STATUS_DEFERRED)
return false;
@@ -544,6 +544,7 @@ static bool mce_is_correctable(struct mce *m)
return true;
}
+EXPORT_SYMBOL_GPL(mce_is_correctable);
static bool cec_add_mce(struct mce *m)
{
diff --git a/drivers/acpi/nfit/mce.c b/drivers/acpi/nfit/mce.c
index e9626bf6ca29..7a51707f87e9 100644
--- a/drivers/acpi/nfit/mce.c
+++ b/drivers/acpi/nfit/mce.c
@@ -25,8 +25,8 @@ static int nfit_handle_mce(struct notifier_block *nb, unsigned long val,
struct acpi_nfit_desc *acpi_desc;
struct nfit_spa *nfit_spa;
- /* We only care about memory errors */
- if (!mce_is_memory_error(mce))
+ /* We only care about uncorrectable memory errors */
+ if (!mce_is_memory_error(mce) || mce_is_correctable(mce))
return NOTIFY_DONE;
/*
--
2.17.1
^ permalink raw reply related [flat|nested] 10+ messages in thread
* Re: [PATCH] nfit, mce: only handle uncorrectable machine checks
@ 2018-10-25 0:25 ` Dan Williams
0 siblings, 0 replies; 10+ messages in thread
From: Dan Williams @ 2018-10-25 0:25 UTC (permalink / raw)
To: Vishal L Verma
Cc: linux-edac, Borislav Petkov, Luck,
Tony <tony.luck@intel.com>,
stable <stable@vger.kernel.org>,
linux-nvdimm
On Wed, Oct 24, 2018 at 1:03 PM Vishal Verma <vishal.l.verma@intel.com> wrote:
>
> We only want a machine check error to be added to libnvdimm's 'badblock'
> if it was an uncorrectable error. Currently we insert both corrected and
> uncorrectable errors. Add a check in the nfit mce handler to filter out
> corrected mce events.
>
> Reported-by: Omar Avelar <omar.avelar@intel.com>
> Fixes: 6839a6d96f4e ("nfit: do an ARS scrub on hitting a latent media error")
> Cc: stable@vger.kernel.org
> Cc: Dan Williams <dan.j.williams@intel.com>
> Cc: Tony Luck <tony.luck@intel.com>
> Cc: Borislav Petkov <bp@alien8.de>
> Signed-off-by: Vishal Verma <vishal.l.verma@intel.com>
Looks good, will let this sit in -next until the back half of the merge window.
_______________________________________________
Linux-nvdimm mailing list
Linux-nvdimm@lists.01.org
https://lists.01.org/mailman/listinfo/linux-nvdimm
^ permalink raw reply [flat|nested] 10+ messages in thread
* nfit, mce: only handle uncorrectable machine checks
@ 2018-10-25 0:25 ` Dan Williams
0 siblings, 0 replies; 10+ messages in thread
From: Dan Williams @ 2018-10-25 0:25 UTC (permalink / raw)
To: Vishal L Verma
Cc: linux-nvdimm, linux-edac, stable, Luck, Tony, Borislav Petkov
On Wed, Oct 24, 2018 at 1:03 PM Vishal Verma <vishal.l.verma@intel.com> wrote:
>
> We only want a machine check error to be added to libnvdimm's 'badblock'
> if it was an uncorrectable error. Currently we insert both corrected and
> uncorrectable errors. Add a check in the nfit mce handler to filter out
> corrected mce events.
>
> Reported-by: Omar Avelar <omar.avelar@intel.com>
> Fixes: 6839a6d96f4e ("nfit: do an ARS scrub on hitting a latent media error")
> Cc: stable@vger.kernel.org
> Cc: Dan Williams <dan.j.williams@intel.com>
> Cc: Tony Luck <tony.luck@intel.com>
> Cc: Borislav Petkov <bp@alien8.de>
> Signed-off-by: Vishal Verma <vishal.l.verma@intel.com>
Looks good, will let this sit in -next until the back half of the merge window.
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] nfit, mce: only handle uncorrectable machine checks
@ 2018-10-25 0:25 ` Dan Williams
0 siblings, 0 replies; 10+ messages in thread
From: Dan Williams @ 2018-10-25 0:25 UTC (permalink / raw)
To: Vishal L Verma
Cc: linux-nvdimm, linux-edac, stable, Luck, Tony, Borislav Petkov
On Wed, Oct 24, 2018 at 1:03 PM Vishal Verma <vishal.l.verma@intel.com> wrote:
>
> We only want a machine check error to be added to libnvdimm's 'badblock'
> if it was an uncorrectable error. Currently we insert both corrected and
> uncorrectable errors. Add a check in the nfit mce handler to filter out
> corrected mce events.
>
> Reported-by: Omar Avelar <omar.avelar@intel.com>
> Fixes: 6839a6d96f4e ("nfit: do an ARS scrub on hitting a latent media error")
> Cc: stable@vger.kernel.org
> Cc: Dan Williams <dan.j.williams@intel.com>
> Cc: Tony Luck <tony.luck@intel.com>
> Cc: Borislav Petkov <bp@alien8.de>
> Signed-off-by: Vishal Verma <vishal.l.verma@intel.com>
Looks good, will let this sit in -next until the back half of the merge window.
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] nfit, mce: only handle uncorrectable machine checks
@ 2018-10-25 10:04 ` Borislav Petkov
0 siblings, 0 replies; 10+ messages in thread
From: Borislav Petkov @ 2018-10-25 10:04 UTC (permalink / raw)
To: Vishal Verma; +Cc: linux-edac, Tony Luck, linux-nvdimm
Drop stable@ from CC.
On Wed, Oct 24, 2018 at 02:01:48PM -0600, Vishal Verma wrote:
> We only want a machine check error to be added to libnvdimm's 'badblock'
> if it was an uncorrectable error.
What is libnvdimm's 'badblock' ?
Also, pls write in the commit message *why* you want only UE errors.
Also, write your commit message in impartial tone, without the "we".
Thx.
--
Regards/Gruss,
Boris.
ECO tip #101: Trim your mails when you reply.
--
_______________________________________________
Linux-nvdimm mailing list
Linux-nvdimm@lists.01.org
https://lists.01.org/mailman/listinfo/linux-nvdimm
^ permalink raw reply [flat|nested] 10+ messages in thread
* nfit, mce: only handle uncorrectable machine checks
@ 2018-10-25 10:04 ` Borislav Petkov
0 siblings, 0 replies; 10+ messages in thread
From: Borislav Petkov @ 2018-10-25 10:04 UTC (permalink / raw)
To: Vishal Verma; +Cc: linux-nvdimm, linux-edac, Dan Williams, Tony Luck
Drop stable@ from CC.
On Wed, Oct 24, 2018 at 02:01:48PM -0600, Vishal Verma wrote:
> We only want a machine check error to be added to libnvdimm's 'badblock'
> if it was an uncorrectable error.
What is libnvdimm's 'badblock' ?
Also, pls write in the commit message *why* you want only UE errors.
Also, write your commit message in impartial tone, without the "we".
Thx.
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] nfit, mce: only handle uncorrectable machine checks
@ 2018-10-25 17:55 ` Vishal Verma
0 siblings, 0 replies; 10+ messages in thread
From: Verma, Vishal L @ 2018-10-25 17:55 UTC (permalink / raw)
To: bp; +Cc: Luck, Tony, linux-edac, linux-nvdimm
On Thu, 2018-10-25 at 12:04 +0200, Borislav Petkov wrote:
> Drop stable@ from CC.
>
> On Wed, Oct 24, 2018 at 02:01:48PM -0600, Vishal Verma wrote:
> > We only want a machine check error to be added to libnvdimm's
> > 'badblock'
> > if it was an uncorrectable error.
>
> What is libnvdimm's 'badblock' ?
>
> Also, pls write in the commit message *why* you want only UE errors.
>
> Also, write your commit message in impartial tone, without the "we".
Hi Borislav,
Thanks for the feedback, I'll send a new revision with better
explanations in the changelog.
Thanks,
-Vishal
_______________________________________________
Linux-nvdimm mailing list
Linux-nvdimm@lists.01.org
https://lists.01.org/mailman/listinfo/linux-nvdimm
^ permalink raw reply [flat|nested] 10+ messages in thread
* nfit, mce: only handle uncorrectable machine checks
@ 2018-10-25 17:55 ` Vishal Verma
0 siblings, 0 replies; 10+ messages in thread
From: Vishal Verma @ 2018-10-25 17:55 UTC (permalink / raw)
To: bp; +Cc: Williams, Dan J, linux-nvdimm, Luck, Tony, linux-edac
On Thu, 2018-10-25 at 12:04 +0200, Borislav Petkov wrote:
> Drop stable@ from CC.
>
> On Wed, Oct 24, 2018 at 02:01:48PM -0600, Vishal Verma wrote:
> > We only want a machine check error to be added to libnvdimm's
> > 'badblock'
> > if it was an uncorrectable error.
>
> What is libnvdimm's 'badblock' ?
>
> Also, pls write in the commit message *why* you want only UE errors.
>
> Also, write your commit message in impartial tone, without the "we".
Hi Borislav,
Thanks for the feedback, I'll send a new revision with better
explanations in the changelog.
Thanks,
-Vishal
^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2018-10-25 17:55 UTC | newest]
Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-10-24 20:01 [PATCH] nfit, mce: only handle uncorrectable machine checks Vishal Verma
2018-10-24 20:01 ` Vishal Verma
2018-10-24 20:01 ` Vishal Verma
2018-10-25 0:25 ` [PATCH] " Dan Williams
2018-10-25 0:25 ` Dan Williams
2018-10-25 0:25 ` Dan Williams
2018-10-25 10:04 ` [PATCH] " Borislav Petkov
2018-10-25 10:04 ` Borislav Petkov
2018-10-25 17:55 ` [PATCH] " Verma, Vishal L
2018-10-25 17:55 ` Vishal Verma
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.