* [PATCH v4] x86/mce: retrieve poison range from hardware
@ 2022-07-27 18:46 Jane Chu
2022-07-27 18:56 ` Dan Williams
0 siblings, 1 reply; 9+ messages in thread
From: Jane Chu @ 2022-07-27 18:46 UTC (permalink / raw)
To: tony.luck, bp, tglx, mingo, dave.hansen, x86, linux-edac,
dan.j.williams, linux-kernel, hch, nvdimm
With Commit 7917f9cdb503 ("acpi/nfit: rely on mce->misc to determine
poison granularity") that changed nfit_handle_mce() callback to report
badrange according to 1ULL << MCI_MISC_ADDR_LSB(mce->misc), it's been
discovered that the mce->misc LSB field is 0x1000 bytes, hence injecting
2 back-to-back poisons and the driver ends up logging 8 badblocks,
because 0x1000 bytes is 8 512-byte.
Dan Williams noticed that apei_mce_report_mem_error() hardcode
the LSB field to PAGE_SHIFT instead of consulting the input
struct cper_sec_mem_err record. So change to rely on hardware whenever
support is available.
Link: https://lore.kernel.org/r/7ed50fd8-521e-cade-77b1-738b8bfb8502@oracle.com
Reviewed-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Jane Chu <jane.chu@oracle.com>
---
arch/x86/kernel/cpu/mce/apei.c | 14 +++++++++++++-
1 file changed, 13 insertions(+), 1 deletion(-)
diff --git a/arch/x86/kernel/cpu/mce/apei.c b/arch/x86/kernel/cpu/mce/apei.c
index 717192915f28..26d63818b2de 100644
--- a/arch/x86/kernel/cpu/mce/apei.c
+++ b/arch/x86/kernel/cpu/mce/apei.c
@@ -29,15 +29,27 @@
void apei_mce_report_mem_error(int severity, struct cper_sec_mem_err *mem_err)
{
struct mce m;
+ int grain = PAGE_SHIFT;
if (!(mem_err->validation_bits & CPER_MEM_VALID_PA))
return;
+ /*
+ * Even if the ->validation_bits are set for address mask,
+ * to be extra safe, check and reject an error radius '0',
+ * and fallback to the default page size.
+ */
+ if (mem_err->validation_bits & CPER_MEM_VALID_PA_MASK) {
+ grain = ~mem_err->physical_addr_mask + 1;
+ if (grain == 1)
+ grain = PAGE_SHIFT;
+ }
+
mce_setup(&m);
m.bank = -1;
/* Fake a memory read error with unknown channel */
m.status = MCI_STATUS_VAL | MCI_STATUS_EN | MCI_STATUS_ADDRV | MCI_STATUS_MISCV | 0x9f;
- m.misc = (MCI_MISC_ADDR_PHYS << 6) | PAGE_SHIFT;
+ m.misc = (MCI_MISC_ADDR_PHYS << 6) | grain;
if (severity >= GHES_SEV_RECOVERABLE)
m.status |= MCI_STATUS_UC;
--
2.18.4
^ permalink raw reply related [flat|nested] 9+ messages in thread
* RE: [PATCH v4] x86/mce: retrieve poison range from hardware
2022-07-27 18:46 [PATCH v4] x86/mce: retrieve poison range from hardware Jane Chu
@ 2022-07-27 18:56 ` Dan Williams
2022-07-27 19:24 ` Jane Chu
0 siblings, 1 reply; 9+ messages in thread
From: Dan Williams @ 2022-07-27 18:56 UTC (permalink / raw)
To: Jane Chu, tony.luck, bp, tglx, mingo, dave.hansen, x86,
linux-edac, dan.j.williams, linux-kernel, hch, nvdimm
Jane Chu wrote:
> With Commit 7917f9cdb503 ("acpi/nfit: rely on mce->misc to determine
> poison granularity") that changed nfit_handle_mce() callback to report
> badrange according to 1ULL << MCI_MISC_ADDR_LSB(mce->misc), it's been
> discovered that the mce->misc LSB field is 0x1000 bytes, hence injecting
> 2 back-to-back poisons and the driver ends up logging 8 badblocks,
> because 0x1000 bytes is 8 512-byte.
>
> Dan Williams noticed that apei_mce_report_mem_error() hardcode
> the LSB field to PAGE_SHIFT instead of consulting the input
> struct cper_sec_mem_err record. So change to rely on hardware whenever
> support is available.
>
> Link: https://lore.kernel.org/r/7ed50fd8-521e-cade-77b1-738b8bfb8502@oracle.com
>
> Reviewed-by: Dan Williams <dan.j.williams@intel.com>
> Signed-off-by: Jane Chu <jane.chu@oracle.com>
> ---
> arch/x86/kernel/cpu/mce/apei.c | 14 +++++++++++++-
> 1 file changed, 13 insertions(+), 1 deletion(-)
>
> diff --git a/arch/x86/kernel/cpu/mce/apei.c b/arch/x86/kernel/cpu/mce/apei.c
> index 717192915f28..26d63818b2de 100644
> --- a/arch/x86/kernel/cpu/mce/apei.c
> +++ b/arch/x86/kernel/cpu/mce/apei.c
> @@ -29,15 +29,27 @@
> void apei_mce_report_mem_error(int severity, struct cper_sec_mem_err *mem_err)
> {
> struct mce m;
> + int grain = PAGE_SHIFT;
>
> if (!(mem_err->validation_bits & CPER_MEM_VALID_PA))
> return;
>
> + /*
> + * Even if the ->validation_bits are set for address mask,
> + * to be extra safe, check and reject an error radius '0',
> + * and fallback to the default page size.
> + */
> + if (mem_err->validation_bits & CPER_MEM_VALID_PA_MASK) {
> + grain = ~mem_err->physical_addr_mask + 1;
> + if (grain == 1)
> + grain = PAGE_SHIFT;
Wait, if @grain is the number of bits to mask off the address, shouldn't
this be something like:
grain = min_not_zero(PAGE_SHIFT, hweight64(~mem_err->physical_addr_mask));
...?
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH v4] x86/mce: retrieve poison range from hardware
2022-07-27 18:56 ` Dan Williams
@ 2022-07-27 19:24 ` Jane Chu
2022-07-27 19:30 ` Jane Chu
0 siblings, 1 reply; 9+ messages in thread
From: Jane Chu @ 2022-07-27 19:24 UTC (permalink / raw)
To: Dan Williams, tony.luck, bp, tglx, mingo, dave.hansen, x86,
linux-edac, linux-kernel, hch, nvdimm
On 7/27/2022 11:56 AM, Dan Williams wrote:
> Jane Chu wrote:
>> With Commit 7917f9cdb503 ("acpi/nfit: rely on mce->misc to determine
>> poison granularity") that changed nfit_handle_mce() callback to report
>> badrange according to 1ULL << MCI_MISC_ADDR_LSB(mce->misc), it's been
>> discovered that the mce->misc LSB field is 0x1000 bytes, hence injecting
>> 2 back-to-back poisons and the driver ends up logging 8 badblocks,
>> because 0x1000 bytes is 8 512-byte.
>>
>> Dan Williams noticed that apei_mce_report_mem_error() hardcode
>> the LSB field to PAGE_SHIFT instead of consulting the input
>> struct cper_sec_mem_err record. So change to rely on hardware whenever
>> support is available.
>>
>> Link: https://lore.kernel.org/r/7ed50fd8-521e-cade-77b1-738b8bfb8502@oracle.com
>>
>> Reviewed-by: Dan Williams <dan.j.williams@intel.com>
>> Signed-off-by: Jane Chu <jane.chu@oracle.com>
>> ---
>> arch/x86/kernel/cpu/mce/apei.c | 14 +++++++++++++-
>> 1 file changed, 13 insertions(+), 1 deletion(-)
>>
>> diff --git a/arch/x86/kernel/cpu/mce/apei.c b/arch/x86/kernel/cpu/mce/apei.c
>> index 717192915f28..26d63818b2de 100644
>> --- a/arch/x86/kernel/cpu/mce/apei.c
>> +++ b/arch/x86/kernel/cpu/mce/apei.c
>> @@ -29,15 +29,27 @@
>> void apei_mce_report_mem_error(int severity, struct cper_sec_mem_err *mem_err)
>> {
>> struct mce m;
>> + int grain = PAGE_SHIFT;
>>
>> if (!(mem_err->validation_bits & CPER_MEM_VALID_PA))
>> return;
>>
>> + /*
>> + * Even if the ->validation_bits are set for address mask,
>> + * to be extra safe, check and reject an error radius '0',
>> + * and fallback to the default page size.
>> + */
>> + if (mem_err->validation_bits & CPER_MEM_VALID_PA_MASK) {
>> + grain = ~mem_err->physical_addr_mask + 1;
>> + if (grain == 1)
>> + grain = PAGE_SHIFT;
>
> Wait, if @grain is the number of bits to mask off the address, shouldn't
> this be something like:
>
> grain = min_not_zero(PAGE_SHIFT, hweight64(~mem_err->physical_addr_mask));
I see. I guess what you meant is
grain = min(PAGE_SHIFT, (1 + hweight64(~mem_err->physical_addr_mask)));
so that in the pmem poison case, 'grain' would be 8, not 7.
thanks,
-jane
>
> ...?
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH v4] x86/mce: retrieve poison range from hardware
2022-07-27 19:24 ` Jane Chu
@ 2022-07-27 19:30 ` Jane Chu
2022-07-27 19:34 ` Jane Chu
0 siblings, 1 reply; 9+ messages in thread
From: Jane Chu @ 2022-07-27 19:30 UTC (permalink / raw)
To: Dan Williams, tony.luck, bp, tglx, mingo, dave.hansen, x86,
linux-edac, linux-kernel, hch, nvdimm
On 7/27/2022 12:24 PM, Jane Chu wrote:
> On 7/27/2022 11:56 AM, Dan Williams wrote:
>> Jane Chu wrote:
>>> With Commit 7917f9cdb503 ("acpi/nfit: rely on mce->misc to determine
>>> poison granularity") that changed nfit_handle_mce() callback to report
>>> badrange according to 1ULL << MCI_MISC_ADDR_LSB(mce->misc), it's been
>>> discovered that the mce->misc LSB field is 0x1000 bytes, hence injecting
>>> 2 back-to-back poisons and the driver ends up logging 8 badblocks,
>>> because 0x1000 bytes is 8 512-byte.
>>>
>>> Dan Williams noticed that apei_mce_report_mem_error() hardcode
>>> the LSB field to PAGE_SHIFT instead of consulting the input
>>> struct cper_sec_mem_err record. So change to rely on hardware whenever
>>> support is available.
>>>
>>> Link:
>>> https://lore.kernel.org/r/7ed50fd8-521e-cade-77b1-738b8bfb8502@oracle.com
>>>
>>>
>>> Reviewed-by: Dan Williams <dan.j.williams@intel.com>
>>> Signed-off-by: Jane Chu <jane.chu@oracle.com>
>>> ---
>>> arch/x86/kernel/cpu/mce/apei.c | 14 +++++++++++++-
>>> 1 file changed, 13 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/arch/x86/kernel/cpu/mce/apei.c
>>> b/arch/x86/kernel/cpu/mce/apei.c
>>> index 717192915f28..26d63818b2de 100644
>>> --- a/arch/x86/kernel/cpu/mce/apei.c
>>> +++ b/arch/x86/kernel/cpu/mce/apei.c
>>> @@ -29,15 +29,27 @@
>>> void apei_mce_report_mem_error(int severity, struct
>>> cper_sec_mem_err *mem_err)
>>> {
>>> struct mce m;
>>> + int grain = PAGE_SHIFT;
>>> if (!(mem_err->validation_bits & CPER_MEM_VALID_PA))
>>> return;
>>> + /*
>>> + * Even if the ->validation_bits are set for address mask,
>>> + * to be extra safe, check and reject an error radius '0',
>>> + * and fallback to the default page size.
>>> + */
>>> + if (mem_err->validation_bits & CPER_MEM_VALID_PA_MASK) {
>>> + grain = ~mem_err->physical_addr_mask + 1;
>>> + if (grain == 1)
>>> + grain = PAGE_SHIFT;
>>
>> Wait, if @grain is the number of bits to mask off the address, shouldn't
>> this be something like:
>>
>> grain = min_not_zero(PAGE_SHIFT,
>> hweight64(~mem_err->physical_addr_mask));
>
> I see. I guess what you meant is
> grain = min(PAGE_SHIFT, (1 + hweight64(~mem_err->physical_addr_mask)));
Sorry, take that back, it won't work either.
-jane
> so that in the pmem poison case, 'grain' would be 8, not 7.
>
> thanks,
> -jane
>
>>
>> ...?
>
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH v4] x86/mce: retrieve poison range from hardware
2022-07-27 19:30 ` Jane Chu
@ 2022-07-27 19:34 ` Jane Chu
2022-07-27 20:01 ` Dan Williams
0 siblings, 1 reply; 9+ messages in thread
From: Jane Chu @ 2022-07-27 19:34 UTC (permalink / raw)
To: Dan Williams, tony.luck, bp, tglx, mingo, dave.hansen, x86,
linux-edac, linux-kernel, hch, nvdimm
On 7/27/2022 12:30 PM, Jane Chu wrote:
> On 7/27/2022 12:24 PM, Jane Chu wrote:
>> On 7/27/2022 11:56 AM, Dan Williams wrote:
>>> Jane Chu wrote:
>>>> With Commit 7917f9cdb503 ("acpi/nfit: rely on mce->misc to determine
>>>> poison granularity") that changed nfit_handle_mce() callback to report
>>>> badrange according to 1ULL << MCI_MISC_ADDR_LSB(mce->misc), it's been
>>>> discovered that the mce->misc LSB field is 0x1000 bytes, hence
>>>> injecting
>>>> 2 back-to-back poisons and the driver ends up logging 8 badblocks,
>>>> because 0x1000 bytes is 8 512-byte.
>>>>
>>>> Dan Williams noticed that apei_mce_report_mem_error() hardcode
>>>> the LSB field to PAGE_SHIFT instead of consulting the input
>>>> struct cper_sec_mem_err record. So change to rely on hardware whenever
>>>> support is available.
>>>>
>>>> Link:
>>>> https://lore.kernel.org/r/7ed50fd8-521e-cade-77b1-738b8bfb8502@oracle.com
>>>>
>>>>
>>>> Reviewed-by: Dan Williams <dan.j.williams@intel.com>
>>>> Signed-off-by: Jane Chu <jane.chu@oracle.com>
>>>> ---
>>>> arch/x86/kernel/cpu/mce/apei.c | 14 +++++++++++++-
>>>> 1 file changed, 13 insertions(+), 1 deletion(-)
>>>>
>>>> diff --git a/arch/x86/kernel/cpu/mce/apei.c
>>>> b/arch/x86/kernel/cpu/mce/apei.c
>>>> index 717192915f28..26d63818b2de 100644
>>>> --- a/arch/x86/kernel/cpu/mce/apei.c
>>>> +++ b/arch/x86/kernel/cpu/mce/apei.c
>>>> @@ -29,15 +29,27 @@
>>>> void apei_mce_report_mem_error(int severity, struct
>>>> cper_sec_mem_err *mem_err)
>>>> {
>>>> struct mce m;
>>>> + int grain = PAGE_SHIFT;
>>>> if (!(mem_err->validation_bits & CPER_MEM_VALID_PA))
>>>> return;
>>>> + /*
>>>> + * Even if the ->validation_bits are set for address mask,
>>>> + * to be extra safe, check and reject an error radius '0',
>>>> + * and fallback to the default page size.
>>>> + */
>>>> + if (mem_err->validation_bits & CPER_MEM_VALID_PA_MASK) {
>>>> + grain = ~mem_err->physical_addr_mask + 1;
>>>> + if (grain == 1)
>>>> + grain = PAGE_SHIFT;
>>>
>>> Wait, if @grain is the number of bits to mask off the address, shouldn't
>>> this be something like:
>>>
>>> grain = min_not_zero(PAGE_SHIFT,
>>> hweight64(~mem_err->physical_addr_mask));
>>
>> I see. I guess what you meant is
>> grain = min(PAGE_SHIFT, (1 +
>> hweight64(~mem_err->physical_addr_mask)));
>
> Sorry, take that back, it won't work either.
This will work,
grain = min_not_zero(PAGE_SHIFT - 1,
hweight64(~mem_err->physical_addr_mask));
grain++;
but too sophisticated? I guess I prefer the simple "if" expression.
thanks,
-jane
>
> -jane
>
>> so that in the pmem poison case, 'grain' would be 8, not 7.
>>
>> thanks,
>> -jane
>>
>>>
>>> ...?
>>
>
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH v4] x86/mce: retrieve poison range from hardware
2022-07-27 19:34 ` Jane Chu
@ 2022-07-27 20:01 ` Dan Williams
2022-07-28 18:31 ` Jane Chu
0 siblings, 1 reply; 9+ messages in thread
From: Dan Williams @ 2022-07-27 20:01 UTC (permalink / raw)
To: Jane Chu, Dan Williams, tony.luck, bp, tglx, mingo, dave.hansen,
x86, linux-edac, linux-kernel, hch, nvdimm
Jane Chu wrote:
> On 7/27/2022 12:30 PM, Jane Chu wrote:
> > On 7/27/2022 12:24 PM, Jane Chu wrote:
> >> On 7/27/2022 11:56 AM, Dan Williams wrote:
> >>> Jane Chu wrote:
> >>>> With Commit 7917f9cdb503 ("acpi/nfit: rely on mce->misc to determine
> >>>> poison granularity") that changed nfit_handle_mce() callback to report
> >>>> badrange according to 1ULL << MCI_MISC_ADDR_LSB(mce->misc), it's been
> >>>> discovered that the mce->misc LSB field is 0x1000 bytes, hence
> >>>> injecting
> >>>> 2 back-to-back poisons and the driver ends up logging 8 badblocks,
> >>>> because 0x1000 bytes is 8 512-byte.
> >>>>
> >>>> Dan Williams noticed that apei_mce_report_mem_error() hardcode
> >>>> the LSB field to PAGE_SHIFT instead of consulting the input
> >>>> struct cper_sec_mem_err record. So change to rely on hardware whenever
> >>>> support is available.
> >>>>
> >>>> Link:
> >>>> https://lore.kernel.org/r/7ed50fd8-521e-cade-77b1-738b8bfb8502@oracle.com
> >>>>
> >>>>
> >>>> Reviewed-by: Dan Williams <dan.j.williams@intel.com>
> >>>> Signed-off-by: Jane Chu <jane.chu@oracle.com>
> >>>> ---
> >>>> arch/x86/kernel/cpu/mce/apei.c | 14 +++++++++++++-
> >>>> 1 file changed, 13 insertions(+), 1 deletion(-)
> >>>>
> >>>> diff --git a/arch/x86/kernel/cpu/mce/apei.c
> >>>> b/arch/x86/kernel/cpu/mce/apei.c
> >>>> index 717192915f28..26d63818b2de 100644
> >>>> --- a/arch/x86/kernel/cpu/mce/apei.c
> >>>> +++ b/arch/x86/kernel/cpu/mce/apei.c
> >>>> @@ -29,15 +29,27 @@
> >>>> void apei_mce_report_mem_error(int severity, struct
> >>>> cper_sec_mem_err *mem_err)
> >>>> {
> >>>> struct mce m;
> >>>> + int grain = PAGE_SHIFT;
> >>>> if (!(mem_err->validation_bits & CPER_MEM_VALID_PA))
> >>>> return;
> >>>> + /*
> >>>> + * Even if the ->validation_bits are set for address mask,
> >>>> + * to be extra safe, check and reject an error radius '0',
> >>>> + * and fallback to the default page size.
> >>>> + */
> >>>> + if (mem_err->validation_bits & CPER_MEM_VALID_PA_MASK) {
> >>>> + grain = ~mem_err->physical_addr_mask + 1;
> >>>> + if (grain == 1)
> >>>> + grain = PAGE_SHIFT;
> >>>
> >>> Wait, if @grain is the number of bits to mask off the address, shouldn't
> >>> this be something like:
> >>>
> >>> grain = min_not_zero(PAGE_SHIFT,
> >>> hweight64(~mem_err->physical_addr_mask));
> >>
> >> I see. I guess what you meant is
> >> grain = min(PAGE_SHIFT, (1 +
> >> hweight64(~mem_err->physical_addr_mask)));
> >
> > Sorry, take that back, it won't work either.
>
> This will work,
> grain = min_not_zero(PAGE_SHIFT - 1,
> hweight64(~mem_err->physical_addr_mask));
> grain++;
> but too sophisticated? I guess I prefer the simple "if" expression.
An "if" is fine, I was more pointing out that:
hweight64(~mem_err->physical_addr_mask) + 1
...and:
~mem_err->physical_addr_mask + 1;
...give different results.
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH v4] x86/mce: retrieve poison range from hardware
2022-07-27 20:01 ` Dan Williams
@ 2022-07-28 18:31 ` Jane Chu
2022-07-28 18:46 ` Dan Williams
0 siblings, 1 reply; 9+ messages in thread
From: Jane Chu @ 2022-07-28 18:31 UTC (permalink / raw)
To: Dan Williams, tony.luck, bp, tglx, mingo, dave.hansen, x86,
linux-edac, linux-kernel, hch, nvdimm
On 7/27/2022 1:01 PM, Dan Williams wrote:
> Jane Chu wrote:
>> On 7/27/2022 12:30 PM, Jane Chu wrote:
>>> On 7/27/2022 12:24 PM, Jane Chu wrote:
>>>> On 7/27/2022 11:56 AM, Dan Williams wrote:
>>>>> Jane Chu wrote:
>>>>>> With Commit 7917f9cdb503 ("acpi/nfit: rely on mce->misc to determine
>>>>>> poison granularity") that changed nfit_handle_mce() callback to report
>>>>>> badrange according to 1ULL << MCI_MISC_ADDR_LSB(mce->misc), it's been
>>>>>> discovered that the mce->misc LSB field is 0x1000 bytes, hence
>>>>>> injecting
>>>>>> 2 back-to-back poisons and the driver ends up logging 8 badblocks,
>>>>>> because 0x1000 bytes is 8 512-byte.
>>>>>>
>>>>>> Dan Williams noticed that apei_mce_report_mem_error() hardcode
>>>>>> the LSB field to PAGE_SHIFT instead of consulting the input
>>>>>> struct cper_sec_mem_err record. So change to rely on hardware whenever
>>>>>> support is available.
>>>>>>
>>>>>> Link:
>>>>>> https://lore.kernel.org/r/7ed50fd8-521e-cade-77b1-738b8bfb8502@oracle.com
>>>>>>
>>>>>>
>>>>>> Reviewed-by: Dan Williams <dan.j.williams@intel.com>
>>>>>> Signed-off-by: Jane Chu <jane.chu@oracle.com>
>>>>>> ---
>>>>>> arch/x86/kernel/cpu/mce/apei.c | 14 +++++++++++++-
>>>>>> 1 file changed, 13 insertions(+), 1 deletion(-)
>>>>>>
>>>>>> diff --git a/arch/x86/kernel/cpu/mce/apei.c
>>>>>> b/arch/x86/kernel/cpu/mce/apei.c
>>>>>> index 717192915f28..26d63818b2de 100644
>>>>>> --- a/arch/x86/kernel/cpu/mce/apei.c
>>>>>> +++ b/arch/x86/kernel/cpu/mce/apei.c
>>>>>> @@ -29,15 +29,27 @@
>>>>>> void apei_mce_report_mem_error(int severity, struct
>>>>>> cper_sec_mem_err *mem_err)
>>>>>> {
>>>>>> struct mce m;
>>>>>> + int grain = PAGE_SHIFT;
>>>>>> if (!(mem_err->validation_bits & CPER_MEM_VALID_PA))
>>>>>> return;
>>>>>> + /*
>>>>>> + * Even if the ->validation_bits are set for address mask,
>>>>>> + * to be extra safe, check and reject an error radius '0',
>>>>>> + * and fallback to the default page size.
>>>>>> + */
>>>>>> + if (mem_err->validation_bits & CPER_MEM_VALID_PA_MASK) {
>>>>>> + grain = ~mem_err->physical_addr_mask + 1;
>>>>>> + if (grain == 1)
>>>>>> + grain = PAGE_SHIFT;
>>>>>
>>>>> Wait, if @grain is the number of bits to mask off the address, shouldn't
>>>>> this be something like:
>>>>>
>>>>> grain = min_not_zero(PAGE_SHIFT,
>>>>> hweight64(~mem_err->physical_addr_mask));
>>>>
>>>> I see. I guess what you meant is
>>>> grain = min(PAGE_SHIFT, (1 +
>>>> hweight64(~mem_err->physical_addr_mask)));
>>>
>>> Sorry, take that back, it won't work either.
>>
>> This will work,
>> grain = min_not_zero(PAGE_SHIFT - 1,
>> hweight64(~mem_err->physical_addr_mask));
>> grain++;
>> but too sophisticated? I guess I prefer the simple "if" expression.
>
> An "if" is fine, I was more pointing out that:
>
> hweight64(~mem_err->physical_addr_mask) + 1
>
> ...and:
>
> ~mem_err->physical_addr_mask + 1;
>
> ...give different results.
They are different indeed. hweight64 returns the count of set bit while
~mem_err->physical_addr_mask returns a negated value.
According to the definition of "Physical Address Mask" -
https://uefi.org/sites/default/files/resources/UEFI_Spec_2_9_2021_03_18.pdf
Table N-31 Memory Error Record
Physical Address Mask 24 8 Defines the valid address bits in the
Physical Address field. The mask specifies the granularity of the
physical address which is dependent on the hw/ implementation factors
such as interleaving.
It appears that "Physical Address Mask" is defined more like PAGE_MASK
rather than in bitops hweight64() ofter used to count the set bits as
an indication of (e.g.) how many registers are in use.
Ans similar to PAGE_MASK, a valid "Physical Address Mask" should
consist of a contiguous low 0 bits, not 1's and 0's mixed up.
So far, as far as I can see, the v4 patch still looks correct to me.
Please let me know if I'm missing anything.
thanks!
-jane
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH v4] x86/mce: retrieve poison range from hardware
2022-07-28 18:31 ` Jane Chu
@ 2022-07-28 18:46 ` Dan Williams
2022-07-28 20:29 ` Jane Chu
0 siblings, 1 reply; 9+ messages in thread
From: Dan Williams @ 2022-07-28 18:46 UTC (permalink / raw)
To: Jane Chu, Dan Williams, tony.luck, bp, tglx, mingo, dave.hansen,
x86, linux-edac, linux-kernel, hch, nvdimm
Jane Chu wrote:
> On 7/27/2022 1:01 PM, Dan Williams wrote:
> > Jane Chu wrote:
> >> On 7/27/2022 12:30 PM, Jane Chu wrote:
> >>> On 7/27/2022 12:24 PM, Jane Chu wrote:
> >>>> On 7/27/2022 11:56 AM, Dan Williams wrote:
> >>>>> Jane Chu wrote:
> >>>>>> With Commit 7917f9cdb503 ("acpi/nfit: rely on mce->misc to determine
> >>>>>> poison granularity") that changed nfit_handle_mce() callback to report
> >>>>>> badrange according to 1ULL << MCI_MISC_ADDR_LSB(mce->misc), it's been
> >>>>>> discovered that the mce->misc LSB field is 0x1000 bytes, hence
> >>>>>> injecting
> >>>>>> 2 back-to-back poisons and the driver ends up logging 8 badblocks,
> >>>>>> because 0x1000 bytes is 8 512-byte.
> >>>>>>
> >>>>>> Dan Williams noticed that apei_mce_report_mem_error() hardcode
> >>>>>> the LSB field to PAGE_SHIFT instead of consulting the input
> >>>>>> struct cper_sec_mem_err record. So change to rely on hardware whenever
> >>>>>> support is available.
> >>>>>>
> >>>>>> Link:
> >>>>>> https://lore.kernel.org/r/7ed50fd8-521e-cade-77b1-738b8bfb8502@oracle.com
> >>>>>>
> >>>>>>
> >>>>>> Reviewed-by: Dan Williams <dan.j.williams@intel.com>
> >>>>>> Signed-off-by: Jane Chu <jane.chu@oracle.com>
> >>>>>> ---
> >>>>>> arch/x86/kernel/cpu/mce/apei.c | 14 +++++++++++++-
> >>>>>> 1 file changed, 13 insertions(+), 1 deletion(-)
> >>>>>>
> >>>>>> diff --git a/arch/x86/kernel/cpu/mce/apei.c
> >>>>>> b/arch/x86/kernel/cpu/mce/apei.c
> >>>>>> index 717192915f28..26d63818b2de 100644
> >>>>>> --- a/arch/x86/kernel/cpu/mce/apei.c
> >>>>>> +++ b/arch/x86/kernel/cpu/mce/apei.c
> >>>>>> @@ -29,15 +29,27 @@
> >>>>>> void apei_mce_report_mem_error(int severity, struct
> >>>>>> cper_sec_mem_err *mem_err)
> >>>>>> {
> >>>>>> struct mce m;
> >>>>>> + int grain = PAGE_SHIFT;
> >>>>>> if (!(mem_err->validation_bits & CPER_MEM_VALID_PA))
> >>>>>> return;
> >>>>>> + /*
> >>>>>> + * Even if the ->validation_bits are set for address mask,
> >>>>>> + * to be extra safe, check and reject an error radius '0',
> >>>>>> + * and fallback to the default page size.
> >>>>>> + */
> >>>>>> + if (mem_err->validation_bits & CPER_MEM_VALID_PA_MASK) {
> >>>>>> + grain = ~mem_err->physical_addr_mask + 1;
> >>>>>> + if (grain == 1)
> >>>>>> + grain = PAGE_SHIFT;
> >>>>>
> >>>>> Wait, if @grain is the number of bits to mask off the address, shouldn't
> >>>>> this be something like:
> >>>>>
> >>>>> grain = min_not_zero(PAGE_SHIFT,
> >>>>> hweight64(~mem_err->physical_addr_mask));
> >>>>
> >>>> I see. I guess what you meant is
> >>>> grain = min(PAGE_SHIFT, (1 +
> >>>> hweight64(~mem_err->physical_addr_mask)));
> >>>
> >>> Sorry, take that back, it won't work either.
> >>
> >> This will work,
> >> grain = min_not_zero(PAGE_SHIFT - 1,
> >> hweight64(~mem_err->physical_addr_mask));
> >> grain++;
> >> but too sophisticated? I guess I prefer the simple "if" expression.
> >
> > An "if" is fine, I was more pointing out that:
> >
> > hweight64(~mem_err->physical_addr_mask) + 1
> >
> > ...and:
> >
> > ~mem_err->physical_addr_mask + 1;
> >
> > ...give different results.
>
> They are different indeed. hweight64 returns the count of set bit while
> ~mem_err->physical_addr_mask returns a negated value.
>
> According to the definition of "Physical Address Mask" -
>
> https://uefi.org/sites/default/files/resources/UEFI_Spec_2_9_2021_03_18.pdf
>
> Table N-31 Memory Error Record
>
> Physical Address Mask 24 8 Defines the valid address bits in the
> Physical Address field. The mask specifies the granularity of the
> physical address which is dependent on the hw/ implementation factors
> such as interleaving.
>
> It appears that "Physical Address Mask" is defined more like PAGE_MASK
> rather than in bitops hweight64() ofter used to count the set bits as
> an indication of (e.g.) how many registers are in use.
>
> Ans similar to PAGE_MASK, a valid "Physical Address Mask" should
> consist of a contiguous low 0 bits, not 1's and 0's mixed up.
>
> So far, as far as I can see, the v4 patch still looks correct to me.
> Please let me know if I'm missing anything.
The v4 patch looks broken to me. If the address mask is
0xffffffffffffffc0 to indicate a cacheline error then:
~mem_err->physical_addr_mask + 1;
...results in a grain of 64 when it should be 6.
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH v4] x86/mce: retrieve poison range from hardware
2022-07-28 18:46 ` Dan Williams
@ 2022-07-28 20:29 ` Jane Chu
0 siblings, 0 replies; 9+ messages in thread
From: Jane Chu @ 2022-07-28 20:29 UTC (permalink / raw)
To: Dan Williams, tony.luck, bp, tglx, mingo, dave.hansen, x86,
linux-edac, linux-kernel, hch, nvdimm
On 7/28/2022 11:46 AM, Dan Williams wrote:
> Jane Chu wrote:
>> On 7/27/2022 1:01 PM, Dan Williams wrote:
>>> Jane Chu wrote:
>>>> On 7/27/2022 12:30 PM, Jane Chu wrote:
>>>>> On 7/27/2022 12:24 PM, Jane Chu wrote:
>>>>>> On 7/27/2022 11:56 AM, Dan Williams wrote:
>>>>>>> Jane Chu wrote:
>>>>>>>> With Commit 7917f9cdb503 ("acpi/nfit: rely on mce->misc to determine
>>>>>>>> poison granularity") that changed nfit_handle_mce() callback to report
>>>>>>>> badrange according to 1ULL << MCI_MISC_ADDR_LSB(mce->misc), it's been
>>>>>>>> discovered that the mce->misc LSB field is 0x1000 bytes, hence
>>>>>>>> injecting
>>>>>>>> 2 back-to-back poisons and the driver ends up logging 8 badblocks,
>>>>>>>> because 0x1000 bytes is 8 512-byte.
>>>>>>>>
>>>>>>>> Dan Williams noticed that apei_mce_report_mem_error() hardcode
>>>>>>>> the LSB field to PAGE_SHIFT instead of consulting the input
>>>>>>>> struct cper_sec_mem_err record. So change to rely on hardware whenever
>>>>>>>> support is available.
>>>>>>>>
>>>>>>>> Link:
>>>>>>>> https://lore.kernel.org/r/7ed50fd8-521e-cade-77b1-738b8bfb8502@oracle.com
>>>>>>>>
>>>>>>>>
>>>>>>>> Reviewed-by: Dan Williams <dan.j.williams@intel.com>
>>>>>>>> Signed-off-by: Jane Chu <jane.chu@oracle.com>
>>>>>>>> ---
>>>>>>>> arch/x86/kernel/cpu/mce/apei.c | 14 +++++++++++++-
>>>>>>>> 1 file changed, 13 insertions(+), 1 deletion(-)
>>>>>>>>
>>>>>>>> diff --git a/arch/x86/kernel/cpu/mce/apei.c
>>>>>>>> b/arch/x86/kernel/cpu/mce/apei.c
>>>>>>>> index 717192915f28..26d63818b2de 100644
>>>>>>>> --- a/arch/x86/kernel/cpu/mce/apei.c
>>>>>>>> +++ b/arch/x86/kernel/cpu/mce/apei.c
>>>>>>>> @@ -29,15 +29,27 @@
>>>>>>>> void apei_mce_report_mem_error(int severity, struct
>>>>>>>> cper_sec_mem_err *mem_err)
>>>>>>>> {
>>>>>>>> struct mce m;
>>>>>>>> + int grain = PAGE_SHIFT;
>>>>>>>> if (!(mem_err->validation_bits & CPER_MEM_VALID_PA))
>>>>>>>> return;
>>>>>>>> + /*
>>>>>>>> + * Even if the ->validation_bits are set for address mask,
>>>>>>>> + * to be extra safe, check and reject an error radius '0',
>>>>>>>> + * and fallback to the default page size.
>>>>>>>> + */
>>>>>>>> + if (mem_err->validation_bits & CPER_MEM_VALID_PA_MASK) {
>>>>>>>> + grain = ~mem_err->physical_addr_mask + 1;
>>>>>>>> + if (grain == 1)
>>>>>>>> + grain = PAGE_SHIFT;
>>>>>>>
>>>>>>> Wait, if @grain is the number of bits to mask off the address, shouldn't
>>>>>>> this be something like:
>>>>>>>
>>>>>>> grain = min_not_zero(PAGE_SHIFT,
>>>>>>> hweight64(~mem_err->physical_addr_mask));
>>>>>>
>>>>>> I see. I guess what you meant is
>>>>>> grain = min(PAGE_SHIFT, (1 +
>>>>>> hweight64(~mem_err->physical_addr_mask)));
>>>>>
>>>>> Sorry, take that back, it won't work either.
>>>>
>>>> This will work,
>>>> grain = min_not_zero(PAGE_SHIFT - 1,
>>>> hweight64(~mem_err->physical_addr_mask));
>>>> grain++;
>>>> but too sophisticated? I guess I prefer the simple "if" expression.
>>>
>>> An "if" is fine, I was more pointing out that:
>>>
>>> hweight64(~mem_err->physical_addr_mask) + 1
>>>
>>> ...and:
>>>
>>> ~mem_err->physical_addr_mask + 1;
>>>
>>> ...give different results.
>>
>> They are different indeed. hweight64 returns the count of set bit while
>> ~mem_err->physical_addr_mask returns a negated value.
>>
>> According to the definition of "Physical Address Mask" -
>>
>> https://uefi.org/sites/default/files/resources/UEFI_Spec_2_9_2021_03_18.pdf
>>
>> Table N-31 Memory Error Record
>>
>> Physical Address Mask 24 8 Defines the valid address bits in the
>> Physical Address field. The mask specifies the granularity of the
>> physical address which is dependent on the hw/ implementation factors
>> such as interleaving.
>>
>> It appears that "Physical Address Mask" is defined more like PAGE_MASK
>> rather than in bitops hweight64() ofter used to count the set bits as
>> an indication of (e.g.) how many registers are in use.
>>
>> Ans similar to PAGE_MASK, a valid "Physical Address Mask" should
>> consist of a contiguous low 0 bits, not 1's and 0's mixed up.
>>
>> So far, as far as I can see, the v4 patch still looks correct to me.
>> Please let me know if I'm missing anything.
>
> The v4 patch looks broken to me. If the address mask is
> 0xffffffffffffffc0 to indicate a cacheline error then:
>
> ~mem_err->physical_addr_mask + 1;
>
> ...results in a grain of 64 when it should be 6.
Right, it's the exponent that's needed, so back to __ffs64().
Sorry for the detour. v5 is coming next.
thanks!
-jane
^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2022-07-28 20:31 UTC | newest]
Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-07-27 18:46 [PATCH v4] x86/mce: retrieve poison range from hardware Jane Chu
2022-07-27 18:56 ` Dan Williams
2022-07-27 19:24 ` Jane Chu
2022-07-27 19:30 ` Jane Chu
2022-07-27 19:34 ` Jane Chu
2022-07-27 20:01 ` Dan Williams
2022-07-28 18:31 ` Jane Chu
2022-07-28 18:46 ` Dan Williams
2022-07-28 20:29 ` Jane Chu
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).