linux-edac.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* RE: (EXT) Re: [PATCH v2 1/1] edac: fsl_ddr_edac: fix expected data message
@ 2020-09-04  6:52 Gregor Herburger
  2020-09-04  9:17 ` Borislav Petkov
  0 siblings, 1 reply; 8+ messages in thread
From: Gregor Herburger @ 2020-09-04  6:52 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: york.sun, mchehab, tony.luck, james.morse, rrichter, linux-edac,
	linux-kernel

> >  drivers/edac/fsl_ddr_edac.c | 6 +++---
> >  1 file changed, 3 insertions(+), 3 deletions(-)
> >
> > diff --git a/drivers/edac/fsl_ddr_edac.c b/drivers/edac/fsl_ddr_edac.c
> > index 6d8ea226010d..4b6989cf1947 100644
> > --- a/drivers/edac/fsl_ddr_edac.c
> > +++ b/drivers/edac/fsl_ddr_edac.c
> > @@ -343,9 +343,9 @@ static void fsl_mc_check(struct mem_ctl_info *mci)
> > 
> >  fsl_mc_printk(mci, KERN_ERR,
> >  "Expected Data / ECC:\t%#8.8x_%08x / %#2.2x\n",
> > - cap_high ^ (1 << (bad_data_bit - 32)),
> > - cap_low ^ (1 << bad_data_bit),
> > - syndrome ^ (1 << bad_ecc_bit));
> > + (bad_data_bit > 31) ? cap_high ^ (1 << (bad_data_bit - 32)) : cap_high,
> > + (bad_data_bit <= 31) ? cap_low ^ (1 << (bad_data_bit)) : cap_low,
>
> But if bad_data_bit is -1, this check above will hit and you'd still
> shift by -1, IINM.
You are right. It worked on my machine, but i guess that is again machine-dependent.

> How about you fix it properly, clean it up and make it more readable in
> the process (pasting the code directly instead of a diff because a diff
> is less readable):
>
>         if ((err_detect & DDR_EDE_SBE) && (bus_width == 64)) {
>                 sbe_ecc_decode(cap_high, cap_low, syndrome,
>                                 &bad_data_bit, &bad_ecc_bit);
>
>                 if (bad_data_bit != -1) {
>                         if (bad_data_bit > 31)
>                                 cap_high ^= 1 << (bad_data_bit - 32);
>                         else
>                                 cap_low  ^= 1 << bad_data_bit;
>
>                         fsl_mc_printk(mci, KERN_ERR, "Faulty Data bit: %d\n",
> bad_data_bit);
>                         fsl_mc_printk(mci, KERN_ERR, "Expected Data: %#8.8x_%08x\n",
>                                       cap_high, cap_low);
>                 }
>
>                 if (bad_ecc_bit != -1) {
>                         fsl_mc_printk(mci, KERN_ERR, "Faulty ECC bit: %d\n",
> bad_ecc_bit);
>                         fsl_mc_printk(mci, KERN_ERR, "Expected ECC: %#2.2x\n",
>                                       syndrome ^ (1 << bad_ecc_bit));
>                 }
>         }
>
> This way you print only when the respective faulty bits have been
> properly found and not print anything otherwise.

The cap_low, cap_high and syndrome are used in the printk following the if-Block.
This will make expected data / captured data look the same.

>
> Hmm?

I would prefer printing exptected data and captured data in the same format, making it
easier to compare them directly.

diff --git a/drivers/edac/fsl_ddr_edac.c b/drivers/edac/fsl_ddr_edac.c
index 6d8ea226010d..880cf3f4712b 100644
--- a/drivers/edac/fsl_ddr_edac.c
+++ b/drivers/edac/fsl_ddr_edac.c
@@ -288,6 +288,9 @@ static void fsl_mc_check(struct mem_ctl_info *mci)
        u32 cap_low;
        int bad_data_bit;
        int bad_ecc_bit;
+       u32 exp_high;
+       u32 exp_low;
+       u32 exp_syndrome;

        err_detect = ddr_in32(pdata->mc_vbase + FSL_MC_ERR_DETECT);
        if (!err_detect)
@@ -334,18 +337,32 @@ static void fsl_mc_check(struct mem_ctl_info *mci)
                sbe_ecc_decode(cap_high, cap_low, syndrome,
                                &bad_data_bit, &bad_ecc_bit);

+               exp_high = cap_high;
+               exp_low = cap_low;
+               exp_syndrome = syndrome;
+
                if (bad_data_bit != -1)
+               {
                        fsl_mc_printk(mci, KERN_ERR,
                                "Faulty Data bit: %d\n", bad_data_bit);
+
+                       if (bad_data_bit < 32)
+                               exp_low = cap_low ^ (1 << bad_data_bit);
+                       else
+                               exp_high = cap_high ^ (1 << (bad_data_bit - 32));
+               }
+
                if (bad_ecc_bit != -1)
+               {
                        fsl_mc_printk(mci, KERN_ERR,
                                "Faulty ECC bit: %d\n", bad_ecc_bit);

+                       exp_syndrome = syndrome ^ (1 << bad_ecc_bit);
+               }
+
                fsl_mc_printk(mci, KERN_ERR,
                        "Expected Data / ECC:\t%#8.8x_%08x / %#2.2x\n",
-                       cap_high ^ (1 << (bad_data_bit - 32)),
-                       cap_low ^ (1 << bad_data_bit),
-                       syndrome ^ (1 << bad_ecc_bit));
+                       exp_high, exp_low, exp_syndrome);
        }

          fsl_mc_printk(mci, KERN_ERR,
                          "Captured Data / ECC:\t%#8.8x_%08x / %#2.2x\n",
                          cap_high, cap_low, syndrome);

How about something like this?


^ permalink raw reply related	[flat|nested] 8+ messages in thread
* Re: [PATCH 1/1] edac: fsl_ddr_edac: fix expected data message
@ 2020-08-17  9:53 Borislav Petkov
  2020-08-27  7:56 ` [PATCH v2 " Gregor Herburger
  0 siblings, 1 reply; 8+ messages in thread
From: Borislav Petkov @ 2020-08-17  9:53 UTC (permalink / raw)
  To: Gregor Herburger
  Cc: york.sun, mchehab, tony.luck, james.morse, rrichter, linux-edac

On Fri, Jul 24, 2020 at 01:18:46PM +0200, Gregor Herburger wrote:
> In some cases a wrong 'Expected Data' is calculated and reported.

In some cases? Which cases?

You need to expand that sentence with more details as to what the
problem is because I'm not getting any smarter from it.

> When comparing Expected/Captured Data this looks like dual bit errors when
> only a single bit error occurred.
> 
> On my aarch64 machine it prints something similar to this:
> [  311.103794] EDAC FSL_DDR MC0: Faulty Data bit: 36
> [  311.108490] EDAC FSL_DDR MC0: Expected Data / ECC:   0xffffffef_ffffffff / 0x80000059
> [  311.116135] EDAC FSL_DDR MC0: Captured Data / ECC:   0xffffffff_ffffffef / 0x59

Is that output before or after your change?

0xffffffef is with bit 4 XORed and cap_high was -1 before, cap_low is -1
too. The expected data syndrome has bit 31 set?!

Yeah, I'm confused. Please explain the issue in greater detail, try
structuring it this way:

Problem is A.

It happens because of B.

Fix it by doing C.

(Potentially do D).

For more detailed info, see
Documentation/process/submitting-patches.rst, Section "2) Describe your
changes".

Thx.

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2020-09-11 11:06 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-09-04  6:52 (EXT) Re: [PATCH v2 1/1] edac: fsl_ddr_edac: fix expected data message Gregor Herburger
2020-09-04  9:17 ` Borislav Petkov
2020-09-04 13:32   ` Gregor Herburger
2020-09-08 19:24     ` Borislav Petkov
2020-09-10 15:06       ` (EXT) " Gregor Herburger
2020-09-11 11:06         ` Borislav Petkov
  -- strict thread matches above, loose matches on Subject: below --
2020-08-17  9:53 [PATCH " Borislav Petkov
2020-08-27  7:56 ` [PATCH v2 " Gregor Herburger
2020-09-03 10:58   ` Borislav Petkov

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).