Re: [PATCH] EDAC, ghes: Fix grain calculation

From: Robert Richter <rrichter@marvell.com>
To: Borislav Petkov <bp@alien8.de>
Cc: James Morse <james.morse@arm.com>,
	"linux-edac@vger.kernel.org" <linux-edac@vger.kernel.org>,
	Mauro Carvalho Chehab <mchehab@kernel.org>,
	Toshi Kani <toshi.kani@hpe.com>
Subject: Re: [PATCH] EDAC, ghes: Fix grain calculation
Date: Fri, 14 Jun 2019 14:40:01 +0000	[thread overview]
Message-ID: <20190614143932.cacobehkuw62frzx@rric.localdomain> (raw)
In-Reply-To: <20190614100918.GA2586@zn.tnic>

On 14.06.19 12:09:18, Borislav Petkov wrote:
> On Fri, Jun 14, 2019 at 07:21:47AM +0000, Robert Richter wrote:
> > On 14.06.19 00:41:30, Borislav Petkov wrote:
> > > On Thu, Jun 13, 2019 at 09:07:38PM +0000, Robert Richter wrote:
> > > >  grain_bits = fls_long(e->grain) + 1;
> > > 
> > > For 4K grain this makes grain_bits == 12.
> > 
> > fls(0) == 0
> > fls(1) == 1
> > fls(2) == 2
> > fls(4) == 3
> > ...
> > fls(0x1000) == 13
> 
> I believe the intent of the "+ 1" above is that e->grain was a mask of
> the type 0xfff so you have to + 1 since we count from bit 0.

No, as you grepped below, the granularity is the number of bytes, and
mostly power of 2. Also note that fls(0xfff) is 12 and fls(0x1000) is
13 ...

> 
> > Correct for edac_mc is:
> > 
> >  grain_bits = fls_long(e->grain ? e->grain - 1 : 0);

... thus the -1 here.

> 
> First of all, you don't wanna do fls(0).

You need to deal with fls(0) as you don't know what drivers fill in
for the grain. fls(0) calculates fine here to 1 grain bit which is a
one byte granularity.

> 
> Then, we don't want to use PAGE_MASK for grain computation because as
> James showed, it causes problems.
> 
> > James' calculation for ghes is correct in this particular case
> > (assuming a bit mask with (power of 2 - 1)).
> 
> And this is the question that needs to be clarified first: from what do
> we compute the grain.
> 
> ghes_edac uses ~physical_addr_mask which can basically be:
> 
> 	grain_bits = hweight_long(physical_addr_mask);
> 
> which, in the 0xfff case gives you the correct 12 bits. It is assuming
> it is a contiguous mask though. It better be...
> 
> Now, if you look at how the grain gets read out in edac_mc_handle_error(), it
> comes from dimm->grain. And that is defined as:
> 
>         u32 grain;              /* granularity of reported error in bytes */
> 
> in bytes!
> 
> Now, if you grep for "grain" in drivers/edac/ you see most (if not all)
> initialized as bytes:
> 
> i82975x_edac.c:415:                     dimm->grain = 1 << 7;   /* 128Byte cache-line resolution */
> pnd2_edac.c:1260:               dimm->grain = 32;
> skx_common.c:313:       dimm->grain = 32;
> highbank_mc_edac.c:223: dimm->grain = 8;
> ...
> 
> and so on.
> 
> So actually I think that
> 
> 	grain_bits = fls_long(e->grain) + 1;
> 
> is wrong in that case.

This calculates correctly:

	grain_bits = fls_long(e->grain ? e->grain - 1 : 0);

and also handles the case if the number of bytes is not power of 2.

> It should simply hand down e->grain as it is
> already the number of bytes.
> Which we should simply dump in the tracepoint too - the number of bytes
> and not that silly shifting there:
> 
> 	1 << __entry->grain_bits,

The tracepoint format for mc_event changes from byte to long for the
grain then. This is not backward compatible and breaks userland.

> 
> And once we've said that the grain is going to *everywhere* be the
> granularity of the error in number of bytes, then we should stick to
> that and fix all those drivers which don't do that.
> 
> Ok?

See my other mail in this thread with a diff. I would go with that
one.

-Robert