linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Kamal Dasu <kdasu.kdev@gmail.com>
To: Miquel Raynal <miquel.raynal@bootlin.com>
Cc: Brian Norris <computersforpeace@gmail.com>,
	Richard Weinberger <richard@nod.at>,
	Vignesh Raghavendra <vigneshr@ti.com>,
	MTD Maling List <linux-mtd@lists.infradead.org>,
	bcm-kernel-feedback-list <bcm-kernel-feedback-list@broadcom.com>,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH 2/2] mtd: rawnand: brcmnand: Ecc error handling on EDU transfers
Date: Mon, 15 Jun 2020 11:11:00 -0400	[thread overview]
Message-ID: <CAC=U0a14R-ZFtwcjESLn67tOfM=DPtdm_Ljuu6Y6Xyx6mRS-Lw@mail.gmail.com> (raw)
In-Reply-To: <20200615091923.0c3c7aa7@xps13>

On Mon, Jun 15, 2020 at 3:19 AM Miquel Raynal <miquel.raynal@bootlin.com> wrote:
>
> Hi Kamal,
>
> Kamal Dasu <kdasu.kdev@gmail.com> wrote on Fri, 12 Jun 2020 12:34:22
> -0400:
>
> > On Fri, Jun 12, 2020 at 3:07 AM Miquel Raynal <miquel.raynal@bootlin.com> wrote:
> > >
> > > Hi Kamal,
> > >
> > > Kamal Dasu <kdasu.kdev@gmail.com> wrote on Thu, 11 Jun 2020 12:04:29
> > > -0400:
> > >
> > > > On Thu, Jun 11, 2020 at 3:27 AM Miquel Raynal <miquel.raynal@bootlin.com> wrote:
> > > > >
> > > > > Hi Kamal,
> > > > >
> > > > > Kamal Dasu <kdasu.kdev@gmail.com> wrote on Thu, 11 Jun 2020 01:44:54
> > > > > -0400:
> > > > >
> > > > > > Implemented ECC correctable and uncorrectable error handling for EDU
> > > > >
> > > > > Implement?
> > > > >
> > > > > > reads. If ECC correctable bitflips are encountered  on EDU transfer,
> > > > >
> > > > > extra space                                         ^
> > > > >
> > > > > > read page again using pio, This is needed due to a controller lmitation
> > > > >
> > > > > s/pio/PIO/
> > > > >
> > > > > > where read and corrected data is not transferred to the DMA buffer on ECC
> > > > > > errors. This holds true for ECC correctable errors beyond set threshold.
> > > > >
> > > > > error.
> > > > >
> > > > > Not sure what the last sentence means?
> > > > >
> > > >
> > > > NAND controller allows for setting a correctable  ECC threshold number
> > > > of bits beyond which it will actually report the error to the driver.
> > > > e.g. for BCH-4 the threshold is 3, so 3-bit and 4-bit errors will
> > > > generate correctable ECC interrupt however 1-bit and 2-bit errors will
> > > > be corrected silently.
> > > > From the above example EDU hardware will not transfer corrected data
> > > > to the DMA buffer for 3-bit and 4-bit errors that get reported. So
> > > > once we detect
> > > > the error duing EDU we read the page again using pio.
> > >
> > > Ok I see what you mean, can't you fake the threshold instead? The NAND
> > > controller in Linux is not supposed to handle this threshold, the NAND
> > > core is in charge. So what the controller driver should do is just:
> > > increase the number of bitflips + return the maximum number or bitflip
> > > or increase the failure counter. Is this already the case?
> > >
> > /* threshold = ceil(BCH-level * 0.75) */
> > brcmnand_wr_corr_thresh(host, DIV_ROUND_UP(chip->ecc.strength * 3, 4));
> > This how the threshold is set, all it means is that for high BCH
> > levels don't interrupt on low number (less than threshold) of
> > bit_flips. Yes the controller driver only increments correctable ECC
> > count. But due the EDU design an EDU operation is disrupted when the
> > controller interrupts on correctable ECC errors during subpage ECC
> > calculations. Hence the driver needs to read the page again with PIO
> > to transfer corrected data.
>
> IIUC, you are doing the job twice: you should just return a number of
> bitflips or an error to the NAND core. So that's why I'm telling that
> you should get rid of this threshold. It would avoid the need for the
> PIO transfer too.

I think you are reading some statements in isolation that probably are
causing some confusion.
EDU design has a flaw in case of reported  ECC error interrupt in that
corrected data is not transferred to the DMA buffer. The PIO is needed
to read corrected data into the NAND data buffer and only for the
reported errors. So there is no need to change the threshold
calculation logic, if we get rid of the threshold then we will have to
do the PIO read on any correctable bit error if it occurs during EDU
reads.

>
> You also say that the controller "only increments correctable ECC
> count", what do you mean exactly?

Maybe that statement was a bit misleading. To be clear when an ECC
error is reported the controller gives the bit_flips count  as well as
updates the ECC error address Register and ecc error status registers.
This logic works as expected in the hardware.

>The controller does not report errors
> when the number of bitflips happens to be above the BCH threshold? This
> would be the only case where what is currently done would be actually
> needed though.

 It's the other way. The controller only reports bit errors beyond
>=threshold value, will not report otherwise and silently correct the
data. There is no problem in  cases where erros are corrected
silently. Now ECC (un)correctable on EDU reads are detected by simply
reading back the ECC Error address register. And in case of reported
uncorrectable ECC errors are treated as usual.  And for reported
correctable ECC errors we need to read the page again using PIO so
that the corrected data is properly transferred. All this applies to
EDU transfer only.

>
> Thanks,
> Miquèl

Kamal

  reply	other threads:[~2020-06-15 15:11 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-06-11  5:44 [PATCH 1/2] mtd: rawnand: brcmnand: Don't default to edu transfer Kamal Dasu
2020-06-11  5:44 ` [PATCH 2/2] mtd: rawnand: brcmnand: Ecc error handling on EDU transfers Kamal Dasu
2020-06-11  7:27   ` Miquel Raynal
2020-06-11 16:04     ` Kamal Dasu
2020-06-12  7:07       ` Miquel Raynal
2020-06-12 16:34         ` Kamal Dasu
2020-06-15  7:19           ` Miquel Raynal
2020-06-15 15:11             ` Kamal Dasu [this message]
2020-06-15 17:37               ` Miquel Raynal
2020-06-11  7:16 ` [PATCH 1/2] mtd: rawnand: brcmnand: Don't default to edu transfer Miquel Raynal
2020-06-11 15:22   ` Kamal Dasu
2020-06-11 15:24     ` Miquel Raynal

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAC=U0a14R-ZFtwcjESLn67tOfM=DPtdm_Ljuu6Y6Xyx6mRS-Lw@mail.gmail.com' \
    --to=kdasu.kdev@gmail.com \
    --cc=bcm-kernel-feedback-list@broadcom.com \
    --cc=computersforpeace@gmail.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mtd@lists.infradead.org \
    --cc=miquel.raynal@bootlin.com \
    --cc=richard@nod.at \
    --cc=vigneshr@ti.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).