* [PATCH] mtd: rawnand: micron: Fix support for on-die ECC
@ 2018-05-03 7:49 Boris Brezillon
2018-05-04 9:58 ` Miquel Raynal
0 siblings, 1 reply; 5+ messages in thread
From: Boris Brezillon @ 2018-05-03 7:49 UTC (permalink / raw)
To: Boris Brezillon, Richard Weinberger, Miquel Raynal, linux-mtd
Cc: David Woodhouse, Brian Norris, Marek Vasut, Cyrille Pitchen,
stable, Thomas Petazzoni, Bean Huo, Peter Pan
It looks like the NAND_STATUS_FAIL bit is sticky after an ECC failure,
which leads all READ operations following the failing one to report
an ECC failure. Reset the chip to clear the NAND_STATUS_FAIL bit.
Note that this behavior is not document in the datasheet, but resetting
the chip is the only solution we found to fix the problem.
Fixes: 9748e1d87573 ("mtd: nand: add support for Micron on-die ECC")
Cc: <stable@vger.kernel.org>
Signed-off-by: Boris Brezillon <boris.brezillon@bootlin.com>
Cc: Thomas Petazzoni <thomas.petazzoni@bootlin.com>
Cc: Bean Huo <beanhuo@micron.com>
Cc: Peter Pan <peterpandong@micron.com>
---
Peter, Bean,
Can you confirm this behavior, or ask someone in Micron who can confirm
it? Also, if a RESET is actually needed, it would be good to update the
datasheet accordingly. And if that's not the case, can you explain why
the NAND_STATUS_FAIL bit is stuck and how to clear it (I tried a 0x00
command, A.K.A. READ STATUS EXIT, but it does not clear this bit, ERASE
and PROGRAM seem to clear the bit, but that's clearly not the kind of
operation I can do when the user asks for a READ)?
Thanks,
Boris
---
drivers/mtd/nand/raw/nand_micron.c | 17 +++++++++++++++++
1 file changed, 17 insertions(+)
diff --git a/drivers/mtd/nand/raw/nand_micron.c b/drivers/mtd/nand/raw/nand_micron.c
index 0af45b134c0c..a915f568f6a3 100644
--- a/drivers/mtd/nand/raw/nand_micron.c
+++ b/drivers/mtd/nand/raw/nand_micron.c
@@ -153,6 +153,23 @@ micron_nand_read_page_on_die_ecc(struct mtd_info *mtd, struct nand_chip *chip,
ret = nand_read_data_op(chip, chip->oob_poi, mtd->oobsize,
false);
+ /*
+ * Looks like the NAND_STATUS_FAIL bit is sticky after an ECC failure,
+ * which leads all READ operations following the failing one to report
+ * an ECC failure.
+ * Reset the chip to clear it.
+ *
+ * Note that this behavior is not document in the datasheet, but
+ * resetting the chip is the only solution we found to clear this bit.
+ */
+ if (status & NAND_STATUS_FAIL) {
+ int cs = page >> (chip->chip_shift - chip->page_shift);
+
+ chip->select_chip(mtd, -1);
+ nand_reset(chip, cs);
+ chip->select_chip(mtd, cs);
+ }
+
out:
micron_nand_on_die_ecc_setup(chip, false);
--
2.14.1
^ permalink raw reply related [flat|nested] 5+ messages in thread
* Re: [PATCH] mtd: rawnand: micron: Fix support for on-die ECC
2018-05-03 7:49 [PATCH] mtd: rawnand: micron: Fix support for on-die ECC Boris Brezillon
@ 2018-05-04 9:58 ` Miquel Raynal
2018-05-08 21:12 ` Boris Brezillon
0 siblings, 1 reply; 5+ messages in thread
From: Miquel Raynal @ 2018-05-04 9:58 UTC (permalink / raw)
To: Boris Brezillon
Cc: Richard Weinberger, linux-mtd, David Woodhouse, Brian Norris,
Marek Vasut, Cyrille Pitchen, stable, Thomas Petazzoni, Bean Huo,
Peter Pan
Hi Boris,
On Thu, 3 May 2018 09:49:08 +0200, Boris Brezillon
<boris.brezillon@bootlin.com> wrote:
> It looks like the NAND_STATUS_FAIL bit is sticky after an ECC failure,
> which leads all READ operations following the failing one to report
> an ECC failure. Reset the chip to clear the NAND_STATUS_FAIL bit.
>
> Note that this behavior is not document in the datasheet, but resetting
> the chip is the only solution we found to fix the problem.
>
> Fixes: 9748e1d87573 ("mtd: nand: add support for Micron on-die ECC")
> Cc: <stable@vger.kernel.org>
> Signed-off-by: Boris Brezillon <boris.brezillon@bootlin.com>
> Cc: Thomas Petazzoni <thomas.petazzoni@bootlin.com>
> Cc: Bean Huo <beanhuo@micron.com>
> Cc: Peter Pan <peterpandong@micron.com>
> ---
Reviewed-by: Miquel Raynal <miquel.raynal@bootlin.com>
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH] mtd: rawnand: micron: Fix support for on-die ECC
2018-05-04 9:58 ` Miquel Raynal
@ 2018-05-08 21:12 ` Boris Brezillon
2018-05-10 6:46 ` Boris Brezillon
0 siblings, 1 reply; 5+ messages in thread
From: Boris Brezillon @ 2018-05-08 21:12 UTC (permalink / raw)
To: Miquel Raynal
Cc: Richard Weinberger, stable, Peter Pan, Marek Vasut, linux-mtd,
Thomas Petazzoni, Cyrille Pitchen, Brian Norris, David Woodhouse,
Bean Huo
On Fri, 4 May 2018 11:58:35 +0200
Miquel Raynal <miquel.raynal@bootlin.com> wrote:
> Hi Boris,
>
> On Thu, 3 May 2018 09:49:08 +0200, Boris Brezillon
> <boris.brezillon@bootlin.com> wrote:
>
> > It looks like the NAND_STATUS_FAIL bit is sticky after an ECC failure,
> > which leads all READ operations following the failing one to report
> > an ECC failure. Reset the chip to clear the NAND_STATUS_FAIL bit.
> >
> > Note that this behavior is not document in the datasheet, but resetting
> > the chip is the only solution we found to fix the problem.
> >
> > Fixes: 9748e1d87573 ("mtd: nand: add support for Micron on-die ECC")
> > Cc: <stable@vger.kernel.org>
> > Signed-off-by: Boris Brezillon <boris.brezillon@bootlin.com>
> > Cc: Thomas Petazzoni <thomas.petazzoni@bootlin.com>
> > Cc: Bean Huo <beanhuo@micron.com>
> > Cc: Peter Pan <peterpandong@micron.com>
> > ---
>
> Reviewed-by: Miquel Raynal <miquel.raynal@bootlin.com>
Queued to mtd/master.
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH] mtd: rawnand: micron: Fix support for on-die ECC
2018-05-08 21:12 ` Boris Brezillon
@ 2018-05-10 6:46 ` Boris Brezillon
0 siblings, 0 replies; 5+ messages in thread
From: Boris Brezillon @ 2018-05-10 6:46 UTC (permalink / raw)
To: Miquel Raynal
Cc: Richard Weinberger, stable, Marek Vasut, linux-mtd,
Thomas Petazzoni, Cyrille Pitchen, Bean Huo, Brian Norris,
David Woodhouse, Peter Pan
On Tue, 8 May 2018 23:12:59 +0200
Boris Brezillon <boris.brezillon@bootlin.com> wrote:
> On Fri, 4 May 2018 11:58:35 +0200
> Miquel Raynal <miquel.raynal@bootlin.com> wrote:
>
> > Hi Boris,
> >
> > On Thu, 3 May 2018 09:49:08 +0200, Boris Brezillon
> > <boris.brezillon@bootlin.com> wrote:
> >
> > > It looks like the NAND_STATUS_FAIL bit is sticky after an ECC failure,
> > > which leads all READ operations following the failing one to report
> > > an ECC failure. Reset the chip to clear the NAND_STATUS_FAIL bit.
> > >
> > > Note that this behavior is not document in the datasheet, but resetting
> > > the chip is the only solution we found to fix the problem.
> > >
> > > Fixes: 9748e1d87573 ("mtd: nand: add support for Micron on-die ECC")
> > > Cc: <stable@vger.kernel.org>
> > > Signed-off-by: Boris Brezillon <boris.brezillon@bootlin.com>
> > > Cc: Thomas Petazzoni <thomas.petazzoni@bootlin.com>
> > > Cc: Bean Huo <beanhuo@micron.com>
> > > Cc: Peter Pan <peterpandong@micron.com>
> > > ---
> >
> > Reviewed-by: Miquel Raynal <miquel.raynal@bootlin.com>
>
> Queued to mtd/master.
I'm dropping this patch because I'm no longer sure this is the correct
way to fix bug. It seems that nand_set_features_op() is checking the
FAIL bit while the ONFI spec clearly says that FAIL bit is only valid
after a PROGRAM, ERASE or READ-with-on-die-ECC-enabled op. That might
explain why ->set_features() fails with -EIO after an ECC failure
(apparently Micron only clears the FAIL bit when launching a PROGRAM,
ERASE or READ-with-on-die-ECC-enabled op, not on a SET_FEATURES op).
^ permalink raw reply [flat|nested] 5+ messages in thread
* RE: [PATCH] mtd: rawnand: micron: Fix support for on-die ECC
@ 2018-05-21 22:17 Bean Huo (beanhuo)
0 siblings, 0 replies; 5+ messages in thread
From: Bean Huo (beanhuo) @ 2018-05-21 22:17 UTC (permalink / raw)
To: Boris Brezillon, Richard Weinberger, Miquel Raynal, linux-mtd
Cc: David Woodhouse, Brian Norris, Marek Vasut, Cyrille Pitchen,
stable, Thomas Petazzoni,
Peter Pan 潘栋 (peterpandong)
Hi, Boris
Sorry for the later as for I am in a long vacation.
Here how the SR should behave:
the status register is updated after each array operation and can be cleared with a reset command.
After a read operation the status register bit0 will report the ECC status of the read until a different array operation is performed (erase/program/read) or a reset occurs.
The status register bit1 will report the status of the time before last time operation. So, this bit can report a fail (value 1) even if the very last operation was successful (bit0=0 bit1=1).
//beanhuo
>
>---
>Peter, Bean,
>
>Can you confirm this behavior, or ask someone in Micron who can confirm it?
>Also, if a RESET is actually needed, it would be good to update the datasheet
>accordingly. And if that's not the case, can you explain why the
>NAND_STATUS_FAIL bit is stuck and how to clear it (I tried a 0x00 command,
>A.K.A. READ STATUS EXIT, but it does not clear this bit, ERASE and PROGRAM
>seem to clear the bit, but that's clearly not the kind of operation I can do
>when the user asks for a READ)?
>
>Thanks,
>
>Boris
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2018-05-21 22:36 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-05-03 7:49 [PATCH] mtd: rawnand: micron: Fix support for on-die ECC Boris Brezillon
2018-05-04 9:58 ` Miquel Raynal
2018-05-08 21:12 ` Boris Brezillon
2018-05-10 6:46 ` Boris Brezillon
2018-05-21 22:17 Bean Huo (beanhuo)
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.