All of lore.kernel.org
 help / color / mirror / Atom feed
* RE: [PATCH 4/5] mtd: nand: add support for Micron on-die ECC
       [not found] <538805ebf8e64015a8b833de755652b3@SIWEX5A.sing.micron.com>
@ 2017-03-22 13:20     ` Bean Huo (beanhuo)
  0 siblings, 0 replies; 35+ messages in thread
From: Bean Huo (beanhuo) @ 2017-03-22 13:20 UTC (permalink / raw)
  To: Thomas Petazzoni, richard-/L3Ra7n9ekc, Brezillon
  Cc: marek.vasut-Re5JQEeQqe8AvxtiuMwx3w, Cyrille Pitchen,
	computersforpeace-Re5JQEeQqe8AvxtiuMwx3w,
	linux-mtd-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	devicetree-u79uwXL29TY76Z2rM5mHXA, Rob Herring, Campbell,
	pawel.moll-5wv7dgnIgG8, Mark Rutland,
	galak-sgV2jX0FEOL9JmXXK+q4OQ

>+micron_nand_read_page_on_die_ecc(struct mtd_info *mtd, struct nand_chip
>*chip,
>+                                                         uint8_t *buf, int oob_required,
>+                                                         int page)
>+{
>+             int status;
>+             int max_bitflips = 0;
>+
>+             micron_nand_on_die_ecc_setup(chip, true);
>+
>+             chip->cmdfunc(mtd, NAND_CMD_READ0, 0x00, page);
>+             chip->cmdfunc(mtd, NAND_CMD_STATUS, -1, -1);
>+             status = chip->read_byte(mtd);
>+             if (status & NAND_STATUS_FAIL)
>+                           mtd->ecc_stats.failed++;
>+             /*
>+             * The internal ECC doesn't tell us the number of bitflips
>+             * that have been corrected, but tells us if it recommends to
>+             * rewrite the block. If it's the case, then we pretend we had
>+             * a number of bitflips equal to the ECC strength, which will
>+             * hint the NAND core to rewrite the block.
>+             */
>+             else if (status & NAND_STATUS_WRITE_RECOMMENDED)
>+                           max_bitflips = chip->ecc.strength;
>+
>+             chip->cmdfunc(mtd, NAND_CMD_READ0, -1, -1);
>+
>+             nand_read_page_raw(mtd, chip, buf, oob_required, page);
>+
>+             micron_nand_on_die_ecc_setup(chip, false);
>+
>+             return max_bitflips;
>+}


Hi, 
Let me give you some information, hopefully you can do some modification based on above codes.

I noticed that this patches are based on MT29F1G08ABADAWP SLC NAND, it is our 60s 34nm SLC NAND.
So far, we have 2 series SLC NAND with implementations of on die ECC.
1. M79A for all 25nm (70series) SLC NAND with on-die ECC (M78A, M79A, and future design M70A)
2. M60A for all 34nm (60series) SLC NAND with on-die ECC

NAND_STATUS_FAIL:
For the both of series SLC NAND with on-die ECC, SR bit 0 (NAND_STATUS_FAIL) indicates an uncorrectable read fail,
data is lost, no recovery possible, unless we have software additional protection, the block is not necessarily
bad but the data is lost.

NAND_STATUS_WRITE_RECOMMENDED:

For the NAND_STATUS_WRITE_RECOMMENDED, it only works on 60s NAND, it is 4 bit ECC, the status register only
indicates if there is 0 or 1-4 correctable error bits. We don't want to trigger refresh if only 1 or 2 bits fail.
the base refresh is that if there 3 or 4 bitflips. But unfortunately we can't get failed bit count trough read status register. 
SW workaround proposal:
1. If SR bit 3 is set to 1 it means 1~4 bitflips and correctable.
2. Read out the page with ECC ON
3. Read out the page with ECC OFF
4. Compare the data
5. Count the number of bitflips for the sectors (there are 4 ECC sectors)
6. if 3 or more fail bits, trigger fresh. 
I know this is not good solution, but if as long as NAND_STATUS_WRITE_RECOMMENDED is set, and trigger refresh,
this will definitely increase NAND PE cycle.

For the 70s, it is 8 bits on-die ECC, the status register can report 7-8 bitflips (refresh recommended), 4-6 bitflips and 1-3 bitflips.
So we can trigger refresh according to its bitflips status.

Thanks.
//beanhuo


--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 35+ messages in thread

* RE: [PATCH 4/5] mtd: nand: add support for Micron on-die ECC
@ 2017-03-22 13:20     ` Bean Huo (beanhuo)
  0 siblings, 0 replies; 35+ messages in thread
From: Bean Huo (beanhuo) @ 2017-03-22 13:20 UTC (permalink / raw)
  To: Thomas Petazzoni, richard, Brezillon
  Cc: marek.vasut, Cyrille Pitchen, computersforpeace, linux-mtd,
	devicetree, Rob Herring, Campbell, pawel.moll, Mark Rutland,
	galak, Campbell

>+micron_nand_read_page_on_die_ecc(struct mtd_info *mtd, struct nand_chip
>*chip,
>+                                                         uint8_t *buf, int oob_required,
>+                                                         int page)
>+{
>+             int status;
>+             int max_bitflips = 0;
>+
>+             micron_nand_on_die_ecc_setup(chip, true);
>+
>+             chip->cmdfunc(mtd, NAND_CMD_READ0, 0x00, page);
>+             chip->cmdfunc(mtd, NAND_CMD_STATUS, -1, -1);
>+             status = chip->read_byte(mtd);
>+             if (status & NAND_STATUS_FAIL)
>+                           mtd->ecc_stats.failed++;
>+             /*
>+             * The internal ECC doesn't tell us the number of bitflips
>+             * that have been corrected, but tells us if it recommends to
>+             * rewrite the block. If it's the case, then we pretend we had
>+             * a number of bitflips equal to the ECC strength, which will
>+             * hint the NAND core to rewrite the block.
>+             */
>+             else if (status & NAND_STATUS_WRITE_RECOMMENDED)
>+                           max_bitflips = chip->ecc.strength;
>+
>+             chip->cmdfunc(mtd, NAND_CMD_READ0, -1, -1);
>+
>+             nand_read_page_raw(mtd, chip, buf, oob_required, page);
>+
>+             micron_nand_on_die_ecc_setup(chip, false);
>+
>+             return max_bitflips;
>+}


Hi, 
Let me give you some information, hopefully you can do some modification based on above codes.

I noticed that this patches are based on MT29F1G08ABADAWP SLC NAND, it is our 60s 34nm SLC NAND.
So far, we have 2 series SLC NAND with implementations of on die ECC.
1. M79A for all 25nm (70series) SLC NAND with on-die ECC (M78A, M79A, and future design M70A)
2. M60A for all 34nm (60series) SLC NAND with on-die ECC

NAND_STATUS_FAIL:
For the both of series SLC NAND with on-die ECC, SR bit 0 (NAND_STATUS_FAIL) indicates an uncorrectable read fail,
data is lost, no recovery possible, unless we have software additional protection, the block is not necessarily
bad but the data is lost.

NAND_STATUS_WRITE_RECOMMENDED:

For the NAND_STATUS_WRITE_RECOMMENDED, it only works on 60s NAND, it is 4 bit ECC, the status register only
indicates if there is 0 or 1-4 correctable error bits. We don't want to trigger refresh if only 1 or 2 bits fail.
the base refresh is that if there 3 or 4 bitflips. But unfortunately we can't get failed bit count trough read status register. 
SW workaround proposal:
1. If SR bit 3 is set to 1 it means 1~4 bitflips and correctable.
2. Read out the page with ECC ON
3. Read out the page with ECC OFF
4. Compare the data
5. Count the number of bitflips for the sectors (there are 4 ECC sectors)
6. if 3 or more fail bits, trigger fresh. 
I know this is not good solution, but if as long as NAND_STATUS_WRITE_RECOMMENDED is set, and trigger refresh,
this will definitely increase NAND PE cycle.

For the 70s, it is 8 bits on-die ECC, the status register can report 7-8 bitflips (refresh recommended), 4-6 bitflips and 1-3 bitflips.
So we can trigger refresh according to its bitflips status.

Thanks.
//beanhuo

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH 4/5] mtd: nand: add support for Micron on-die ECC
  2017-03-22 13:20     ` Bean Huo (beanhuo)
@ 2017-03-22 13:45         ` Boris Brezillon
  -1 siblings, 0 replies; 35+ messages in thread
From: Boris Brezillon @ 2017-03-22 13:45 UTC (permalink / raw)
  To: Bean Huo (beanhuo)
  Cc: Thomas Petazzoni, richard-/L3Ra7n9ekc,
	marek.vasut-Re5JQEeQqe8AvxtiuMwx3w, Cyrille Pitchen,
	computersforpeace-Re5JQEeQqe8AvxtiuMwx3w,
	linux-mtd-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	devicetree-u79uwXL29TY76Z2rM5mHXA, Rob Herring, Campbell,
	pawel.moll-5wv7dgnIgG8, Mark Rutland,
	galak-sgV2jX0FEOL9JmXXK+q4OQ

Hi Bean,

On Wed, 22 Mar 2017 13:20:04 +0000
"Bean Huo (beanhuo)" <beanhuo-AL4WhLSQfzjQT0dZR+AlfA@public.gmane.org> wrote:

> >+micron_nand_read_page_on_die_ecc(struct mtd_info *mtd, struct nand_chip
> >*chip,
> >+                                                         uint8_t *buf, int oob_required,
> >+                                                         int page)
> >+{
> >+             int status;
> >+             int max_bitflips = 0;
> >+
> >+             micron_nand_on_die_ecc_setup(chip, true);
> >+
> >+             chip->cmdfunc(mtd, NAND_CMD_READ0, 0x00, page);
> >+             chip->cmdfunc(mtd, NAND_CMD_STATUS, -1, -1);
> >+             status = chip->read_byte(mtd);
> >+             if (status & NAND_STATUS_FAIL)
> >+                           mtd->ecc_stats.failed++;
> >+             /*
> >+             * The internal ECC doesn't tell us the number of bitflips
> >+             * that have been corrected, but tells us if it recommends to
> >+             * rewrite the block. If it's the case, then we pretend we had
> >+             * a number of bitflips equal to the ECC strength, which will
> >+             * hint the NAND core to rewrite the block.
> >+             */
> >+             else if (status & NAND_STATUS_WRITE_RECOMMENDED)
> >+                           max_bitflips = chip->ecc.strength;
> >+
> >+             chip->cmdfunc(mtd, NAND_CMD_READ0, -1, -1);
> >+
> >+             nand_read_page_raw(mtd, chip, buf, oob_required, page);
> >+
> >+             micron_nand_on_die_ecc_setup(chip, false);
> >+
> >+             return max_bitflips;
> >+}  
> 
> 
> Hi, 
> Let me give you some information, hopefully you can do some modification based on above codes.
> 
> I noticed that this patches are based on MT29F1G08ABADAWP SLC NAND, it is our 60s 34nm SLC NAND.
> So far, we have 2 series SLC NAND with implementations of on die ECC.
> 1. M79A for all 25nm (70series) SLC NAND with on-die ECC (M78A, M79A, and future design M70A)
> 2. M60A for all 34nm (60series) SLC NAND with on-die ECC

Do you have an easy way to differentiate those 2 generations of chip,
or should we base our detection on the model name provided in the ONFI
parameter page?

> 
> NAND_STATUS_FAIL:
> For the both of series SLC NAND with on-die ECC, SR bit 0 (NAND_STATUS_FAIL) indicates an uncorrectable read fail,
> data is lost, no recovery possible, unless we have software additional protection, the block is not necessarily
> bad but the data is lost.
> 
> NAND_STATUS_WRITE_RECOMMENDED:
> 
> For the NAND_STATUS_WRITE_RECOMMENDED, it only works on 60s NAND, it is 4 bit ECC, the status register only
> indicates if there is 0 or 1-4 correctable error bits. We don't want to trigger refresh if only 1 or 2 bits fail.
> the base refresh is that if there 3 or 4 bitflips. But unfortunately we can't get failed bit count trough read status register. 
> SW workaround proposal:
> 1. If SR bit 3 is set to 1 it means 1~4 bitflips and correctable.
> 2. Read out the page with ECC ON
> 3. Read out the page with ECC OFF
> 4. Compare the data
> 5. Count the number of bitflips for the sectors (there are 4 ECC sectors)
> 6. if 3 or more fail bits, trigger fresh. 
> I know this is not good solution, but if as long as NAND_STATUS_WRITE_RECOMMENDED is set, and trigger refresh,
> this will definitely increase NAND PE cycle.

We discussed that with Thomas when developing the solution. I suggested
to first go for a simple solution even if it implies unneeded PE
cycles when bitflips are detected, but maybe I was wrong. In any case,
it shouldn't be to hard to do what you suggest.

> 
> For the 70s, it is 8 bits on-die ECC, the status register can report 7-8 bitflips (refresh recommended), 4-6 bitflips and 1-3 bitflips.
> So we can trigger refresh according to its bitflips status.

That's good news!

Thanks for your feedback.

Boris
--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH 4/5] mtd: nand: add support for Micron on-die ECC
@ 2017-03-22 13:45         ` Boris Brezillon
  0 siblings, 0 replies; 35+ messages in thread
From: Boris Brezillon @ 2017-03-22 13:45 UTC (permalink / raw)
  To: Bean Huo (beanhuo)
  Cc: Thomas Petazzoni, richard, marek.vasut, Cyrille Pitchen,
	computersforpeace, linux-mtd, devicetree, Rob Herring, Campbell,
	pawel.moll, Mark Rutland, galak

Hi Bean,

On Wed, 22 Mar 2017 13:20:04 +0000
"Bean Huo (beanhuo)" <beanhuo@micron.com> wrote:

> >+micron_nand_read_page_on_die_ecc(struct mtd_info *mtd, struct nand_chip
> >*chip,
> >+                                                         uint8_t *buf, int oob_required,
> >+                                                         int page)
> >+{
> >+             int status;
> >+             int max_bitflips = 0;
> >+
> >+             micron_nand_on_die_ecc_setup(chip, true);
> >+
> >+             chip->cmdfunc(mtd, NAND_CMD_READ0, 0x00, page);
> >+             chip->cmdfunc(mtd, NAND_CMD_STATUS, -1, -1);
> >+             status = chip->read_byte(mtd);
> >+             if (status & NAND_STATUS_FAIL)
> >+                           mtd->ecc_stats.failed++;
> >+             /*
> >+             * The internal ECC doesn't tell us the number of bitflips
> >+             * that have been corrected, but tells us if it recommends to
> >+             * rewrite the block. If it's the case, then we pretend we had
> >+             * a number of bitflips equal to the ECC strength, which will
> >+             * hint the NAND core to rewrite the block.
> >+             */
> >+             else if (status & NAND_STATUS_WRITE_RECOMMENDED)
> >+                           max_bitflips = chip->ecc.strength;
> >+
> >+             chip->cmdfunc(mtd, NAND_CMD_READ0, -1, -1);
> >+
> >+             nand_read_page_raw(mtd, chip, buf, oob_required, page);
> >+
> >+             micron_nand_on_die_ecc_setup(chip, false);
> >+
> >+             return max_bitflips;
> >+}  
> 
> 
> Hi, 
> Let me give you some information, hopefully you can do some modification based on above codes.
> 
> I noticed that this patches are based on MT29F1G08ABADAWP SLC NAND, it is our 60s 34nm SLC NAND.
> So far, we have 2 series SLC NAND with implementations of on die ECC.
> 1. M79A for all 25nm (70series) SLC NAND with on-die ECC (M78A, M79A, and future design M70A)
> 2. M60A for all 34nm (60series) SLC NAND with on-die ECC

Do you have an easy way to differentiate those 2 generations of chip,
or should we base our detection on the model name provided in the ONFI
parameter page?

> 
> NAND_STATUS_FAIL:
> For the both of series SLC NAND with on-die ECC, SR bit 0 (NAND_STATUS_FAIL) indicates an uncorrectable read fail,
> data is lost, no recovery possible, unless we have software additional protection, the block is not necessarily
> bad but the data is lost.
> 
> NAND_STATUS_WRITE_RECOMMENDED:
> 
> For the NAND_STATUS_WRITE_RECOMMENDED, it only works on 60s NAND, it is 4 bit ECC, the status register only
> indicates if there is 0 or 1-4 correctable error bits. We don't want to trigger refresh if only 1 or 2 bits fail.
> the base refresh is that if there 3 or 4 bitflips. But unfortunately we can't get failed bit count trough read status register. 
> SW workaround proposal:
> 1. If SR bit 3 is set to 1 it means 1~4 bitflips and correctable.
> 2. Read out the page with ECC ON
> 3. Read out the page with ECC OFF
> 4. Compare the data
> 5. Count the number of bitflips for the sectors (there are 4 ECC sectors)
> 6. if 3 or more fail bits, trigger fresh. 
> I know this is not good solution, but if as long as NAND_STATUS_WRITE_RECOMMENDED is set, and trigger refresh,
> this will definitely increase NAND PE cycle.

We discussed that with Thomas when developing the solution. I suggested
to first go for a simple solution even if it implies unneeded PE
cycles when bitflips are detected, but maybe I was wrong. In any case,
it shouldn't be to hard to do what you suggest.

> 
> For the 70s, it is 8 bits on-die ECC, the status register can report 7-8 bitflips (refresh recommended), 4-6 bitflips and 1-3 bitflips.
> So we can trigger refresh according to its bitflips status.

That's good news!

Thanks for your feedback.

Boris

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH 4/5] mtd: nand: add support for Micron on-die ECC
  2017-03-22 13:45         ` Boris Brezillon
  (?)
@ 2017-03-22 14:01         ` Arnaud Mouiche
  -1 siblings, 0 replies; 35+ messages in thread
From: Arnaud Mouiche @ 2017-03-22 14:01 UTC (permalink / raw)
  To: linux-mtd

Hi

On 22/03/2017 14:45, Boris Brezillon wrote:
> Hi Bean,
>
> On Wed, 22 Mar 2017 13:20:04 +0000
> "Bean Huo (beanhuo)" <beanhuo@micron.com> wrote:
>
> [...]
>> NAND_STATUS_FAIL:
>> For the both of series SLC NAND with on-die ECC, SR bit 0 (NAND_STATUS_FAIL) indicates an uncorrectable read fail,
>> data is lost, no recovery possible, unless we have software additional protection, the block is not necessarily
>> bad but the data is lost.
>>
>> NAND_STATUS_WRITE_RECOMMENDED:
>>
>> For the NAND_STATUS_WRITE_RECOMMENDED, it only works on 60s NAND, it is 4 bit ECC, the status register only
>> indicates if there is 0 or 1-4 correctable error bits. We don't want to trigger refresh if only 1 or 2 bits fail.
>> the base refresh is that if there 3 or 4 bitflips. But unfortunately we can't get failed bit count trough read status register.
>> SW workaround proposal:
>> 1. If SR bit 3 is set to 1 it means 1~4 bitflips and correctable.
>> 2. Read out the page with ECC ON
>> 3. Read out the page with ECC OFF
>> 4. Compare the data
>> 5. Count the number of bitflips for the sectors (there are 4 ECC sectors)
>> 6. if 3 or more fail bits, trigger fresh.
>> I know this is not good solution, but if as long as NAND_STATUS_WRITE_RECOMMENDED is set, and trigger refresh,
>> this will definitely increase NAND PE cycle.
> We discussed that with Thomas when developing the solution. I suggested
> to first go for a simple solution even if it implies unneeded PE
> cycles when bitflips are detected, but maybe I was wrong. In any case,
> it shouldn't be to hard to do what you suggest.

Just to share my experience with MX35LF1GE4AB device (a spinand one).

This device is able to fix up to 4 bits per 512 bytes page, but is not 
able to tell how many bit were fixed.
My first option was to say "and so, erasing/re-write will not be an issue !"
Well, quickly some blocks were scrubbed more than 100K by UBI... This 
was an issue !
(see 
http://lists.infradead.org/pipermail/linux-mtd/2016-April/066628.html 
for details)

So, for this particular device, I need to do something suggested by 
Bean: read the page with and without ECC, and count errors one by one.
This kind of patch is in my queue waiting for spinand integration, but 
it seems that something at nand level may be required.

My 2 cents
Arnaud

>
>> For the 70s, it is 8 bits on-die ECC, the status register can report 7-8 bitflips (refresh recommended), 4-6 bitflips and 1-3 bitflips.
>> So we can trigger refresh according to its bitflips status.
> That's good news!
>
> Thanks for your feedback.
>
> Boris
>
> ______________________________________________________
> Linux MTD discussion mailing list
> http://lists.infradead.org/mailman/listinfo/linux-mtd/

^ permalink raw reply	[flat|nested] 35+ messages in thread

* RE: [PATCH 4/5] mtd: nand: add support for Micron on-die ECC
  2017-03-22 13:45         ` Boris Brezillon
@ 2017-03-22 14:39           ` Bean Huo (beanhuo)
  -1 siblings, 0 replies; 35+ messages in thread
From: Bean Huo (beanhuo) @ 2017-03-22 14:39 UTC (permalink / raw)
  To: Boris Brezillon
  Cc: Thomas Petazzoni, devicetree, pawel.moll, Campbell, richard,
	Mark Rutland, marek.vasut, Rob Herring, linux-mtd, galak,
	Cyrille Pitchen, computersforpeace

 Hi Boris

>Hi Bean,
>
>On Wed, 22 Mar 2017 13:20:04 +0000
>"Bean Huo (beanhuo)" <beanhuo@micron.com> wrote:
>
>> >+micron_nand_read_page_on_die_ecc(struct mtd_info *mtd, struct
>> >+nand_chip
>> >*chip,
>> >+                                                         uint8_t
>> >+*buf, int oob_required,
>> >+                                                         int page) {
>> >+             int status;
>> >+             int max_bitflips = 0;
>> >+
>> >+             micron_nand_on_die_ecc_setup(chip, true);
>> >+
>> >+             chip->cmdfunc(mtd, NAND_CMD_READ0, 0x00, page);
>> >+             chip->cmdfunc(mtd, NAND_CMD_STATUS, -1, -1);
>> >+             status = chip->read_byte(mtd);
>> >+             if (status & NAND_STATUS_FAIL)
>> >+                           mtd->ecc_stats.failed++;
>> >+             /*
>> >+             * The internal ECC doesn't tell us the number of
>> >+bitflips
>> >+             * that have been corrected, but tells us if it
>> >+recommends to
>> >+             * rewrite the block. If it's the case, then we pretend
>> >+we had
>> >+             * a number of bitflips equal to the ECC strength, which
>> >+will
>> >+             * hint the NAND core to rewrite the block.
>> >+             */
>> >+             else if (status & NAND_STATUS_WRITE_RECOMMENDED)
>> >+                           max_bitflips = chip->ecc.strength;
>> >+
>> >+             chip->cmdfunc(mtd, NAND_CMD_READ0, -1, -1);
>> >+
>> >+             nand_read_page_raw(mtd, chip, buf, oob_required, page);
>> >+
>> >+             micron_nand_on_die_ecc_setup(chip, false);
>> >+
>> >+             return max_bitflips;
>> >+}
>>
>>
>> Hi,
>> Let me give you some information, hopefully you can do some modification
>based on above codes.
>>
>> I noticed that this patches are based on MT29F1G08ABADAWP SLC NAND, it is
>our 60s 34nm SLC NAND.
>> So far, we have 2 series SLC NAND with implementations of on die ECC.
>> 1. M79A for all 25nm (70series) SLC NAND with on-die ECC (M78A, M79A,
>> and future design M70A) 2. M60A for all 34nm (60series) SLC NAND with
>> on-die ECC
>
>Do you have an easy way to differentiate those 2 generations of chip, or should
>we base our detection on the model name provided in the ONFI parameter page?
>
Of course, you can use model name, but I think we will keep a big table to include every NAND information.
Also, it doesn't look nice and always changes.

The better solution is:

For the Micron SLC NAND with on Die ECC, please note only for the "SLC NAND with on Die ECC",
You can always differentiate these two generation NAND by ONFI table byte 112 "Number of bits
ECC correctability ", if its value is 4, it is 60s; if it's 8, it is 70s. this is a permanent method for both
60s and 70s "SLC NAND with on Die ECC".

>>
>> NAND_STATUS_FAIL:
>> For the both of series SLC NAND with on-die ECC, SR bit 0
>> (NAND_STATUS_FAIL) indicates an uncorrectable read fail, data is lost,
>> no recovery possible, unless we have software additional protection, the block
>is not necessarily bad but the data is lost.
>>
>> NAND_STATUS_WRITE_RECOMMENDED:
>>
>> For the NAND_STATUS_WRITE_RECOMMENDED, it only works on 60s NAND, it
>> is 4 bit ECC, the status register only indicates if there is 0 or 1-4 correctable
>error bits. We don't want to trigger refresh if only 1 or 2 bits fail.
>> the base refresh is that if there 3 or 4 bitflips. But unfortunately we can't get
>failed bit count trough read status register.
>> SW workaround proposal:
>> 1. If SR bit 3 is set to 1 it means 1~4 bitflips and correctable.
>> 2. Read out the page with ECC ON
>> 3. Read out the page with ECC OFF
>> 4. Compare the data
>> 5. Count the number of bitflips for the sectors (there are 4 ECC
>> sectors) 6. if 3 or more fail bits, trigger fresh.
>> I know this is not good solution, but if as long as
>> NAND_STATUS_WRITE_RECOMMENDED is set, and trigger refresh, this will
>definitely increase NAND PE cycle.
>
>We discussed that with Thomas when developing the solution. I suggested to first
>go for a simple solution even if it implies unneeded PE cycles when bitflips are
>detected, but maybe I was wrong. In any case, it shouldn't be to hard to do what
>you suggest.
>

Ok, but I recommend that 70s should be the first choice on this single solution,
it doesn't need to read twice to detect its bitflips count. 

>>
>> For the 70s, it is 8 bits on-die ECC, the status register can report 7-8 bitflips
>(refresh recommended), 4-6 bitflips and 1-3 bitflips.
>> So we can trigger refresh according to its bitflips status.
>
>That's good news!
>
>Thanks for your feedback.
>
>Boris


//Beanhuo
______________________________________________________
Linux MTD discussion mailing list
http://lists.infradead.org/mailman/listinfo/linux-mtd/

^ permalink raw reply	[flat|nested] 35+ messages in thread

* RE: [PATCH 4/5] mtd: nand: add support for Micron on-die ECC
@ 2017-03-22 14:39           ` Bean Huo (beanhuo)
  0 siblings, 0 replies; 35+ messages in thread
From: Bean Huo (beanhuo) @ 2017-03-22 14:39 UTC (permalink / raw)
  To: Boris Brezillon
  Cc: Thomas Petazzoni, richard, marek.vasut, Cyrille Pitchen,
	computersforpeace, linux-mtd, devicetree, Rob Herring, Campbell,
	pawel.moll, Mark Rutland, galak

 Hi Boris

>Hi Bean,
>
>On Wed, 22 Mar 2017 13:20:04 +0000
>"Bean Huo (beanhuo)" <beanhuo@micron.com> wrote:
>
>> >+micron_nand_read_page_on_die_ecc(struct mtd_info *mtd, struct
>> >+nand_chip
>> >*chip,
>> >+                                                         uint8_t
>> >+*buf, int oob_required,
>> >+                                                         int page) {
>> >+             int status;
>> >+             int max_bitflips = 0;
>> >+
>> >+             micron_nand_on_die_ecc_setup(chip, true);
>> >+
>> >+             chip->cmdfunc(mtd, NAND_CMD_READ0, 0x00, page);
>> >+             chip->cmdfunc(mtd, NAND_CMD_STATUS, -1, -1);
>> >+             status = chip->read_byte(mtd);
>> >+             if (status & NAND_STATUS_FAIL)
>> >+                           mtd->ecc_stats.failed++;
>> >+             /*
>> >+             * The internal ECC doesn't tell us the number of
>> >+bitflips
>> >+             * that have been corrected, but tells us if it
>> >+recommends to
>> >+             * rewrite the block. If it's the case, then we pretend
>> >+we had
>> >+             * a number of bitflips equal to the ECC strength, which
>> >+will
>> >+             * hint the NAND core to rewrite the block.
>> >+             */
>> >+             else if (status & NAND_STATUS_WRITE_RECOMMENDED)
>> >+                           max_bitflips = chip->ecc.strength;
>> >+
>> >+             chip->cmdfunc(mtd, NAND_CMD_READ0, -1, -1);
>> >+
>> >+             nand_read_page_raw(mtd, chip, buf, oob_required, page);
>> >+
>> >+             micron_nand_on_die_ecc_setup(chip, false);
>> >+
>> >+             return max_bitflips;
>> >+}
>>
>>
>> Hi,
>> Let me give you some information, hopefully you can do some modification
>based on above codes.
>>
>> I noticed that this patches are based on MT29F1G08ABADAWP SLC NAND, it is
>our 60s 34nm SLC NAND.
>> So far, we have 2 series SLC NAND with implementations of on die ECC.
>> 1. M79A for all 25nm (70series) SLC NAND with on-die ECC (M78A, M79A,
>> and future design M70A) 2. M60A for all 34nm (60series) SLC NAND with
>> on-die ECC
>
>Do you have an easy way to differentiate those 2 generations of chip, or should
>we base our detection on the model name provided in the ONFI parameter page?
>
Of course, you can use model name, but I think we will keep a big table to include every NAND information.
Also, it doesn't look nice and always changes.

The better solution is:

For the Micron SLC NAND with on Die ECC, please note only for the "SLC NAND with on Die ECC",
You can always differentiate these two generation NAND by ONFI table byte 112 "Number of bits
ECC correctability ", if its value is 4, it is 60s; if it's 8, it is 70s. this is a permanent method for both
60s and 70s "SLC NAND with on Die ECC".

>>
>> NAND_STATUS_FAIL:
>> For the both of series SLC NAND with on-die ECC, SR bit 0
>> (NAND_STATUS_FAIL) indicates an uncorrectable read fail, data is lost,
>> no recovery possible, unless we have software additional protection, the block
>is not necessarily bad but the data is lost.
>>
>> NAND_STATUS_WRITE_RECOMMENDED:
>>
>> For the NAND_STATUS_WRITE_RECOMMENDED, it only works on 60s NAND, it
>> is 4 bit ECC, the status register only indicates if there is 0 or 1-4 correctable
>error bits. We don't want to trigger refresh if only 1 or 2 bits fail.
>> the base refresh is that if there 3 or 4 bitflips. But unfortunately we can't get
>failed bit count trough read status register.
>> SW workaround proposal:
>> 1. If SR bit 3 is set to 1 it means 1~4 bitflips and correctable.
>> 2. Read out the page with ECC ON
>> 3. Read out the page with ECC OFF
>> 4. Compare the data
>> 5. Count the number of bitflips for the sectors (there are 4 ECC
>> sectors) 6. if 3 or more fail bits, trigger fresh.
>> I know this is not good solution, but if as long as
>> NAND_STATUS_WRITE_RECOMMENDED is set, and trigger refresh, this will
>definitely increase NAND PE cycle.
>
>We discussed that with Thomas when developing the solution. I suggested to first
>go for a simple solution even if it implies unneeded PE cycles when bitflips are
>detected, but maybe I was wrong. In any case, it shouldn't be to hard to do what
>you suggest.
>

Ok, but I recommend that 70s should be the first choice on this single solution,
it doesn't need to read twice to detect its bitflips count. 

>>
>> For the 70s, it is 8 bits on-die ECC, the status register can report 7-8 bitflips
>(refresh recommended), 4-6 bitflips and 1-3 bitflips.
>> So we can trigger refresh according to its bitflips status.
>
>That's good news!
>
>Thanks for your feedback.
>
>Boris


//Beanhuo

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH 4/5] mtd: nand: add support for Micron on-die ECC
  2017-03-22 14:39           ` Bean Huo (beanhuo)
@ 2017-03-22 14:52               ` Boris Brezillon
  -1 siblings, 0 replies; 35+ messages in thread
From: Boris Brezillon @ 2017-03-22 14:52 UTC (permalink / raw)
  To: Bean Huo (beanhuo)
  Cc: Thomas Petazzoni, richard-/L3Ra7n9ekc,
	marek.vasut-Re5JQEeQqe8AvxtiuMwx3w, Cyrille Pitchen,
	computersforpeace-Re5JQEeQqe8AvxtiuMwx3w,
	linux-mtd-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	devicetree-u79uwXL29TY76Z2rM5mHXA, Rob Herring, Campbell,
	pawel.moll-5wv7dgnIgG8, Mark Rutland,
	galak-sgV2jX0FEOL9JmXXK+q4OQ


> >> I noticed that this patches are based on MT29F1G08ABADAWP SLC NAND, it is  
> >our 60s 34nm SLC NAND.  
> >> So far, we have 2 series SLC NAND with implementations of on die ECC.
> >> 1. M79A for all 25nm (70series) SLC NAND with on-die ECC (M78A, M79A,
> >> and future design M70A) 2. M60A for all 34nm (60series) SLC NAND with
> >> on-die ECC  
> >
> >Do you have an easy way to differentiate those 2 generations of chip, or should
> >we base our detection on the model name provided in the ONFI parameter page?
> >  
> Of course, you can use model name, but I think we will keep a big table to include every NAND information.
> Also, it doesn't look nice and always changes.
> 
> The better solution is:
> 
> For the Micron SLC NAND with on Die ECC, please note only for the "SLC NAND with on Die ECC",
> You can always differentiate these two generation NAND by ONFI table byte 112 "Number of bits
> ECC correctability ", if its value is 4, it is 60s; if it's 8, it is 70s. this is a permanent method for both
> 60s and 70s "SLC NAND with on Die ECC".

The question is, how can we know if the NAND supports on-die ECC?

We were basing our detection on the "Internal ECC" bit in READ_ID, but
it seems this bit is actually reflecting the current ECC engine status
(enabled/disabled), which is disturbing, since information returned by
the READ_ID are supposed to be static :-(.

> 
> >>
> >> NAND_STATUS_FAIL:
> >> For the both of series SLC NAND with on-die ECC, SR bit 0
> >> (NAND_STATUS_FAIL) indicates an uncorrectable read fail, data is lost,
> >> no recovery possible, unless we have software additional protection, the block  
> >is not necessarily bad but the data is lost.  
> >>
> >> NAND_STATUS_WRITE_RECOMMENDED:
> >>
> >> For the NAND_STATUS_WRITE_RECOMMENDED, it only works on 60s NAND, it
> >> is 4 bit ECC, the status register only indicates if there is 0 or 1-4 correctable  
> >error bits. We don't want to trigger refresh if only 1 or 2 bits fail.  
> >> the base refresh is that if there 3 or 4 bitflips. But unfortunately we can't get  
> >failed bit count trough read status register.  
> >> SW workaround proposal:
> >> 1. If SR bit 3 is set to 1 it means 1~4 bitflips and correctable.
> >> 2. Read out the page with ECC ON
> >> 3. Read out the page with ECC OFF
> >> 4. Compare the data
> >> 5. Count the number of bitflips for the sectors (there are 4 ECC
> >> sectors) 6. if 3 or more fail bits, trigger fresh.
> >> I know this is not good solution, but if as long as
> >> NAND_STATUS_WRITE_RECOMMENDED is set, and trigger refresh, this will  
> >definitely increase NAND PE cycle.
> >
> >We discussed that with Thomas when developing the solution. I suggested to first
> >go for a simple solution even if it implies unneeded PE cycles when bitflips are
> >detected, but maybe I was wrong. In any case, it shouldn't be to hard to do what
> >you suggest.
> >  
> 
> Ok, but I recommend that 70s should be the first choice on this single solution,
> it doesn't need to read twice to detect its bitflips count.

That's exactly why we need to differentiate the 2 chips.
--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH 4/5] mtd: nand: add support for Micron on-die ECC
@ 2017-03-22 14:52               ` Boris Brezillon
  0 siblings, 0 replies; 35+ messages in thread
From: Boris Brezillon @ 2017-03-22 14:52 UTC (permalink / raw)
  To: Bean Huo (beanhuo)
  Cc: Thomas Petazzoni, richard, marek.vasut, Cyrille Pitchen,
	computersforpeace, linux-mtd, devicetree, Rob Herring, Campbell,
	pawel.moll, Mark Rutland, galak


> >> I noticed that this patches are based on MT29F1G08ABADAWP SLC NAND, it is  
> >our 60s 34nm SLC NAND.  
> >> So far, we have 2 series SLC NAND with implementations of on die ECC.
> >> 1. M79A for all 25nm (70series) SLC NAND with on-die ECC (M78A, M79A,
> >> and future design M70A) 2. M60A for all 34nm (60series) SLC NAND with
> >> on-die ECC  
> >
> >Do you have an easy way to differentiate those 2 generations of chip, or should
> >we base our detection on the model name provided in the ONFI parameter page?
> >  
> Of course, you can use model name, but I think we will keep a big table to include every NAND information.
> Also, it doesn't look nice and always changes.
> 
> The better solution is:
> 
> For the Micron SLC NAND with on Die ECC, please note only for the "SLC NAND with on Die ECC",
> You can always differentiate these two generation NAND by ONFI table byte 112 "Number of bits
> ECC correctability ", if its value is 4, it is 60s; if it's 8, it is 70s. this is a permanent method for both
> 60s and 70s "SLC NAND with on Die ECC".

The question is, how can we know if the NAND supports on-die ECC?

We were basing our detection on the "Internal ECC" bit in READ_ID, but
it seems this bit is actually reflecting the current ECC engine status
(enabled/disabled), which is disturbing, since information returned by
the READ_ID are supposed to be static :-(.

> 
> >>
> >> NAND_STATUS_FAIL:
> >> For the both of series SLC NAND with on-die ECC, SR bit 0
> >> (NAND_STATUS_FAIL) indicates an uncorrectable read fail, data is lost,
> >> no recovery possible, unless we have software additional protection, the block  
> >is not necessarily bad but the data is lost.  
> >>
> >> NAND_STATUS_WRITE_RECOMMENDED:
> >>
> >> For the NAND_STATUS_WRITE_RECOMMENDED, it only works on 60s NAND, it
> >> is 4 bit ECC, the status register only indicates if there is 0 or 1-4 correctable  
> >error bits. We don't want to trigger refresh if only 1 or 2 bits fail.  
> >> the base refresh is that if there 3 or 4 bitflips. But unfortunately we can't get  
> >failed bit count trough read status register.  
> >> SW workaround proposal:
> >> 1. If SR bit 3 is set to 1 it means 1~4 bitflips and correctable.
> >> 2. Read out the page with ECC ON
> >> 3. Read out the page with ECC OFF
> >> 4. Compare the data
> >> 5. Count the number of bitflips for the sectors (there are 4 ECC
> >> sectors) 6. if 3 or more fail bits, trigger fresh.
> >> I know this is not good solution, but if as long as
> >> NAND_STATUS_WRITE_RECOMMENDED is set, and trigger refresh, this will  
> >definitely increase NAND PE cycle.
> >
> >We discussed that with Thomas when developing the solution. I suggested to first
> >go for a simple solution even if it implies unneeded PE cycles when bitflips are
> >detected, but maybe I was wrong. In any case, it shouldn't be to hard to do what
> >you suggest.
> >  
> 
> Ok, but I recommend that 70s should be the first choice on this single solution,
> it doesn't need to read twice to detect its bitflips count.

That's exactly why we need to differentiate the 2 chips.

^ permalink raw reply	[flat|nested] 35+ messages in thread

* RE: [PATCH 4/5] mtd: nand: add support for Micron on-die ECC
  2017-03-22 14:52               ` Boris Brezillon
@ 2017-03-22 17:11                 ` Bean Huo (beanhuo)
  -1 siblings, 0 replies; 35+ messages in thread
From: Bean Huo (beanhuo) @ 2017-03-22 17:11 UTC (permalink / raw)
  To: Boris Brezillon
  Cc: Thomas Petazzoni, richard-/L3Ra7n9ekc,
	marek.vasut-Re5JQEeQqe8AvxtiuMwx3w, Cyrille Pitchen,
	computersforpeace-Re5JQEeQqe8AvxtiuMwx3w,
	linux-mtd-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	devicetree-u79uwXL29TY76Z2rM5mHXA, Rob Herring, Campbell,
	pawel.moll-5wv7dgnIgG8, Mark Rutland,
	galak-sgV2jX0FEOL9JmXXK+q4OQ

Hi Boris

>>
>> For the Micron SLC NAND with on Die ECC, please note only for the "SLC
>> NAND with on Die ECC", You can always differentiate these two
>> generation NAND by ONFI table byte 112 "Number of bits ECC
>> correctability ", if its value is 4, it is 60s; if it's 8, it is 70s. this is a permanent
>method for both 60s and 70s "SLC NAND with on Die ECC".
>
>The question is, how can we know if the NAND supports on-die ECC?
>
>We were basing our detection on the "Internal ECC" bit in READ_ID, but it seems
>this bit is actually reflecting the current ECC engine status (enabled/disabled),
>which is disturbing, since information returned by the READ_ID are supposed to
>be static :-(.
>

This is very difficult question, as a Micron worker, this is also a headache to me.
Device ID byte 4 can't be fully trusted, it always changes. Anyway, I will give you
A better answer in 2 days. Hopefully.

--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 35+ messages in thread

* RE: [PATCH 4/5] mtd: nand: add support for Micron on-die ECC
@ 2017-03-22 17:11                 ` Bean Huo (beanhuo)
  0 siblings, 0 replies; 35+ messages in thread
From: Bean Huo (beanhuo) @ 2017-03-22 17:11 UTC (permalink / raw)
  To: Boris Brezillon
  Cc: Thomas Petazzoni, richard, marek.vasut, Cyrille Pitchen,
	computersforpeace, linux-mtd, devicetree, Rob Herring, Campbell,
	pawel.moll, Mark Rutland, galak

Hi Boris

>>
>> For the Micron SLC NAND with on Die ECC, please note only for the "SLC
>> NAND with on Die ECC", You can always differentiate these two
>> generation NAND by ONFI table byte 112 "Number of bits ECC
>> correctability ", if its value is 4, it is 60s; if it's 8, it is 70s. this is a permanent
>method for both 60s and 70s "SLC NAND with on Die ECC".
>
>The question is, how can we know if the NAND supports on-die ECC?
>
>We were basing our detection on the "Internal ECC" bit in READ_ID, but it seems
>this bit is actually reflecting the current ECC engine status (enabled/disabled),
>which is disturbing, since information returned by the READ_ID are supposed to
>be static :-(.
>

This is very difficult question, as a Micron worker, this is also a headache to me.
Device ID byte 4 can't be fully trusted, it always changes. Anyway, I will give you
A better answer in 2 days. Hopefully.

^ permalink raw reply	[flat|nested] 35+ messages in thread

* RE: [PATCH 4/5] mtd: nand: add support for Micron on-die ECC
  2017-03-22 14:52               ` Boris Brezillon
@ 2017-04-03 11:31                 ` Bean Huo (beanhuo)
  -1 siblings, 0 replies; 35+ messages in thread
From: Bean Huo (beanhuo) @ 2017-04-03 11:31 UTC (permalink / raw)
  To: Boris Brezillon
  Cc: Thomas Petazzoni, richard-/L3Ra7n9ekc,
	marek.vasut-Re5JQEeQqe8AvxtiuMwx3w, Cyrille Pitchen,
	computersforpeace-Re5JQEeQqe8AvxtiuMwx3w,
	linux-mtd-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	devicetree-u79uwXL29TY76Z2rM5mHXA, Rob Herring, Campbell,
	pawel.moll-5wv7dgnIgG8, Mark Rutland,
	galak-sgV2jX0FEOL9JmXXK+q4OQ

Hi, Boris and Thomas

>>
>> Ok, but I recommend that 70s should be the first choice on this single
>> solution, it doesn't need to read twice to detect its bitflips count.
>
>That's exactly why we need to differentiate the 2 chips.

Sorry for later this response. 
Below is the pseudo codes about how to differentiate these 2 series parallel
NAND with on-die ECC:

if (NAND == SLC ) { // on-die ECC only exists in SLC
//check device ID byte 4
     if ((ID.byte4 & 0x02) == 0x02) {// internal ECC level ==10b
	if (ID.byte4 & 0x80) {//on-Die ECC enabled
                    if (ONFI.byte112 == 4)
		 60s SLC NAND with on-die ECC
	    else if (ONFI.byte112 == 8)
     	              70s SLC NAND with on-die ECC
	    else
                          Doesn't support on-die ECC
	}	
	else
	  On-die ECC not enabled
     }
   else 
        Doesn't support on-die ECC
}
else
   Doesn't support on-die ECC.

//beanhuo

--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 35+ messages in thread

* RE: [PATCH 4/5] mtd: nand: add support for Micron on-die ECC
@ 2017-04-03 11:31                 ` Bean Huo (beanhuo)
  0 siblings, 0 replies; 35+ messages in thread
From: Bean Huo (beanhuo) @ 2017-04-03 11:31 UTC (permalink / raw)
  To: Boris Brezillon
  Cc: Thomas Petazzoni, richard, marek.vasut, Cyrille Pitchen,
	computersforpeace, linux-mtd, devicetree, Rob Herring, Campbell,
	pawel.moll, Mark Rutland, galak

Hi, Boris and Thomas

>>
>> Ok, but I recommend that 70s should be the first choice on this single
>> solution, it doesn't need to read twice to detect its bitflips count.
>
>That's exactly why we need to differentiate the 2 chips.

Sorry for later this response. 
Below is the pseudo codes about how to differentiate these 2 series parallel
NAND with on-die ECC:

if (NAND == SLC ) { // on-die ECC only exists in SLC
//check device ID byte 4
     if ((ID.byte4 & 0x02) == 0x02) {// internal ECC level ==10b
	if (ID.byte4 & 0x80) {//on-Die ECC enabled
                    if (ONFI.byte112 == 4)
		 60s SLC NAND with on-die ECC
	    else if (ONFI.byte112 == 8)
     	              70s SLC NAND with on-die ECC
	    else
                          Doesn't support on-die ECC
	}	
	else
	  On-die ECC not enabled
     }
   else 
        Doesn't support on-die ECC
}
else
   Doesn't support on-die ECC.

//beanhuo

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH 4/5] mtd: nand: add support for Micron on-die ECC
  2017-04-03 11:31                 ` Bean Huo (beanhuo)
@ 2017-04-11 12:51                     ` Boris Brezillon
  -1 siblings, 0 replies; 35+ messages in thread
From: Boris Brezillon @ 2017-04-11 12:51 UTC (permalink / raw)
  To: Bean Huo (beanhuo)
  Cc: Thomas Petazzoni, devicetree-u79uwXL29TY76Z2rM5mHXA,
	pawel.moll-5wv7dgnIgG8, Campbell, richard-/L3Ra7n9ekc,
	Mark Rutland, marek.vasut-Re5JQEeQqe8AvxtiuMwx3w, Rob Herring,
	linux-mtd-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	galak-sgV2jX0FEOL9JmXXK+q4OQ, Cyrille Pitchen,
	computersforpeace-Re5JQEeQqe8AvxtiuMwx3w

Hi Bean,

On Mon, 3 Apr 2017 11:31:05 +0000
"Bean Huo (beanhuo)" <beanhuo-AL4WhLSQfzjQT0dZR+AlfA@public.gmane.org> wrote:

> Hi, Boris and Thomas
> 
> >>
> >> Ok, but I recommend that 70s should be the first choice on this single
> >> solution, it doesn't need to read twice to detect its bitflips count.  
> >
> >That's exactly why we need to differentiate the 2 chips.  
> 
> Sorry for later this response. 
> Below is the pseudo codes about how to differentiate these 2 series parallel
> NAND with on-die ECC:
> 
> if (NAND == SLC ) { // on-die ECC only exists in SLC
> //check device ID byte 4
>      if ((ID.byte4 & 0x02) == 0x02) {// internal ECC level ==10b

So here the MT29F1G08ABADAWP datasheet says 0x2 <=> 4bit/512bytes ECC.

> 	if (ID.byte4 & 0x80) {//on-Die ECC enabled

Did you read my last reply?
Thomas discovered that ID[4].bit7 is actually reflecting the ECC engine
state (1 if the engine is enabled, 0 if it's disabled), not whether the
NAND supports on-die ECC or not, so no this test is not reliable.

>                     if (ONFI.byte112 == 4)
> 		 60s SLC NAND with on-die ECC
> 	    else if (ONFI.byte112 == 8)
>      	              70s SLC NAND with on-die ECC

This is completely fucked up! Now the ONFI param page says the NAND
requires 8bits/512bytes, while the ID bytes advertised an on-die ECC
providing 4bits/512bytes correctability.
So either your algorithm is wrong, or the ID and ONFI param page are
contracting (not sure what solution I'd prefer...).

> 	    else
>                           Doesn't support on-die ECC

Sorry to say that, but I find it worrisome that even someone from Micron
is not able to get it right.

I think we'll stick to the model name to detect whether on-die ECC is
supported.

Regards,

Boris
--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH 4/5] mtd: nand: add support for Micron on-die ECC
@ 2017-04-11 12:51                     ` Boris Brezillon
  0 siblings, 0 replies; 35+ messages in thread
From: Boris Brezillon @ 2017-04-11 12:51 UTC (permalink / raw)
  To: Bean Huo (beanhuo)
  Cc: Thomas Petazzoni, devicetree, pawel.moll, Campbell, richard,
	Mark Rutland, marek.vasut, Rob Herring, linux-mtd, galak,
	Cyrille Pitchen, computersforpeace

Hi Bean,

On Mon, 3 Apr 2017 11:31:05 +0000
"Bean Huo (beanhuo)" <beanhuo@micron.com> wrote:

> Hi, Boris and Thomas
> 
> >>
> >> Ok, but I recommend that 70s should be the first choice on this single
> >> solution, it doesn't need to read twice to detect its bitflips count.  
> >
> >That's exactly why we need to differentiate the 2 chips.  
> 
> Sorry for later this response. 
> Below is the pseudo codes about how to differentiate these 2 series parallel
> NAND with on-die ECC:
> 
> if (NAND == SLC ) { // on-die ECC only exists in SLC
> //check device ID byte 4
>      if ((ID.byte4 & 0x02) == 0x02) {// internal ECC level ==10b

So here the MT29F1G08ABADAWP datasheet says 0x2 <=> 4bit/512bytes ECC.

> 	if (ID.byte4 & 0x80) {//on-Die ECC enabled

Did you read my last reply?
Thomas discovered that ID[4].bit7 is actually reflecting the ECC engine
state (1 if the engine is enabled, 0 if it's disabled), not whether the
NAND supports on-die ECC or not, so no this test is not reliable.

>                     if (ONFI.byte112 == 4)
> 		 60s SLC NAND with on-die ECC
> 	    else if (ONFI.byte112 == 8)
>      	              70s SLC NAND with on-die ECC

This is completely fucked up! Now the ONFI param page says the NAND
requires 8bits/512bytes, while the ID bytes advertised an on-die ECC
providing 4bits/512bytes correctability.
So either your algorithm is wrong, or the ID and ONFI param page are
contracting (not sure what solution I'd prefer...).

> 	    else
>                           Doesn't support on-die ECC

Sorry to say that, but I find it worrisome that even someone from Micron
is not able to get it right.

I think we'll stick to the model name to detect whether on-die ECC is
supported.

Regards,

Boris

^ permalink raw reply	[flat|nested] 35+ messages in thread

* RE: [PATCH 4/5] mtd: nand: add support for Micron on-die ECC
  2017-04-11 12:51                     ` Boris Brezillon
@ 2017-04-11 14:26                       ` Bean Huo (beanhuo)
  -1 siblings, 0 replies; 35+ messages in thread
From: Bean Huo (beanhuo) @ 2017-04-11 14:26 UTC (permalink / raw)
  To: Boris Brezillon
  Cc: Thomas Petazzoni, devicetree, pawel.moll, Campbell, richard,
	marek.vasut, Rob Herring, linux-mtd, galak, Mark Rutland,
	computersforpeace, Cyrille Pitchen

>
>Hi Bean,
>
>On Mon, 3 Apr 2017 11:31:05 +0000
>"Bean Huo (beanhuo)" <beanhuo@micron.com> wrote:
>
>> Hi, Boris and Thomas
>>
>> >>
>> >> Ok, but I recommend that 70s should be the first choice on this
>> >> single solution, it doesn't need to read twice to detect its bitflips count.
>> >
>> >That's exactly why we need to differentiate the 2 chips.
>>
>> Sorry for later this response.
>> Below is the pseudo codes about how to differentiate these 2 series
>> parallel NAND with on-die ECC:
>>
>> if (NAND == SLC ) { // on-die ECC only exists in SLC //check device ID
>> byte 4
>>      if ((ID.byte4 & 0x02) == 0x02) {// internal ECC level ==10b
>
>So here the MT29F1G08ABADAWP datasheet says 0x2 <=> 4bit/512bytes ECC.
>
>> 	if (ID.byte4 & 0x80) {//on-Die ECC enabled
>
>Did you read my last reply?
>Thomas discovered that ID[4].bit7 is actually reflecting the ECC engine state (1 if
>the engine is enabled, 0 if it's disabled), not whether the NAND supports on-die
>ECC or not, so no this test is not reliable.
>
>>                     if (ONFI.byte112 == 4)
>> 		 60s SLC NAND with on-die ECC
>> 	    else if (ONFI.byte112 == 8)
>>      	              70s SLC NAND with on-die ECC
>
>This is completely fucked up! Now the ONFI param page says the NAND requires
>8bits/512bytes, while the ID bytes advertised an on-die ECC providing
>4bits/512bytes correctability.
>So either your algorithm is wrong, or the ID and ONFI param page are contracting
>(not sure what solution I'd prefer...).
>
>> 	    else
>>                           Doesn't support on-die ECC
>
>Sorry to say that, but I find it worrisome that even someone from Micron is not
>able to get it right.
>

Sorry, would you please specify which one is wrong or confuse you?

>I think we'll stick to the model name to detect whether on-die ECC is supported.
>
You want one solution that can clearly differentiate two serial SLC NAND, but NAND ONFI table
and device Id are always changing. It is easy to draw a perfect solution to do that.
OK, if you like maintain a huge/ugly table in MTD, please do that.

>Regards,
>
>Boris

______________________________________________________
Linux MTD discussion mailing list
http://lists.infradead.org/mailman/listinfo/linux-mtd/

^ permalink raw reply	[flat|nested] 35+ messages in thread

* RE: [PATCH 4/5] mtd: nand: add support for Micron on-die ECC
@ 2017-04-11 14:26                       ` Bean Huo (beanhuo)
  0 siblings, 0 replies; 35+ messages in thread
From: Bean Huo (beanhuo) @ 2017-04-11 14:26 UTC (permalink / raw)
  To: Boris Brezillon
  Cc: Thomas Petazzoni, devicetree, pawel.moll, Campbell, richard,
	Mark Rutland, marek.vasut, Rob Herring, linux-mtd, galak,
	Cyrille Pitchen, computersforpeace

>
>Hi Bean,
>
>On Mon, 3 Apr 2017 11:31:05 +0000
>"Bean Huo (beanhuo)" <beanhuo@micron.com> wrote:
>
>> Hi, Boris and Thomas
>>
>> >>
>> >> Ok, but I recommend that 70s should be the first choice on this
>> >> single solution, it doesn't need to read twice to detect its bitflips count.
>> >
>> >That's exactly why we need to differentiate the 2 chips.
>>
>> Sorry for later this response.
>> Below is the pseudo codes about how to differentiate these 2 series
>> parallel NAND with on-die ECC:
>>
>> if (NAND == SLC ) { // on-die ECC only exists in SLC //check device ID
>> byte 4
>>      if ((ID.byte4 & 0x02) == 0x02) {// internal ECC level ==10b
>
>So here the MT29F1G08ABADAWP datasheet says 0x2 <=> 4bit/512bytes ECC.
>
>> 	if (ID.byte4 & 0x80) {//on-Die ECC enabled
>
>Did you read my last reply?
>Thomas discovered that ID[4].bit7 is actually reflecting the ECC engine state (1 if
>the engine is enabled, 0 if it's disabled), not whether the NAND supports on-die
>ECC or not, so no this test is not reliable.
>
>>                     if (ONFI.byte112 == 4)
>> 		 60s SLC NAND with on-die ECC
>> 	    else if (ONFI.byte112 == 8)
>>      	              70s SLC NAND with on-die ECC
>
>This is completely fucked up! Now the ONFI param page says the NAND requires
>8bits/512bytes, while the ID bytes advertised an on-die ECC providing
>4bits/512bytes correctability.
>So either your algorithm is wrong, or the ID and ONFI param page are contracting
>(not sure what solution I'd prefer...).
>
>> 	    else
>>                           Doesn't support on-die ECC
>
>Sorry to say that, but I find it worrisome that even someone from Micron is not
>able to get it right.
>

Sorry, would you please specify which one is wrong or confuse you?

>I think we'll stick to the model name to detect whether on-die ECC is supported.
>
You want one solution that can clearly differentiate two serial SLC NAND, but NAND ONFI table
and device Id are always changing. It is easy to draw a perfect solution to do that.
OK, if you like maintain a huge/ugly table in MTD, please do that.

>Regards,
>
>Boris

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH 4/5] mtd: nand: add support for Micron on-die ECC
  2017-04-11 14:26                       ` Bean Huo (beanhuo)
@ 2017-04-11 14:49                           ` Boris Brezillon
  -1 siblings, 0 replies; 35+ messages in thread
From: Boris Brezillon @ 2017-04-11 14:49 UTC (permalink / raw)
  To: Bean Huo (beanhuo)
  Cc: Thomas Petazzoni, devicetree-u79uwXL29TY76Z2rM5mHXA,
	pawel.moll-5wv7dgnIgG8, Campbell, richard-/L3Ra7n9ekc,
	Mark Rutland, marek.vasut-Re5JQEeQqe8AvxtiuMwx3w, Rob Herring,
	linux-mtd-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	galak-sgV2jX0FEOL9JmXXK+q4OQ, Cyrille Pitchen,
	computersforpeace-Re5JQEeQqe8AvxtiuMwx3w

On Tue, 11 Apr 2017 14:26:02 +0000
"Bean Huo (beanhuo)" <beanhuo-AL4WhLSQfzjQT0dZR+AlfA@public.gmane.org> wrote:

> >
> >Hi Bean,
> >
> >On Mon, 3 Apr 2017 11:31:05 +0000
> >"Bean Huo (beanhuo)" <beanhuo-AL4WhLSQfzjQT0dZR+AlfA@public.gmane.org> wrote:
> >  
> >> Hi, Boris and Thomas
> >>  
> >> >>
> >> >> Ok, but I recommend that 70s should be the first choice on this
> >> >> single solution, it doesn't need to read twice to detect its bitflips count.  
> >> >
> >> >That's exactly why we need to differentiate the 2 chips.  
> >>
> >> Sorry for later this response.
> >> Below is the pseudo codes about how to differentiate these 2 series
> >> parallel NAND with on-die ECC:
> >>
> >> if (NAND == SLC ) { // on-die ECC only exists in SLC //check device ID
> >> byte 4
> >>      if ((ID.byte4 & 0x02) == 0x02) {// internal ECC level ==10b  
> >
> >So here the MT29F1G08ABADAWP datasheet says 0x2 <=> 4bit/512bytes ECC.
> >  
> >> 	if (ID.byte4 & 0x80) {//on-Die ECC enabled  
> >
> >Did you read my last reply?
> >Thomas discovered that ID[4].bit7 is actually reflecting the ECC engine state (1 if
> >the engine is enabled, 0 if it's disabled), not whether the NAND supports on-die
> >ECC or not, so no this test is not reliable.
> >  
> >>                     if (ONFI.byte112 == 4)
> >> 		 60s SLC NAND with on-die ECC
> >> 	    else if (ONFI.byte112 == 8)
> >>      	              70s SLC NAND with on-die ECC  
> >
> >This is completely fucked up! Now the ONFI param page says the NAND requires
> >8bits/512bytes, while the ID bytes advertised an on-die ECC providing
> >4bits/512bytes correctability.
> >So either your algorithm is wrong, or the ID and ONFI param page are contracting
> >(not sure what solution I'd prefer...).
> >  
> >> 	    else
> >>                           Doesn't support on-die ECC  
> >
> >Sorry to say that, but I find it worrisome that even someone from Micron is not
> >able to get it right.
> >  
> 
> Sorry, would you please specify which one is wrong or confuse you?

The initial 'if (ID.byte4 & 0x80)' is wrong, because this bit is only
set when someone enabled the ECC engine using the SET_FEATURE command
(this has been verified by Thomas who tried to disable the feature in
the bootloader and noticed that on-die ECC was reported as
'unsupported' by the kernel).

Maybe I was wrong about your 'if ((ID.byte4 & 0x02) == 0x02)' test,
because you apparently only mask bit 1 and not bits 0 and 1.
Anyway, I can't tell if this is valid because I don't have access to
the M79A datasheets you're referring to.


> 
> >I think we'll stick to the model name to detect whether on-die ECC is supported.
> >  
> You want one solution that can clearly differentiate two serial SLC NAND, but NAND ONFI table
> and device Id are always changing. It is easy to draw a perfect solution to do that.
> OK, if you like maintain a huge/ugly table in MTD, please do that.

I'm not happy with the big ID table either, but unless I'm missing
something, what you propose does not work for the MT29F1G08ABADAWP, so
I prefer to rely on something I can trust.

--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH 4/5] mtd: nand: add support for Micron on-die ECC
@ 2017-04-11 14:49                           ` Boris Brezillon
  0 siblings, 0 replies; 35+ messages in thread
From: Boris Brezillon @ 2017-04-11 14:49 UTC (permalink / raw)
  To: Bean Huo (beanhuo)
  Cc: Thomas Petazzoni, devicetree, pawel.moll, Campbell, richard,
	Mark Rutland, marek.vasut, Rob Herring, linux-mtd, galak,
	Cyrille Pitchen, computersforpeace

On Tue, 11 Apr 2017 14:26:02 +0000
"Bean Huo (beanhuo)" <beanhuo@micron.com> wrote:

> >
> >Hi Bean,
> >
> >On Mon, 3 Apr 2017 11:31:05 +0000
> >"Bean Huo (beanhuo)" <beanhuo@micron.com> wrote:
> >  
> >> Hi, Boris and Thomas
> >>  
> >> >>
> >> >> Ok, but I recommend that 70s should be the first choice on this
> >> >> single solution, it doesn't need to read twice to detect its bitflips count.  
> >> >
> >> >That's exactly why we need to differentiate the 2 chips.  
> >>
> >> Sorry for later this response.
> >> Below is the pseudo codes about how to differentiate these 2 series
> >> parallel NAND with on-die ECC:
> >>
> >> if (NAND == SLC ) { // on-die ECC only exists in SLC //check device ID
> >> byte 4
> >>      if ((ID.byte4 & 0x02) == 0x02) {// internal ECC level ==10b  
> >
> >So here the MT29F1G08ABADAWP datasheet says 0x2 <=> 4bit/512bytes ECC.
> >  
> >> 	if (ID.byte4 & 0x80) {//on-Die ECC enabled  
> >
> >Did you read my last reply?
> >Thomas discovered that ID[4].bit7 is actually reflecting the ECC engine state (1 if
> >the engine is enabled, 0 if it's disabled), not whether the NAND supports on-die
> >ECC or not, so no this test is not reliable.
> >  
> >>                     if (ONFI.byte112 == 4)
> >> 		 60s SLC NAND with on-die ECC
> >> 	    else if (ONFI.byte112 == 8)
> >>      	              70s SLC NAND with on-die ECC  
> >
> >This is completely fucked up! Now the ONFI param page says the NAND requires
> >8bits/512bytes, while the ID bytes advertised an on-die ECC providing
> >4bits/512bytes correctability.
> >So either your algorithm is wrong, or the ID and ONFI param page are contracting
> >(not sure what solution I'd prefer...).
> >  
> >> 	    else
> >>                           Doesn't support on-die ECC  
> >
> >Sorry to say that, but I find it worrisome that even someone from Micron is not
> >able to get it right.
> >  
> 
> Sorry, would you please specify which one is wrong or confuse you?

The initial 'if (ID.byte4 & 0x80)' is wrong, because this bit is only
set when someone enabled the ECC engine using the SET_FEATURE command
(this has been verified by Thomas who tried to disable the feature in
the bootloader and noticed that on-die ECC was reported as
'unsupported' by the kernel).

Maybe I was wrong about your 'if ((ID.byte4 & 0x02) == 0x02)' test,
because you apparently only mask bit 1 and not bits 0 and 1.
Anyway, I can't tell if this is valid because I don't have access to
the M79A datasheets you're referring to.


> 
> >I think we'll stick to the model name to detect whether on-die ECC is supported.
> >  
> You want one solution that can clearly differentiate two serial SLC NAND, but NAND ONFI table
> and device Id are always changing. It is easy to draw a perfect solution to do that.
> OK, if you like maintain a huge/ugly table in MTD, please do that.

I'm not happy with the big ID table either, but unless I'm missing
something, what you propose does not work for the MT29F1G08ABADAWP, so
I prefer to rely on something I can trust.

^ permalink raw reply	[flat|nested] 35+ messages in thread

* RE: [PATCH 4/5] mtd: nand: add support for Micron on-die ECC
  2017-04-11 12:51                     ` Boris Brezillon
@ 2017-04-11 15:02                       ` Bean Huo (beanhuo)
  -1 siblings, 0 replies; 35+ messages in thread
From: Bean Huo (beanhuo) @ 2017-04-11 15:02 UTC (permalink / raw)
  To: Boris Brezillon
  Cc: Thomas Petazzoni, devicetree-u79uwXL29TY76Z2rM5mHXA,
	pawel.moll-5wv7dgnIgG8, Campbell, richard-/L3Ra7n9ekc,
	Mark Rutland, marek.vasut-Re5JQEeQqe8AvxtiuMwx3w, Rob Herring,
	linux-mtd-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	galak-sgV2jX0FEOL9JmXXK+q4OQ, Cyrille Pitchen,
	computersforpeace-Re5JQEeQqe8AvxtiuMwx3w

Hi, Boris and Thomas
Let me do some explanation.

>> if (NAND == SLC ) { // on-die ECC only exists in SLC //check device ID
>> byte 4
>>      if ((ID.byte4 & 0x02) == 0x02) {// internal ECC level ==10b
>
>So here the MT29F1G08ABADAWP datasheet says 0x2 <=> 4bit/512bytes ECC.
>

If the NAND supports on-die ECC, here should be 10b, not matter it is 8bit or 4bit,
You are correct, MT29F1G08ABADAWP is 0x2, its explanation is 4bit/512bytes ECC.
But for the 70s, it is 8bit on-die ECC, but it is still 10b. 
So that why here using these two bits to determine if exist on-die ECC.
What's more, for some old products, they don't support on-die ECC,
Sometimes, here is still 01b, so still need following codes to do further
determinations.

>> 	if (ID.byte4 & 0x80) {//on-Die ECC enabled
>
>Did you read my last reply?
>Thomas discovered that ID[4].bit7 is actually reflecting the ECC engine state (1 if
>the engine is enabled, 0 if it's disabled), not whether the NAND supports on-die
>ECC or not, so no this test is not reliable.
>
For the on-die ECC, it is not always default enabled. It depends on requirement from costumers.
If on-die ECC is not enabled, bit7 is 0. It can be switched through "Feature Operations".

>>                     if (ONFI.byte112 == 4)
>> 		 60s SLC NAND with on-die ECC
>> 	    else if (ONFI.byte112 == 8)
>>      	              70s SLC NAND with on-die ECC
>
>This is completely fucked up! Now the ONFI param page says the NAND requires
>8bits/512bytes, while the ID bytes advertised an on-die ECC providing
>4bits/512bytes correctability.

I think, my previous answers can answer this confusion.

>So either your algorithm is wrong, or the ID and ONFI param page are contracting
>(not sure what solution I'd prefer...).
>
>> 	    else
>>                           Doesn't support on-die ECC
>
>Sorry to say that, but I find it worrisome that even someone from Micron is not
>able to get it right.
>
I am ashamed. I have been in Micron for two years, for some old products, I am also not very clear.
But I checked all the SLC-NAND datasheet with on-die ECC with our AE, and had this final pseudo code.

>I think we'll stick to the model name to detect whether on-die ECC is supported.
>
>Regards,
>
>Boris

beanhuo
--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 35+ messages in thread

* RE: [PATCH 4/5] mtd: nand: add support for Micron on-die ECC
@ 2017-04-11 15:02                       ` Bean Huo (beanhuo)
  0 siblings, 0 replies; 35+ messages in thread
From: Bean Huo (beanhuo) @ 2017-04-11 15:02 UTC (permalink / raw)
  To: Boris Brezillon
  Cc: Thomas Petazzoni, devicetree, pawel.moll, Campbell, richard,
	Mark Rutland, marek.vasut, Rob Herring, linux-mtd, galak,
	Cyrille Pitchen, computersforpeace

Hi, Boris and Thomas
Let me do some explanation.

>> if (NAND == SLC ) { // on-die ECC only exists in SLC //check device ID
>> byte 4
>>      if ((ID.byte4 & 0x02) == 0x02) {// internal ECC level ==10b
>
>So here the MT29F1G08ABADAWP datasheet says 0x2 <=> 4bit/512bytes ECC.
>

If the NAND supports on-die ECC, here should be 10b, not matter it is 8bit or 4bit,
You are correct, MT29F1G08ABADAWP is 0x2, its explanation is 4bit/512bytes ECC.
But for the 70s, it is 8bit on-die ECC, but it is still 10b. 
So that why here using these two bits to determine if exist on-die ECC.
What's more, for some old products, they don't support on-die ECC,
Sometimes, here is still 01b, so still need following codes to do further
determinations.

>> 	if (ID.byte4 & 0x80) {//on-Die ECC enabled
>
>Did you read my last reply?
>Thomas discovered that ID[4].bit7 is actually reflecting the ECC engine state (1 if
>the engine is enabled, 0 if it's disabled), not whether the NAND supports on-die
>ECC or not, so no this test is not reliable.
>
For the on-die ECC, it is not always default enabled. It depends on requirement from costumers.
If on-die ECC is not enabled, bit7 is 0. It can be switched through "Feature Operations".

>>                     if (ONFI.byte112 == 4)
>> 		 60s SLC NAND with on-die ECC
>> 	    else if (ONFI.byte112 == 8)
>>      	              70s SLC NAND with on-die ECC
>
>This is completely fucked up! Now the ONFI param page says the NAND requires
>8bits/512bytes, while the ID bytes advertised an on-die ECC providing
>4bits/512bytes correctability.

I think, my previous answers can answer this confusion.

>So either your algorithm is wrong, or the ID and ONFI param page are contracting
>(not sure what solution I'd prefer...).
>
>> 	    else
>>                           Doesn't support on-die ECC
>
>Sorry to say that, but I find it worrisome that even someone from Micron is not
>able to get it right.
>
I am ashamed. I have been in Micron for two years, for some old products, I am also not very clear.
But I checked all the SLC-NAND datasheet with on-die ECC with our AE, and had this final pseudo code.

>I think we'll stick to the model name to detect whether on-die ECC is supported.
>
>Regards,
>
>Boris

beanhuo

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH 4/5] mtd: nand: add support for Micron on-die ECC
  2017-04-11 14:49                           ` Boris Brezillon
@ 2017-04-11 15:10                             ` Boris Brezillon
  -1 siblings, 0 replies; 35+ messages in thread
From: Boris Brezillon @ 2017-04-11 15:10 UTC (permalink / raw)
  To: Bean Huo (beanhuo)
  Cc: Thomas Petazzoni, devicetree-u79uwXL29TY76Z2rM5mHXA,
	pawel.moll-5wv7dgnIgG8, Campbell, richard-/L3Ra7n9ekc,
	Mark Rutland, marek.vasut-Re5JQEeQqe8AvxtiuMwx3w, Rob Herring,
	linux-mtd-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	galak-sgV2jX0FEOL9JmXXK+q4OQ, Cyrille Pitchen,
	computersforpeace-Re5JQEeQqe8AvxtiuMwx3w

On Tue, 11 Apr 2017 16:49:52 +0200
Boris Brezillon <boris.brezillon-wi1+55ScJUtKEb57/3fJTNBPR1lH4CV8@public.gmane.org> wrote:

> On Tue, 11 Apr 2017 14:26:02 +0000
> "Bean Huo (beanhuo)" <beanhuo-AL4WhLSQfzjQT0dZR+AlfA@public.gmane.org> wrote:
> 
> > >
> > >Hi Bean,
> > >
> > >On Mon, 3 Apr 2017 11:31:05 +0000
> > >"Bean Huo (beanhuo)" <beanhuo-AL4WhLSQfzjQT0dZR+AlfA@public.gmane.org> wrote:
> > >    
> > >> Hi, Boris and Thomas
> > >>    
> > >> >>
> > >> >> Ok, but I recommend that 70s should be the first choice on this
> > >> >> single solution, it doesn't need to read twice to detect its bitflips count.    
> > >> >
> > >> >That's exactly why we need to differentiate the 2 chips.    
> > >>
> > >> Sorry for later this response.
> > >> Below is the pseudo codes about how to differentiate these 2 series
> > >> parallel NAND with on-die ECC:
> > >>
> > >> if (NAND == SLC ) { // on-die ECC only exists in SLC //check device ID
> > >> byte 4
> > >>      if ((ID.byte4 & 0x02) == 0x02) {// internal ECC level ==10b    
> > >
> > >So here the MT29F1G08ABADAWP datasheet says 0x2 <=> 4bit/512bytes ECC.
> > >    
> > >> 	if (ID.byte4 & 0x80) {//on-Die ECC enabled    
> > >
> > >Did you read my last reply?
> > >Thomas discovered that ID[4].bit7 is actually reflecting the ECC engine state (1 if
> > >the engine is enabled, 0 if it's disabled), not whether the NAND supports on-die
> > >ECC or not, so no this test is not reliable.
> > >    
> > >>                     if (ONFI.byte112 == 4)
> > >> 		 60s SLC NAND with on-die ECC
> > >> 	    else if (ONFI.byte112 == 8)
> > >>      	              70s SLC NAND with on-die ECC    
> > >
> > >This is completely fucked up! Now the ONFI param page says the NAND requires
> > >8bits/512bytes, while the ID bytes advertised an on-die ECC providing
> > >4bits/512bytes correctability.
> > >So either your algorithm is wrong, or the ID and ONFI param page are contracting
> > >(not sure what solution I'd prefer...).
> > >    
> > >> 	    else
> > >>                           Doesn't support on-die ECC    
> > >
> > >Sorry to say that, but I find it worrisome that even someone from Micron is not
> > >able to get it right.
> > >    
> > 
> > Sorry, would you please specify which one is wrong or confuse you?  
> 
> The initial 'if (ID.byte4 & 0x80)' is wrong, because this bit is only
> set when someone enabled the ECC engine using the SET_FEATURE command
> (this has been verified by Thomas who tried to disable the feature in
> the bootloader and noticed that on-die ECC was reported as
> 'unsupported' by the kernel).
> 
> Maybe I was wrong about your 'if ((ID.byte4 & 0x02) == 0x02)' test,
> because you apparently only mask bit 1 and not bits 0 and 1.
> Anyway, I can't tell if this is valid because I don't have access to
> the M79A datasheets you're referring to.

Okay, I managed to download the MT29F2G08ABAGAWP datasheet (from the
MT79A family), and it seems that the test should be

	if ((ID.byte4 & 0x03) == 0x02)

and not

	if ((ID.byte4 & 0x02) == 0x02)

Also, this field named "Internal ECC level" clearly does not reflect
the on-die ECC strength because it's set to the same value on both
parts (0x2) while MT29F2G08ABAGAWP provides 8bits/512bytes and
MT29F1G08ABADAWP 4bits/512bytes.

See why I say we can't rely on READ_ID information. It's changing all
the time, and nothing clearly say how to differentiate the scheme used
in a specific NAND part.

--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH 4/5] mtd: nand: add support for Micron on-die ECC
@ 2017-04-11 15:10                             ` Boris Brezillon
  0 siblings, 0 replies; 35+ messages in thread
From: Boris Brezillon @ 2017-04-11 15:10 UTC (permalink / raw)
  To: Bean Huo (beanhuo)
  Cc: Thomas Petazzoni, devicetree, pawel.moll, Campbell, richard,
	Mark Rutland, marek.vasut, Rob Herring, linux-mtd, galak,
	Cyrille Pitchen, computersforpeace

On Tue, 11 Apr 2017 16:49:52 +0200
Boris Brezillon <boris.brezillon@free-electrons.com> wrote:

> On Tue, 11 Apr 2017 14:26:02 +0000
> "Bean Huo (beanhuo)" <beanhuo@micron.com> wrote:
> 
> > >
> > >Hi Bean,
> > >
> > >On Mon, 3 Apr 2017 11:31:05 +0000
> > >"Bean Huo (beanhuo)" <beanhuo@micron.com> wrote:
> > >    
> > >> Hi, Boris and Thomas
> > >>    
> > >> >>
> > >> >> Ok, but I recommend that 70s should be the first choice on this
> > >> >> single solution, it doesn't need to read twice to detect its bitflips count.    
> > >> >
> > >> >That's exactly why we need to differentiate the 2 chips.    
> > >>
> > >> Sorry for later this response.
> > >> Below is the pseudo codes about how to differentiate these 2 series
> > >> parallel NAND with on-die ECC:
> > >>
> > >> if (NAND == SLC ) { // on-die ECC only exists in SLC //check device ID
> > >> byte 4
> > >>      if ((ID.byte4 & 0x02) == 0x02) {// internal ECC level ==10b    
> > >
> > >So here the MT29F1G08ABADAWP datasheet says 0x2 <=> 4bit/512bytes ECC.
> > >    
> > >> 	if (ID.byte4 & 0x80) {//on-Die ECC enabled    
> > >
> > >Did you read my last reply?
> > >Thomas discovered that ID[4].bit7 is actually reflecting the ECC engine state (1 if
> > >the engine is enabled, 0 if it's disabled), not whether the NAND supports on-die
> > >ECC or not, so no this test is not reliable.
> > >    
> > >>                     if (ONFI.byte112 == 4)
> > >> 		 60s SLC NAND with on-die ECC
> > >> 	    else if (ONFI.byte112 == 8)
> > >>      	              70s SLC NAND with on-die ECC    
> > >
> > >This is completely fucked up! Now the ONFI param page says the NAND requires
> > >8bits/512bytes, while the ID bytes advertised an on-die ECC providing
> > >4bits/512bytes correctability.
> > >So either your algorithm is wrong, or the ID and ONFI param page are contracting
> > >(not sure what solution I'd prefer...).
> > >    
> > >> 	    else
> > >>                           Doesn't support on-die ECC    
> > >
> > >Sorry to say that, but I find it worrisome that even someone from Micron is not
> > >able to get it right.
> > >    
> > 
> > Sorry, would you please specify which one is wrong or confuse you?  
> 
> The initial 'if (ID.byte4 & 0x80)' is wrong, because this bit is only
> set when someone enabled the ECC engine using the SET_FEATURE command
> (this has been verified by Thomas who tried to disable the feature in
> the bootloader and noticed that on-die ECC was reported as
> 'unsupported' by the kernel).
> 
> Maybe I was wrong about your 'if ((ID.byte4 & 0x02) == 0x02)' test,
> because you apparently only mask bit 1 and not bits 0 and 1.
> Anyway, I can't tell if this is valid because I don't have access to
> the M79A datasheets you're referring to.

Okay, I managed to download the MT29F2G08ABAGAWP datasheet (from the
MT79A family), and it seems that the test should be

	if ((ID.byte4 & 0x03) == 0x02)

and not

	if ((ID.byte4 & 0x02) == 0x02)

Also, this field named "Internal ECC level" clearly does not reflect
the on-die ECC strength because it's set to the same value on both
parts (0x2) while MT29F2G08ABAGAWP provides 8bits/512bytes and
MT29F1G08ABADAWP 4bits/512bytes.

See why I say we can't rely on READ_ID information. It's changing all
the time, and nothing clearly say how to differentiate the scheme used
in a specific NAND part.

^ permalink raw reply	[flat|nested] 35+ messages in thread

* RE: [PATCH 4/5] mtd: nand: add support for Micron on-die ECC
  2017-04-11 15:10                             ` Boris Brezillon
@ 2017-04-11 15:28                               ` Bean Huo (beanhuo)
  -1 siblings, 0 replies; 35+ messages in thread
From: Bean Huo (beanhuo) @ 2017-04-11 15:28 UTC (permalink / raw)
  To: Boris Brezillon
  Cc: Thomas Petazzoni, devicetree-u79uwXL29TY76Z2rM5mHXA,
	pawel.moll-5wv7dgnIgG8, Campbell, richard-/L3Ra7n9ekc,
	Mark Rutland, marek.vasut-Re5JQEeQqe8AvxtiuMwx3w, Rob Herring,
	linux-mtd-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	galak-sgV2jX0FEOL9JmXXK+q4OQ, Cyrille Pitchen,
	computersforpeace-Re5JQEeQqe8AvxtiuMwx3w

Hi, Boris 

>> Maybe I was wrong about your 'if ((ID.byte4 & 0x02) == 0x02)' test,
>> because you apparently only mask bit 1 and not bits 0 and 1.

Sorry, here is my wrong, it should be masked with 0x3, not 0x02. 

>> Anyway, I can't tell if this is valid because I don't have access to
>> the M79A datasheets you're referring to.
>
>Okay, I managed to download the MT29F2G08ABAGAWP datasheet (from the
>MT79A family), and it seems that the test should be
>
>	if ((ID.byte4 & 0x03) == 0x02)
This is correct, should be 0x03, not 0x02.
>
>and not
>
>	if ((ID.byte4 & 0x02) == 0x02)
>
>Also, this field named "Internal ECC level" clearly does not reflect the on-die ECC
>strength because it's set to the same value on both parts (0x2) while
>MT29F2G08ABAGAWP provides 8bits/512bytes and MT29F1G08ABADAWP
>4bits/512bytes.
>
>See why I say we can't rely on READ_ID information. It's changing all the time,
>and nothing clearly say how to differentiate the scheme used in a specific NAND
>part.

Correct, so far there is no standard to define ID.byte4, every vendor with their own definition.
Even with the same vendor, for the different products, this is also changing.

Maintain one table in MTD, it is a simple and convenient way to differentiate, but, in my opinion,
This is not linux trend. We have method to automatically detect, why still maintain an additional table.

--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 35+ messages in thread

* RE: [PATCH 4/5] mtd: nand: add support for Micron on-die ECC
@ 2017-04-11 15:28                               ` Bean Huo (beanhuo)
  0 siblings, 0 replies; 35+ messages in thread
From: Bean Huo (beanhuo) @ 2017-04-11 15:28 UTC (permalink / raw)
  To: Boris Brezillon
  Cc: Thomas Petazzoni, devicetree, pawel.moll, Campbell, richard,
	Mark Rutland, marek.vasut, Rob Herring, linux-mtd, galak,
	Cyrille Pitchen, computersforpeace

Hi, Boris 

>> Maybe I was wrong about your 'if ((ID.byte4 & 0x02) == 0x02)' test,
>> because you apparently only mask bit 1 and not bits 0 and 1.

Sorry, here is my wrong, it should be masked with 0x3, not 0x02. 

>> Anyway, I can't tell if this is valid because I don't have access to
>> the M79A datasheets you're referring to.
>
>Okay, I managed to download the MT29F2G08ABAGAWP datasheet (from the
>MT79A family), and it seems that the test should be
>
>	if ((ID.byte4 & 0x03) == 0x02)
This is correct, should be 0x03, not 0x02.
>
>and not
>
>	if ((ID.byte4 & 0x02) == 0x02)
>
>Also, this field named "Internal ECC level" clearly does not reflect the on-die ECC
>strength because it's set to the same value on both parts (0x2) while
>MT29F2G08ABAGAWP provides 8bits/512bytes and MT29F1G08ABADAWP
>4bits/512bytes.
>
>See why I say we can't rely on READ_ID information. It's changing all the time,
>and nothing clearly say how to differentiate the scheme used in a specific NAND
>part.

Correct, so far there is no standard to define ID.byte4, every vendor with their own definition.
Even with the same vendor, for the different products, this is also changing.

Maintain one table in MTD, it is a simple and convenient way to differentiate, but, in my opinion,
This is not linux trend. We have method to automatically detect, why still maintain an additional table.

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH 4/5] mtd: nand: add support for Micron on-die ECC
  2017-04-11 15:02                       ` Bean Huo (beanhuo)
@ 2017-04-11 15:30                           ` Boris Brezillon
  -1 siblings, 0 replies; 35+ messages in thread
From: Boris Brezillon @ 2017-04-11 15:30 UTC (permalink / raw)
  To: Bean Huo (beanhuo)
  Cc: Thomas Petazzoni, devicetree-u79uwXL29TY76Z2rM5mHXA,
	pawel.moll-5wv7dgnIgG8, Campbell, richard-/L3Ra7n9ekc,
	Mark Rutland, marek.vasut-Re5JQEeQqe8AvxtiuMwx3w, Rob Herring,
	linux-mtd-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	galak-sgV2jX0FEOL9JmXXK+q4OQ, Cyrille Pitchen,
	computersforpeace-Re5JQEeQqe8AvxtiuMwx3w

On Tue, 11 Apr 2017 15:02:22 +0000
"Bean Huo (beanhuo)" <beanhuo-AL4WhLSQfzjQT0dZR+AlfA@public.gmane.org> wrote:

> Hi, Boris and Thomas
> Let me do some explanation.
> 
> >> if (NAND == SLC ) { // on-die ECC only exists in SLC //check device ID
> >> byte 4
> >>      if ((ID.byte4 & 0x02) == 0x02) {// internal ECC level ==10b  
> >
> >So here the MT29F1G08ABADAWP datasheet says 0x2 <=> 4bit/512bytes ECC.
> >  
> 
> If the NAND supports on-die ECC, here should be 10b, not matter it is 8bit or 4bit,
> You are correct, MT29F1G08ABADAWP is 0x2, its explanation is 4bit/512bytes ECC.
> But for the 70s, it is 8bit on-die ECC, but it is still 10b. 
> So that why here using these two bits to determine if exist on-die ECC.
> What's more, for some old products, they don't support on-die ECC,
> Sometimes, here is still 01b, so still need following codes to do further
> determinations.

Okay, then here is the differentiator. Did you check that on SLC NANDs
there's no collision on ID[4].bits[1:0]. I've seen NAND vendors
changing their ID scheme in incompatible ways (old fields were
replaced by new ones with completely different meanings).

I'd really like to make sure we're not mis-interpreting READ_ID
information, so maybe we should restrict the test on ONFI NANDs if all
NANDs supporting on-die ECC are ONFI compliant. We should probably also
check that chip->id.len >= 5.


> 
> >> 	if (ID.byte4 & 0x80) {//on-Die ECC enabled  
> >
> >Did you read my last reply?
> >Thomas discovered that ID[4].bit7 is actually reflecting the ECC engine state (1 if
> >the engine is enabled, 0 if it's disabled), not whether the NAND supports on-die
> >ECC or not, so no this test is not reliable.
> >  
> For the on-die ECC, it is not always default enabled. It depends on requirement from costumers.
> If on-die ECC is not enabled, bit7 is 0. It can be switched through "Feature Operations".

So this check is not needed, right?
BTW, do you have NANDs where the on-die ECC is always enabled, and if
this is the case, what happens when you call
SET_FEATURE(disable/enable-ECC) on these NANDs?

> 
> >>                     if (ONFI.byte112 == 4)
> >> 		 60s SLC NAND with on-die ECC
> >> 	    else if (ONFI.byte112 == 8)
> >>      	              70s SLC NAND with on-die ECC  
> >
> >This is completely fucked up! Now the ONFI param page says the NAND requires
> >8bits/512bytes, while the ID bytes advertised an on-die ECC providing
> >4bits/512bytes correctability.  
> 
> I think, my previous answers can answer this confusion.

Yep. BTW, sorry for being so harsh in my previous reply.
--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH 4/5] mtd: nand: add support for Micron on-die ECC
@ 2017-04-11 15:30                           ` Boris Brezillon
  0 siblings, 0 replies; 35+ messages in thread
From: Boris Brezillon @ 2017-04-11 15:30 UTC (permalink / raw)
  To: Bean Huo (beanhuo)
  Cc: Thomas Petazzoni, devicetree, pawel.moll, Campbell, richard,
	Mark Rutland, marek.vasut, Rob Herring, linux-mtd, galak,
	Cyrille Pitchen, computersforpeace

On Tue, 11 Apr 2017 15:02:22 +0000
"Bean Huo (beanhuo)" <beanhuo@micron.com> wrote:

> Hi, Boris and Thomas
> Let me do some explanation.
> 
> >> if (NAND == SLC ) { // on-die ECC only exists in SLC //check device ID
> >> byte 4
> >>      if ((ID.byte4 & 0x02) == 0x02) {// internal ECC level ==10b  
> >
> >So here the MT29F1G08ABADAWP datasheet says 0x2 <=> 4bit/512bytes ECC.
> >  
> 
> If the NAND supports on-die ECC, here should be 10b, not matter it is 8bit or 4bit,
> You are correct, MT29F1G08ABADAWP is 0x2, its explanation is 4bit/512bytes ECC.
> But for the 70s, it is 8bit on-die ECC, but it is still 10b. 
> So that why here using these two bits to determine if exist on-die ECC.
> What's more, for some old products, they don't support on-die ECC,
> Sometimes, here is still 01b, so still need following codes to do further
> determinations.

Okay, then here is the differentiator. Did you check that on SLC NANDs
there's no collision on ID[4].bits[1:0]. I've seen NAND vendors
changing their ID scheme in incompatible ways (old fields were
replaced by new ones with completely different meanings).

I'd really like to make sure we're not mis-interpreting READ_ID
information, so maybe we should restrict the test on ONFI NANDs if all
NANDs supporting on-die ECC are ONFI compliant. We should probably also
check that chip->id.len >= 5.


> 
> >> 	if (ID.byte4 & 0x80) {//on-Die ECC enabled  
> >
> >Did you read my last reply?
> >Thomas discovered that ID[4].bit7 is actually reflecting the ECC engine state (1 if
> >the engine is enabled, 0 if it's disabled), not whether the NAND supports on-die
> >ECC or not, so no this test is not reliable.
> >  
> For the on-die ECC, it is not always default enabled. It depends on requirement from costumers.
> If on-die ECC is not enabled, bit7 is 0. It can be switched through "Feature Operations".

So this check is not needed, right?
BTW, do you have NANDs where the on-die ECC is always enabled, and if
this is the case, what happens when you call
SET_FEATURE(disable/enable-ECC) on these NANDs?

> 
> >>                     if (ONFI.byte112 == 4)
> >> 		 60s SLC NAND with on-die ECC
> >> 	    else if (ONFI.byte112 == 8)
> >>      	              70s SLC NAND with on-die ECC  
> >
> >This is completely fucked up! Now the ONFI param page says the NAND requires
> >8bits/512bytes, while the ID bytes advertised an on-die ECC providing
> >4bits/512bytes correctability.  
> 
> I think, my previous answers can answer this confusion.

Yep. BTW, sorry for being so harsh in my previous reply.

^ permalink raw reply	[flat|nested] 35+ messages in thread

* RE: [PATCH 4/5] mtd: nand: add support for Micron on-die ECC
  2017-04-11 15:30                           ` Boris Brezillon
@ 2017-04-11 17:01                             ` Bean Huo (beanhuo)
  -1 siblings, 0 replies; 35+ messages in thread
From: Bean Huo (beanhuo) @ 2017-04-11 17:01 UTC (permalink / raw)
  To: Boris Brezillon
  Cc: Thomas Petazzoni, devicetree-u79uwXL29TY76Z2rM5mHXA,
	pawel.moll-5wv7dgnIgG8, Campbell, richard-/L3Ra7n9ekc,
	Mark Rutland, marek.vasut-Re5JQEeQqe8AvxtiuMwx3w, Rob Herring,
	linux-mtd-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	galak-sgV2jX0FEOL9JmXXK+q4OQ, Cyrille Pitchen,
	computersforpeace-Re5JQEeQqe8AvxtiuMwx3w

>On Tue, 11 Apr 2017 15:02:22 +0000
>"Bean Huo (beanhuo)" <beanhuo-AL4WhLSQfzjQT0dZR+AlfA@public.gmane.org> wrote:
>
>> Hi, Boris and Thomas
>> Let me do some explanation.
>>
>> >> if (NAND == SLC ) { // on-die ECC only exists in SLC //check device
>> >> ID byte 4
>> >>      if ((ID.byte4 & 0x02) == 0x02) {// internal ECC level ==10b
>> >
>> >So here the MT29F1G08ABADAWP datasheet says 0x2 <=> 4bit/512bytes ECC.
>> >
>>
>> If the NAND supports on-die ECC, here should be 10b, not matter it is
>> 8bit or 4bit, You are correct, MT29F1G08ABADAWP is 0x2, its explanation is
>4bit/512bytes ECC.
>> But for the 70s, it is 8bit on-die ECC, but it is still 10b.
>> So that why here using these two bits to determine if exist on-die ECC.
>> What's more, for some old products, they don't support on-die ECC,
>> Sometimes, here is still 01b, so still need following codes to do
>> further determinations.
>
>Okay, then here is the differentiator. Did you check that on SLC NANDs there's no
>collision on ID[4].bits[1:0]. I've seen NAND vendors changing their ID scheme in
>incompatible ways (old fields were replaced by new ones with completely
>different meanings).


Yes, this is true, there is no one standard to define and formalize ID.byte4,
It is always changing. Also, sometimes it definitely conflicts with other NAND without
On-die ECC. For the Micron both serials SLC NAND with on-die ECC, bits[1:0] is defined
Internal ECC level. 

>I'd really like to make sure we're not mis-interpreting READ_ID information, so
>maybe we should restrict the test on ONFI NANDs if all NANDs supporting on-die
>ECC are ONFI compliant. We should probably also check that chip->id.len >= 5.
>
>
>>
>> >> 	if (ID.byte4 & 0x80) {//on-Die ECC enabled
>> >
>> >Did you read my last reply?
>> >Thomas discovered that ID[4].bit7 is actually reflecting the ECC
>> >engine state (1 if the engine is enabled, 0 if it's disabled), not
>> >whether the NAND supports on-die ECC or not, so no this test is not reliable.
>> >
>> For the on-die ECC, it is not always default enabled. It depends on requirement
>from costumers.
>> If on-die ECC is not enabled, bit7 is 0. It can be switched through "Feature
>Operations".
>
>So this check is not needed, right?

Here is much complicated. One question is that what main purpose of on-die ECC.
there are two types of usage model:
1.  on-die ECC default enabled:
Normally before bootloader and kernel, there is no any ECC to correct and maintain
Bootloader reliability.  For this kind of customer, I think, they mainly want to have reliable booting.
Rather than for store user data. Per this kind of condition, we don't check, because on-die ECC
Always be enabled, cannot be disabled.

2. on-die ECC default disabled:
I think this is used for some important user data. Unless the bootrom of CPU can issue 
SET_FEATURE to enable on-die ECC, and until Linux running, on-die ECC is still enabled.
Otherwise, we need to check if it enables or not.

>BTW, do you have NANDs where the on-die ECC is always enabled, and if this is
>the case, what happens when you call
>SET_FEATURE(disable/enable-ECC) on these NANDs?

If this NAND is on-die ECC defaulted enabled, the on-die ECC cannot
Disabled later. Why? This is related to specific user model.
We have one PPT on Micron domain website, it is "on die ECC training",
It opens and can freely download. It clearly describes this.
If you cannot download, please let me know, I send to you.

As for the on-die ECC default disabled, you can freely switch on or off
by SET_FEATURE(disable/enable-ECC).

>
>>
>> >>                     if (ONFI.byte112 == 4)
>> >> 		 60s SLC NAND with on-die ECC
>> >> 	    else if (ONFI.byte112 == 8)
>> >>      	              70s SLC NAND with on-die ECC
>> >
>> >This is completely fucked up! Now the ONFI param page says the NAND
>> >requires 8bits/512bytes, while the ID bytes advertised an on-die ECC
>> >providing 4bits/512bytes correctability.
>>
>> I think, my previous answers can answer this confusion.
>
>Yep. BTW, sorry for being so harsh in my previous reply.
Don't sorry, open source community should like this. And if you have any confusion and
Something fucked up about Micron NAND, please freely speak out and let me know.
If we can give any support, we are very happy. 
--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 35+ messages in thread

* RE: [PATCH 4/5] mtd: nand: add support for Micron on-die ECC
@ 2017-04-11 17:01                             ` Bean Huo (beanhuo)
  0 siblings, 0 replies; 35+ messages in thread
From: Bean Huo (beanhuo) @ 2017-04-11 17:01 UTC (permalink / raw)
  To: Boris Brezillon
  Cc: Thomas Petazzoni, devicetree, pawel.moll, Campbell, richard,
	Mark Rutland, marek.vasut, Rob Herring, linux-mtd, galak,
	Cyrille Pitchen, computersforpeace

>On Tue, 11 Apr 2017 15:02:22 +0000
>"Bean Huo (beanhuo)" <beanhuo@micron.com> wrote:
>
>> Hi, Boris and Thomas
>> Let me do some explanation.
>>
>> >> if (NAND == SLC ) { // on-die ECC only exists in SLC //check device
>> >> ID byte 4
>> >>      if ((ID.byte4 & 0x02) == 0x02) {// internal ECC level ==10b
>> >
>> >So here the MT29F1G08ABADAWP datasheet says 0x2 <=> 4bit/512bytes ECC.
>> >
>>
>> If the NAND supports on-die ECC, here should be 10b, not matter it is
>> 8bit or 4bit, You are correct, MT29F1G08ABADAWP is 0x2, its explanation is
>4bit/512bytes ECC.
>> But for the 70s, it is 8bit on-die ECC, but it is still 10b.
>> So that why here using these two bits to determine if exist on-die ECC.
>> What's more, for some old products, they don't support on-die ECC,
>> Sometimes, here is still 01b, so still need following codes to do
>> further determinations.
>
>Okay, then here is the differentiator. Did you check that on SLC NANDs there's no
>collision on ID[4].bits[1:0]. I've seen NAND vendors changing their ID scheme in
>incompatible ways (old fields were replaced by new ones with completely
>different meanings).


Yes, this is true, there is no one standard to define and formalize ID.byte4,
It is always changing. Also, sometimes it definitely conflicts with other NAND without
On-die ECC. For the Micron both serials SLC NAND with on-die ECC, bits[1:0] is defined
Internal ECC level. 

>I'd really like to make sure we're not mis-interpreting READ_ID information, so
>maybe we should restrict the test on ONFI NANDs if all NANDs supporting on-die
>ECC are ONFI compliant. We should probably also check that chip->id.len >= 5.
>
>
>>
>> >> 	if (ID.byte4 & 0x80) {//on-Die ECC enabled
>> >
>> >Did you read my last reply?
>> >Thomas discovered that ID[4].bit7 is actually reflecting the ECC
>> >engine state (1 if the engine is enabled, 0 if it's disabled), not
>> >whether the NAND supports on-die ECC or not, so no this test is not reliable.
>> >
>> For the on-die ECC, it is not always default enabled. It depends on requirement
>from costumers.
>> If on-die ECC is not enabled, bit7 is 0. It can be switched through "Feature
>Operations".
>
>So this check is not needed, right?

Here is much complicated. One question is that what main purpose of on-die ECC.
there are two types of usage model:
1.  on-die ECC default enabled:
Normally before bootloader and kernel, there is no any ECC to correct and maintain
Bootloader reliability.  For this kind of customer, I think, they mainly want to have reliable booting.
Rather than for store user data. Per this kind of condition, we don't check, because on-die ECC
Always be enabled, cannot be disabled.

2. on-die ECC default disabled:
I think this is used for some important user data. Unless the bootrom of CPU can issue 
SET_FEATURE to enable on-die ECC, and until Linux running, on-die ECC is still enabled.
Otherwise, we need to check if it enables or not.

>BTW, do you have NANDs where the on-die ECC is always enabled, and if this is
>the case, what happens when you call
>SET_FEATURE(disable/enable-ECC) on these NANDs?

If this NAND is on-die ECC defaulted enabled, the on-die ECC cannot
Disabled later. Why? This is related to specific user model.
We have one PPT on Micron domain website, it is "on die ECC training",
It opens and can freely download. It clearly describes this.
If you cannot download, please let me know, I send to you.

As for the on-die ECC default disabled, you can freely switch on or off
by SET_FEATURE(disable/enable-ECC).

>
>>
>> >>                     if (ONFI.byte112 == 4)
>> >> 		 60s SLC NAND with on-die ECC
>> >> 	    else if (ONFI.byte112 == 8)
>> >>      	              70s SLC NAND with on-die ECC
>> >
>> >This is completely fucked up! Now the ONFI param page says the NAND
>> >requires 8bits/512bytes, while the ID bytes advertised an on-die ECC
>> >providing 4bits/512bytes correctability.
>>
>> I think, my previous answers can answer this confusion.
>
>Yep. BTW, sorry for being so harsh in my previous reply.
Don't sorry, open source community should like this. And if you have any confusion and
Something fucked up about Micron NAND, please freely speak out and let me know.
If we can give any support, we are very happy. 

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH 4/5] mtd: nand: add support for Micron on-die ECC
  2017-04-11 17:01                             ` Bean Huo (beanhuo)
@ 2017-04-12  7:03                                 ` Boris Brezillon
  -1 siblings, 0 replies; 35+ messages in thread
From: Boris Brezillon @ 2017-04-12  7:03 UTC (permalink / raw)
  To: Bean Huo (beanhuo)
  Cc: Thomas Petazzoni, devicetree-u79uwXL29TY76Z2rM5mHXA,
	pawel.moll-5wv7dgnIgG8, Campbell, richard-/L3Ra7n9ekc,
	Mark Rutland, marek.vasut-Re5JQEeQqe8AvxtiuMwx3w, Rob Herring,
	linux-mtd-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	galak-sgV2jX0FEOL9JmXXK+q4OQ, Cyrille Pitchen,
	computersforpeace-Re5JQEeQqe8AvxtiuMwx3w

On Tue, 11 Apr 2017 17:01:51 +0000
"Bean Huo (beanhuo)" <beanhuo-AL4WhLSQfzjQT0dZR+AlfA@public.gmane.org> wrote:

> >On Tue, 11 Apr 2017 15:02:22 +0000
> >"Bean Huo (beanhuo)" <beanhuo-AL4WhLSQfzjQT0dZR+AlfA@public.gmane.org> wrote:
> >  
> >> Hi, Boris and Thomas
> >> Let me do some explanation.
> >>  
> >> >> if (NAND == SLC ) { // on-die ECC only exists in SLC //check device
> >> >> ID byte 4
> >> >>      if ((ID.byte4 & 0x02) == 0x02) {// internal ECC level ==10b  
> >> >
> >> >So here the MT29F1G08ABADAWP datasheet says 0x2 <=> 4bit/512bytes ECC.
> >> >  
> >>
> >> If the NAND supports on-die ECC, here should be 10b, not matter it is
> >> 8bit or 4bit, You are correct, MT29F1G08ABADAWP is 0x2, its explanation is  
> >4bit/512bytes ECC.  
> >> But for the 70s, it is 8bit on-die ECC, but it is still 10b.
> >> So that why here using these two bits to determine if exist on-die ECC.
> >> What's more, for some old products, they don't support on-die ECC,
> >> Sometimes, here is still 01b, so still need following codes to do
> >> further determinations.  
> >
> >Okay, then here is the differentiator. Did you check that on SLC NANDs there's no
> >collision on ID[4].bits[1:0]. I've seen NAND vendors changing their ID scheme in
> >incompatible ways (old fields were replaced by new ones with completely
> >different meanings).  
> 
> 
> Yes, this is true, there is no one standard to define and formalize ID.byte4,
> It is always changing. Also, sometimes it definitely conflicts with other NAND without
> On-die ECC. For the Micron both serials SLC NAND with on-die ECC, bits[1:0] is defined
> Internal ECC level. 
> 
> >I'd really like to make sure we're not mis-interpreting READ_ID information, so
> >maybe we should restrict the test on ONFI NANDs if all NANDs supporting on-die
> >ECC are ONFI compliant. We should probably also check that chip->id.len >= 5.
> >
> >  
> >>  
> >> >> 	if (ID.byte4 & 0x80) {//on-Die ECC enabled  
> >> >
> >> >Did you read my last reply?
> >> >Thomas discovered that ID[4].bit7 is actually reflecting the ECC
> >> >engine state (1 if the engine is enabled, 0 if it's disabled), not
> >> >whether the NAND supports on-die ECC or not, so no this test is not reliable.
> >> >  
> >> For the on-die ECC, it is not always default enabled. It depends on requirement  
> >from costumers.  
> >> If on-die ECC is not enabled, bit7 is 0. It can be switched through "Feature  
> >Operations".
> >
> >So this check is not needed, right?  
> 
> Here is much complicated. One question is that what main purpose of on-die ECC.
> there are two types of usage model:
> 1.  on-die ECC default enabled:
> Normally before bootloader and kernel, there is no any ECC to correct and maintain
> Bootloader reliability.  For this kind of customer, I think, they mainly want to have reliable booting.
> Rather than for store user data. Per this kind of condition, we don't check, because on-die ECC
> Always be enabled, cannot be disabled.
> 
> 2. on-die ECC default disabled:
> I think this is used for some important user data. Unless the bootrom of CPU can issue 
> SET_FEATURE to enable on-die ECC, and until Linux running, on-die ECC is still enabled.
> Otherwise, we need to check if it enables or not.

Well, knowing whether the NAND has on-die ECC or not and determining if
it's enabled by default are 2 different things. Until now, we were
trying to detect the former.

> 
> >BTW, do you have NANDs where the on-die ECC is always enabled, and if this is
> >the case, what happens when you call
> >SET_FEATURE(disable/enable-ECC) on these NANDs?  
> 
> If this NAND is on-die ECC defaulted enabled, the on-die ECC cannot
> Disabled later. Why? This is related to specific user model.

Erf, this changes a bit what Thomas and I had in mind, because that
means read/write_page_raw() are not supported in this case, and more
importantly, that means users should by no mean enable external ECC
engines.

> We have one PPT on Micron domain website, it is "on die ECC training",
> It opens and can freely download. It clearly describes this.

Okay, I'll try to download this document.

One last question. Is it dangerous to call
SET_FEATURE(disable/enable-ECC) on a NAND that has ECC enabled by
default? We could use that to detect whether on-die ECC can be turned
off or not and adjust the chip->ecc init steps accordingly.

Thanks,

Boris
--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH 4/5] mtd: nand: add support for Micron on-die ECC
@ 2017-04-12  7:03                                 ` Boris Brezillon
  0 siblings, 0 replies; 35+ messages in thread
From: Boris Brezillon @ 2017-04-12  7:03 UTC (permalink / raw)
  To: Bean Huo (beanhuo)
  Cc: Thomas Petazzoni, devicetree, pawel.moll, Campbell, richard,
	Mark Rutland, marek.vasut, Rob Herring, linux-mtd, galak,
	Cyrille Pitchen, computersforpeace

On Tue, 11 Apr 2017 17:01:51 +0000
"Bean Huo (beanhuo)" <beanhuo@micron.com> wrote:

> >On Tue, 11 Apr 2017 15:02:22 +0000
> >"Bean Huo (beanhuo)" <beanhuo@micron.com> wrote:
> >  
> >> Hi, Boris and Thomas
> >> Let me do some explanation.
> >>  
> >> >> if (NAND == SLC ) { // on-die ECC only exists in SLC //check device
> >> >> ID byte 4
> >> >>      if ((ID.byte4 & 0x02) == 0x02) {// internal ECC level ==10b  
> >> >
> >> >So here the MT29F1G08ABADAWP datasheet says 0x2 <=> 4bit/512bytes ECC.
> >> >  
> >>
> >> If the NAND supports on-die ECC, here should be 10b, not matter it is
> >> 8bit or 4bit, You are correct, MT29F1G08ABADAWP is 0x2, its explanation is  
> >4bit/512bytes ECC.  
> >> But for the 70s, it is 8bit on-die ECC, but it is still 10b.
> >> So that why here using these two bits to determine if exist on-die ECC.
> >> What's more, for some old products, they don't support on-die ECC,
> >> Sometimes, here is still 01b, so still need following codes to do
> >> further determinations.  
> >
> >Okay, then here is the differentiator. Did you check that on SLC NANDs there's no
> >collision on ID[4].bits[1:0]. I've seen NAND vendors changing their ID scheme in
> >incompatible ways (old fields were replaced by new ones with completely
> >different meanings).  
> 
> 
> Yes, this is true, there is no one standard to define and formalize ID.byte4,
> It is always changing. Also, sometimes it definitely conflicts with other NAND without
> On-die ECC. For the Micron both serials SLC NAND with on-die ECC, bits[1:0] is defined
> Internal ECC level. 
> 
> >I'd really like to make sure we're not mis-interpreting READ_ID information, so
> >maybe we should restrict the test on ONFI NANDs if all NANDs supporting on-die
> >ECC are ONFI compliant. We should probably also check that chip->id.len >= 5.
> >
> >  
> >>  
> >> >> 	if (ID.byte4 & 0x80) {//on-Die ECC enabled  
> >> >
> >> >Did you read my last reply?
> >> >Thomas discovered that ID[4].bit7 is actually reflecting the ECC
> >> >engine state (1 if the engine is enabled, 0 if it's disabled), not
> >> >whether the NAND supports on-die ECC or not, so no this test is not reliable.
> >> >  
> >> For the on-die ECC, it is not always default enabled. It depends on requirement  
> >from costumers.  
> >> If on-die ECC is not enabled, bit7 is 0. It can be switched through "Feature  
> >Operations".
> >
> >So this check is not needed, right?  
> 
> Here is much complicated. One question is that what main purpose of on-die ECC.
> there are two types of usage model:
> 1.  on-die ECC default enabled:
> Normally before bootloader and kernel, there is no any ECC to correct and maintain
> Bootloader reliability.  For this kind of customer, I think, they mainly want to have reliable booting.
> Rather than for store user data. Per this kind of condition, we don't check, because on-die ECC
> Always be enabled, cannot be disabled.
> 
> 2. on-die ECC default disabled:
> I think this is used for some important user data. Unless the bootrom of CPU can issue 
> SET_FEATURE to enable on-die ECC, and until Linux running, on-die ECC is still enabled.
> Otherwise, we need to check if it enables or not.

Well, knowing whether the NAND has on-die ECC or not and determining if
it's enabled by default are 2 different things. Until now, we were
trying to detect the former.

> 
> >BTW, do you have NANDs where the on-die ECC is always enabled, and if this is
> >the case, what happens when you call
> >SET_FEATURE(disable/enable-ECC) on these NANDs?  
> 
> If this NAND is on-die ECC defaulted enabled, the on-die ECC cannot
> Disabled later. Why? This is related to specific user model.

Erf, this changes a bit what Thomas and I had in mind, because that
means read/write_page_raw() are not supported in this case, and more
importantly, that means users should by no mean enable external ECC
engines.

> We have one PPT on Micron domain website, it is "on die ECC training",
> It opens and can freely download. It clearly describes this.

Okay, I'll try to download this document.

One last question. Is it dangerous to call
SET_FEATURE(disable/enable-ECC) on a NAND that has ECC enabled by
default? We could use that to detect whether on-die ECC can be turned
off or not and adjust the chip->ecc init steps accordingly.

Thanks,

Boris

^ permalink raw reply	[flat|nested] 35+ messages in thread

* RE: [PATCH 4/5] mtd: nand: add support for Micron on-die ECC
  2017-04-12  7:03                                 ` Boris Brezillon
@ 2017-04-13 14:08                                   ` Bean Huo (beanhuo)
  -1 siblings, 0 replies; 35+ messages in thread
From: Bean Huo (beanhuo) @ 2017-04-13 14:08 UTC (permalink / raw)
  To: Boris Brezillon
  Cc: Thomas Petazzoni, devicetree-u79uwXL29TY76Z2rM5mHXA,
	pawel.moll-5wv7dgnIgG8, Campbell, richard-/L3Ra7n9ekc,
	Mark Rutland, marek.vasut-Re5JQEeQqe8AvxtiuMwx3w, Rob Herring,
	linux-mtd-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	galak-sgV2jX0FEOL9JmXXK+q4OQ, Cyrille Pitchen,
	computersforpeace-Re5JQEeQqe8AvxtiuMwx3w

Hi, Boris

>One last question. Is it dangerous to call
>SET_FEATURE(disable/enable-ECC) on a NAND that has ECC enabled by default?
>We could use that to detect whether on-die ECC can be turned off or not and
>adjust the chip->ecc init steps accordingly.
>
No any danger exist, NAND just gives response that set feature failed.
Also, it is a good solution to detect whether on-die ECC support or not through
SET_FEATURE(disable/enable-ECC) successful or not.

>Thanks,
>
>Boris

Happy Easter!
//Bean
--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 35+ messages in thread

* RE: [PATCH 4/5] mtd: nand: add support for Micron on-die ECC
@ 2017-04-13 14:08                                   ` Bean Huo (beanhuo)
  0 siblings, 0 replies; 35+ messages in thread
From: Bean Huo (beanhuo) @ 2017-04-13 14:08 UTC (permalink / raw)
  To: Boris Brezillon
  Cc: Thomas Petazzoni, devicetree, pawel.moll, Campbell, richard,
	Mark Rutland, marek.vasut, Rob Herring, linux-mtd, galak,
	Cyrille Pitchen, computersforpeace

Hi, Boris

>One last question. Is it dangerous to call
>SET_FEATURE(disable/enable-ECC) on a NAND that has ECC enabled by default?
>We could use that to detect whether on-die ECC can be turned off or not and
>adjust the chip->ecc init steps accordingly.
>
No any danger exist, NAND just gives response that set feature failed.
Also, it is a good solution to detect whether on-die ECC support or not through
SET_FEATURE(disable/enable-ECC) successful or not.

>Thanks,
>
>Boris

Happy Easter!
//Bean

^ permalink raw reply	[flat|nested] 35+ messages in thread

* [PATCH 4/5] mtd: nand: add support for Micron on-die ECC
  2017-03-21 10:38 [PATCH 0/5] mtd: nand: add support for " Thomas Petazzoni
@ 2017-03-21 10:38     ` Thomas Petazzoni
  0 siblings, 0 replies; 35+ messages in thread
From: Thomas Petazzoni @ 2017-03-21 10:38 UTC (permalink / raw)
  To: Boris Brezillon, Marek Vasut, Richard Weinberger,
	Cyrille Pitchen, David Woodhouse, Brian Norris,
	linux-mtd-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	devicetree-u79uwXL29TY76Z2rM5mHXA, Rob Herring, Ian Campbell,
	Pawel Moll, Mark Rutland, Kumar Gala
  Cc: Thomas Petazzoni

Now that the core NAND subsystem has support for on-die ECC, this commit
brings the necessary code to support on-die ECC on Micron NANDs.

In micron_nand_init(), if the Device Tree indicates an on-die ECC mode,
we verify if the NAND chip actually has ECC support and if so, fill in
the appropriate fields of nand_chip->ecc and defines the OOB layout
according to the NAND datasheet.

We then provide an implementation of the ->read_page(),
->read_page_raw(), ->write_page() and ->write_page_raw() operation to
properly handle the on-die ECC.

In the non-raw functions, we need to enable the internal ECC engine
before issuing the NAND_CMD_READ0 or NAND_CMD_SEQIN commands, which is
why we set the NAND_ECC_CUSTOM_PAGE_ACCESS option at initialization
time (it asks the NAND core to let the NAND driver issue those
commands).

Signed-off-by: Thomas Petazzoni <thomas.petazzoni-wi1+55ScJUtKEb57/3fJTNBPR1lH4CV8@public.gmane.org>
---
 drivers/mtd/nand/nand_micron.c | 147 +++++++++++++++++++++++++++++++++++++++++
 include/linux/mtd/nand.h       |   2 +
 2 files changed, 149 insertions(+)

diff --git a/drivers/mtd/nand/nand_micron.c b/drivers/mtd/nand/nand_micron.c
index 8770110..e9a445d 100644
--- a/drivers/mtd/nand/nand_micron.c
+++ b/drivers/mtd/nand/nand_micron.c
@@ -17,6 +17,18 @@
 
 #include <linux/mtd/nand.h>
 
+/*
+ * Special Micron status bit that indicates when the block has been
+ * corrected by on-die ECC and should be rewritten
+ */
+#define NAND_STATUS_WRITE_RECOMMENDED	BIT(3)
+
+/*
+ * READ ID bit that indicates if the NAND has on-die ECC support,
+ * located in byte 4.
+ */
+#define NAND_MICRON_ON_DIE_ECC_CAPABLE	BIT(7)
+
 struct nand_onfi_vendor_micron {
 	u8 two_plane_read;
 	u8 read_cache;
@@ -66,6 +78,121 @@ static int micron_nand_onfi_init(struct nand_chip *chip)
 	return 0;
 }
 
+static int micron_nand_on_die_ooblayout_ecc(struct mtd_info *mtd, int section,
+					    struct mtd_oob_region *oobregion)
+{
+	if (section >= 4)
+		return -ERANGE;
+
+	oobregion->offset = (section * 16) + 8;
+	oobregion->length = 8;
+
+	return 0;
+}
+
+static int micron_nand_on_die_ooblayout_free(struct mtd_info *mtd, int section,
+					     struct mtd_oob_region *oobregion)
+{
+	if (section >= 4)
+		return -ERANGE;
+
+	oobregion->offset = (section * 16) + 2;
+	oobregion->length = 6;
+
+	return 0;
+}
+
+static const struct mtd_ooblayout_ops micron_nand_on_die_ooblayout_ops = {
+	.ecc = micron_nand_on_die_ooblayout_ecc,
+	.free = micron_nand_on_die_ooblayout_free,
+};
+
+static void micron_nand_on_die_ecc_setup(struct nand_chip *chip, bool enable)
+{
+	u8 feature[ONFI_SUBFEATURE_PARAM_LEN] = { 0, };
+
+	if (enable)
+		feature[0] |= ONFI_FEATURE_ON_DIE_ECC_EN;
+
+	chip->onfi_set_features(nand_to_mtd(chip), chip,
+				ONFI_FEATURE_ON_DIE_ECC, feature);
+}
+
+static int
+micron_nand_read_page_on_die_ecc(struct mtd_info *mtd, struct nand_chip *chip,
+				 uint8_t *buf, int oob_required,
+				 int page)
+{
+	int status;
+	int max_bitflips = 0;
+
+	micron_nand_on_die_ecc_setup(chip, true);
+
+	chip->cmdfunc(mtd, NAND_CMD_READ0, 0x00, page);
+	chip->cmdfunc(mtd, NAND_CMD_STATUS, -1, -1);
+	status = chip->read_byte(mtd);
+	if (status & NAND_STATUS_FAIL)
+		mtd->ecc_stats.failed++;
+	/*
+	 * The internal ECC doesn't tell us the number of bitflips
+	 * that have been corrected, but tells us if it recommends to
+	 * rewrite the block. If it's the case, then we pretend we had
+	 * a number of bitflips equal to the ECC strength, which will
+	 * hint the NAND core to rewrite the block.
+	 */
+	else if (status & NAND_STATUS_WRITE_RECOMMENDED)
+		max_bitflips = chip->ecc.strength;
+
+	chip->cmdfunc(mtd, NAND_CMD_READ0, -1, -1);
+
+	nand_read_page_raw(mtd, chip, buf, oob_required, page);
+
+	micron_nand_on_die_ecc_setup(chip, false);
+
+	return max_bitflips;
+}
+
+static int
+micron_nand_write_page_on_die_ecc(struct mtd_info *mtd, struct nand_chip *chip,
+				  const uint8_t *buf, int oob_required,
+				  int page)
+{
+	micron_nand_on_die_ecc_setup(chip, true);
+
+	chip->cmdfunc(mtd, NAND_CMD_SEQIN, 0x00, page);
+	nand_write_page_raw(mtd, chip, buf, oob_required, page);
+	chip->cmdfunc(mtd, NAND_CMD_PAGEPROG, -1, -1);
+
+	micron_nand_on_die_ecc_setup(chip, false);
+
+	return 0;
+}
+
+static int
+micron_nand_read_page_raw_on_die_ecc(struct mtd_info *mtd,
+				     struct nand_chip *chip,
+				     uint8_t *buf, int oob_required,
+				     int page)
+{
+	chip->cmdfunc(mtd, NAND_CMD_READ0, 0x00, page);
+	nand_read_page_raw(mtd, chip, buf, oob_required, page);
+
+	return 0;
+}
+
+static int
+micron_nand_write_page_raw_on_die_ecc(struct mtd_info *mtd,
+				      struct nand_chip *chip,
+				      const uint8_t *buf, int oob_required,
+				      int page)
+{
+	chip->cmdfunc(mtd, NAND_CMD_SEQIN, 0x00, page);
+	nand_write_page_raw(mtd, chip, buf, oob_required, page);
+	chip->cmdfunc(mtd, NAND_CMD_PAGEPROG, -1, -1);
+
+	return 0;
+}
+
 static int micron_nand_init(struct nand_chip *chip)
 {
 	struct mtd_info *mtd = nand_to_mtd(chip);
@@ -78,6 +205,26 @@ static int micron_nand_init(struct nand_chip *chip)
 	if (mtd->writesize == 2048)
 		chip->bbt_options |= NAND_BBT_SCAN2NDPAGE;
 
+	if (chip->ecc.mode == NAND_ECC_ON_DIE) {
+		if ((chip->id.data[4] & NAND_MICRON_ON_DIE_ECC_CAPABLE) == 0) {
+			pr_err("On-die ECC selected but not supported\n");
+			return -EINVAL;
+		}
+
+		chip->ecc.options = NAND_ECC_CUSTOM_PAGE_ACCESS;
+		chip->ecc.bytes = 32;
+		chip->ecc.strength = 4;
+		chip->ecc.algo = NAND_ECC_BCH;
+		chip->ecc.read_page = micron_nand_read_page_on_die_ecc;
+		chip->ecc.write_page = micron_nand_write_page_on_die_ecc;
+		chip->ecc.read_page_raw =
+			micron_nand_read_page_raw_on_die_ecc;
+		chip->ecc.write_page_raw =
+			micron_nand_write_page_raw_on_die_ecc;
+
+		mtd_set_ooblayout(mtd, &micron_nand_on_die_ooblayout_ops);
+	}
+
 	return 0;
 }
 
diff --git a/include/linux/mtd/nand.h b/include/linux/mtd/nand.h
index 5fc705d..3320ffd 100644
--- a/include/linux/mtd/nand.h
+++ b/include/linux/mtd/nand.h
@@ -258,6 +258,8 @@ struct nand_chip;
 
 /* Vendor-specific feature address (Micron) */
 #define ONFI_FEATURE_ADDR_READ_RETRY	0x89
+#define ONFI_FEATURE_ON_DIE_ECC		0x90
+#define   ONFI_FEATURE_ON_DIE_ECC_EN	BIT(3)
 
 /* ONFI subfeature parameters length */
 #define ONFI_SUBFEATURE_PARAM_LEN	4
-- 
2.7.4

--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH 4/5] mtd: nand: add support for Micron on-die ECC
@ 2017-03-21 10:38     ` Thomas Petazzoni
  0 siblings, 0 replies; 35+ messages in thread
From: Thomas Petazzoni @ 2017-03-21 10:38 UTC (permalink / raw)
  To: Boris Brezillon, Marek Vasut, Richard Weinberger,
	Cyrille Pitchen, David Woodhouse, Brian Norris, linux-mtd,
	devicetree, Rob Herring, Ian Campbell, Pawel Moll, Mark Rutland,
	Kumar Gala
  Cc: Thomas Petazzoni

Now that the core NAND subsystem has support for on-die ECC, this commit
brings the necessary code to support on-die ECC on Micron NANDs.

In micron_nand_init(), if the Device Tree indicates an on-die ECC mode,
we verify if the NAND chip actually has ECC support and if so, fill in
the appropriate fields of nand_chip->ecc and defines the OOB layout
according to the NAND datasheet.

We then provide an implementation of the ->read_page(),
->read_page_raw(), ->write_page() and ->write_page_raw() operation to
properly handle the on-die ECC.

In the non-raw functions, we need to enable the internal ECC engine
before issuing the NAND_CMD_READ0 or NAND_CMD_SEQIN commands, which is
why we set the NAND_ECC_CUSTOM_PAGE_ACCESS option at initialization
time (it asks the NAND core to let the NAND driver issue those
commands).

Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
---
 drivers/mtd/nand/nand_micron.c | 147 +++++++++++++++++++++++++++++++++++++++++
 include/linux/mtd/nand.h       |   2 +
 2 files changed, 149 insertions(+)

diff --git a/drivers/mtd/nand/nand_micron.c b/drivers/mtd/nand/nand_micron.c
index 8770110..e9a445d 100644
--- a/drivers/mtd/nand/nand_micron.c
+++ b/drivers/mtd/nand/nand_micron.c
@@ -17,6 +17,18 @@
 
 #include <linux/mtd/nand.h>
 
+/*
+ * Special Micron status bit that indicates when the block has been
+ * corrected by on-die ECC and should be rewritten
+ */
+#define NAND_STATUS_WRITE_RECOMMENDED	BIT(3)
+
+/*
+ * READ ID bit that indicates if the NAND has on-die ECC support,
+ * located in byte 4.
+ */
+#define NAND_MICRON_ON_DIE_ECC_CAPABLE	BIT(7)
+
 struct nand_onfi_vendor_micron {
 	u8 two_plane_read;
 	u8 read_cache;
@@ -66,6 +78,121 @@ static int micron_nand_onfi_init(struct nand_chip *chip)
 	return 0;
 }
 
+static int micron_nand_on_die_ooblayout_ecc(struct mtd_info *mtd, int section,
+					    struct mtd_oob_region *oobregion)
+{
+	if (section >= 4)
+		return -ERANGE;
+
+	oobregion->offset = (section * 16) + 8;
+	oobregion->length = 8;
+
+	return 0;
+}
+
+static int micron_nand_on_die_ooblayout_free(struct mtd_info *mtd, int section,
+					     struct mtd_oob_region *oobregion)
+{
+	if (section >= 4)
+		return -ERANGE;
+
+	oobregion->offset = (section * 16) + 2;
+	oobregion->length = 6;
+
+	return 0;
+}
+
+static const struct mtd_ooblayout_ops micron_nand_on_die_ooblayout_ops = {
+	.ecc = micron_nand_on_die_ooblayout_ecc,
+	.free = micron_nand_on_die_ooblayout_free,
+};
+
+static void micron_nand_on_die_ecc_setup(struct nand_chip *chip, bool enable)
+{
+	u8 feature[ONFI_SUBFEATURE_PARAM_LEN] = { 0, };
+
+	if (enable)
+		feature[0] |= ONFI_FEATURE_ON_DIE_ECC_EN;
+
+	chip->onfi_set_features(nand_to_mtd(chip), chip,
+				ONFI_FEATURE_ON_DIE_ECC, feature);
+}
+
+static int
+micron_nand_read_page_on_die_ecc(struct mtd_info *mtd, struct nand_chip *chip,
+				 uint8_t *buf, int oob_required,
+				 int page)
+{
+	int status;
+	int max_bitflips = 0;
+
+	micron_nand_on_die_ecc_setup(chip, true);
+
+	chip->cmdfunc(mtd, NAND_CMD_READ0, 0x00, page);
+	chip->cmdfunc(mtd, NAND_CMD_STATUS, -1, -1);
+	status = chip->read_byte(mtd);
+	if (status & NAND_STATUS_FAIL)
+		mtd->ecc_stats.failed++;
+	/*
+	 * The internal ECC doesn't tell us the number of bitflips
+	 * that have been corrected, but tells us if it recommends to
+	 * rewrite the block. If it's the case, then we pretend we had
+	 * a number of bitflips equal to the ECC strength, which will
+	 * hint the NAND core to rewrite the block.
+	 */
+	else if (status & NAND_STATUS_WRITE_RECOMMENDED)
+		max_bitflips = chip->ecc.strength;
+
+	chip->cmdfunc(mtd, NAND_CMD_READ0, -1, -1);
+
+	nand_read_page_raw(mtd, chip, buf, oob_required, page);
+
+	micron_nand_on_die_ecc_setup(chip, false);
+
+	return max_bitflips;
+}
+
+static int
+micron_nand_write_page_on_die_ecc(struct mtd_info *mtd, struct nand_chip *chip,
+				  const uint8_t *buf, int oob_required,
+				  int page)
+{
+	micron_nand_on_die_ecc_setup(chip, true);
+
+	chip->cmdfunc(mtd, NAND_CMD_SEQIN, 0x00, page);
+	nand_write_page_raw(mtd, chip, buf, oob_required, page);
+	chip->cmdfunc(mtd, NAND_CMD_PAGEPROG, -1, -1);
+
+	micron_nand_on_die_ecc_setup(chip, false);
+
+	return 0;
+}
+
+static int
+micron_nand_read_page_raw_on_die_ecc(struct mtd_info *mtd,
+				     struct nand_chip *chip,
+				     uint8_t *buf, int oob_required,
+				     int page)
+{
+	chip->cmdfunc(mtd, NAND_CMD_READ0, 0x00, page);
+	nand_read_page_raw(mtd, chip, buf, oob_required, page);
+
+	return 0;
+}
+
+static int
+micron_nand_write_page_raw_on_die_ecc(struct mtd_info *mtd,
+				      struct nand_chip *chip,
+				      const uint8_t *buf, int oob_required,
+				      int page)
+{
+	chip->cmdfunc(mtd, NAND_CMD_SEQIN, 0x00, page);
+	nand_write_page_raw(mtd, chip, buf, oob_required, page);
+	chip->cmdfunc(mtd, NAND_CMD_PAGEPROG, -1, -1);
+
+	return 0;
+}
+
 static int micron_nand_init(struct nand_chip *chip)
 {
 	struct mtd_info *mtd = nand_to_mtd(chip);
@@ -78,6 +205,26 @@ static int micron_nand_init(struct nand_chip *chip)
 	if (mtd->writesize == 2048)
 		chip->bbt_options |= NAND_BBT_SCAN2NDPAGE;
 
+	if (chip->ecc.mode == NAND_ECC_ON_DIE) {
+		if ((chip->id.data[4] & NAND_MICRON_ON_DIE_ECC_CAPABLE) == 0) {
+			pr_err("On-die ECC selected but not supported\n");
+			return -EINVAL;
+		}
+
+		chip->ecc.options = NAND_ECC_CUSTOM_PAGE_ACCESS;
+		chip->ecc.bytes = 32;
+		chip->ecc.strength = 4;
+		chip->ecc.algo = NAND_ECC_BCH;
+		chip->ecc.read_page = micron_nand_read_page_on_die_ecc;
+		chip->ecc.write_page = micron_nand_write_page_on_die_ecc;
+		chip->ecc.read_page_raw =
+			micron_nand_read_page_raw_on_die_ecc;
+		chip->ecc.write_page_raw =
+			micron_nand_write_page_raw_on_die_ecc;
+
+		mtd_set_ooblayout(mtd, &micron_nand_on_die_ooblayout_ops);
+	}
+
 	return 0;
 }
 
diff --git a/include/linux/mtd/nand.h b/include/linux/mtd/nand.h
index 5fc705d..3320ffd 100644
--- a/include/linux/mtd/nand.h
+++ b/include/linux/mtd/nand.h
@@ -258,6 +258,8 @@ struct nand_chip;
 
 /* Vendor-specific feature address (Micron) */
 #define ONFI_FEATURE_ADDR_READ_RETRY	0x89
+#define ONFI_FEATURE_ON_DIE_ECC		0x90
+#define   ONFI_FEATURE_ON_DIE_ECC_EN	BIT(3)
 
 /* ONFI subfeature parameters length */
 #define ONFI_SUBFEATURE_PARAM_LEN	4
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 35+ messages in thread

end of thread, other threads:[~2017-04-13 14:09 UTC | newest]

Thread overview: 35+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <538805ebf8e64015a8b833de755652b3@SIWEX5A.sing.micron.com>
     [not found] ` <538805ebf8e64015a8b833de755652b3-aBoyCxvc2dBaXkNJqdKpEhSpLNRU/VIH@public.gmane.org>
2017-03-22 13:20   ` [PATCH 4/5] mtd: nand: add support for Micron on-die ECC Bean Huo (beanhuo)
2017-03-22 13:20     ` Bean Huo (beanhuo)
     [not found]     ` <8a171dacd20c45bd8285ecc5dbe8854a-aBoyCxvc2dBaXkNJqdKpEhSpLNRU/VIH@public.gmane.org>
2017-03-22 13:45       ` Boris Brezillon
2017-03-22 13:45         ` Boris Brezillon
2017-03-22 14:01         ` Arnaud Mouiche
2017-03-22 14:39         ` Bean Huo (beanhuo)
2017-03-22 14:39           ` Bean Huo (beanhuo)
     [not found]           ` <0dccc0abcf234e98be6d340027cf1a30-aBoyCxvc2dBaXkNJqdKpEhSpLNRU/VIH@public.gmane.org>
2017-03-22 14:52             ` Boris Brezillon
2017-03-22 14:52               ` Boris Brezillon
2017-03-22 17:11               ` Bean Huo (beanhuo)
2017-03-22 17:11                 ` Bean Huo (beanhuo)
2017-04-03 11:31               ` Bean Huo (beanhuo)
2017-04-03 11:31                 ` Bean Huo (beanhuo)
     [not found]                 ` <414dd35931814ce38381a251917ad79f-aBoyCxvc2dBaXkNJqdKpEhSpLNRU/VIH@public.gmane.org>
2017-04-11 12:51                   ` Boris Brezillon
2017-04-11 12:51                     ` Boris Brezillon
2017-04-11 14:26                     ` Bean Huo (beanhuo)
2017-04-11 14:26                       ` Bean Huo (beanhuo)
     [not found]                       ` <106593e04c494120b323836b8bc54f7f-aBoyCxvc2dBaXkNJqdKpEhSpLNRU/VIH@public.gmane.org>
2017-04-11 14:49                         ` Boris Brezillon
2017-04-11 14:49                           ` Boris Brezillon
2017-04-11 15:10                           ` Boris Brezillon
2017-04-11 15:10                             ` Boris Brezillon
2017-04-11 15:28                             ` Bean Huo (beanhuo)
2017-04-11 15:28                               ` Bean Huo (beanhuo)
2017-04-11 15:02                     ` Bean Huo (beanhuo)
2017-04-11 15:02                       ` Bean Huo (beanhuo)
     [not found]                       ` <90300f14cd2a4ae6967d8be0f7dff4e9-aBoyCxvc2dBaXkNJqdKpEhSpLNRU/VIH@public.gmane.org>
2017-04-11 15:30                         ` Boris Brezillon
2017-04-11 15:30                           ` Boris Brezillon
2017-04-11 17:01                           ` Bean Huo (beanhuo)
2017-04-11 17:01                             ` Bean Huo (beanhuo)
     [not found]                             ` <5a96e73ef951414a82c01b67088b24d3-aBoyCxvc2dBaXkNJqdKpEhSpLNRU/VIH@public.gmane.org>
2017-04-12  7:03                               ` Boris Brezillon
2017-04-12  7:03                                 ` Boris Brezillon
2017-04-13 14:08                                 ` Bean Huo (beanhuo)
2017-04-13 14:08                                   ` Bean Huo (beanhuo)
2017-03-21 10:38 [PATCH 0/5] mtd: nand: add support for " Thomas Petazzoni
     [not found] ` <1490092686-16509-1-git-send-email-thomas.petazzoni-wi1+55ScJUtKEb57/3fJTNBPR1lH4CV8@public.gmane.org>
2017-03-21 10:38   ` [PATCH 4/5] mtd: nand: add support for Micron " Thomas Petazzoni
2017-03-21 10:38     ` Thomas Petazzoni

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.