All of lore.kernel.org
 help / color / mirror / Atom feed
* enhance ONFI table reliability/stable
@ 2015-07-21 14:42 Bean Huo 霍斌斌 (beanhuo)
  2015-11-18  2:50 ` Brian Norris
  0 siblings, 1 reply; 7+ messages in thread
From: Bean Huo 霍斌斌 (beanhuo) @ 2015-07-21 14:42 UTC (permalink / raw)
  To: linux-kernel, linux-mtd

Hi, 

Recently, I faced some case about ONFI table reliability, now it used CRC.
If there is bit flips in ONFI parameter pages, parameter backup page will be taken. 
For latest linux,default read three copys.

	chip->cmdfunc(mtd, NAND_CMD_PARAM, 0, -1);
	for (i = 0; i < 3; i++) {
		for (j = 0; j < sizeof(*p); j++)
			((uint8_t *)p)[j] = chip->read_byte(mtd);
		if (onfi_crc16(ONFI_CRC_BASE, (uint8_t *)p, 254) ==
				le16_to_cpu(p->crc)) {
			break;
		}
	}

However ,with technoogy improvement,for TLC and new generatin MLC,I think, three copys of 
Parameter tables is not powerful enough.my question is that if there is a good method to protect and corrent parameter page. For example,we can use linux software BCH ecc.
Any suggections and input be welcomed,if you having any concerns about this,don't free tell me.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: enhance ONFI table reliability/stable
  2015-07-21 14:42 enhance ONFI table reliability/stable Bean Huo 霍斌斌 (beanhuo)
@ 2015-11-18  2:50 ` Brian Norris
  2015-11-19  4:21     ` Bean Huo 霍斌斌 (beanhuo)
  0 siblings, 1 reply; 7+ messages in thread
From: Brian Norris @ 2015-11-18  2:50 UTC (permalink / raw)
  To: Bean Huo 霍斌斌 (beanhuo)
  Cc: linux-kernel, linux-mtd, Boris Brezillon

Hi Bean,

I was sorting through old email and I found this.

On Tue, Jul 21, 2015 at 02:42:34PM +0000, Bean Huo 霍斌斌 (beanhuo) wrote:
> Hi, 
> 
> Recently, I faced some case about ONFI table reliability, now it used CRC.
> If there is bit flips in ONFI parameter pages, parameter backup page will be taken. 
> For latest linux,default read three copys.
> 
> 	chip->cmdfunc(mtd, NAND_CMD_PARAM, 0, -1);
> 	for (i = 0; i < 3; i++) {
> 		for (j = 0; j < sizeof(*p); j++)
> 			((uint8_t *)p)[j] = chip->read_byte(mtd);
> 		if (onfi_crc16(ONFI_CRC_BASE, (uint8_t *)p, 254) ==
> 				le16_to_cpu(p->crc)) {
> 			break;
> 		}
> 	}
> 
> However ,with technoogy improvement,for TLC and new generatin MLC,I
> think, three copys of 

Ha, "improvement" :)

> Parameter tables is not powerful enough.my question is that if there
> is a good method to protect and corrent parameter page. For example,we
> can use linux software BCH ecc. Any suggections and input be
> welcomed,if you having any concerns about this,don't free tell me.

I recall this being brought up at my old job, and I all I can say is...
(please pardon my censored language)

...that is complete and utter bulls***. An ONFI standard that can't
guarantee "reliable enough" parameter pages is no standard at all.

To step back a bit: How would one expect to store and retrieve ECC
parity data? ...on the NAND flash? But to do that, we have to know the
geometry parameters of said NAND flash. How do we figure out the
geometry? From the ONFI parameter pages! Nice Catch 22 you have there.

Please encourage your employer never to produce "ONFI-compliant" flash
that are this bad.

Regards,
Brian

^ permalink raw reply	[flat|nested] 7+ messages in thread

* RE: enhance ONFI table reliability/stable
  2015-11-18  2:50 ` Brian Norris
@ 2015-11-19  4:21     ` Bean Huo 霍斌斌 (beanhuo)
  0 siblings, 0 replies; 7+ messages in thread
From: Bean Huo 霍斌斌 (beanhuo) @ 2015-11-19  4:21 UTC (permalink / raw)
  To: Brian Norris; +Cc: linux-kernel, linux-mtd, Boris Brezillon

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain; charset="utf-8", Size: 2176 bytes --]

> 
> Hi Bean,
> 
> I was sorting through old email and I found this.
> 
> On Tue, Jul 21, 2015 at 02:42:34PM +0000, Bean Huo 霍斌斌 (beanhuo)
> wrote:
> > Hi,
> >
> > Recently, I faced some case about ONFI table reliability, now it used CRC.
> > If there is bit flips in ONFI parameter pages, parameter backup page will be
> taken.
> > For latest linux,default read three copys.
> >
> > 	chip->cmdfunc(mtd, NAND_CMD_PARAM, 0, -1);
> > 	for (i = 0; i < 3; i++) {
> > 		for (j = 0; j < sizeof(*p); j++)
> > 			((uint8_t *)p)[j] = chip->read_byte(mtd);
> > 		if (onfi_crc16(ONFI_CRC_BASE, (uint8_t *)p, 254) ==
> > 				le16_to_cpu(p->crc)) {
> > 			break;
> > 		}
> > 	}
> >
> > However ,with technoogy improvement,for TLC and new generatin MLC,I
> > think, three copys of
> 
> Ha, "improvement" :)
> 
> > Parameter tables is not powerful enough.my question is that if there
> > is a good method to protect and corrent parameter page. For example,we
> > can use linux software BCH ecc. Any suggections and input be
> > welcomed,if you having any concerns about this,don't free tell me.
> 
> I recall this being brought up at my old job, and I all I can say is...
> (please pardon my censored language)


Yes , you ever told about this. I just follow.
Sorry for my rude following.
I only want to share my one suggestion about using software ECC to protect 
ONFI table that read from NAND. I want to hear every MTD expert 's valuable 
Feedback on this. if OK, I can do it. 

> ...that is complete and utter bulls***. An ONFI standard that can't guarantee
> "reliable enough" parameter pages is no standard at all.
> 
> To step back a bit: How would one expect to store and retrieve ECC parity
> data? ...on the NAND flash? But to do that, we have to know the geometry
> parameters of said NAND flash. How do we figure out the geometry? From the
> ONFI parameter pages! Nice Catch 22 you have there.
> 
> Please encourage your employer never to produce "ONFI-compliant" flash that
> are this bad.
> 
> Regards,
> Brian
ÿôèº{.nÇ+‰·Ÿ®‰­†+%ŠËÿ±éݶ\x17¥Šwÿº{.nÇ+‰·¥Š{±þG«éÿŠ{ayº\x1dʇڙë,j\a­¢f£¢·hšïêÿ‘êçz_è®\x03(­éšŽŠÝ¢j"ú\x1a¶^[m§ÿÿ¾\a«þG«éÿ¢¸?™¨è­Ú&£ø§~á¶iO•æ¬z·švØ^\x14\x04\x1a¶^[m§ÿÿÃ\fÿ¶ìÿ¢¸?–I¥

^ permalink raw reply	[flat|nested] 7+ messages in thread

* RE: enhance ONFI table reliability/stable
@ 2015-11-19  4:21     ` Bean Huo 霍斌斌 (beanhuo)
  0 siblings, 0 replies; 7+ messages in thread
From: Bean Huo 霍斌斌 (beanhuo) @ 2015-11-19  4:21 UTC (permalink / raw)
  To: Brian Norris; +Cc: linux-kernel, linux-mtd, Boris Brezillon

> 
> Hi Bean,
> 
> I was sorting through old email and I found this.
> 
> On Tue, Jul 21, 2015 at 02:42:34PM +0000, Bean Huo 霍斌斌 (beanhuo)
> wrote:
> > Hi,
> >
> > Recently, I faced some case about ONFI table reliability, now it used CRC.
> > If there is bit flips in ONFI parameter pages, parameter backup page will be
> taken.
> > For latest linux,default read three copys.
> >
> > 	chip->cmdfunc(mtd, NAND_CMD_PARAM, 0, -1);
> > 	for (i = 0; i < 3; i++) {
> > 		for (j = 0; j < sizeof(*p); j++)
> > 			((uint8_t *)p)[j] = chip->read_byte(mtd);
> > 		if (onfi_crc16(ONFI_CRC_BASE, (uint8_t *)p, 254) ==
> > 				le16_to_cpu(p->crc)) {
> > 			break;
> > 		}
> > 	}
> >
> > However ,with technoogy improvement,for TLC and new generatin MLC,I
> > think, three copys of
> 
> Ha, "improvement" :)
> 
> > Parameter tables is not powerful enough.my question is that if there
> > is a good method to protect and corrent parameter page. For example,we
> > can use linux software BCH ecc. Any suggections and input be
> > welcomed,if you having any concerns about this,don't free tell me.
> 
> I recall this being brought up at my old job, and I all I can say is...
> (please pardon my censored language)


Yes , you ever told about this. I just follow.
Sorry for my rude following.
I only want to share my one suggestion about using software ECC to protect 
ONFI table that read from NAND. I want to hear every MTD expert 's valuable 
Feedback on this. if OK, I can do it. 

> ...that is complete and utter bulls***. An ONFI standard that can't guarantee
> "reliable enough" parameter pages is no standard at all.
> 
> To step back a bit: How would one expect to store and retrieve ECC parity
> data? ...on the NAND flash? But to do that, we have to know the geometry
> parameters of said NAND flash. How do we figure out the geometry? From the
> ONFI parameter pages! Nice Catch 22 you have there.
> 
> Please encourage your employer never to produce "ONFI-compliant" flash that
> are this bad.
> 
> Regards,
> Brian

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: enhance ONFI table reliability/stable
  2015-11-19  4:21     ` Bean Huo 霍斌斌 (beanhuo)
  (?)
@ 2015-11-20 23:59     ` Brian Norris
  2015-11-21  7:46       ` Boris Brezillon
  -1 siblings, 1 reply; 7+ messages in thread
From: Brian Norris @ 2015-11-20 23:59 UTC (permalink / raw)
  To: Bean Huo 霍斌斌 (beanhuo)
  Cc: linux-kernel, linux-mtd, Boris Brezillon

On Thu, Nov 19, 2015 at 04:21:01AM +0000, Bean Huo 霍斌斌 (beanhuo) wrote:
> > On Tue, Jul 21, 2015 at 02:42:34PM +0000, Bean Huo 霍斌斌 (beanhuo)
> > wrote:
> > > Hi,
> > >
> > > Recently, I faced some case about ONFI table reliability, now it used CRC.
> > > If there is bit flips in ONFI parameter pages, parameter backup page will be
> > taken.
> > > For latest linux,default read three copys.
> > >
> > > 	chip->cmdfunc(mtd, NAND_CMD_PARAM, 0, -1);
> > > 	for (i = 0; i < 3; i++) {
> > > 		for (j = 0; j < sizeof(*p); j++)
> > > 			((uint8_t *)p)[j] = chip->read_byte(mtd);
> > > 		if (onfi_crc16(ONFI_CRC_BASE, (uint8_t *)p, 254) ==
> > > 				le16_to_cpu(p->crc)) {
> > > 			break;
> > > 		}
> > > 	}
> > >
> > > However ,with technoogy improvement,for TLC and new generatin MLC,I
> > > think, three copys of
> > 
> > Ha, "improvement" :)
> > 
> > > Parameter tables is not powerful enough.my question is that if there
> > > is a good method to protect and corrent parameter page. For example,we
> > > can use linux software BCH ecc. Any suggections and input be
> > > welcomed,if you having any concerns about this,don't free tell me.
> > 
> > I recall this being brought up at my old job, and I all I can say is...
> > (please pardon my censored language)
> 
> 
> Yes , you ever told about this. I just follow.
> Sorry for my rude following.
> I only want to share my one suggestion about using software ECC to protect 
> ONFI table that read from NAND. I want to hear every MTD expert 's valuable 
> Feedback on this. if OK, I can do it. 

Perhaps I'm misunderstanding you, I don't understand how you could
possibly "do it" if it is a circular dependency. You have nowhere to
store ECC/parity data for a parameter page, because you can't actually
read/write the NAND flash until after you know its geometry.

> > ...that is complete and utter bulls***. An ONFI standard that can't guarantee
> > "reliable enough" parameter pages is no standard at all.
> > 
> > To step back a bit: How would one expect to store and retrieve ECC parity
> > data? ...on the NAND flash? But to do that, we have to know the geometry
> > parameters of said NAND flash. How do we figure out the geometry? From the
> > ONFI parameter pages! Nice Catch 22 you have there.

I realize a non-native English speaker might not understand the "Catch
22" reference. Wikipedia has a nice summary:

  https://en.wikipedia.org/wiki/Catch-22_(logic)

Essentially, it's a circular argument, or a contradiction. An
impossibility.

> > Please encourage your employer never to produce "ONFI-compliant" flash that
> > are this bad.

I still stand by the above statement.

But now that I'm in a slightly more charitable mood, there are ways to
improve our ability to recover from slightly corrupted parameter pages
(ECC is not one of them).

For one, you could do some kind of bit majority. e.g.:

 (1) try pages 1-3
 (2) if none pass the CRC check, then compute bit majority of all 3; if
     the CRC of this combined page passes, then use it
 (3) ???

Brian

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: enhance ONFI table reliability/stable
  2015-11-20 23:59     ` Brian Norris
@ 2015-11-21  7:46       ` Boris Brezillon
  2015-11-21  8:27         ` Brian Norris
  0 siblings, 1 reply; 7+ messages in thread
From: Boris Brezillon @ 2015-11-21  7:46 UTC (permalink / raw)
  To: Brian Norris
  Cc: Bean Huo 霍斌斌 (beanhuo), linux-kernel, linux-mtd

On Fri, 20 Nov 2015 15:59:27 -0800
Brian Norris <computersforpeace@gmail.com> wrote:

> On Thu, Nov 19, 2015 at 04:21:01AM +0000, Bean Huo 霍斌斌 (beanhuo) wrote:
> > > On Tue, Jul 21, 2015 at 02:42:34PM +0000, Bean Huo 霍斌斌 (beanhuo)
> > > wrote:
> > > > Hi,
> > > >
> > > > Recently, I faced some case about ONFI table reliability, now it used CRC.
> > > > If there is bit flips in ONFI parameter pages, parameter backup page will be
> > > taken.
> > > > For latest linux,default read three copys.
> > > >
> > > > 	chip->cmdfunc(mtd, NAND_CMD_PARAM, 0, -1);
> > > > 	for (i = 0; i < 3; i++) {
> > > > 		for (j = 0; j < sizeof(*p); j++)
> > > > 			((uint8_t *)p)[j] = chip->read_byte(mtd);
> > > > 		if (onfi_crc16(ONFI_CRC_BASE, (uint8_t *)p, 254) ==
> > > > 				le16_to_cpu(p->crc)) {
> > > > 			break;
> > > > 		}
> > > > 	}
> > > >
> > > > However ,with technoogy improvement,for TLC and new generatin MLC,I
> > > > think, three copys of
> > > 
> > > Ha, "improvement" :)
> > > 
> > > > Parameter tables is not powerful enough.my question is that if there
> > > > is a good method to protect and corrent parameter page. For example,we
> > > > can use linux software BCH ecc. Any suggections and input be
> > > > welcomed,if you having any concerns about this,don't free tell me.
> > > 
> > > I recall this being brought up at my old job, and I all I can say is...
> > > (please pardon my censored language)
> > 
> > 
> > Yes , you ever told about this. I just follow.
> > Sorry for my rude following.
> > I only want to share my one suggestion about using software ECC to protect 
> > ONFI table that read from NAND. I want to hear every MTD expert 's valuable 
> > Feedback on this. if OK, I can do it. 
> 
> Perhaps I'm misunderstanding you, I don't understand how you could
> possibly "do it" if it is a circular dependency. You have nowhere to
> store ECC/parity data for a parameter page, because you can't actually
> read/write the NAND flash until after you know its geometry.

Well, while I agree with most of your answer (why the hell are NAND
vendors storing the ONFI parameter page, and other sensitive information
in normal NAND pages, especially when we're talking about TLC/MLC
NANDs???), it's perfectly possible to have ECC in this case, as long as
the geometry is known in advance (at least this is true for BCH).

Say you have only 3 copies of the parameter page and ECC are stored
after that. You can define the following layout:

|3 x parameter page size|3 x ECC bytes|

Of course this implies reserving the space after the 3 parameter pages
for the ECC bytes, which according to the current ONFI spec is not true
(you should have at least 3 copies, but you can have more).

And we would choose the ECC geometry with this logic:

ECC chunk size = sizeof(struct nand_onfi_params)
ECC strength = iteratively tested with different pre-defined values

This being said, I don't know how you would change the ONFI spec and
keep it compatible with the previous version. As I said, the current
version of the spec does not reserve any area after the mandatory
parameter pages...
You'll probably have to add a NAND_CMD_ALT_PARAM to support this kind
of thing.

> 
> > > ...that is complete and utter bulls***. An ONFI standard that can't guarantee
> > > "reliable enough" parameter pages is no standard at all.
> > > 
> > > To step back a bit: How would one expect to store and retrieve ECC parity
> > > data? ...on the NAND flash? But to do that, we have to know the geometry
> > > parameters of said NAND flash. How do we figure out the geometry? From the
> > > ONFI parameter pages! Nice Catch 22 you have there.
> 
> I realize a non-native English speaker might not understand the "Catch
> 22" reference. Wikipedia has a nice summary:
> 
>   https://en.wikipedia.org/wiki/Catch-22_(logic)
> 
> Essentially, it's a circular argument, or a contradiction. An
> impossibility.
> 
> > > Please encourage your employer never to produce "ONFI-compliant" flash that
> > > are this bad.
> 
> I still stand by the above statement.
> 
> But now that I'm in a slightly more charitable mood, there are ways to
> improve our ability to recover from slightly corrupted parameter pages
> (ECC is not one of them).
> 
> For one, you could do some kind of bit majority. e.g.:
> 
>  (1) try pages 1-3
>  (2) if none pass the CRC check, then compute bit majority of all 3; if
>      the CRC of this combined page passes, then use it
>  (3) ???

Should work too, but it's probably less reliable than BCH ECC (we only
have 3 copies :-/).

Best Regards,

Boris

-- 
Boris Brezillon, Free Electrons
Embedded Linux and Kernel engineering
http://free-electrons.com

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: enhance ONFI table reliability/stable
  2015-11-21  7:46       ` Boris Brezillon
@ 2015-11-21  8:27         ` Brian Norris
  0 siblings, 0 replies; 7+ messages in thread
From: Brian Norris @ 2015-11-21  8:27 UTC (permalink / raw)
  To: Boris Brezillon
  Cc: Bean Huo 霍斌斌 (beanhuo), linux-kernel, linux-mtd

On Sat, Nov 21, 2015 at 08:46:04AM +0100, Boris Brezillon wrote:
> This being said, I don't know how you would change the ONFI spec and
> keep it compatible with the previous version. As I said, the current
> version of the spec does not reserve any area after the mandatory
> parameter pages...
> You'll probably have to add a NAND_CMD_ALT_PARAM to support this kind
> of thing.

I was interpreting Bean's comments to mean only a change in software,
not in the actual NAND flash (HW) implementation. If he is suggesting a
change in the flash spec, then that's a completely different discussion.

Brian

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2015-11-21  8:27 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-07-21 14:42 enhance ONFI table reliability/stable Bean Huo 霍斌斌 (beanhuo)
2015-11-18  2:50 ` Brian Norris
2015-11-19  4:21   ` Bean Huo 霍斌斌 (beanhuo)
2015-11-19  4:21     ` Bean Huo 霍斌斌 (beanhuo)
2015-11-20 23:59     ` Brian Norris
2015-11-21  7:46       ` Boris Brezillon
2015-11-21  8:27         ` Brian Norris

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.