All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Jeff Lauruhn (jlauruhn)" <jlauruhn@micron.com>
To: Boris Brezillon <boris.brezillon@free-electrons.com>
Cc: Andrea Scian <rnd4@dave-tech.it>,
	Richard Weinberger <richard@nod.at>,
	mtd_mailinglist <linux-mtd@lists.infradead.org>,
	"dedekind1@gmail.com" <dedekind1@gmail.com>
Subject: RE: RFC: detect and manage power cut on MLC NAND
Date: Fri, 13 Mar 2015 23:51:53 +0000	[thread overview]
Message-ID: <0D23F1ECC880A74392D56535BCADD7354973DF0B@NTXBOIMBX03.micron.com> (raw)
In-Reply-To: <20150313213134.1b53430b@bbrezillon>




Jeff Lauruhn

-----Original Message-----
From: linux-mtd [mailto:linux-mtd-bounces@lists.infradead.org] On Behalf Of Boris Brezillon
Sent: Friday, March 13, 2015 1:32 PM
To: Jeff Lauruhn (jlauruhn)
Cc: Richard Weinberger; dedekind1@gmail.com; mtd_mailinglist; Andrea Scian
Subject: Re: RFC: detect and manage power cut on MLC NAND

Hello Jeff,

I'm joining the discussion to ask more questions about MLC NANDs ;-).

Could you tell us more about how block wear impact the voltage level stored in NAND cells.

1/ Are all pages in a block impacted the same way ?
	Yes, because of block erase, P/E cycles affect all the pages in a block.    
2/ Is wear more likely to induce voltage increase, voltage decrease
   or is it unpredictable ?   Wear is a very well known a NAND characteristic.   During P/E cycling there is a potential for electrons to get permanently trapped in the oxide.  The more P/E cycles the more electrons get trapped.  Over many P/E cycles cells well get to a point where they look permanent programmed and can't be erased or programmed.  As cells begin to fail, ECC can be used to recover the data.  If too many bits fail in page the device will respond with a FAIL status after a P/E cycle.
	
3/ Is it possible to have more than one working voltage threshold
   (read-retry mode): I did some testing on my Hynix chip (I know you
   work for Micron but that's the only MLC chip I have :-)), and I
   managed to get less bitflips by trying another read-retry mode even
   if the previous one was allowing me to successfully fix existing
   bitflips.
Read Retry is available on some newer  products.  RR was introduced to help maintain and improve data retention and P/E cycles as geometry shrinks and bit/cell increase.  If the device supports RR, we have predefined RR Options, based on the most  likely chance of success.  Start with option 1 and step through the options until you get a successful read.  The DS usually has pretty good information.  

4/ Do you have any numbers/statistics that could
   help us choose the more appropriate read-retry mode according to the
   number of P/E cycles ?  I don't have numbers or statistics, but I can tell you that the RR steps are generally defined based on known NAND behavior.  Go to the Micron website and put in this PN MT29F128G08CBCCB and you will find good information on RR.
   
5/ Any other things you'd like to share regarding read-retry ? 
RR isn't available on all devices.   From your prospective I would give them the option to use RR if it's available.  

Apart from that, we're currently trying to find the most appropriate way to deal with paired pages, and this sounds rather complicated.
The current idea is to expose paired pages information up to the UBIFS layer, and let UBIFS decide when it should stop writing on pages paired with already written pages.
Moreover, we have a few pages we need to protect (UBI metadata: EC and VID headers) in order to keep UBI/UBIFS consistent.
Do you have anything to share on this topic (ideas, solutions we should consider, constraints we're not aware of, ...)

This is one of the reasons I came to this site.  I have a great deal of device knowledge and I need to know more about how end users use the device.  

Most designs today employ power loss detection and employ elegant shutdown to the NAND.  In addition, we provide Write Protect, which provides an extra layer of protection against power loss.  There is still a chance that if the power event happens during a program to a page, the previously programmed shared page can also be corrupted.  It's not clear to me how to keep track of shared pages for every device out there.  It's not like a parameter page that you can read.  It's an interesting problem.    

Thanks for your valuable information.

Best Regards,

Boris

--
Boris Brezillon, Free Electrons
Embedded Linux and Kernel engineering
http://free-electrons.com

______________________________________________________
Linux MTD discussion mailing list
http://lists.infradead.org/mailman/listinfo/linux-mtd/

  reply	other threads:[~2015-03-13 23:52 UTC|newest]

Thread overview: 59+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-03-10 11:57 RFC: detect and manage power cut on MLC NAND Andrea Scian
2015-03-10 12:51 ` Richard Weinberger
2015-03-11  7:20   ` Artem Bityutskiy
2015-03-11  8:57     ` Richard Weinberger
2015-03-11  9:05       ` Artem Bityutskiy
2015-03-11  9:09         ` Richard Weinberger
2015-03-11 17:01           ` Andrea Scian
2015-03-11 17:23             ` Jeff Lauruhn (jlauruhn)
2015-03-11 17:29               ` Richard Weinberger
2015-03-11 21:16                 ` Jeff Lauruhn (jlauruhn)
2015-03-12 10:28                   ` Richard Weinberger
2015-03-12 22:57                     ` Jeff Lauruhn (jlauruhn)
2015-03-13 20:31                       ` Boris Brezillon
2015-03-13 23:51                         ` Jeff Lauruhn (jlauruhn) [this message]
2015-03-14  9:46                           ` Andrea Marson - DAVE Embedded Systems
2015-03-16 16:02                             ` Jeff Lauruhn (jlauruhn)
2015-03-17  8:00                               ` Andrea Scian
2015-03-14 10:32                           ` Boris Brezillon
2015-03-16 21:11                             ` Jeff Lauruhn (jlauruhn)
2015-03-17  9:30                               ` Andrea Scian
2015-03-17 10:02                                 ` Boris Brezillon
2015-03-17 16:42                                   ` Jeff Lauruhn (jlauruhn)
2015-03-18  8:45                                     ` RFC: detect and manage power cut on MLC NAND (linux-mtd Digest, Vol 144, Issue 70) Andrea Marson
2015-03-18  9:07                                       ` Boris Brezillon
2015-03-18  9:56                                         ` Andrea Marson
2015-03-18 10:03                                           ` Boris Brezillon
2015-03-18 12:07                                         ` Richard Weinberger
2015-03-18 17:11                                           ` Jeff Lauruhn (jlauruhn)
2015-03-18 16:12                                       ` Jeff Lauruhn (jlauruhn)
2015-03-19  8:47                                         ` RFC: detect and manage power cut on MLC NAND Andrea Marson
2015-03-19  9:12                                           ` Boris Brezillon
2015-03-19 17:45                                             ` Jeff Lauruhn (jlauruhn)
2015-03-20  0:25                                             ` Iwo Mergler
2015-03-20  3:38                                               ` nick
2015-03-20  5:40                                                 ` Iwo Mergler
2015-03-20  8:26                                               ` Boris Brezillon
2015-03-20 17:15                                                 ` Nick Krause
2015-03-22 23:45                                                 ` Iwo Mergler
2015-03-23  2:18                                                 ` Iwo Mergler
2015-03-23  7:06                                                   ` Artem Bityutskiy
2015-03-23 19:05                                                     ` Boris Brezillon
2015-03-24  7:05                                                       ` Artem Bityutskiy
2015-03-19 18:00                                           ` Jeff Lauruhn (jlauruhn)
2015-03-20  8:07                                             ` Andrea Marson
2015-03-17 17:04                                 ` Jeff Lauruhn (jlauruhn)
2015-03-16  9:01                         ` Ricard Wanderlof
2015-03-16 17:27                           ` Jeff Lauruhn (jlauruhn)
2015-03-14 10:03                       ` Richard Weinberger
2015-03-12  9:32               ` Ricard Wanderlof
2015-03-23  4:08           ` Iwo Mergler
2015-03-23 21:15             ` Jeff Lauruhn (jlauruhn)
2015-03-24  1:17               ` Iwo Mergler
2015-03-24 16:50                 ` Jeff Lauruhn (jlauruhn)
2015-03-25  3:38                   ` Iwo Mergler
2015-03-25  8:33                     ` Ricard Wanderlof
2015-03-26  1:57                       ` Jeff Lauruhn (jlauruhn)
2015-03-26  8:55                         ` Ricard Wanderlof
2015-03-11  7:21 ` Artem Bityutskiy
  -- strict thread matches above, loose matches on Subject: below --
2015-03-12 10:31 Andrea Marson - DAVE Embedded Systems
     [not found] <mailman.37176.1426610573.22890.linux-mtd@lists.infradead.org>

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=0D23F1ECC880A74392D56535BCADD7354973DF0B@NTXBOIMBX03.micron.com \
    --to=jlauruhn@micron.com \
    --cc=boris.brezillon@free-electrons.com \
    --cc=dedekind1@gmail.com \
    --cc=linux-mtd@lists.infradead.org \
    --cc=richard@nod.at \
    --cc=rnd4@dave-tech.it \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.