From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mailout.micron.com ([137.201.242.129]) by bombadil.infradead.org with esmtps (Exim 4.80.1 #2 (Red Hat Linux)) id 1Ya9hb-0007XM-11 for linux-mtd@lists.infradead.org; Mon, 23 Mar 2015 21:16:03 +0000 From: "Jeff Lauruhn (jlauruhn)" To: Iwo Mergler , Richard Weinberger , "dedekind1@gmail.com" Subject: RE: RFC: detect and manage power cut on MLC NAND Date: Mon, 23 Mar 2015 21:15:25 +0000 Message-ID: <0D23F1ECC880A74392D56535BCADD7356B6C8603@NTXBOIMBX03.micron.com> References: <54FEDC42.2060407@dave-tech.it> <1426058414.1567.2.camel@sauron.fi.intel.com> <5500037A.9010509@nod.at> <1426064733.1567.6.camel@sauron.fi.intel.com>,<55000637.1030702@nod.at> In-Reply-To: Content-Language: en-US Content-Type: text/plain; charset="iso-2022-jp" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Cc: Andrea Scian , =?iso-2022-jp?B?UWkgV2FuZyAbJEIyJjUvGyhCIChxaXdhbmcp?= , mtd_mailinglist List-Id: Linux MTD discussion mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , This is a very simplified description, but actually it's more like this:=20 First pass, program the lower page. If you the lower page is 1, do nothing= . If the lower page is 0 subtract 0.7v to 0.3. Lower page is SLC like, tw= o distributions spread apart by 0.7V. Lvl LH =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D 1.0 =3D> 1u 0.3 =3D> 0u Now, program the upper page. First, read lower page, if lower page is 1 an= d upper page is 1, do nothing (11). If lower page is 1 and upper page is 0= , then subtract -0.3 and call that 01. Next if lower page is 0 and upper p= age is 1 do nothing and if lower page is 0 and upper page is 0 subtract 0.3= v and call it 00. Notice that state of lower page is on right of 11, 01,10= , 00. =20 Lvl LH =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D 1.0 =3D> 11 0.7 =3D> 01 0.3 =3D> 10 0.0 =3D> 00 Now what happens if there's a power loss during the programming of the uppe= r page? The upper page data will most likely be lost, and the lower page m= ay be changed, but there's a good chance of recovery, because it will be in= the range of SLC. It is highly recommended to read and refresh data after= a power loss. =20 Jeff Lauruhn NAND Application Engineer Embedded Business Unit -----Original Message----- From: Iwo Mergler [mailto:Iwo.Mergler@netcommwireless.com]=20 Sent: Sunday, March 22, 2015 9:09 PM To: Richard Weinberger; dedekind1@gmail.com Cc: Andrea Scian; mtd_mailinglist; Jeff Lauruhn (jlauruhn); Qi Wang =1B$B2&= 5/=1B(B (qiwang) Subject: RE: RFC: detect and manage power cut on MLC NAND Hi all, I probably don't know enough about the silicon implementation of MLC paired= pages, but my feeling is that there should be a way to recover one of the = pages if the paired write fails, at least in some cases. Something along the lines of using both bits to map to a single good one. 2 bit MLC stores 4 levels - 1.0, 0.7, 0.3, 0.0. Obviously, the actual volta= ge levels will be somewhat different, so take this as electrons on the floa= ting gate: 1.0=3Dminimum, 0.0 maximum. I imagine that there are two ways to achieve that - small step for low page= and large step for high page, or the other way 'round. Assuming the first, the low page write would subtract 0.3 from the erased (= 1.0) cell if the bit is 0. That leaves the cell at either ~1.0 (1) or 0.7 (= 0). Lvl LH =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D 1.0 =3D> 1u 0.7 =3D> 0u Then, the high page write would subtract either nothing (1) or 0.7 (0): Lvl LH =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D 1.0 =3D> 11 0.7 =3D> 01 0.3 =3D> 10 0.0 =3D> 00 So the MLC decoder logic gets 3 priority encoded bits from the sense amplif= iers: 111, 011, 001, 000. The decoder turns this into 11, 01, 10, 00. The process of writing a 0 to the high page, transitions low page 0-bits th= rough 1 and back to 0, as the level moves down. Low page 1 bits transition from 1 through 0 and back to 1. So a half-completed high page 0-write can flip a low page bit both ways. We can detect an incorrect 0-1 transition in the low page, because it's mar= ked by a 0 bit in the high page. We can't detect an incorrect 1-0 transition in the low page. So assuming a failed high page write, this is what we get: LH 11 =3D nothing happens, reads back as 11 Correct level for both. 01 =3D Level stays at 0.7, reads back as 01, Correct level for low page. 10 =3D Level between 1.0 and 0.3, reads back as 11, 01 or 10. 01 is wrong for low page, but can't be distinguished from 10. 00 =3D Level between 0.7 and 0.0, reads back as 01, 10, or 00. 10 is wrong for low page, but can be distinguished from 01. So, there are two bit combinations (50%) that have an undetectable failure,= and this failure will happen about half the time, for a total of 25% unfix= able failure rate. Not acceptable in the general case, but might be good enough for things lik= e UBI EC & VID headers, if we ensure that the high page contains 1s at the = offsets at which the low page stores the header. Now, on the other hand, if the low page write uses the larger step, there s= houldn't be any paired page problem at all, since the high page write would= n't cross the low page thresholds on the way: Lvl LH =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D 1.0 =3D> 1u 0.3 =3D> 0u Lvl LH =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D 1.0 =3D> 11 0.7 =3D> 10 0.3 =3D> 01 0.0 =3D> 00 Which makes me think I'm misunderstanding something. If not, why isn't his = scheme used in the first place? What would happen if we reverse the paired page writing order? Not recommended, we want pages programmed in sequence to mitigate disturbs = and obtain the highest reliability. Jeff, Qi, is the mechanism I described here anywhere near reality? It's a simplified view, but fairly accurate. =20 Best regards, Iwo