From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from gateway.prograde.net ([66.92.163.78] helo=sol.prograde.net) by canuck.infradead.org with esmtp (Exim 4.72 #1 (Red Hat Linux)) id 1PnYf8-0005jt-Tr for linux-mtd@lists.infradead.org; Thu, 10 Feb 2011 15:42:31 +0000 Subject: Re: Numonyx NOR and chip->mutex bug? Mime-Version: 1.0 (Apple Message framework v1082) Content-Type: text/plain; charset=iso-8859-1 From: Michael Cashwell In-Reply-To: Date: Thu, 10 Feb 2011 10:43:40 -0500 Content-Transfer-Encoding: quoted-printable Message-Id: References: <16826B66-31FE-41AD-A6EF-E668A45AF1FE@prograde.net> <4D4AD9ED.8060104@keymile.com> <4D4B37D4.4050204@keymile.com> <4D4BDD48.6040600@keymile.com> <541E19B8-D428-4F59-B6BB-A3BD8F455AE4@prograde.net> <0488D3BA-7BA3-4E98-B289-3F3D1DB485D4@prograde.net> To: Joakim Tjernlund Cc: Holger brunck , stefan.bigler@keymile.com, linux-mtd@lists.infradead.org, =?iso-8859-1?Q?Anders_Grafstr=F6m?= List-Id: Linux MTD discussion mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , On Feb 10, 2011, at 9:04 AM, Joakim Tjernlund wrote: > Anders Grafstr=F6m wrote on 2011/02/10 = 14:21:36: >=20 >> I've seen this issue with Intel 28F640J5 chips as well. >>=20 >> There's an old thread on it. >> http://lists.infradead.org/pipermail/linux-mtd/2008-April/021266.html >>=20 >> A delay was suggested similar to the one you're experimenting with I = think. >> http://lists.infradead.org/pipermail/linux-mtd/2008-April/021436.html >=20 > Oh, I had forgotten about this thread :) >=20 > I agree with the latency theory, for Numonyx there is this: >=20 > W602 is the typical time between an initial block erase or erase = resume command and the a subsequent erase suspend > command. Violating the specification repeatedly during any particular = block erase may cause erase failures. Interesting. I saw the W602 value but it also talks about "erase = failures" which are not precisely what I'm seeing. I see failed buffered = write. But it's sounding like the issue anyway. I've just instrumented a suspend count and the erase that was active = when a failed write occurred was suspended 29 times. That's not really = very high so it fits with the idea that the issue is not the number of = suspend/resumes but their timing. > W602 is defined to 500us >=20 > Adding a delay after resume will do it but is a bit crude. Possibly = one could add a timestamp at resume/initial erase and a check in suspend = that enough time has passed before suspending again. >=20 > How does that sound? I like this idea but how do we deal with the lack of precision? 500=B5s = "typical" and violating this timing "repeatedly"? Yuk. Without hard = numbers coding it will be tricky. And what about this all being = Numonyx-specific? Do we need to key off of the manufacturer ID or does = it apply to all chips handled by this cmd set? Lastly, what's the general kernel API for =B5s resolution time? I recall = having issues in the past when high-resolution timers is not enabled in = the kernel config. -Mike