All of lore.kernel.org
 help / color / mirror / Atom feed
From: Michael Cashwell <mboards@prograde.net>
To: linux-mtd@lists.infradead.org
Cc: Stefan Bigler <stefan.bigler@keymile.com>,
	Holger brunck <holger.brunck@keymile.com>,
	Joakim Tjernlund <joakim.tjernlund@transmode.se>
Subject: Re: Numonyx NOR and chip->mutex bug?
Date: Wed, 2 Feb 2011 11:20:46 -0500	[thread overview]
Message-ID: <25631ED7-C6A0-44B1-B33D-F48DC48C812E@prograde.net> (raw)
In-Reply-To: <OF669CD6B0.87299771-ONC1257823.007EE3A7-C1257823.007F2FD2@transmode.se>

[-- Attachment #1: Type: text/plain, Size: 3494 bytes --]

On Jan 25, 2011, at 6:09 PM, Joakim Tjernlund wrote:

> On Jan 25, 2011, at 1:14 PM, Michael Cashwell wrote:
> 
>> With this new part I'm seeing MTD errors that I think I've traced to cfi_cmdset_0001.c that I'd like to ask about.
>> 
>> The error manifests when I write hard to a UBIFS file system on this NOR flash. What I see is a "NOR Flash: buffer write error" and then either "(block locked)" or "(Bad VPP)"
> 
> Just check that you didn't get some old samples. We did.

All I know is that the chips are marked with an '08 copyright. (The previous ones are '03.) So we are dealing with old(er) parts in any event. But from what I can tell, this particular part is the current one Numonyx/Micron are selling.

>> The fact that the errors stop if I comment out the chip->mutex calls while waiting [for command completion] suggests to me that there's a reentrancy problem. It doesn't mean the locks are wrong or that doing that is a real fix.
> 
> Oh, I misread earlier. I figured you held the lock for all ops.

In the end I found a failure in the following scenario. A block erase is underway and a request is made to access the chip in order to write data elsewhere. The erase is suspended and the buffered write is performed. When the chip is released after the write operation the code notices the suspended erase and resumes it. But there seems to be a timing issue where the WSM ready bit SR.7 was checked "too soon" following issuing the resume command and it made the code think the erase was complete when it was not.

The normal code paths that *start* erase or program operations have an inherent delay of several µs between writing the command and the first read of the WSM status. This delay is a side effect of a kernel cache invalidate call. But the key issue is that when resuming an erase no such cache invalidation is done as it was already done when the erase originally began.

That difference means there's very little time between the resume command write and the status read. The apparent result is that the WSM is reported "not busy" when in fact the resumption is still being acted upon. The code misinterprets this to mean the resumed erase is complete when it is not and subsequent commands then go fully off the rails as a result.

I cannot find a corresponding timing constraint in the data sheet. By rights, the bus cycle time alone should be enough between the write and read. But in practice, for these parts, it is not. This may be an undocumented erratum for current parts or just an anomaly for this batch. I have no way to tell.

I found the addition of a 20µs delay immediately after the erase resume command avoids the failure. I also tested 10µs and found it to be insufficient. I did not bisect the time further. I have also not explored any similar issue for resumed write operations because it appears that only kernels doing XIP on MTD parts ever do that. I frankly expect the problem would occur then too but I'm not setup to do XIP and don't want to propose changes I cannot test.

I've included the patch that I am using. It also addresses a few other warts and errata I found while debugging this. If these changes are found to have merit after review I'd be happy for them to be included in mainline. Let me know if I can assist in any way.

Stephan, I hope this helps. Since yours is the only report at all similar to mine I'd be very interested in hearing about your progress.

Best regards,
-Mike Cashwell


[-- Attachment #2: linux-2.6.35.7-001-numonyx-errata.patch --]
[-- Type: application/octet-stream, Size: 2326 bytes --]

diff -Nurp linux-2.6.35.7-arm-gum/drivers/mtd/chips/cfi_cmdset_0001.c linux-2.6.35.7-arm-gum-mod/drivers/mtd/chips/cfi_cmdset_0001.c
--- linux-2.6.35.7-arm-gum/drivers/mtd/chips/cfi_cmdset_0001.c	2010-09-28 21:09:08.000000000 -0400
+++ linux-2.6.35.7-arm-gum-mod/drivers/mtd/chips/cfi_cmdset_0001.c	2011-01-27 16:11:27.000000000 -0500
@@ -1004,7 +1004,19 @@ static void put_chip(struct map_info *ma
 		   sending the 0x70 (Read Status) command to an erasing
 		   chip and expecting it to be ignored, that's what we
 		   do. */
+		/* numonyx data sheet clearly says to always reset the status bits
+		   before resuming a suspended erase. not doing so results in
+		   an ambiguous mixture of status bits once the command ends. */
+		map_write(map, CMD(0x50), adr);
 		map_write(map, CMD(0xd0), adr);
+		/* some numonyx P30 parts have an apparent delay after starting or
+		   resuming some commands. this is normally covered by the cache
+		   invalidation done between the command and the start of reading
+		   for the busy status bit to clear. but no such cache invalidation
+		   is done when resuming and this allows the status-reading thread
+		   awakened below to read the status too soon and think its operation
+		   has finished when it fact its resumption is still underway. */
+		udelay(20);
 		map_write(map, CMD(0x70), adr);
 		chip->oldstate = FL_READY;
 		chip->state = FL_ERASING;
# Numonyx Axcell P33/P30 Spec Update - Nov 2009 - document 509003-04
# Erratum #2 - Flexlock Write Timing for stepping A1
@@ -2045,6 +2057,7 @@ static int __xipram do_xxlock_oneblock(s
 {
 	struct cfi_private *cfi = map->fldrv_priv;
 	struct cfi_pri_intelext *extp = cfi->cmdset_priv;
+	int ofs_factor = cfi->interleave * cfi->device_type;
 	int udelay;
 	int ret;
 
@@ -2060,6 +2073,14 @@ static int __xipram do_xxlock_oneblock(s
 	ENABLE_VPP(map);
 	xip_disable(map, chip, adr);
 
+	/* address numonyx errata regarding Flexlock Write Timing.
+	   before doing a 0x60 lock/unlock sequence on a sector
+	   its current lock state needs to be read and the result
+	   discarded. */
+	map_write(map, CMD(0x90), adr+(2*ofs_factor));
+	chip->state = FL_JEDEC_QUERY;
+	(void) cfi_read_query(map, adr+(2*ofs_factor));
+
 	map_write(map, CMD(0x60), adr);
 	if (thunk == DO_XXLOCK_ONEBLOCK_LOCK) {
 		map_write(map, CMD(0x01), adr);

  reply	other threads:[~2011-02-02 16:20 UTC|newest]

Thread overview: 76+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-01-25 18:14 Numonyx NOR and chip->mutex bug? Michael Cashwell
2011-01-25 18:56 ` Joakim Tjernlund
2011-01-25 22:03   ` Michael Cashwell
2011-01-25 23:09     ` Joakim Tjernlund
2011-02-02 16:20       ` Michael Cashwell [this message]
2011-02-02 17:37         ` Stefan Bigler
2011-02-02 20:12         ` Joakim Tjernlund
2011-02-02 21:19           ` Michael Cashwell
2011-02-03  8:11             ` Joakim Tjernlund
2011-02-03  9:50               ` Joakim Tjernlund
2011-02-03 15:24                 ` Michael Cashwell
2011-02-03 16:38                   ` Stefan Bigler
2011-02-03 23:18                     ` Stefan Bigler
2011-02-04 10:47                       ` Joakim Tjernlund
2011-02-04 11:04                         ` Stefan Bigler
2011-02-04 12:26                           ` Joakim Tjernlund
2011-02-04 12:35                             ` Joakim Tjernlund
2011-02-04 12:42                               ` Joakim Tjernlund
2011-02-04 13:05                           ` Joakim Tjernlund
2011-02-04 13:25                             ` Joakim Tjernlund
2011-02-04 16:45                             ` Stefan Bigler
2011-02-04 16:55                               ` Joakim Tjernlund
2011-02-04 17:09                             ` Michael Cashwell
     [not found]                               ` <OF42EF <F66AF016-8A2B-4116-BE49-CE05B91BE50F@prograde.net>
     [not found]                               ` <OF42EF<F66AF016-8A2B-4116-BE49-CE05B91BE50F@prograde.net>
2011-02-05 10:29                               ` Joakim Tjernlund
2011-02-05 19:19                                 ` Michael Cashwell
2011-02-05 21:47                                   ` Michael Cashwell
2011-02-06  9:46                                     ` Joakim Tjernlund
2011-02-06 15:49                                       ` Michael Cashwell
2011-02-06 17:29                                         ` Joakim Tjernlund
2011-02-06 21:13                                           ` Michael Cashwell
     [not found]                                             ` <OF2C1ABD39 <4D5005E4.1040506@keymile.com>
     [not found]                                               ` <OFFF2C6D34.91E5D6C3-ONC1257 <96BD3889-E8AD-408D-8275-ED1A5FD55F1B@prograde.net>
     [not found]                                             ` <OF2C1ABD39<4D5005E4.1040506@keymile.com>
     [not found]                                               ` <OFFF2C6D34.91E5D6C3-ONC1257<96BD3889-E8AD-408D-8275-ED1A5FD55F1B@prograde.net>
     [not found]                                                 ` <OF9738D658.F46<4C117A67-5057-4ACD-8EBE-04E9C782570C@prograde.net>
     [not found]                                                   ` <OF6D40AC82.1008D3DD-ONC1 <4D53E660.4020305@users.sourceforge.net>
2011-02-06 21:20                                             ` Joakim Tjernlund
2011-02-07 14:47                                               ` Stefan Bigler
2011-02-07 15:01                                                 ` Joakim Tjernlund
2011-02-07 15:46                                                   ` Michael Cashwell
2011-02-07 15:52                                                     ` Stefan Bigler
2011-02-07 16:22                                                     ` Joakim Tjernlund
2011-02-07 16:46                                                       ` Michael Cashwell
2011-02-07 17:08                                                         ` Stefan Bigler
2011-02-07 19:04                                                           ` Michael Cashwell
2011-02-09 19:52                                                             ` Michael Cashwell
2011-02-09 20:13                                                               ` Joakim Tjernlund
2011-02-09 21:59                                                                 ` Michael Cashwell
2011-02-10 13:21                                                                   ` Anders Grafström
2011-02-10 14:04                                                                     ` Joakim Tjernlund
2011-02-10 15:04                                                                       ` Joakim Tjernlund
2011-02-10 15:59                                                                         ` Michael Cashwell
2011-02-10 16:05                                                                           ` Joakim Tjernlund
2011-02-10 16:41                                                                             ` Michael Cashwell
2011-02-10 16:46                                                                               ` Joakim Tjernlund
2011-02-10 17:02                                                                                 ` Joakim Tjernlund
2011-02-10 17:10                                                                                   ` Michael Cashwell
2011-02-10 17:20                                                                                     ` Joakim Tjernlund
2011-02-10 17:47                                                                                       ` Joakim Tjernlund
2011-02-10 18:26                                                                                         ` Joakim Tjernlund
2011-02-11 18:03                                                                                           ` Michael Cashwell
2011-02-12 10:47                                                                                             ` Joakim Tjernlund
2011-02-14 15:59                                                                                         ` Michael Cashwell
2011-02-14 15:44                                                                                       ` Michael Cashwell
2011-02-10 16:43                                                                           ` Michael Cashwell
2011-02-10 17:54                                                                             ` Anders Grafström
2011-02-11 15:05                                                                               ` Michael Cashwell
2011-02-11 15:39                                                                                 ` Joakim Tjernlund
2011-02-14 16:15                                                                                   ` Michael Cashwell
2011-02-11 17:00                                                                               ` Joakim Tjernlund
2011-02-10 15:43                                                                       ` Michael Cashwell
2011-02-10 15:51                                                                         ` Joakim Tjernlund
2011-02-24 10:50                                                                     ` Joakim Tjernlund
2011-02-24 11:36                                                                       ` Joakim Tjernlund
2011-02-24 14:28                                                                       ` Michael Cashwell
2011-02-10 14:53                                                                   ` Joakim Tjernlund
2011-02-06  9:40                                   ` Joakim Tjernlund
2011-02-06 14:55                                     ` Michael Cashwell
2011-02-07 15:10                                   ` Michael Cashwell
2011-02-07 15:48                                     ` Joakim Tjernlund
2011-02-03 13:24               ` Michael Cashwell
2011-02-03 14:01                 ` Joakim Tjernlund

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=25631ED7-C6A0-44B1-B33D-F48DC48C812E@prograde.net \
    --to=mboards@prograde.net \
    --cc=holger.brunck@keymile.com \
    --cc=joakim.tjernlund@transmode.se \
    --cc=linux-mtd@lists.infradead.org \
    --cc=stefan.bigler@keymile.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.