* [PATCH RFC 0/2] Micron (formerly Numonyx) M29EW NOR flash issues @ 2012-06-18 7:24 Gerlando Falauto 2012-06-18 7:24 ` [PATCH RFC 1/2] mtd: cfi_cmdset_0002: Micron M29EW bugfix "Correcting Erase Suspend Hang Ups" Gerlando Falauto 2012-06-18 7:24 ` [PATCH RFC 2/2] mtd: cfi_cmdset_0002: Micron M29EW bugfix "Resolving the Delay After Resume Issue" Gerlando Falauto 0 siblings, 2 replies; 7+ messages in thread From: Gerlando Falauto @ 2012-06-18 7:24 UTC (permalink / raw) To: linux-mtd; +Cc: Holger Brunck, Leo, Stefan Bigler, Gerlando Falauto Hi everyone, we have been experiencing some problems with the above NOR flash. Please find our analysis and the patches we applied to 3.0.8. Patches of course are not meant for mainlining "as is"; we'd rather appreciate your suggestions as to how to make them suitable for inclusion. It should be a fairly common flash part, but it sounds like noone has run into this issue so far except http://lists.infradead.org/pipermail/linux-mtd/2011-April/034867.html Thank you very much, Gerlando Falauto PROBLEM ANALYSIS: This issue only appears when performing concurrent operations like simultaneous UBI volume creation/deletion, but rarerly under normal conditions. The problem seems to happen rather soon though when the unit is put in a Climate Chamber at high temperatures (say 60°C). In our experience the most probable root cause is the delay needed after an erase resume, before a new erase suspend can be issued again [PATCH 2/2]. This is documented on page 22 of the technical note TN-13-07 from Micron: http://www.micron.com/~/media/Documents/Products/Technical%20Note/NOR%20Flash/tn1307_patching_linux_kernel_for_m29.ashx [NOTE: TN-13-07 explicitly refers to "some revisions of the M29EW (for example, A1 and A2 step revisions)", even though our boards are equipped with silicon revision 12 = 0xC] Adding this delay with a value of 500 us seems to fix the problem even at high temperatures. This is also incidentally the typical value for the "Erase to suspend" parameter as specified the datasheet: Erase to suspend is the typical time between an initial BLOCK ERASE or ERASE RESUME command and a subsequent ERASE SUSPEND command. Violating the specification repeatedly during any particular block erase may cause erase failures. Also, [PATCH 1/2] described on page 20 (Correcting Erase Suspend Hang Ups, was added first, although it does not appear to have any impact on the issue. SIDE NOTE: The flash stressing test used to reproduce this issue has shown in some cases the unforeseen side effect of inexplicably damaging sector 0 (which is where u-boot code resides). When this happened, sector 0 could not be erased anymore, not even through JTAG. A couple of times, further attempts at reprogramming the sector mysteriously lead it to be erasable again. One particular board however (incidentally brought into that condition after a test in the climate chamber) showed unstable values for some bits of sector 0 among successive reads. All other sectors seemed to be immune to this problem. For this board I could not find any way to erase sector 0. This is currently an open issue with wrong software operations causing hardware to break. Signed-off-by: Gerlando Falauto <gerlando.falauto@keymile.com> Cc: Stefan Bigler <stefan.bigler@keymile.com> Cc: Holger Brunck <holger.brunck@keymile.com> Cc: Leo <leo.costa77@gmail.com> Gerlando Falauto (2): mtd: cfi_cmdset_0002: Micron M29EW bugfix "Correcting Erase Suspend Hang Ups" mtd: cfi_cmdset_0002: Micron M29EW bugfix "Resolving the Delay After Resume Issue" drivers/mtd/chips/cfi_cmdset_0002.c | 15 +++++++++++++++ 1 files changed, 15 insertions(+), 0 deletions(-) ^ permalink raw reply [flat|nested] 7+ messages in thread
* [PATCH RFC 1/2] mtd: cfi_cmdset_0002: Micron M29EW bugfix "Correcting Erase Suspend Hang Ups" 2012-06-18 7:24 [PATCH RFC 0/2] Micron (formerly Numonyx) M29EW NOR flash issues Gerlando Falauto @ 2012-06-18 7:24 ` Gerlando Falauto 2012-06-27 10:27 ` Artem Bityutskiy 2012-06-18 7:24 ` [PATCH RFC 2/2] mtd: cfi_cmdset_0002: Micron M29EW bugfix "Resolving the Delay After Resume Issue" Gerlando Falauto 1 sibling, 1 reply; 7+ messages in thread From: Gerlando Falauto @ 2012-06-18 7:24 UTC (permalink / raw) To: linux-mtd; +Cc: Holger Brunck, Leo, Stefan Bigler, Gerlando Falauto >From TN-13-07: Patching the Linux Kernel and U-Boot for M29 Flash, page 20: Some revisions of the M29EW suffer from erase suspend hang ups. In particular, it can occur when the sequence Erase Confirm -> Suspend -> Program -> Resume causes a lockup due to internal timing issues. The consequence is that the erase cannot be resumed without inserting a dummy command after programming and prior to resuming. [...] The work-around is to issue a dummy write cycle that writes an F0 command code before the RESUME command. Signed-off-by: Stefan Bigler <stefan.bigler@keymile.com> Signed-off-by: Gerlando Falauto <gerlando.falauto@keymile.com> Cc: Holger Brunck <holger.brunck@keymile.com> Cc: Leo <leo.costa77@gmail.com> --- drivers/mtd/chips/cfi_cmdset_0002.c | 11 +++++++++++ 1 files changed, 11 insertions(+), 0 deletions(-) diff --git a/drivers/mtd/chips/cfi_cmdset_0002.c b/drivers/mtd/chips/cfi_cmdset_0002.c index 23175ed..72f6164 100644 --- a/drivers/mtd/chips/cfi_cmdset_0002.c +++ b/drivers/mtd/chips/cfi_cmdset_0002.c @@ -761,6 +761,11 @@ static void put_chip(struct map_info *map, struct flchip *chip, unsigned long ad switch(chip->oldstate) { case FL_ERASING: + /* before resume, insert a dummy 0xF0 cycle for Micron M29EW devices */ + if ( (cfi->mfr == 0x0089) && + (((cfi->device_type == CFI_DEVICETYPE_X8) && ((cfi->id & 0xff)== 0x7e)) + || ((cfi->device_type == CFI_DEVICETYPE_X16) && (cfi->id == 0x227e))) ) + map_write(map, CMD(0xF0), chip->in_progress_block_addr); map_write(map, cfi->sector_erase_cmd, chip->in_progress_block_addr); chip->oldstate = FL_READY; chip->state = FL_ERASING; @@ -904,6 +909,12 @@ static void __xipram xip_udelay(struct map_info *map, struct flchip *chip, local_irq_disable(); /* Resume the write or erase operation */ + /* before resume, insert a dummy 0xF0 cycle for Micron M29EW devices */ + if ( (cfi->mfr == 0x0089) && + (((cfi->device_type == CFI_DEVICETYPE_X8) && ((cfi->id & 0xff)== 0x7e)) + || ((cfi->device_type == CFI_DEVICETYPE_X16) && (cfi->id == 0x227e))) ) + map_write(map, CMD(0xF0), adr); + map_write(map, cfi->sector_erase_cmd, adr); chip->state = oldstate; start = xip_currtime(); -- 1.7.1 ^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: [PATCH RFC 1/2] mtd: cfi_cmdset_0002: Micron M29EW bugfix "Correcting Erase Suspend Hang Ups" 2012-06-18 7:24 ` [PATCH RFC 1/2] mtd: cfi_cmdset_0002: Micron M29EW bugfix "Correcting Erase Suspend Hang Ups" Gerlando Falauto @ 2012-06-27 10:27 ` Artem Bityutskiy 2012-07-03 7:09 ` [PATCH] mtd: cfi_cmdset_0002: Micron M29EW bugfixes as per TN-13-07 Gerlando Falauto 0 siblings, 1 reply; 7+ messages in thread From: Artem Bityutskiy @ 2012-06-27 10:27 UTC (permalink / raw) To: Gerlando Falauto; +Cc: Holger Brunck, Leo, linux-mtd, Stefan Bigler [-- Attachment #1: Type: text/plain, Size: 795 bytes --] On Mon, 2012-06-18 at 09:24 +0200, Gerlando Falauto wrote: > + /* before resume, insert a dummy 0xF0 cycle for Micron M29EW devices */ > + if ( (cfi->mfr == 0x0089) && > + (((cfi->device_type == CFI_DEVICETYPE_X8) && ((cfi->id & 0xff)== 0x7e)) > + || ((cfi->device_type == CFI_DEVICETYPE_X16) && (cfi->id == 0x227e))) ) > + map_write(map, CMD(0xF0), chip->in_progress_block_addr); Please, separate the M29-specific quirks out to functions, do not inject them to the main code. Each quirk should be in a separate function, e.g. 'micron_m29_erase_quirk()'. The functions should have a descriptive comment similar to what you added to the commit message. The function just returns if the chip ID is not M29. This will be cleaner. -- Best Regards, Artem Bityutskiy [-- Attachment #2: This is a digitally signed message part --] [-- Type: application/pgp-signature, Size: 836 bytes --] ^ permalink raw reply [flat|nested] 7+ messages in thread
* [PATCH] mtd: cfi_cmdset_0002: Micron M29EW bugfixes as per TN-13-07 2012-06-27 10:27 ` Artem Bityutskiy @ 2012-07-03 7:09 ` Gerlando Falauto 2012-07-16 14:29 ` Artem Bityutskiy 2013-02-12 14:50 ` David Woodhouse 0 siblings, 2 replies; 7+ messages in thread From: Gerlando Falauto @ 2012-07-03 7:09 UTC (permalink / raw) To: linux-mtd Cc: Holger Brunck, Stefan Bigler, Gerlando Falauto, Artem Bityutskiy Fix the following issues with Micron's (formerly Numonyx) M29EW NOR flash chips, as documented on TN-13-07: - Correcting Erase Suspend Hang Ups (page 20) - Resolving the Delay After Resume Issue (page 22) Signed-off-by: Gerlando Falauto <gerlando.falauto@keymile.com> Cc: Stefan Bigler <stefan.bigler@keymile.com> Cc: Holger Brunck <holger.brunck@keymile.com> Cc: Artem Bityutskiy <dedekind1@gmail.com> --- drivers/mtd/chips/cfi_cmdset_0002.c | 69 +++++++++++++++++++++++++++++++++++ 1 files changed, 69 insertions(+), 0 deletions(-) diff --git a/drivers/mtd/chips/cfi_cmdset_0002.c b/drivers/mtd/chips/cfi_cmdset_0002.c index 23175ed..7f23248 100644 --- a/drivers/mtd/chips/cfi_cmdset_0002.c +++ b/drivers/mtd/chips/cfi_cmdset_0002.c @@ -417,6 +417,70 @@ static void cfi_fixup_major_minor(struct cfi_private *cfi, } } +static int is_m29ew(struct cfi_private *cfi) +{ + if (cfi->mfr == CFI_MFR_INTEL) + if ((cfi->device_type == CFI_DEVICETYPE_X8 && + (cfi->id & 0xff) == 0x7e) || + (cfi->device_type == CFI_DEVICETYPE_X16 && + cfi->id == 0x227e)) + return 1; + return 0; +} + +/* + * From TN-13-07: Patching the Linux Kernel and U-Boot for M29 Flash, page 20: + * Some revisions of the M29EW suffer from erase suspend hang ups. In + * particular, it can occur when the sequence + * Erase Confirm -> Suspend -> Program -> Resume + * causes a lockup due to internal timing issues. The consequence is that the + * erase cannot be resumed without inserting a dummy command after programming + * and prior to resuming. [...] The work-around is to issue a dummy write cycle + * that writes an F0 command code before the RESUME command. + */ +static void cfi_fixup_m29ew_erase_suspend(struct map_info *map, + unsigned long adr) +{ + struct cfi_private *cfi = map->fldrv_priv; + /* before resume, insert a dummy 0xF0 cycle for Micron M29EW devices */ + if (is_m29ew(cfi)) + map_write(map, CMD(0xF0), adr); +} + +/* + * From TN-13-07: Patching the Linux Kernel and U-Boot for M29 Flash, page 22: + * + * Some revisions of the M29EW (for example, A1 and A2 step revisions) + * are affected by a problem that could cause a hang up when an ERASE SUSPEND + * command is issued after an ERASE RESUME operation without waiting for a + * minimum delay. The result is that once the ERASE seems to be completed + * (no bits are toggling), the contents of the Flash memory block on which + * the erase was ongoing could be inconsistent with the expected values + * (typically, the array value is stuck to the 0xC0, 0xC4, 0x80, or 0x84 + * values), causing a consequent failure of the ERASE operation. + * The occurrence of this issue could be high, especially when file system + * operations on the Flash are intensive. As a result, it is recommended + * that a patch be applied. Intensive file system operations can cause many + * calls to the garbage routine to free Flash space (also by erasing physical + * Flash blocks) and as a result, many consecutive SUSPEND and RESUME + * commands can occur. The problem disappears when a delay is inserted after + * the RESUME command by using the udelay() function available in Linux. + * The DELAY value must be tuned based on the customer's platform. + * The maximum value that fixes the problem in all cases is 500us. + * But, in our experience, a delay of 30 us to 50 us is sufficient + * in most cases. + * We have chosen 500us because this latency is acceptable. + */ +static void cfi_fixup_m29ew_delay_after_resume(struct cfi_private *cfi) +{ + /* + * Resolving the Delay After Resume Issue see Micron TN-13-07 + * Worstcase delay must be 500us but 30-50us should be ok as well + */ + if (is_m29ew(cfi)) + cfi_udelay(500); +} + struct mtd_info *cfi_cmdset_0002(struct map_info *map, int primary) { struct cfi_private *cfi = map->fldrv_priv; @@ -761,7 +825,10 @@ static void put_chip(struct map_info *map, struct flchip *chip, unsigned long ad switch(chip->oldstate) { case FL_ERASING: + cfi_fixup_m29ew_erase_suspend(map, + chip->in_progress_block_addr); map_write(map, cfi->sector_erase_cmd, chip->in_progress_block_addr); + cfi_fixup_m29ew_delay_after_resume(cfi); chip->oldstate = FL_READY; chip->state = FL_ERASING; break; @@ -903,6 +970,8 @@ static void __xipram xip_udelay(struct map_info *map, struct flchip *chip, /* Disallow XIP again */ local_irq_disable(); + /* Correct Erase Suspend Hangups for M29EW */ + cfi_fixup_m29ew_erase_suspend(map, adr); /* Resume the write or erase operation */ map_write(map, cfi->sector_erase_cmd, adr); chip->state = oldstate; -- 1.7.1 ^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: [PATCH] mtd: cfi_cmdset_0002: Micron M29EW bugfixes as per TN-13-07 2012-07-03 7:09 ` [PATCH] mtd: cfi_cmdset_0002: Micron M29EW bugfixes as per TN-13-07 Gerlando Falauto @ 2012-07-16 14:29 ` Artem Bityutskiy 2013-02-12 14:50 ` David Woodhouse 1 sibling, 0 replies; 7+ messages in thread From: Artem Bityutskiy @ 2012-07-16 14:29 UTC (permalink / raw) To: Gerlando Falauto; +Cc: Stefan Bigler, Holger Brunck, linux-mtd [-- Attachment #1: Type: text/plain, Size: 350 bytes --] On Tue, 2012-07-03 at 09:09 +0200, Gerlando Falauto wrote: > Fix the following issues with Micron's (formerly Numonyx) > M29EW NOR flash chips, as documented on TN-13-07: > - Correcting Erase Suspend Hang Ups (page 20) > - Resolving the Delay After Resume Issue (page 22) Pushed to l2-mtd.git, thanks! -- Best Regards, Artem Bityutskiy [-- Attachment #2: This is a digitally signed message part --] [-- Type: application/pgp-signature, Size: 836 bytes --] ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] mtd: cfi_cmdset_0002: Micron M29EW bugfixes as per TN-13-07 2012-07-03 7:09 ` [PATCH] mtd: cfi_cmdset_0002: Micron M29EW bugfixes as per TN-13-07 Gerlando Falauto 2012-07-16 14:29 ` Artem Bityutskiy @ 2013-02-12 14:50 ` David Woodhouse 1 sibling, 0 replies; 7+ messages in thread From: David Woodhouse @ 2013-02-12 14:50 UTC (permalink / raw) To: Gerlando Falauto Cc: Holger Brunck, Erwan Velu, linux-mtd, Stefan Bigler, Artem Bityutskiy [-- Attachment #1: Type: text/plain, Size: 2303 bytes --] On Tue, 2012-07-03 at 09:09 +0200, Gerlando Falauto wrote: > +/* > + * From TN-13-07: Patching the Linux Kernel and U-Boot for M29 Flash, page 22: > + * > + * Some revisions of the M29EW (for example, A1 and A2 step revisions) > + * are affected by a problem that could cause a hang up when an ERASE SUSPEND > + * command is issued after an ERASE RESUME operation without waiting for a > + * minimum delay. The result is that once the ERASE seems to be completed > + * (no bits are toggling), the contents of the Flash memory block on which > + * the erase was ongoing could be inconsistent with the expected values > + * (typically, the array value is stuck to the 0xC0, 0xC4, 0x80, or 0x84 > + * values), causing a consequent failure of the ERASE operation. > + * The occurrence of this issue could be high, especially when file system > + * operations on the Flash are intensive. As a result, it is recommended > + * that a patch be applied. Intensive file system operations can cause many > + * calls to the garbage routine to free Flash space (also by erasing physical > + * Flash blocks) and as a result, many consecutive SUSPEND and RESUME > + * commands can occur. The problem disappears when a delay is inserted after > + * the RESUME command by using the udelay() function available in Linux. > + * The DELAY value must be tuned based on the customer's platform. > + * The maximum value that fixes the problem in all cases is 500us. > + * But, in our experience, a delay of 30 us to 50 us is sufficient > + * in most cases. > + * We have chosen 500us because this latency is acceptable. > + */ > +static void cfi_fixup_m29ew_delay_after_resume(struct cfi_private *cfi) > +{ > + /* > + * Resolving the Delay After Resume Issue see Micron TN-13-07 > + * Worstcase delay must be 500us but 30-50us should be ok as well > + */ > + if (is_m29ew(cfi)) > + cfi_udelay(500); > +} Hm, this would be better off done without a hard delay right there, but instead just note the timestamp. Then use your existing hook in erase suspend to check that it's been long enough since the last resume, and have a *conditional* delay if not. This assumes you have a timer with enough precision, of course. -- dwmw2 [-- Attachment #2: smime.p7s --] [-- Type: application/x-pkcs7-signature, Size: 6171 bytes --] ^ permalink raw reply [flat|nested] 7+ messages in thread
* [PATCH RFC 2/2] mtd: cfi_cmdset_0002: Micron M29EW bugfix "Resolving the Delay After Resume Issue" 2012-06-18 7:24 [PATCH RFC 0/2] Micron (formerly Numonyx) M29EW NOR flash issues Gerlando Falauto 2012-06-18 7:24 ` [PATCH RFC 1/2] mtd: cfi_cmdset_0002: Micron M29EW bugfix "Correcting Erase Suspend Hang Ups" Gerlando Falauto @ 2012-06-18 7:24 ` Gerlando Falauto 1 sibling, 0 replies; 7+ messages in thread From: Gerlando Falauto @ 2012-06-18 7:24 UTC (permalink / raw) To: linux-mtd; +Cc: Holger Brunck, Leo, Stefan Bigler, Gerlando Falauto >From TN-13-07: Patching the Linux Kernel and U-Boot for M29 Flash, page 22: Some revisions of the M29EW (for example, A1 and A2 step revisions) are affected by a problem that could cause a hang up when an ERASE SUSPEND command is issued after an ERASE RESUME operation without waiting for a minimum delay. The result is that once the ERASE seems to be completed (no bits are toggling), the contents of the Flash memory block on which the erase was ongoing could be inconsistent with the expected values (typically, the array value is stuck to the 0xC0, 0xC4, 0x80, or 0x84 values), causing a consequent failure of the ERASE operation. The occurrence of this issue could be high, especially when file system operations on the Flash are intensive. As a result, it is recommended that a patch be applied. Intensive file system operations can cause many calls to the garbage routine to free Flash space (also by erasing physical Flash blocks) and as a result, many consecutive SUSPEND and RESUME commands can occur. The problem disappears when a delay is inserted after the RESUME command by using the udelay () function available in Linux. The DELAY value must be tuned based on the customer’s platform. The maximum value that fixes the problem in all cases is 500us. But, in our experience, a delay of 30μs to 50μs is sufficient in most cases. We have chosen 500us because this latency is acceptable. Signed-off-by: Stefan Bigler <stefan.bigler@keymile.com> Signed-off-by: Gerlando Falauto <gerlando.falauto@keymile.com> Cc: Holger Brunck <holger.brunck@keymile.com> Cc: Leo <leo.costa77@gmail.com> --- drivers/mtd/chips/cfi_cmdset_0002.c | 4 ++++ 1 files changed, 4 insertions(+), 0 deletions(-) diff --git a/drivers/mtd/chips/cfi_cmdset_0002.c b/drivers/mtd/chips/cfi_cmdset_0002.c index 72f6164..7fb24dc 100644 --- a/drivers/mtd/chips/cfi_cmdset_0002.c +++ b/drivers/mtd/chips/cfi_cmdset_0002.c @@ -767,6 +767,10 @@ static void put_chip(struct map_info *map, struct flchip *chip, unsigned long ad || ((cfi->device_type == CFI_DEVICETYPE_X16) && (cfi->id == 0x227e))) ) map_write(map, CMD(0xF0), chip->in_progress_block_addr); map_write(map, cfi->sector_erase_cmd, chip->in_progress_block_addr); + /* Resolving the Delay After Resume Issue see Micron TN-13-07 */ + /* Worstcase delay must be 500us but 30-50us should be ok as well + nbigls has choosen 500us because this latency is acceptable */ + udelay(500); chip->oldstate = FL_READY; chip->state = FL_ERASING; break; -- 1.7.1 ^ permalink raw reply related [flat|nested] 7+ messages in thread
end of thread, other threads:[~2013-02-12 14:50 UTC | newest] Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2012-06-18 7:24 [PATCH RFC 0/2] Micron (formerly Numonyx) M29EW NOR flash issues Gerlando Falauto 2012-06-18 7:24 ` [PATCH RFC 1/2] mtd: cfi_cmdset_0002: Micron M29EW bugfix "Correcting Erase Suspend Hang Ups" Gerlando Falauto 2012-06-27 10:27 ` Artem Bityutskiy 2012-07-03 7:09 ` [PATCH] mtd: cfi_cmdset_0002: Micron M29EW bugfixes as per TN-13-07 Gerlando Falauto 2012-07-16 14:29 ` Artem Bityutskiy 2013-02-12 14:50 ` David Woodhouse 2012-06-18 7:24 ` [PATCH RFC 2/2] mtd: cfi_cmdset_0002: Micron M29EW bugfix "Resolving the Delay After Resume Issue" Gerlando Falauto
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.