[PATCH 0/2] use hrtimer in nand_wait

* [PATCH 0/2] use hrtimer in nand_wait
@ 2012-05-21  8:42 Johan Gunnarsson
  2012-05-21  8:42 ` [PATCH 1/2] mtd: nand: panic_nand_wait expects timeout in ms Johan Gunnarsson
                   ` (2 more replies)
  0 siblings, 3 replies; 14+ messages in thread
From: Johan Gunnarsson @ 2012-05-21  8:42 UTC (permalink / raw)
  To: linux-mtd; +Cc: jespern

Hello all,

I've been researching a bug where blocks have gone bad when combining NAND writes with long periods of disabled interrupts. Such as lots of serial port writes (think printk) in interrupt context.

I've narrowed it down to the nand_wait routine and its dependency on a reliable jiffies counter. Sadly, jiffies is not reliable when handling of timer interrupts are delayed or even completely discarded. If interrupts are disabled for, say, 3 timer periods, jiffies will stop counting during this time and have a very fast increment by 3 when interrupts are later enabled. This combined with unfortunate timing can cause the timeout loop think a 20ms timeout is happening when just <0.1ms has passed in wall clock time.

To illustrate the jiffies/interrupt-relationship:

Interrupts: |      |      |                    |      |      |      |
Jiffies:    |      |      |                    |||    |      |      |

This obviously only happen on multi-core CPUs, where the write and interrupts are executed by different cores simultaneously. Switching to hrtimer-based timeout solves this problem for me. I found a second (less serious) issue which included in the first patch.

Johan

Johan Gunnarsson (2):
  mtd: nand: panic_nand_wait expects timeout in ms.
  mtd: nand: use hrtimer to measure timeout in nand_wait{_ready,}

 drivers/mtd/nand/nand_base.c |   42 ++++++++++++++++++++++++++++++++++--------
 1 files changed, 34 insertions(+), 8 deletions(-)

-- 
1.7.2.5

^ permalink raw reply	[flat|nested] 14+ messages in thread