All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/2] use hrtimer in nand_wait
@ 2012-05-21  8:42 Johan Gunnarsson
  2012-05-21  8:42 ` [PATCH 1/2] mtd: nand: panic_nand_wait expects timeout in ms Johan Gunnarsson
                   ` (2 more replies)
  0 siblings, 3 replies; 14+ messages in thread
From: Johan Gunnarsson @ 2012-05-21  8:42 UTC (permalink / raw)
  To: linux-mtd; +Cc: jespern

Hello all,

I've been researching a bug where blocks have gone bad when combining NAND writes with long periods of disabled interrupts. Such as lots of serial port writes (think printk) in interrupt context.

I've narrowed it down to the nand_wait routine and its dependency on a reliable jiffies counter. Sadly, jiffies is not reliable when handling of timer interrupts are delayed or even completely discarded. If interrupts are disabled for, say, 3 timer periods, jiffies will stop counting during this time and have a very fast increment by 3 when interrupts are later enabled. This combined with unfortunate timing can cause the timeout loop think a 20ms timeout is happening when just <0.1ms has passed in wall clock time.

To illustrate the jiffies/interrupt-relationship:

Interrupts: |      |      |                    |      |      |      |
Jiffies:    |      |      |                    |||    |      |      |

This obviously only happen on multi-core CPUs, where the write and interrupts are executed by different cores simultaneously. Switching to hrtimer-based timeout solves this problem for me. I found a second (less serious) issue which included in the first patch.

Johan


Johan Gunnarsson (2):
  mtd: nand: panic_nand_wait expects timeout in ms.
  mtd: nand: use hrtimer to measure timeout in nand_wait{_ready,}

 drivers/mtd/nand/nand_base.c |   42 ++++++++++++++++++++++++++++++++++--------
 1 files changed, 34 insertions(+), 8 deletions(-)

-- 
1.7.2.5

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH 1/2] mtd: nand: panic_nand_wait expects timeout in ms.
  2012-05-21  8:42 [PATCH 0/2] use hrtimer in nand_wait Johan Gunnarsson
@ 2012-05-21  8:42 ` Johan Gunnarsson
  2012-05-21  8:42 ` [PATCH 2/2] mtd: nand: use hrtimer to measure timeout in nand_wait{_ready, } Johan Gunnarsson
  2012-05-22  7:23 ` [PATCH 0/2] use hrtimer in nand_wait Artem Bityutskiy
  2 siblings, 0 replies; 14+ messages in thread
From: Johan Gunnarsson @ 2012-05-21  8:42 UTC (permalink / raw)
  To: linux-mtd; +Cc: jespern

Signed-off-by: Johan Gunnarsson <johan.gunnarsson@axis.com>
---
 drivers/mtd/nand/nand_base.c |   10 ++++++----
 1 files changed, 6 insertions(+), 4 deletions(-)

diff --git a/drivers/mtd/nand/nand_base.c b/drivers/mtd/nand/nand_base.c
index eb88d8b..b927e64 100644
--- a/drivers/mtd/nand/nand_base.c
+++ b/drivers/mtd/nand/nand_base.c
@@ -867,14 +867,16 @@ static void panic_nand_wait(struct mtd_info *mtd, struct nand_chip *chip,
  */
 static int nand_wait(struct mtd_info *mtd, struct nand_chip *chip)
 {
-
+	unsigned long timeout_ms;
 	unsigned long timeo = jiffies;
 	int status, state = chip->state;
 
 	if (state == FL_ERASING)
-		timeo += (HZ * 400) / 1000;
+		timeout_ms = 400;
 	else
-		timeo += (HZ * 20) / 1000;
+		timeout_ms = 20;
+
+	timeo += (HZ * timeout_ms) / 1000;
 
 	led_trigger_event(nand_led_trigger, LED_FULL);
 
@@ -890,7 +892,7 @@ static int nand_wait(struct mtd_info *mtd, struct nand_chip *chip)
 		chip->cmdfunc(mtd, NAND_CMD_STATUS, -1, -1);
 
 	if (in_interrupt() || oops_in_progress)
-		panic_nand_wait(mtd, chip, timeo);
+		panic_nand_wait(mtd, chip, timeout_ms);
 	else {
 		while (time_before(jiffies, timeo)) {
 			if (chip->dev_ready) {
-- 
1.7.2.5

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH 2/2] mtd: nand: use hrtimer to measure timeout in nand_wait{_ready, }
  2012-05-21  8:42 [PATCH 0/2] use hrtimer in nand_wait Johan Gunnarsson
  2012-05-21  8:42 ` [PATCH 1/2] mtd: nand: panic_nand_wait expects timeout in ms Johan Gunnarsson
@ 2012-05-21  8:42 ` Johan Gunnarsson
  2012-05-22  7:53   ` Artem Bityutskiy
  2012-05-22  7:23 ` [PATCH 0/2] use hrtimer in nand_wait Artem Bityutskiy
  2 siblings, 1 reply; 14+ messages in thread
From: Johan Gunnarsson @ 2012-05-21  8:42 UTC (permalink / raw)
  To: linux-mtd; +Cc: jespern

Previous, jiffy-based, implementation had three issues:

1. For low HZ values (<100 and >=50) the timeout would be 1 jiffy,
   which can cause premature timeouts if a timer interrupt happens
   right after the "timeo" value is assigned.
2. For very low HZ values (<50) the timeout is 0 jiffies, which also
   cause premature timeouts.
3. The jiffies counter is not reliable on multicore systems when
   interrupts are disabled for long times (a couple of timer interrupt
   periods). For example with excessive printk and serial output in
   interrupt context. The jiffies counting stops during the period
   with disabled interrupts and recovers by counting up all lost jiffy
   increments very, very fast when interrupts are later enabled. This,
   together with bad luck, can cause a third type of premature
   timeouts.

All three issues can cause good blocks being marked bad.

Signed-off-by: Johan Gunnarsson <johan.gunnarsson@axis.com>
---
 drivers/mtd/nand/nand_base.c |   36 ++++++++++++++++++++++++++++++------
 1 files changed, 30 insertions(+), 6 deletions(-)

diff --git a/drivers/mtd/nand/nand_base.c b/drivers/mtd/nand/nand_base.c
index b927e64..0db920b 100644
--- a/drivers/mtd/nand/nand_base.c
+++ b/drivers/mtd/nand/nand_base.c
@@ -47,6 +47,7 @@
 #include <linux/bitops.h>
 #include <linux/leds.h>
 #include <linux/io.h>
+#include <linux/hrtimer.h>
 #include <linux/mtd/partitions.h>
 
 /* Define default oob placement schemes for large and small page devices */
@@ -99,6 +100,13 @@ static int nand_get_device(struct nand_chip *chip, struct mtd_info *mtd,
 static int nand_do_write_oob(struct mtd_info *mtd, loff_t to,
 			     struct mtd_oob_ops *ops);
 
+/**
+ * NOP nand_wait timeout callback.
+ */
+static enum hrtimer_restart nand_wait_timeout_callback(struct hrtimer* timer) {
+	return HRTIMER_NORESTART;
+}
+
 /*
  * For devices which display every fart in the system on a separate LED. Is
  * compiled away when LED support is disabled.
@@ -533,19 +541,29 @@ static void panic_nand_wait_ready(struct mtd_info *mtd, unsigned long timeo)
 void nand_wait_ready(struct mtd_info *mtd)
 {
 	struct nand_chip *chip = mtd->priv;
-	unsigned long timeo = jiffies + 2;
+	struct hrtimer timer;
 
 	/* 400ms timeout */
 	if (in_interrupt() || oops_in_progress)
 		return panic_nand_wait_ready(mtd, 400);
 
 	led_trigger_event(nand_led_trigger, LED_FULL);
+
+	/* Arm timeout timer for 20ms timeout */
+	hrtimer_init(&timer, CLOCK_REALTIME, HRTIMER_MODE_REL);
+	timer.function = nand_wait_timeout_callback;
+	hrtimer_start(&timer, ns_to_ktime(20 * 1000 * 1000),
+	              HRTIMER_MODE_REL);
+
 	/* Wait until command is processed or timeout occurs */
 	do {
 		if (chip->dev_ready(mtd))
 			break;
 		touch_softlockup_watchdog();
-	} while (time_before(jiffies, timeo));
+	} while (ktime_to_ns(hrtimer_get_expires(&timer)) > 0);
+
+	hrtimer_cancel(&timer);
+
 	led_trigger_event(nand_led_trigger, LED_OFF);
 }
 EXPORT_SYMBOL_GPL(nand_wait_ready);
@@ -868,7 +886,6 @@ static void panic_nand_wait(struct mtd_info *mtd, struct nand_chip *chip,
 static int nand_wait(struct mtd_info *mtd, struct nand_chip *chip)
 {
 	unsigned long timeout_ms;
-	unsigned long timeo = jiffies;
 	int status, state = chip->state;
 
 	if (state == FL_ERASING)
@@ -876,8 +893,6 @@ static int nand_wait(struct mtd_info *mtd, struct nand_chip *chip)
 	else
 		timeout_ms = 20;
 
-	timeo += (HZ * timeout_ms) / 1000;
-
 	led_trigger_event(nand_led_trigger, LED_FULL);
 
 	/*
@@ -894,7 +909,14 @@ static int nand_wait(struct mtd_info *mtd, struct nand_chip *chip)
 	if (in_interrupt() || oops_in_progress)
 		panic_nand_wait(mtd, chip, timeout_ms);
 	else {
-		while (time_before(jiffies, timeo)) {
+		struct hrtimer timer;
+
+		hrtimer_init(&timer, CLOCK_REALTIME, HRTIMER_MODE_REL);
+		timer.function = nand_wait_timeout_callback;
+		hrtimer_start(&timer, ns_to_ktime(timeout_ms * 1000 * 1000),
+		              HRTIMER_MODE_REL);
+
+		while(ktime_to_ns(hrtimer_get_expires(&timer)) > 0) {
 			if (chip->dev_ready) {
 				if (chip->dev_ready(mtd))
 					break;
@@ -904,6 +926,8 @@ static int nand_wait(struct mtd_info *mtd, struct nand_chip *chip)
 			}
 			cond_resched();
 		}
+
+		hrtimer_cancel(&timer);
 	}
 	led_trigger_event(nand_led_trigger, LED_OFF);
 
-- 
1.7.2.5

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [PATCH 0/2] use hrtimer in nand_wait
  2012-05-21  8:42 [PATCH 0/2] use hrtimer in nand_wait Johan Gunnarsson
  2012-05-21  8:42 ` [PATCH 1/2] mtd: nand: panic_nand_wait expects timeout in ms Johan Gunnarsson
  2012-05-21  8:42 ` [PATCH 2/2] mtd: nand: use hrtimer to measure timeout in nand_wait{_ready, } Johan Gunnarsson
@ 2012-05-22  7:23 ` Artem Bityutskiy
  2012-05-22  8:37   ` Johan Gunnarsson
  2 siblings, 1 reply; 14+ messages in thread
From: Artem Bityutskiy @ 2012-05-22  7:23 UTC (permalink / raw)
  To: Johan Gunnarsson; +Cc: linux-mtd, jespern

[-- Attachment #1: Type: text/plain, Size: 1502 bytes --]

On Mon, 2012-05-21 at 10:42 +0200, Johan Gunnarsson wrote:
> I've narrowed it down to the nand_wait routine and its dependency on a
> reliable jiffies counter. Sadly, jiffies is not reliable when handling
> of timer interrupts are delayed or even completely discarded. If
> interrupts are disabled for, say, 3 timer periods, jiffies will stop
> counting during this time and have a very fast increment by 3 when
> interrupts are later enabled.

I can follow up to this point.

>  This combined with unfortunate timing can cause the timeout loop
> think a 20ms timeout is happening when just <0.1ms has passed in wall
> clock time.

What is you HZ? Let's say it is 100, then jiffie increments every 10ms,
right? How can it increment by 2 after 0.1 ms?
> 
> To illustrate the jiffies/interrupt-relationship:
> 
> Interrupts: |      |      |                    |      |      |      |
> Jiffies:    |      |      |                    |||    |      |      |
> 
> This obviously only happen on multi-core CPUs, where the write and
> interrupts are executed by different cores simultaneously. Switching
> to hrtimer-based timeout solves this problem for me. I found a second
> (less serious) issue which included in the first patch.

Sorry, I do not understand why this happens only on SMP. Could you
please explain some more?

I do not disagree that we should stop using jiffies, if we can, I just
want to understand what is happening.

-- 
Best Regards,
Artem Bityutskiy

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 836 bytes --]

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 2/2] mtd: nand: use hrtimer to measure timeout in nand_wait{_ready, }
  2012-05-21  8:42 ` [PATCH 2/2] mtd: nand: use hrtimer to measure timeout in nand_wait{_ready, } Johan Gunnarsson
@ 2012-05-22  7:53   ` Artem Bityutskiy
  2012-05-22  8:52     ` Johan Gunnarsson
  0 siblings, 1 reply; 14+ messages in thread
From: Artem Bityutskiy @ 2012-05-22  7:53 UTC (permalink / raw)
  To: Johan Gunnarsson, Brian Norris; +Cc: linux-mtd, jespern

[-- Attachment #1: Type: text/plain, Size: 1787 bytes --]

On Mon, 2012-05-21 at 10:42 +0200, Johan Gunnarsson wrote:
> @@ -533,19 +541,29 @@ static void panic_nand_wait_ready(struct mtd_info *mtd, unsigned long timeo)
>  void nand_wait_ready(struct mtd_info *mtd)
>  {
>  	struct nand_chip *chip = mtd->priv;
> -	unsigned long timeo = jiffies + 2;
> +	struct hrtimer timer;
>  
>  	/* 400ms timeout */
>  	if (in_interrupt() || oops_in_progress)
>  		return panic_nand_wait_ready(mtd, 400);
>  
>  	led_trigger_event(nand_led_trigger, LED_FULL);
> +
> +	/* Arm timeout timer for 20ms timeout */
> +	hrtimer_init(&timer, CLOCK_REALTIME, HRTIMER_MODE_REL);
> +	timer.function = nand_wait_timeout_callback;
> +	hrtimer_start(&timer, ns_to_ktime(20 * 1000 * 1000),
> +	              HRTIMER_MODE_REL);
> +
>  	/* Wait until command is processed or timeout occurs */
>  	do {
>  		if (chip->dev_ready(mtd))
>  			break;
>  		touch_softlockup_watchdog();
> -	} while (time_before(jiffies, timeo));

Si this function waits for the chip, but if we cannot access the "ready"
pin - it waits for 400msecs? Where this number 400 comes from? Why not
500? How many chips we have which do not provide "dev_ready()" function?
Can we simplify this function and assume everyone has "dev_ready()" and
add BUG_ON(!chip->dev_ready) - could you make a small research ? Or can
we do like this:

if (!dev_ready()) {
	msleep(400);
	return
}

do {
} while

Why should we iterate if "dev_ready" is NULL?

My point is that this stuff ancient horror which needs love and cleaning
up. Your does not make it less scary. I'd prefer to first clean the
whole thing up, and then fix it.

I'm CCing Brian who spent a lot of time digging nand_base.c recently, he
has probably got some ideas.

-- 
Best Regards,
Artem Bityutskiy

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 836 bytes --]

^ permalink raw reply	[flat|nested] 14+ messages in thread

* RE: [PATCH 0/2] use hrtimer in nand_wait
  2012-05-22  7:23 ` [PATCH 0/2] use hrtimer in nand_wait Artem Bityutskiy
@ 2012-05-22  8:37   ` Johan Gunnarsson
  0 siblings, 0 replies; 14+ messages in thread
From: Johan Gunnarsson @ 2012-05-22  8:37 UTC (permalink / raw)
  To: dedekind1; +Cc: linux-mtd, Jesper Nilsson



> -----Original Message-----
> From: Artem Bityutskiy [mailto:dedekind1@gmail.com]
> Sent: den 22 maj 2012 09:23
> To: Johan Gunnarsson
> Cc: linux-mtd@lists.infradead.org; Jesper Nilsson
> Subject: Re: [PATCH 0/2] use hrtimer in nand_wait
> 
> On Mon, 2012-05-21 at 10:42 +0200, Johan Gunnarsson wrote:
> > I've narrowed it down to the nand_wait routine and its dependency on
> a
> > reliable jiffies counter. Sadly, jiffies is not reliable when
> handling
> > of timer interrupts are delayed or even completely discarded. If
> > interrupts are disabled for, say, 3 timer periods, jiffies will stop
> > counting during this time and have a very fast increment by 3 when
> > interrupts are later enabled.
> 
> I can follow up to this point.
> 
> >  This combined with unfortunate timing can cause the timeout loop
> > think a 20ms timeout is happening when just <0.1ms has passed in wall
> > clock time.
> 
> What is you HZ? Let's say it is 100, then jiffie increments every 10ms,
> right? How can it increment by 2 after 0.1 ms?

It is supposed to increment by 1 each clock tick. But if interrupts are disabled for a number of clock ticks, all these increments will be lumped together and incremented in a loop when interrupts are enabled again. Sort of to catch up.

The relevant parts of the kernel where this happens is

kernel/time/tick_common.c:tick_handle_periodic()
kernel/time/tick_common.c:tick_periodic()
kernel/time/clockevents.c:clockevents_program_event()

Timer interrupt handler calls tick_handle_periodic through the clockevents framework. The loop in tick_handle_periodic will iterate one extra time for each missed tick. clockevents_program_event knows if the next timer to be programmed is in the past by reading back the current HW timer value through the clocksource framework. The actual jiffies increment happens in do_timer through tick_periodic.

Also, please note that we're using a one-shot time that is reprogrammed for each tick. Not a periodic timer.

> >
> > To illustrate the jiffies/interrupt-relationship:
> >
> > Interrupts: |      |      |                    |      |      |      |
> > Jiffies:    |      |      |                    |||    |      |      |
> >
> > This obviously only happen on multi-core CPUs, where the write and
> > interrupts are executed by different cores simultaneously. Switching
> > to hrtimer-based timeout solves this problem for me. I found a second
> > (less serious) issue which included in the first patch.
> 
> Sorry, I do not understand why this happens only on SMP. Could you
> please explain some more?

The only way this can happen is by executing a nand_write at the same time as the jiffy counter is "catching up" (ie. running the loop in tick_handle_periodic as described above.) On a single-core system I don't think that would be possible since the catch-up happens in interrupt context.

> 
> I do not disagree that we should stop using jiffies, if we can, I just
> want to understand what is happening.
> 
> --
> Best Regards,
> Artem Bityutskiy

^ permalink raw reply	[flat|nested] 14+ messages in thread

* RE: [PATCH 2/2] mtd: nand: use hrtimer to measure timeout in nand_wait{_ready, }
  2012-05-22  7:53   ` Artem Bityutskiy
@ 2012-05-22  8:52     ` Johan Gunnarsson
  2012-05-22 10:25       ` Artem Bityutskiy
  2012-05-22 17:10       ` Ivan Djelic
  0 siblings, 2 replies; 14+ messages in thread
From: Johan Gunnarsson @ 2012-05-22  8:52 UTC (permalink / raw)
  To: dedekind1, Brian Norris; +Cc: linux-mtd, Jesper Nilsson



> -----Original Message-----
> From: Artem Bityutskiy [mailto:dedekind1@gmail.com]
> Sent: den 22 maj 2012 09:53
> To: Johan Gunnarsson; Brian Norris
> Cc: linux-mtd@lists.infradead.org; Jesper Nilsson
> Subject: Re: [PATCH 2/2] mtd: nand: use hrtimer to measure timeout in
> nand_wait{_ready, }
> 
> On Mon, 2012-05-21 at 10:42 +0200, Johan Gunnarsson wrote:
> > @@ -533,19 +541,29 @@ static void panic_nand_wait_ready(struct
> mtd_info *mtd, unsigned long timeo)
> >  void nand_wait_ready(struct mtd_info *mtd)
> >  {
> >  	struct nand_chip *chip = mtd->priv;
> > -	unsigned long timeo = jiffies + 2;
> > +	struct hrtimer timer;
> >
> >  	/* 400ms timeout */
> >  	if (in_interrupt() || oops_in_progress)
> >  		return panic_nand_wait_ready(mtd, 400);
> >
> >  	led_trigger_event(nand_led_trigger, LED_FULL);
> > +
> > +	/* Arm timeout timer for 20ms timeout */
> > +	hrtimer_init(&timer, CLOCK_REALTIME, HRTIMER_MODE_REL);
> > +	timer.function = nand_wait_timeout_callback;
> > +	hrtimer_start(&timer, ns_to_ktime(20 * 1000 * 1000),
> > +	              HRTIMER_MODE_REL);
> > +
> >  	/* Wait until command is processed or timeout occurs */
> >  	do {
> >  		if (chip->dev_ready(mtd))
> >  			break;
> >  		touch_softlockup_watchdog();
> > -	} while (time_before(jiffies, timeo));
> 
> Si this function waits for the chip, but if we cannot access the
> "ready"
> pin - it waits for 400msecs? Where this number 400 comes from? Why not
> 500? How many chips we have which do not provide "dev_ready()"
> function?

Not an expert on the MTD framework -- but I believe the 400 comes from the comment above nand_wait(): "Wait for command done. This applies to erase and program only. Erase can take up to 400ms and program up to 20ms according to general NAND and SmartMedia specs."

> Can we simplify this function and assume everyone has "dev_ready()" and
> add BUG_ON(!chip->dev_ready) - could you make a small research ? Or can

The drivers/mtd/nand/h1910.c driver sets dev_ready to NULL on init. The others seem to have dev_ready implementations.

> we do like this:
> 
> if (!dev_ready()) {
> 	msleep(400);
> 	return
> }
> 
> do {
> } while
> 
> Why should we iterate if "dev_ready" is NULL?

Actually, it's worse than that. If dev_ready is NULL (as in the h1910.c case) we'll crash the kernel, right?

> 
> My point is that this stuff ancient horror which needs love and
> cleaning
> up. Your does not make it less scary. I'd prefer to first clean the
> whole thing up, and then fix it.
> 
> I'm CCing Brian who spent a lot of time digging nand_base.c recently,
> he
> has probably got some ideas.
> 
> --
> Best Regards,
> Artem Bityutskiy

^ permalink raw reply	[flat|nested] 14+ messages in thread

* RE: [PATCH 2/2] mtd: nand: use hrtimer to measure timeout in nand_wait{_ready, }
  2012-05-22  8:52     ` Johan Gunnarsson
@ 2012-05-22 10:25       ` Artem Bityutskiy
  2012-05-22 14:24         ` Johan Gunnarsson
  2012-05-22 17:10       ` Ivan Djelic
  1 sibling, 1 reply; 14+ messages in thread
From: Artem Bityutskiy @ 2012-05-22 10:25 UTC (permalink / raw)
  To: Johan Gunnarsson; +Cc: Brian Norris, linux-mtd, Jesper Nilsson

[-- Attachment #1: Type: text/plain, Size: 709 bytes --]

On Tue, 2012-05-22 at 10:52 +0200, Johan Gunnarsson wrote:
> Not an expert on the MTD framework -- but I believe the 400 comes from
> the comment above nand_wait(): "Wait for command done. This applies to
> erase and program only. Erase can take up to 400ms and program up to
> 20ms according to general NAND and SmartMedia specs."

Most probably this legacy can be killed along with the IPAQ h1910 driver
you pointed. But could you please start with cleaning these functions up
and turning the "!chip->dev_ready()" case into a simple msleep() or
mdelay() for the panic case?

Then I may get some ideas how to get rid of the jiffies without using
hrtimer.

-- 
Best Regards,
Artem Bityutskiy

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 836 bytes --]

^ permalink raw reply	[flat|nested] 14+ messages in thread

* RE: [PATCH 2/2] mtd: nand: use hrtimer to measure timeout in nand_wait{_ready, }
  2012-05-22 10:25       ` Artem Bityutskiy
@ 2012-05-22 14:24         ` Johan Gunnarsson
  0 siblings, 0 replies; 14+ messages in thread
From: Johan Gunnarsson @ 2012-05-22 14:24 UTC (permalink / raw)
  To: dedekind1; +Cc: Brian Norris, linux-mtd, Jesper Nilsson



> -----Original Message-----
> From: Artem Bityutskiy [mailto:dedekind1@gmail.com]
> Sent: den 22 maj 2012 12:25
> To: Johan Gunnarsson
> Cc: Brian Norris; linux-mtd@lists.infradead.org; Jesper Nilsson
> Subject: RE: [PATCH 2/2] mtd: nand: use hrtimer to measure timeout in
> nand_wait{_ready, }
> 
> On Tue, 2012-05-22 at 10:52 +0200, Johan Gunnarsson wrote:
> > Not an expert on the MTD framework -- but I believe the 400 comes
> from
> > the comment above nand_wait(): "Wait for command done. This applies
> to
> > erase and program only. Erase can take up to 400ms and program up to
> > 20ms according to general NAND and SmartMedia specs."
> 
> Most probably this legacy can be killed along with the IPAQ h1910
> driver you pointed. But could you please start with cleaning these
> functions up and turning the "!chip->dev_ready()" case into a simple
> msleep() or
> mdelay() for the panic case?
> 

I don't understand what clean-up is needed here. Can you please elaborate?

The case you mentioned earlier doesn't really make sense to me:

if (!dev_ready()) {
	msleep(400);
	return
}

Making a single call to dev_ready() and giving up if FALSE is wrong IMHO, since we very much want to poll dev_ready() and return as soon as it turns TRUE (or timeout.) Assuming you actually mean "if(!dev_ready)) {" also doesn't make sense since checks like that exists on all path leading to nand_wait_ready().

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 2/2] mtd: nand: use hrtimer to measure timeout in nand_wait{_ready, }
  2012-05-22  8:52     ` Johan Gunnarsson
  2012-05-22 10:25       ` Artem Bityutskiy
@ 2012-05-22 17:10       ` Ivan Djelic
  2012-05-22 18:21         ` Artem Bityutskiy
                           ` (2 more replies)
  1 sibling, 3 replies; 14+ messages in thread
From: Ivan Djelic @ 2012-05-22 17:10 UTC (permalink / raw)
  To: Johan Gunnarsson; +Cc: Brian Norris, linux-mtd, Jesper Nilsson, dedekind1

On Tue, May 22, 2012 at 09:52:22AM +0100, Johan Gunnarsson wrote:
> > -----Original Message-----
> > From: Artem Bityutskiy [mailto:dedekind1@gmail.com]
> > Sent: den 22 maj 2012 09:53
> > To: Johan Gunnarsson; Brian Norris
> > Cc: linux-mtd@lists.infradead.org; Jesper Nilsson
> > Subject: Re: [PATCH 2/2] mtd: nand: use hrtimer to measure timeout in
> > nand_wait{_ready, }

(...)

> > Si this function waits for the chip, but if we cannot access the
> > "ready"
> > pin - it waits for 400msecs? Where this number 400 comes from? Why not
> > 500? How many chips we have which do not provide "dev_ready()"
> > function?
> 
> Not an expert on the MTD framework -- but I believe the 400 comes from the comment above nand_wait(): "Wait for command done. This applies to erase and program only. Erase can take up to 400ms and program up to 20ms according to general NAND and SmartMedia specs."
> 

Hi Johan and Artem,

Current NAND devices require approximately 1-3 ms per block erase, and 100-200 us per page program operation.
But I think this timeout is only useful to detect and recover from broken hardware: IMHO there is no point in
trying to "optimize" the timeout delay to 20ms or 400ms depending on which operation is being done (program or erase).

Why not simplify and just use a single 1s (1000ms) timeout ?

Johan,
with such a timeout value, your bug would manifest itself only if interrupts are blocked for nearly a second: would that be acceptable ?

> > Can we simplify this function and assume everyone has "dev_ready()" and
> > add BUG_ON(!chip->dev_ready) - could you make a small research ? Or can
> 
> The drivers/mtd/nand/h1910.c driver sets dev_ready to NULL on init. The others seem to have dev_ready implementations.
> 
> > we do like this:
> > 
> > if (!dev_ready()) {
> > 	msleep(400);
> > 	return
> > }
> > 
> > do {
> > } while
> > 
> > Why should we iterate if "dev_ready" is NULL?

nand_wait() polls the device until it is ready or a timeout has elapsed.
More details here: http://lists.infradead.org/pipermail/linux-mtd/2011-June/036795.html


> Actually, it's worse than that. If dev_ready is NULL (as in the h1910.c case) we'll crash the kernel, right?
> 
> > 
> > My point is that this stuff ancient horror which needs love and
> > cleaning
> > up. Your does not make it less scary. I'd prefer to first clean the
> > whole thing up, and then fix it.

I agree it needs some cleaning; here are a few things I have noticed:

* panic_nand_wait() and panic_nand_wait_ready() are nearly identical

* Some patterns are repeated:
	/* Apply delay or wait for ready/busy pin */
	if (!chip->dev_ready)
		udelay(chip->chip_delay);
	else
		nand_wait_ready(mtd);

* the udelay() fallback should probably be phased out if only one driver (h1910) depends on it because it does not define chip->dev_ready() ?

* nand_wait() and nand_wait_ready() use different strategies to wait:
  - nand_wait_ready = busy loop
  - nand_wait       = busy loop with cond_resched call
  Presumably, this is because nand_wait_ready() should not wait for more than ~25 us (typically),
  so it is more efficient to avoid the cond_resched() call ?

* We could separate dev_ready() and STATUS polling inside nand_wait(), or even better, directly call nand_wait_ready() from nand_wait() ?
  The only difference between the 2 functions (besides cond_resched) is that nand_wait_ready() does not send any STATUS command: as a consequence,
  it does not interfere with the device mode (polling the device with STATUS reads while waiting during a read operation requires
  re-sending a READ0 command at the end to return to read mode).

I would suggest the following approach:

1. Use a single timeout value (1s for instance).

2. Rewrite and simplify nand_wait() as follows:
   a) loop until dev_ready() returns true OR timeout
   b) send a STATUS command
   c) loop until status is ready OR timeout, and return NAND status

The idea is that in most cases, dev_ready() will be defined and functional, and the loop in c) will finish immediately.
If dev_ready() is not available or defective, step c) still ensures we properly wait.
Step a) could be replaced with a call to a modified version of nand_wait_ready() with optional cond_resched-uling.

What do you think ? Any suggestions ?
I'm willing to submit patches if we reach a consensus on this,

Best regards,
--
Ivan

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 2/2] mtd: nand: use hrtimer to measure timeout in nand_wait{_ready, }
  2012-05-22 17:10       ` Ivan Djelic
@ 2012-05-22 18:21         ` Artem Bityutskiy
  2012-05-23  6:39         ` Brian Norris
  2012-05-23  8:14         ` Johan Gunnarsson
  2 siblings, 0 replies; 14+ messages in thread
From: Artem Bityutskiy @ 2012-05-22 18:21 UTC (permalink / raw)
  To: Ivan Djelic; +Cc: Brian Norris, linux-mtd, Jesper Nilsson, Johan Gunnarsson

[-- Attachment #1: Type: text/plain, Size: 1075 bytes --]

On Tue, 2012-05-22 at 19:10 +0200, Ivan Djelic wrote:
> Current NAND devices require approximately 1-3 ms per block erase, and 100-200 us per page program operation.
> But I think this timeout is only useful to detect and recover from broken hardware: IMHO there is no point in
> trying to "optimize" the timeout delay to 20ms or 400ms depending on which operation is being done (program or erase).
> 
> Why not simplify and just use a single 1s (1000ms) timeout ?


I have many things on my plate now so my answers are are sloppy - I
spent 1 minute or less looking at the code. But I thought that this
timeout is important for the the case when 'chip->device_ready' is NULL.
This is why I wanted to get rid of that case, and then the timeout would
only become about detecting the forever loops. And this is exactly what
I meant by saying that I have further idea on improving that without
hrtimer - just have one single 1 second timeout. We indeed do not have
to be precise for the errors detection.

Oops, SIGBABY.

-- 
Best Regards,
Artem Bityutskiy

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 836 bytes --]

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 2/2] mtd: nand: use hrtimer to measure timeout in nand_wait{_ready, }
  2012-05-22 17:10       ` Ivan Djelic
  2012-05-22 18:21         ` Artem Bityutskiy
@ 2012-05-23  6:39         ` Brian Norris
  2012-05-23  8:36           ` Ivan Djelic
  2012-05-23  8:14         ` Johan Gunnarsson
  2 siblings, 1 reply; 14+ messages in thread
From: Brian Norris @ 2012-05-23  6:39 UTC (permalink / raw)
  To: Ivan Djelic, Johan Gunnarsson, dedekind1; +Cc: linux-mtd, Jesper Nilsson

Hi,

On Tue, May 22, 2012 at 10:10 AM, Ivan Djelic <ivan.djelic@parrot.com> wrote:
> On Tue, May 22, 2012 at 09:52:22AM +0100, Johan Gunnarsson wrote:
>> > -----Original Message-----
>> > From: Artem Bityutskiy [mailto:dedekind1@gmail.com]
>> > Sent: den 22 maj 2012 09:53
>> > To: Johan Gunnarsson; Brian Norris
>> > Cc: linux-mtd@lists.infradead.org; Jesper Nilsson
>> > Subject: Re: [PATCH 2/2] mtd: nand: use hrtimer to measure timeout in
>> > nand_wait{_ready, }
>
> (...)
>
>> > Si this function waits for the chip, but if we cannot access the
>> > "ready"
>> > pin - it waits for 400msecs? Where this number 400 comes from? Why not
>> > 500? How many chips we have which do not provide "dev_ready()"
>> > function?
>
> Current NAND devices require approximately 1-3 ms per block erase, and 100-200 us per page program operation.
> But I think this timeout is only useful to detect and recover from broken hardware: IMHO there is no point in
> trying to "optimize" the timeout delay to 20ms or 400ms depending on which operation is being done (program or erase).
>
> Why not simplify and just use a single 1s (1000ms) timeout ?
>
>> > Can we simplify this function and assume everyone has "dev_ready()" and
>> > add BUG_ON(!chip->dev_ready) - could you make a small research ? Or can
>>
>> The drivers/mtd/nand/h1910.c driver sets dev_ready to NULL on init. The others seem to have dev_ready implementations.

Not true: there are several drivers which do not have dev_ready, but
this is also because they are controller-based, so they provide their
own cmdfunc as well, mostly avoiding the nand_wait_ready stuff.
(Drivers: denali.c, docg4.c, my out-of-tree driver, ...)

Note: another use of nand_wait_ready is in nand_do_read_ops and
nand_do_read_oob, under the following condition:
    if (!(chip->options & NAND_NO_READRDY)) {
Isn't NAND_NO_READRDY no longer needed, since I killed autoincrement?
(i.e., we *always* "do not support autoincrement", as its comment says
in nand.h) So if we kill this option, we can kill the only use of
nand_wait_ready outside of cmdfunc(); this brings us closer to being
able to using BUG_ON(!chip->dev_ready). I'll send a patch for this,
and you guys can comment.

>> Actually, it's worse than that. If dev_ready is NULL (as in the h1910.c case) we'll crash the kernel, right?

No, I don't think so. It looks like nand_wait_ready is never called
when dev_ready is NULL.

>> >
>> > My point is that this stuff ancient horror which needs love and
>> > cleaning
>> > up. Your does not make it less scary. I'd prefer to first clean the
>> > whole thing up, and then fix it.
>
> I agree it needs some cleaning; here are a few things I have noticed:

Even though I'm mostly not a user of this code, I agree :)

> * Some patterns are repeated:
>        /* Apply delay or wait for ready/busy pin */
>        if (!chip->dev_ready)
>                udelay(chip->chip_delay);
>        else
>                nand_wait_ready(mtd);

I'm going to kill this pattern, I think.

> * the udelay() fallback should probably be phased out if only one driver (h1910) depends on it because it does not define chip->dev_ready() ?

Clarification: only one driver does leaves both chip->dev_ready and
chip->cmdfunc NULL, right?

> What do you think ? Any suggestions ?
> I'm willing to submit patches if we reach a consensus on this,

Feel free to send a patch; it's probably easier to talk more
concretely about a patch that implements your (seemingly reasonable)
suggestions, and then provide comments based on concrete ideas.

Brian

^ permalink raw reply	[flat|nested] 14+ messages in thread

* RE: [PATCH 2/2] mtd: nand: use hrtimer to measure timeout in nand_wait{_ready, }
  2012-05-22 17:10       ` Ivan Djelic
  2012-05-22 18:21         ` Artem Bityutskiy
  2012-05-23  6:39         ` Brian Norris
@ 2012-05-23  8:14         ` Johan Gunnarsson
  2 siblings, 0 replies; 14+ messages in thread
From: Johan Gunnarsson @ 2012-05-23  8:14 UTC (permalink / raw)
  To: Ivan Djelic; +Cc: Brian Norris, linux-mtd, Jesper Nilsson, dedekind1

> Why not simplify and just use a single 1s (1000ms) timeout ?
> 
> Johan,
> with such a timeout value, your bug would manifest itself only if
> interrupts are blocked for nearly a second: would that be acceptable ?

That indeed would be acceptable from my side.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 2/2] mtd: nand: use hrtimer to measure timeout in nand_wait{_ready, }
  2012-05-23  6:39         ` Brian Norris
@ 2012-05-23  8:36           ` Ivan Djelic
  0 siblings, 0 replies; 14+ messages in thread
From: Ivan Djelic @ 2012-05-23  8:36 UTC (permalink / raw)
  To: Brian Norris; +Cc: dedekind1, linux-mtd, Jesper Nilsson, Johan Gunnarsson

On Wed, May 23, 2012 at 07:39:51AM +0100, Brian Norris wrote:
(...)
> >> > Can we simplify this function and assume everyone has "dev_ready()" and
> >> > add BUG_ON(!chip->dev_ready) - could you make a small research ? Or can
> >>
> >> The drivers/mtd/nand/h1910.c driver sets dev_ready to NULL on init. The others seem to have dev_ready implementations.
> 
> Not true: there are several drivers which do not have dev_ready, but
> this is also because they are controller-based, so they provide their
> own cmdfunc as well, mostly avoiding the nand_wait_ready stuff.
> (Drivers: denali.c, docg4.c, my out-of-tree driver, ...)
> 
> Note: another use of nand_wait_ready is in nand_do_read_ops and
> nand_do_read_oob, under the following condition:
>     if (!(chip->options & NAND_NO_READRDY)) {
> Isn't NAND_NO_READRDY no longer needed, since I killed autoincrement?
> (i.e., we *always* "do not support autoincrement", as its comment says
> in nand.h) So if we kill this option, we can kill the only use of
> nand_wait_ready outside of cmdfunc(); this brings us closer to being
> able to using BUG_ON(!chip->dev_ready). I'll send a patch for this,
> and you guys can comment.

Agreed.
 
> >> Actually, it's worse than that. If dev_ready is NULL (as in the h1910.c case) we'll crash the kernel, right?
> 
> No, I don't think so. It looks like nand_wait_ready is never called
> when dev_ready is NULL.

True, but since nand_wait_ready() is a public function, it could be called from a driver that has (dev_ready == NULL), resulting in a crash.
So all drivers must make sure they define dev_ready if they are going to call nand_wait_ready() (like cafe_nand.c).

(...)
> 
> > * the udelay() fallback should probably be phased out if only one driver (h1910) depends on it because it does not define chip->dev_ready() ?
> 
> Clarification: only one driver does leaves both chip->dev_ready and
> chip->cmdfunc NULL, right?

Yes, exactly. I meant that we could get rid of the udelay in the default chip->cmdfunc implementations (nand_command and nand_command_lp).
In other words, if chip->cmdfunc == NULL, we require chip->dev_ready != NULL. And move the existing udelay in an implementation of chip->dev_ready for the h1910 driver.

> Feel free to send a patch; it's probably easier to talk more
> concretely about a patch that implements your (seemingly reasonable)
> suggestions, and then provide comments based on concrete ideas.

Will do,
Thanks,

--
Ivan

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2012-05-23  8:36 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-05-21  8:42 [PATCH 0/2] use hrtimer in nand_wait Johan Gunnarsson
2012-05-21  8:42 ` [PATCH 1/2] mtd: nand: panic_nand_wait expects timeout in ms Johan Gunnarsson
2012-05-21  8:42 ` [PATCH 2/2] mtd: nand: use hrtimer to measure timeout in nand_wait{_ready, } Johan Gunnarsson
2012-05-22  7:53   ` Artem Bityutskiy
2012-05-22  8:52     ` Johan Gunnarsson
2012-05-22 10:25       ` Artem Bityutskiy
2012-05-22 14:24         ` Johan Gunnarsson
2012-05-22 17:10       ` Ivan Djelic
2012-05-22 18:21         ` Artem Bityutskiy
2012-05-23  6:39         ` Brian Norris
2012-05-23  8:36           ` Ivan Djelic
2012-05-23  8:14         ` Johan Gunnarsson
2012-05-22  7:23 ` [PATCH 0/2] use hrtimer in nand_wait Artem Bityutskiy
2012-05-22  8:37   ` Johan Gunnarsson

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.