All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2 0/2] Use architected timers for delay loop
@ 2012-06-29 17:33 Will Deacon
  2012-06-29 17:33 ` [PATCH v2 1/2] ARM: arch timer: implement read_current_timer and get_cycles Will Deacon
  2012-06-29 17:33 ` [PATCH v2 2/2] ARM: delay: allow timer-based delay implementation to be selected Will Deacon
  0 siblings, 2 replies; 38+ messages in thread
From: Will Deacon @ 2012-06-29 17:33 UTC (permalink / raw)
  To: linux-arm-kernel

Hello,

This is version 2 of the patches originally posted here:

  http://lists.infradead.org/pipermail/linux-arm-kernel/2012-June/105719.html

Thanks to Stephen, Rob and Russell for their comments. The changes since
version 1 are:

	- Fixed compilation breakage when arch timers are not configured
	- Use common definitions for mult/shift between C and asm

Tested on A15 and A9 vexpress platforms (i.e. with and without the
architected timers).

All comments welcome,

Will


Will Deacon (2):
  ARM: arch timer: implement read_current_timer and get_cycles
  ARM: delay: allow timer-based delay implementation to be selected

 arch/arm/include/asm/arch_timer.h      |    3 +
 arch/arm/include/asm/delay.h           |   32 ++++++++++++---
 arch/arm/include/asm/timex.h           |   10 +++--
 arch/arm/kernel/arch_timer.c           |   11 +++++
 arch/arm/kernel/armksyms.c             |    3 +-
 arch/arm/lib/Makefile                  |    2 +-
 arch/arm/lib/{delay.S => delay-loop.S} |   20 ++++-----
 arch/arm/lib/delay.c                   |   66 ++++++++++++++++++++++++++++++++
 arch/arm/mach-sa1100/sleep.S           |    8 ++--
 9 files changed, 126 insertions(+), 29 deletions(-)
 rename arch/arm/lib/{delay.S => delay-loop.S} (81%)
 create mode 100644 arch/arm/lib/delay.c

-- 
1.7.4.1

^ permalink raw reply	[flat|nested] 38+ messages in thread

* [PATCH v2 1/2] ARM: arch timer: implement read_current_timer and get_cycles
  2012-06-29 17:33 [PATCH v2 0/2] Use architected timers for delay loop Will Deacon
@ 2012-06-29 17:33 ` Will Deacon
  2012-07-02 19:14   ` Stephen Boyd
  2012-07-05 12:35   ` Shinya Kuribayashi
  2012-06-29 17:33 ` [PATCH v2 2/2] ARM: delay: allow timer-based delay implementation to be selected Will Deacon
  1 sibling, 2 replies; 38+ messages in thread
From: Will Deacon @ 2012-06-29 17:33 UTC (permalink / raw)
  To: linux-arm-kernel

This patch implements read_current_timer using the architected timers
when they are selected via CONFIG_ARM_ARCH_TIMER. If they are detected
not to be usable at runtime, we return -ENXIO to the caller.

Furthermore, if read_current_timer is exported then we can implement
get_cycles in terms of it for use as both an entropy source and for
implementing __udelay and friends.

Signed-off-by: Will Deacon <will.deacon@arm.com>
---
 arch/arm/include/asm/arch_timer.h |    3 +++
 arch/arm/include/asm/timex.h      |   10 ++++++----
 arch/arm/kernel/arch_timer.c      |    8 ++++++++
 3 files changed, 17 insertions(+), 4 deletions(-)

diff --git a/arch/arm/include/asm/arch_timer.h b/arch/arm/include/asm/arch_timer.h
index ed2e95d..62e7547 100644
--- a/arch/arm/include/asm/arch_timer.h
+++ b/arch/arm/include/asm/arch_timer.h
@@ -1,7 +1,10 @@
 #ifndef __ASMARM_ARCH_TIMER_H
 #define __ASMARM_ARCH_TIMER_H
 
+#include <asm/errno.h>
+
 #ifdef CONFIG_ARM_ARCH_TIMER
+#define ARCH_HAS_READ_CURRENT_TIMER
 int arch_timer_of_register(void);
 int arch_timer_sched_clock_init(void);
 #else
diff --git a/arch/arm/include/asm/timex.h b/arch/arm/include/asm/timex.h
index 3be8de3..ce11944 100644
--- a/arch/arm/include/asm/timex.h
+++ b/arch/arm/include/asm/timex.h
@@ -12,13 +12,15 @@
 #ifndef _ASMARM_TIMEX_H
 #define _ASMARM_TIMEX_H
 
+#include <asm/arch_timer.h>
 #include <mach/timex.h>
 
 typedef unsigned long cycles_t;
 
-static inline cycles_t get_cycles (void)
-{
-	return 0;
-}
+#ifdef ARCH_HAS_READ_CURRENT_TIMER
+#define get_cycles()	({ cycles_t c; read_current_timer(&c) ? 0 : c; })
+#else
+#define get_cycles()	(0)
+#endif
 
 #endif
diff --git a/arch/arm/kernel/arch_timer.c b/arch/arm/kernel/arch_timer.c
index dd58035..dbbeec4 100644
--- a/arch/arm/kernel/arch_timer.c
+++ b/arch/arm/kernel/arch_timer.c
@@ -223,6 +223,14 @@ static cycle_t arch_counter_read(struct clocksource *cs)
 	return arch_counter_get_cntpct();
 }
 
+int read_current_timer(unsigned long *timer_val)
+{
+	if (!arch_timer_rate)
+		return -ENXIO;
+	*timer_val = arch_counter_get_cntpct();
+	return 0;
+}
+
 static struct clocksource clocksource_counter = {
 	.name	= "arch_sys_counter",
 	.rating	= 400,
-- 
1.7.4.1

^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH v2 2/2] ARM: delay: allow timer-based delay implementation to be selected
  2012-06-29 17:33 [PATCH v2 0/2] Use architected timers for delay loop Will Deacon
  2012-06-29 17:33 ` [PATCH v2 1/2] ARM: arch timer: implement read_current_timer and get_cycles Will Deacon
@ 2012-06-29 17:33 ` Will Deacon
  2012-07-02 19:14   ` Stephen Boyd
                     ` (3 more replies)
  1 sibling, 4 replies; 38+ messages in thread
From: Will Deacon @ 2012-06-29 17:33 UTC (permalink / raw)
  To: linux-arm-kernel

This patch allows a timer-based delay implementation to be selected by
switching the delay routines over to use get_cycles, which is
implemented in terms of read_current_timer. This further allows us to
skip the loop calibration and have a consistent delay function in the
face of core frequency scaling.

To avoid the pain of dealing with memory-mapped counters, this
implementation uses the co-processor interface to the architected timers
when they are available. The previous loop-based implementation is
kept around for CPUs without the architected timers and we retain both
the maximum delay (2ms) and the corresponding conversion factors for
determining the number of loops required for a given interval. Since the
indirection of the timer routines will only work when called from C,
the sa1100 sleep routines are modified to branch to the loop-based delay
functions directly.

Signed-off-by: Will Deacon <will.deacon@arm.com>
---
 arch/arm/include/asm/delay.h           |   32 ++++++++++++---
 arch/arm/kernel/arch_timer.c           |    3 +
 arch/arm/kernel/armksyms.c             |    3 +-
 arch/arm/lib/Makefile                  |    2 +-
 arch/arm/lib/{delay.S => delay-loop.S} |   20 ++++-----
 arch/arm/lib/delay.c                   |   66 ++++++++++++++++++++++++++++++++
 arch/arm/mach-sa1100/sleep.S           |    8 ++--
 7 files changed, 109 insertions(+), 25 deletions(-)
 rename arch/arm/lib/{delay.S => delay-loop.S} (81%)
 create mode 100644 arch/arm/lib/delay.c

diff --git a/arch/arm/include/asm/delay.h b/arch/arm/include/asm/delay.h
index b2deda1..dc61451 100644
--- a/arch/arm/include/asm/delay.h
+++ b/arch/arm/include/asm/delay.h
@@ -6,9 +6,22 @@
 #ifndef __ASM_ARM_DELAY_H
 #define __ASM_ARM_DELAY_H
 
+#include <asm/memory.h>
 #include <asm/param.h>	/* HZ */
 
-extern void __delay(int loops);
+#define MAX_UDELAY_MS	2
+#define UDELAY_MULT	((UL(2199023) * HZ) >> 11)
+#define UDELAY_SHIFT	30
+
+#ifndef __ASSEMBLY__
+
+extern struct arm_delay_ops {
+	void (*delay)(unsigned long);
+	void (*const_udelay)(unsigned long);
+	void (*udelay)(unsigned long);
+} arm_delay_ops;
+
+#define __delay(n)		arm_delay_ops.delay(n)
 
 /*
  * This function intentionally does not exist; if you see references to
@@ -23,22 +36,27 @@ extern void __bad_udelay(void);
  * division by multiplication: you don't have to worry about
  * loss of precision.
  *
- * Use only for very small delays ( < 1 msec).  Should probably use a
+ * Use only for very small delays ( < 2 msec).  Should probably use a
  * lookup table, really, as the multiplications take much too long with
  * short delays.  This is a "reasonable" implementation, though (and the
  * first constant multiplications gets optimized away if the delay is
  * a constant)
  */
-extern void __udelay(unsigned long usecs);
-extern void __const_udelay(unsigned long);
-
-#define MAX_UDELAY_MS 2
+#define __udelay(n)		arm_delay_ops.udelay(n)
+#define __const_udelay(n)	arm_delay_ops.const_udelay(n)
 
 #define udelay(n)							\
 	(__builtin_constant_p(n) ?					\
 	  ((n) > (MAX_UDELAY_MS * 1000) ? __bad_udelay() :		\
-			__const_udelay((n) * ((2199023U*HZ)>>11))) :	\
+			__const_udelay((n) * UDELAY_MULT)) :		\
 	  __udelay(n))
 
+/* Loop-based definitions for assembly code. */
+extern void __loop_delay(unsigned long loops);
+extern void __loop_udelay(unsigned long usecs);
+extern void __loop_const_udelay(unsigned long);
+
+#endif /* __ASSEMBLY__ */
+
 #endif /* defined(_ARM_DELAY_H) */
 
diff --git a/arch/arm/kernel/arch_timer.c b/arch/arm/kernel/arch_timer.c
index dbbeec4..675cee0 100644
--- a/arch/arm/kernel/arch_timer.c
+++ b/arch/arm/kernel/arch_timer.c
@@ -32,6 +32,8 @@ static int arch_timer_ppi2;
 
 static struct clock_event_device __percpu **arch_timer_evt;
 
+extern void init_current_timer_delay(unsigned long freq);
+
 /*
  * Architected system timer support.
  */
@@ -304,6 +306,7 @@ static int __init arch_timer_register(void)
 	if (err)
 		goto out_free_irq;
 
+	init_current_timer_delay(arch_timer_rate);
 	return 0;
 
 out_free_irq:
diff --git a/arch/arm/kernel/armksyms.c b/arch/arm/kernel/armksyms.c
index b57c75e..7196228 100644
--- a/arch/arm/kernel/armksyms.c
+++ b/arch/arm/kernel/armksyms.c
@@ -49,8 +49,7 @@ extern void __aeabi_ulcmp(void);
 extern void fpundefinstr(void);
 
 	/* platform dependent support */
-EXPORT_SYMBOL(__udelay);
-EXPORT_SYMBOL(__const_udelay);
+EXPORT_SYMBOL(arm_delay_ops);
 
 	/* networking */
 EXPORT_SYMBOL(csum_partial);
diff --git a/arch/arm/lib/Makefile b/arch/arm/lib/Makefile
index 992769a..b621114 100644
--- a/arch/arm/lib/Makefile
+++ b/arch/arm/lib/Makefile
@@ -6,7 +6,7 @@
 
 lib-y		:= backtrace.o changebit.o csumipv6.o csumpartial.o   \
 		   csumpartialcopy.o csumpartialcopyuser.o clearbit.o \
-		   delay.o findbit.o memchr.o memcpy.o		      \
+		   delay.o delay-loop.o findbit.o memchr.o memcpy.o   \
 		   memmove.o memset.o memzero.o setbit.o              \
 		   strncpy_from_user.o strnlen_user.o                 \
 		   strchr.o strrchr.o                                 \
diff --git a/arch/arm/lib/delay.S b/arch/arm/lib/delay-loop.S
similarity index 81%
rename from arch/arm/lib/delay.S
rename to arch/arm/lib/delay-loop.S
index 3c9a05c..36b668d 100644
--- a/arch/arm/lib/delay.S
+++ b/arch/arm/lib/delay-loop.S
@@ -9,11 +9,11 @@
  */
 #include <linux/linkage.h>
 #include <asm/assembler.h>
-#include <asm/param.h>
+#include <asm/delay.h>
 		.text
 
 .LC0:		.word	loops_per_jiffy
-.LC1:		.word	(2199023*HZ)>>11
+.LC1:		.word	UDELAY_MULT
 
 /*
  * r0  <= 2000
@@ -21,10 +21,10 @@
  * HZ  <= 1000
  */
 
-ENTRY(__udelay)
+ENTRY(__loop_udelay)
 		ldr	r2, .LC1
 		mul	r0, r2, r0
-ENTRY(__const_udelay)				@ 0 <= r0 <= 0x7fffff06
+ENTRY(__loop_const_udelay)			@ 0 <= r0 <= 0x7fffff06
 		mov	r1, #-1
 		ldr	r2, .LC0
 		ldr	r2, [r2]		@ max = 0x01ffffff
@@ -39,12 +39,10 @@ ENTRY(__const_udelay)				@ 0 <= r0 <= 0x7fffff06
 
 /*
  * loops = r0 * HZ * loops_per_jiffy / 1000000
- *
- * Oh, if only we had a cycle counter...
  */
 
 @ Delay routine
-ENTRY(__delay)
+ENTRY(__loop_delay)
 		subs	r0, r0, #1
 #if 0
 		movls	pc, lr
@@ -62,8 +60,8 @@ ENTRY(__delay)
 		movls	pc, lr
 		subs	r0, r0, #1
 #endif
-		bhi	__delay
+		bhi	__loop_delay
 		mov	pc, lr
-ENDPROC(__udelay)
-ENDPROC(__const_udelay)
-ENDPROC(__delay)
+ENDPROC(__loop_udelay)
+ENDPROC(__loop_const_udelay)
+ENDPROC(__loop_delay)
diff --git a/arch/arm/lib/delay.c b/arch/arm/lib/delay.c
new file mode 100644
index 0000000..e1030e1
--- /dev/null
+++ b/arch/arm/lib/delay.c
@@ -0,0 +1,66 @@
+/*
+ * Delay loops based on the OpenRISC implementation.
+ *
+ * Copyright (C) 2012 ARM Limited
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
+ *
+ * Author: Will Deacon <will.deacon@arm.com>
+ */
+
+#include <linux/delay.h>
+#include <linux/init.h>
+#include <linux/kernel.h>
+#include <linux/module.h>
+#include <linux/timex.h>
+
+/*
+ * Default to the loop-based delay implementation.
+ */
+struct arm_delay_ops arm_delay_ops = {
+	.delay		= __loop_delay,
+	.const_udelay	= __loop_const_udelay,
+	.udelay		= __loop_udelay,
+};
+
+#ifdef ARCH_HAS_READ_CURRENT_TIMER
+static void __timer_delay(unsigned long cycles)
+{
+	cycles_t start = get_cycles();
+
+	while ((get_cycles() - start) < cycles)
+		cpu_relax();
+}
+
+static void __timer_const_udelay(unsigned long xloops)
+{
+	unsigned long long loops = xloops;
+	loops *= loops_per_jiffy;
+	__timer_delay(loops >> UDELAY_SHIFT);
+}
+
+static void __timer_udelay(unsigned long usecs)
+{
+	__timer_const_udelay(usecs * UDELAY_MULT);
+}
+
+void __init init_current_timer_delay(unsigned long freq)
+{
+	pr_info("Switching to timer-based delay loop\n");
+	lpj_fine			= freq / HZ;
+	arm_delay_ops.delay		= __timer_delay;
+	arm_delay_ops.const_udelay	= __timer_const_udelay;
+	arm_delay_ops.udelay		= __timer_udelay;
+}
+#endif
diff --git a/arch/arm/mach-sa1100/sleep.S b/arch/arm/mach-sa1100/sleep.S
index 30cc672..8586374 100644
--- a/arch/arm/mach-sa1100/sleep.S
+++ b/arch/arm/mach-sa1100/sleep.S
@@ -38,9 +38,9 @@ ENTRY(sa1100_finish_suspend)
 	orr     r4, r4, #MDREFR_K1DB2
 	ldr	r5, =PPCR
 
-	@ Pre-load __udelay into the I-cache
+	@ Pre-load __loop_udelay into the I-cache
 	mov	r0, #1
-	bl	__udelay
+	bl	__loop_udelay
 	mov	r0, r0
 
 	@ The following must all exist in a single cache line to
@@ -53,11 +53,11 @@ ENTRY(sa1100_finish_suspend)
 	@ delay 90us and set CPU PLL to lowest speed
 	@ fixes resume problem on high speed SA1110
 	mov	r0, #90
-	bl	__udelay
+	bl	__loop_udelay
 	mov	r1, #0
 	str	r1, [r5]
 	mov	r0, #90
-	bl	__udelay
+	bl	__loop_udelay
 
 	/*
 	 * SA1110 SDRAM controller workaround.  register values:
-- 
1.7.4.1

^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH v2 1/2] ARM: arch timer: implement read_current_timer and get_cycles
  2012-06-29 17:33 ` [PATCH v2 1/2] ARM: arch timer: implement read_current_timer and get_cycles Will Deacon
@ 2012-07-02 19:14   ` Stephen Boyd
  2012-07-05 12:35   ` Shinya Kuribayashi
  1 sibling, 0 replies; 38+ messages in thread
From: Stephen Boyd @ 2012-07-02 19:14 UTC (permalink / raw)
  To: linux-arm-kernel

On 06/29/12 10:33, Will Deacon wrote:
> This patch implements read_current_timer using the architected timers
> when they are selected via CONFIG_ARM_ARCH_TIMER. If they are detected
> not to be usable at runtime, we return -ENXIO to the caller.
>
> Furthermore, if read_current_timer is exported then we can implement
> get_cycles in terms of it for use as both an entropy source and for
> implementing __udelay and friends.
>
> Signed-off-by: Will Deacon <will.deacon@arm.com>

Reviewed-by: Stephen Boyd <sboyd@codeaurora.org>

-- 
Sent by an employee of the Qualcomm Innovation Center, Inc.
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum.

^ permalink raw reply	[flat|nested] 38+ messages in thread

* [PATCH v2 2/2] ARM: delay: allow timer-based delay implementation to be selected
  2012-06-29 17:33 ` [PATCH v2 2/2] ARM: delay: allow timer-based delay implementation to be selected Will Deacon
@ 2012-07-02 19:14   ` Stephen Boyd
  2012-07-02 21:53     ` Will Deacon
  2012-07-03 12:09   ` Shinya Kuribayashi
                     ` (2 subsequent siblings)
  3 siblings, 1 reply; 38+ messages in thread
From: Stephen Boyd @ 2012-07-02 19:14 UTC (permalink / raw)
  To: linux-arm-kernel

On 06/29/12 10:33, Will Deacon wrote:
> This patch allows a timer-based delay implementation to be selected by
> switching the delay routines over to use get_cycles, which is
> implemented in terms of read_current_timer. This further allows us to
> skip the loop calibration and have a consistent delay function in the
> face of core frequency scaling.
>
> To avoid the pain of dealing with memory-mapped counters, this
> implementation uses the co-processor interface to the architected timers
> when they are available. The previous loop-based implementation is
> kept around for CPUs without the architected timers and we retain both
> the maximum delay (2ms) and the corresponding conversion factors for
> determining the number of loops required for a given interval. Since the
> indirection of the timer routines will only work when called from C,
> the sa1100 sleep routines are modified to branch to the loop-based delay
> functions directly.
>
> Signed-off-by: Will Deacon <will.deacon@arm.com>

Reviewed-by: Stephen Boyd <sboyd@codeaurora.org>

-- 
Sent by an employee of the Qualcomm Innovation Center, Inc.
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum.

^ permalink raw reply	[flat|nested] 38+ messages in thread

* [PATCH v2 2/2] ARM: delay: allow timer-based delay implementation to be selected
  2012-07-02 19:14   ` Stephen Boyd
@ 2012-07-02 21:53     ` Will Deacon
  0 siblings, 0 replies; 38+ messages in thread
From: Will Deacon @ 2012-07-02 21:53 UTC (permalink / raw)
  To: linux-arm-kernel

On Mon, Jul 02, 2012 at 08:14:57PM +0100, Stephen Boyd wrote:
> On 06/29/12 10:33, Will Deacon wrote:
> > This patch allows a timer-based delay implementation to be selected by
> > switching the delay routines over to use get_cycles, which is
> > implemented in terms of read_current_timer. This further allows us to
> > skip the loop calibration and have a consistent delay function in the
> > face of core frequency scaling.
> >
> > To avoid the pain of dealing with memory-mapped counters, this
> > implementation uses the co-processor interface to the architected timers
> > when they are available. The previous loop-based implementation is
> > kept around for CPUs without the architected timers and we retain both
> > the maximum delay (2ms) and the corresponding conversion factors for
> > determining the number of loops required for a given interval. Since the
> > indirection of the timer routines will only work when called from C,
> > the sa1100 sleep routines are modified to branch to the loop-based delay
> > functions directly.
> >
> > Signed-off-by: Will Deacon <will.deacon@arm.com>
> 
> Reviewed-by: Stephen Boyd <sboyd@codeaurora.org>

Thanks for the reviews Stephen. I'll stick this in the patch system this
week.

Cheers,

Will

^ permalink raw reply	[flat|nested] 38+ messages in thread

* [PATCH v2 2/2] ARM: delay: allow timer-based delay implementation to be selected
  2012-06-29 17:33 ` [PATCH v2 2/2] ARM: delay: allow timer-based delay implementation to be selected Will Deacon
  2012-07-02 19:14   ` Stephen Boyd
@ 2012-07-03 12:09   ` Shinya Kuribayashi
  2012-07-04 15:36     ` Will Deacon
  2012-07-05 13:06   ` Shinya Kuribayashi
  2012-07-12  7:33   ` Shinya Kuribayashi
  3 siblings, 1 reply; 38+ messages in thread
From: Shinya Kuribayashi @ 2012-07-03 12:09 UTC (permalink / raw)
  To: linux-arm-kernel

Hello,

On 6/30/2012 2:33 AM, Will Deacon wrote:
> implemented in terms of read_current_timer. This further allows us to
> skip the loop calibration and have a consistent delay function in the
> face of core frequency scaling.
[...]
> diff --git a/arch/arm/lib/delay.c b/arch/arm/lib/delay.c
> new file mode 100644
> index 0000000..e1030e1
> --- /dev/null
> +++ b/arch/arm/lib/delay.c
> @@ -0,0 +1,66 @@
[...]
> +void __init init_current_timer_delay(unsigned long freq)
> +{
> +	pr_info("Switching to timer-based delay loop\n");
> +	lpj_fine			= freq / HZ;

Looking at init/calibrate.c, 'lpj_fine' can save the first loop
calibration, but does not save the second and subsequent calls to
calibrate_delay().

Magnus first noticed this, and changed to use 'preset_lpj' instead.
See the commit 173bf69a7a (ARM / mach-shmobile: Use preset_lpj with
calibrate_delay(), 2012-05-10).

If we use 'lpj_fine' for this, we need to skip secondary CPU calibration
explicitly in another way, something like this:

http://lists.infradead.org/pipermail/linux-arm-kernel/2011-January/039506.html
[PATCH 5/5] ARM: smp: Skip secondary cpu calibration to speed-up boot

I've been using Stephen's read_current_timer implemented + lpj_fine +
Santosh's skip secondary calibration combo for over a year, because I
wanted to save 'preset_lpj' for the original purpose, FWIW.


  Shinya

> +	arm_delay_ops.delay		= __timer_delay;
> +	arm_delay_ops.const_udelay	= __timer_const_udelay;
> +	arm_delay_ops.udelay		= __timer_udelay;
> +}
> +#endif

^ permalink raw reply	[flat|nested] 38+ messages in thread

* [PATCH v2 2/2] ARM: delay: allow timer-based delay implementation to be selected
  2012-07-03 12:09   ` Shinya Kuribayashi
@ 2012-07-04 15:36     ` Will Deacon
  2012-07-05 12:12       ` Shinya Kuribayashi
  0 siblings, 1 reply; 38+ messages in thread
From: Will Deacon @ 2012-07-04 15:36 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, Jul 03, 2012 at 01:09:31PM +0100, Shinya Kuribayashi wrote:
> Hello,

Hi Shinya,

> Looking at init/calibrate.c, 'lpj_fine' can save the first loop
> calibration, but does not save the second and subsequent calls to
> calibrate_delay().
> 
> Magnus first noticed this, and changed to use 'preset_lpj' instead.
> See the commit 173bf69a7a (ARM / mach-shmobile: Use preset_lpj with
> calibrate_delay(), 2012-05-10).
> 
> If we use 'lpj_fine' for this, we need to skip secondary CPU calibration
> explicitly in another way, something like this:
> 
> http://lists.infradead.org/pipermail/linux-arm-kernel/2011-January/039506.html
> [PATCH 5/5] ARM: smp: Skip secondary cpu calibration to speed-up boot

How about keeping it simple like this:?


diff --git a/arch/arm/lib/delay.c b/arch/arm/lib/delay.c
index e1030e1..84bb5da 100644
--- a/arch/arm/lib/delay.c
+++ b/arch/arm/lib/delay.c
@@ -63,4 +63,9 @@ void __init init_current_timer_delay(unsigned long freq)
        arm_delay_ops.const_udelay      = __timer_const_udelay;
        arm_delay_ops.udelay            = __timer_udelay;
 }
+
+unsigned long __cpuinit calibrate_delay_is_known(void)
+{
+       return lpj_fine ?: 0;
+}
 #endif


Will

^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH v2 2/2] ARM: delay: allow timer-based delay implementation to be selected
  2012-07-04 15:36     ` Will Deacon
@ 2012-07-05 12:12       ` Shinya Kuribayashi
  2012-07-05 12:56         ` Will Deacon
  0 siblings, 1 reply; 38+ messages in thread
From: Shinya Kuribayashi @ 2012-07-05 12:12 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Will,

On 7/5/2012 12:36 AM, Will Deacon wrote:
>> If we use 'lpj_fine' for this, we need to skip secondary CPU calibration
>> explicitly in another way, something like this:
>>
>> http://lists.infradead.org/pipermail/linux-arm-kernel/2011-January/039506.html
>> [PATCH 5/5] ARM: smp: Skip secondary cpu calibration to speed-up boot
> 
> How about keeping it simple like this:?
> 
> 
> diff --git a/arch/arm/lib/delay.c b/arch/arm/lib/delay.c
> index e1030e1..84bb5da 100644
> --- a/arch/arm/lib/delay.c
> +++ b/arch/arm/lib/delay.c
> @@ -63,4 +63,9 @@ void __init init_current_timer_delay(unsigned long freq)
>         arm_delay_ops.const_udelay      = __timer_const_udelay;
>         arm_delay_ops.udelay            = __timer_udelay;
>  }
> +
> +unsigned long __cpuinit calibrate_delay_is_known(void)
> +{
> +       return lpj_fine ?: 0;
> +}
>  #endif

Thanks for the patch, looks lika a missing piece of CPU calibration
optimization for SMP platforms in the face of core frequency scaling.

Ok, I gave your patch a try (including above), and confirmed that:

* It works fine with non-arch_timer counter.  I'm using SH/R-Mobile
  devices, with a memory mapped I/O, 32-bit free-run up-counter
  running at 13MHz.

* Secondary CPU calibration gets skipped as expected.

* Your new timer-based delay works as before (loop-based one).  I've
  verified 10..1999-microsecond busy-wait with a reasonable accuracy
  (and confirmed that 2000+ usec gets rejected as intended).

By the way,

> +       return lpj_fine ?: 0;

Is there any difference with just

        return lpj_fine;

?

^ permalink raw reply	[flat|nested] 38+ messages in thread

* [PATCH v2 1/2] ARM: arch timer: implement read_current_timer and get_cycles
  2012-06-29 17:33 ` [PATCH v2 1/2] ARM: arch timer: implement read_current_timer and get_cycles Will Deacon
  2012-07-02 19:14   ` Stephen Boyd
@ 2012-07-05 12:35   ` Shinya Kuribayashi
  2012-07-05 12:59     ` Will Deacon
  1 sibling, 1 reply; 38+ messages in thread
From: Shinya Kuribayashi @ 2012-07-05 12:35 UTC (permalink / raw)
  To: linux-arm-kernel

Hi,

On 6/30/2012 2:33 AM, Will Deacon wrote:
>  arch/arm/include/asm/arch_timer.h |    3 +++
>  arch/arm/include/asm/timex.h      |   10 ++++++----
>  arch/arm/kernel/arch_timer.c      |    8 ++++++++
>  3 files changed, 17 insertions(+), 4 deletions(-)
> 
[...]
> diff --git a/arch/arm/include/asm/timex.h b/arch/arm/include/asm/timex.h
> index 3be8de3..ce11944 100644
> --- a/arch/arm/include/asm/timex.h
> +++ b/arch/arm/include/asm/timex.h
> @@ -12,13 +12,15 @@
>  #ifndef _ASMARM_TIMEX_H
>  #define _ASMARM_TIMEX_H
>  
> +#include <asm/arch_timer.h>
>  #include <mach/timex.h>
>  
>  typedef unsigned long cycles_t;
>  
> -static inline cycles_t get_cycles (void)
> -{
> -	return 0;
> -}
> +#ifdef ARCH_HAS_READ_CURRENT_TIMER
> +#define get_cycles()	({ cycles_t c; read_current_timer(&c) ? 0 : c; })
> +#else
> +#define get_cycles()	(0)
> +#endif
>  
>  #endif

Except <asm/arch_timer.h> inclusion, this change is not arch_timer-
dependent, but general parts for ARM, connecting read_current_timer()
and get_cycles().  Should be folded into patch 2/2.


  Shinya

^ permalink raw reply	[flat|nested] 38+ messages in thread

* [PATCH v2 2/2] ARM: delay: allow timer-based delay implementation to be selected
  2012-07-05 12:12       ` Shinya Kuribayashi
@ 2012-07-05 12:56         ` Will Deacon
  2012-07-05 16:51           ` Stephen Boyd
  0 siblings, 1 reply; 38+ messages in thread
From: Will Deacon @ 2012-07-05 12:56 UTC (permalink / raw)
  To: linux-arm-kernel

On Thu, Jul 05, 2012 at 01:12:14PM +0100, Shinya Kuribayashi wrote:
> Ok, I gave your patch a try (including above), and confirmed that:
> 
> * It works fine with non-arch_timer counter.  I'm using SH/R-Mobile
>   devices, with a memory mapped I/O, 32-bit free-run up-counter
>   running at 13MHz.
> 
> * Secondary CPU calibration gets skipped as expected.
> 
> * Your new timer-based delay works as before (loop-based one).  I've
>   verified 10..1999-microsecond busy-wait with a reasonable accuracy
>   (and confirmed that 2000+ usec gets rejected as intended).

Great, thanks! Can I add your tested-by please?

> By the way,
> 
> > +       return lpj_fine ?: 0;
> 
> Is there any difference with just
> 
>         return lpj_fine;

Of course, I'll make that change, cheers.

Stephen -- can I keep your reviewed-by with the additional function please?

Will

^ permalink raw reply	[flat|nested] 38+ messages in thread

* [PATCH v2 1/2] ARM: arch timer: implement read_current_timer and get_cycles
  2012-07-05 12:35   ` Shinya Kuribayashi
@ 2012-07-05 12:59     ` Will Deacon
  0 siblings, 0 replies; 38+ messages in thread
From: Will Deacon @ 2012-07-05 12:59 UTC (permalink / raw)
  To: linux-arm-kernel

On Thu, Jul 05, 2012 at 01:35:51PM +0100, Shinya Kuribayashi wrote:
> On 6/30/2012 2:33 AM, Will Deacon wrote:
> > +#ifdef ARCH_HAS_READ_CURRENT_TIMER
> > +#define get_cycles()	({ cycles_t c; read_current_timer(&c) ? 0 : c; })
> > +#else
> > +#define get_cycles()	(0)
> > +#endif
> >  
> >  #endif
> 
> Except <asm/arch_timer.h> inclusion, this change is not arch_timer-
> dependent, but general parts for ARM, connecting read_current_timer()
> and get_cycles().  Should be folded into patch 2/2.

I'm not so sure. Before this patch, nobody implements read_current_timer for
ARM as far as I can tell, so with the first user it makes sense to add this
definition. It's also used on its own outside of the new delay implementation
as an entropy source.

Will

^ permalink raw reply	[flat|nested] 38+ messages in thread

* [PATCH v2 2/2] ARM: delay: allow timer-based delay implementation to be selected
  2012-06-29 17:33 ` [PATCH v2 2/2] ARM: delay: allow timer-based delay implementation to be selected Will Deacon
  2012-07-02 19:14   ` Stephen Boyd
  2012-07-03 12:09   ` Shinya Kuribayashi
@ 2012-07-05 13:06   ` Shinya Kuribayashi
  2012-07-05 14:15     ` Will Deacon
  2012-07-12  7:33   ` Shinya Kuribayashi
  3 siblings, 1 reply; 38+ messages in thread
From: Shinya Kuribayashi @ 2012-07-05 13:06 UTC (permalink / raw)
  To: linux-arm-kernel

On 6/30/2012 2:33 AM, Will Deacon wrote:
> diff --git a/arch/arm/kernel/arch_timer.c b/arch/arm/kernel/arch_timer.c
> index dbbeec4..675cee0 100644
> --- a/arch/arm/kernel/arch_timer.c
> +++ b/arch/arm/kernel/arch_timer.c
> @@ -32,6 +32,8 @@ static int arch_timer_ppi2;
>  
>  static struct clock_event_device __percpu **arch_timer_evt;
>  
> +extern void init_current_timer_delay(unsigned long freq);
> +

This is needed for all users of the new timer-baed delay routine.
How about moving this extern declaration to more generic place, in
<asm/timex.h>?

>  /*
>   * Architected system timer support.
>   */
> @@ -304,6 +306,7 @@ static int __init arch_timer_register(void)
>  	if (err)
>  		goto out_free_irq;
>  
> +	init_current_timer_delay(arch_timer_rate);
>  	return 0;
>  
>  out_free_irq:

This is arch_timer-land change.  I'd prefer to see general part changes
first, followed by arch_timer ones making use of them.

> diff --git a/arch/arm/lib/delay.c b/arch/arm/lib/delay.c
> new file mode 100644
> index 0000000..e1030e1
> --- /dev/null
> +++ b/arch/arm/lib/delay.c
> @@ -0,0 +1,66 @@
[...]
> +void __init init_current_timer_delay(unsigned long freq)
> +{
> +	pr_info("Switching to timer-based delay loop\n");
> +	lpj_fine			= freq / HZ;

I've been preparing for lpj_fine with rounding it up by HZ like this:

        lpj_fine = (freq + (HZ/2)) / HZ;

It's not necessarily required, but just in case.

> +	arm_delay_ops.delay		= __timer_delay;
> +	arm_delay_ops.const_udelay	= __timer_const_udelay;
> +	arm_delay_ops.udelay		= __timer_udelay;
> +}
> +#endif

I modified pr_info() at the end of function to print (lpj=%lu),

+	pr_info("Switching to timer-based delay loop (lpj=%lu)\n", lpj_fine);

but turned out to be printed twice, no-so-great:

sched_clock: 32 bits at 13MHz, resolution 76ns, wraps every 330382ms
cmt: CMCLKE e6131000 mapped to e6131000
sh_cmt.11 at 0xe6130100 probed with CMCLKE
sh_cmt.12 at 0xe6130200 probed with CMCLKE
sh_cmt.11 used for periodic clock events
sh_cmt.11 used for clock events
Switching to timer-based delay loop (lpj=101562) <<<
Console: colour dummy device 80x30
Calibrating delay loop (skipped), value calculated using timer frequency.. 26.04 BogoMIPS (lpj=101562) <<<

Anyway,

Tested-off-by: Shinya Kuribayashi <shinya.kuribayashi.px@renesas.com>

Thank for the work, hope this to be merged soon!


  Shinya

^ permalink raw reply	[flat|nested] 38+ messages in thread

* [PATCH v2 2/2] ARM: delay: allow timer-based delay implementation to be selected
  2012-07-05 13:06   ` Shinya Kuribayashi
@ 2012-07-05 14:15     ` Will Deacon
  0 siblings, 0 replies; 38+ messages in thread
From: Will Deacon @ 2012-07-05 14:15 UTC (permalink / raw)
  To: linux-arm-kernel

On Thu, Jul 05, 2012 at 02:06:13PM +0100, Shinya Kuribayashi wrote:
> On 6/30/2012 2:33 AM, Will Deacon wrote:
> > +extern void init_current_timer_delay(unsigned long freq);
> > +
> 
> This is needed for all users of the new timer-baed delay routine.
> How about moving this extern declaration to more generic place, in
> <asm/timex.h>?

I mentioned this before in the original posting. Currently, only the
architected timers are using this interface and I have some reservations
about using memory-mapped timers instead (although there may be specific
instances where they're suitable).

In this case, I'd rather see a registration mechanism for the current_timer
where we take a pointer to read_current_timer and register the delay at the
same time. That can be added when somebody else decides they want to make
use of the new delay routines.

> Tested-off-by: Shinya Kuribayashi <shinya.kuribayashi.px@renesas.com>

Thanks!

Will

^ permalink raw reply	[flat|nested] 38+ messages in thread

* [PATCH v2 2/2] ARM: delay: allow timer-based delay implementation to be selected
  2012-07-05 12:56         ` Will Deacon
@ 2012-07-05 16:51           ` Stephen Boyd
  0 siblings, 0 replies; 38+ messages in thread
From: Stephen Boyd @ 2012-07-05 16:51 UTC (permalink / raw)
  To: linux-arm-kernel

On 07/05/12 05:56, Will Deacon wrote:
>
> Stephen -- can I keep your reviewed-by with the additional function please?
>
>

Yes, that's fine.

-- 
Sent by an employee of the Qualcomm Innovation Center, Inc.
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum.

^ permalink raw reply	[flat|nested] 38+ messages in thread

* [PATCH v2 2/2] ARM: delay: allow timer-based delay implementation to be selected
  2012-06-29 17:33 ` [PATCH v2 2/2] ARM: delay: allow timer-based delay implementation to be selected Will Deacon
                     ` (2 preceding siblings ...)
  2012-07-05 13:06   ` Shinya Kuribayashi
@ 2012-07-12  7:33   ` Shinya Kuribayashi
  2012-07-12  8:44     ` Will Deacon
  3 siblings, 1 reply; 38+ messages in thread
From: Shinya Kuribayashi @ 2012-07-12  7:33 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Will, Stephen,

On 6/30/2012 2:33 AM, Will Deacon wrote:
> +void __init init_current_timer_delay(unsigned long freq)
> +{
> +	pr_info("Switching to timer-based delay loop\n");
> +	lpj_fine			= freq / HZ;
> +	arm_delay_ops.delay		= __timer_delay;
> +	arm_delay_ops.const_udelay	= __timer_const_udelay;
> +	arm_delay_ops.udelay		= __timer_udelay;
> +}

Once this function is processed, the udelay() behavior changes
_immediately_ from loop-based delay to timer-based one, without waiting
for 'loops_per_jiffy' itself being corrected in calibrate_delay().

As a result, actual udelay()s may be toooo long than expected, in
particular udelay()s used between init_current_timer_delay() and
calibrate_delay().  It's unlikely be short, as the frequency of a
counter for read_current_timer is typically slower than CPU frequency.

I'm not sure udelay()s used in this period are legitimate, but if it's
there we'll in trouble somehow.  This is not so great.

If this assumption is correct, it might be better to adjust / correct
loops_per_jiffy itself along with lpj_fine, prior to calibrate_delay().

diff --git a/arch/arm/lib/delay.c b/arch/arm/lib/delay.c
index cb881c3..9065d30 100644
--- a/arch/arm/lib/delay.c
+++ b/arch/arm/lib/delay.c
@@ -57,7 +57,7 @@ static void __timer_udelay(unsigned long usecs)
 
 void __init init_current_timer_delay(unsigned long freq)
 {
-	lpj_fine			= (freq + HZ/2) / HZ;
+	loops_per_jiffy = lpj_fine	= (freq + HZ/2) / HZ;
 	arm_delay_ops.delay		= __timer_delay;
 	arm_delay_ops.const_udelay	= __timer_const_udelay;
 	arm_delay_ops.udelay		= __timer_udelay;


What do you think?  Am I missing something?

  Shinya

^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH v2 2/2] ARM: delay: allow timer-based delay implementation to be selected
  2012-07-12  7:33   ` Shinya Kuribayashi
@ 2012-07-12  8:44     ` Will Deacon
  2012-07-12  9:35       ` Shinya Kuribayashi
  0 siblings, 1 reply; 38+ messages in thread
From: Will Deacon @ 2012-07-12  8:44 UTC (permalink / raw)
  To: linux-arm-kernel

On Thu, Jul 12, 2012 at 08:33:06AM +0100, Shinya Kuribayashi wrote:
> Hi Will, Stephen,

Hi Shinya,

> On 6/30/2012 2:33 AM, Will Deacon wrote:
> > +void __init init_current_timer_delay(unsigned long freq)
> > +{
> > +	pr_info("Switching to timer-based delay loop\n");
> > +	lpj_fine			= freq / HZ;
> > +	arm_delay_ops.delay		= __timer_delay;
> > +	arm_delay_ops.const_udelay	= __timer_const_udelay;
> > +	arm_delay_ops.udelay		= __timer_udelay;
> > +}
> 
> Once this function is processed, the udelay() behavior changes
> _immediately_ from loop-based delay to timer-based one, without waiting
> for 'loops_per_jiffy' itself being corrected in calibrate_delay().
> 
> As a result, actual udelay()s may be toooo long than expected, in
> particular udelay()s used between init_current_timer_delay() and
> calibrate_delay().  It's unlikely be short, as the frequency of a
> counter for read_current_timer is typically slower than CPU frequency.

Surely using udelay before calibrate_delay_loop has been called is a
fundamental error?

Will

^ permalink raw reply	[flat|nested] 38+ messages in thread

* [PATCH v2 2/2] ARM: delay: allow timer-based delay implementation to be selected
  2012-07-12  8:44     ` Will Deacon
@ 2012-07-12  9:35       ` Shinya Kuribayashi
  2012-07-12 16:40         ` Stephen Boyd
  0 siblings, 1 reply; 38+ messages in thread
From: Shinya Kuribayashi @ 2012-07-12  9:35 UTC (permalink / raw)
  To: linux-arm-kernel

On 7/12/2012 5:44 PM, Will Deacon wrote:
>> On 6/30/2012 2:33 AM, Will Deacon wrote:
>>> +void __init init_current_timer_delay(unsigned long freq)
>>> +{
>>> +	pr_info("Switching to timer-based delay loop\n");
>>> +	lpj_fine			= freq / HZ;
>>> +	arm_delay_ops.delay		= __timer_delay;
>>> +	arm_delay_ops.const_udelay	= __timer_const_udelay;
>>> +	arm_delay_ops.udelay		= __timer_udelay;
>>> +}
>>
>> Once this function is processed, the udelay() behavior changes
>> _immediately_ from loop-based delay to timer-based one, without waiting
>> for 'loops_per_jiffy' itself being corrected in calibrate_delay().
>>
>> As a result, actual udelay()s may be toooo long than expected, in
>> particular udelay()s used between init_current_timer_delay() and
>> calibrate_delay().  It's unlikely be short, as the frequency of a
>> counter for read_current_timer is typically slower than CPU frequency.
> 
> Surely using udelay before calibrate_delay_loop has been called is a
> fundamental error?

Got it.  I'm just not confident about disallowing early use of udelay().

  Shinya

^ permalink raw reply	[flat|nested] 38+ messages in thread

* [PATCH v2 2/2] ARM: delay: allow timer-based delay implementation to be selected
  2012-07-12  9:35       ` Shinya Kuribayashi
@ 2012-07-12 16:40         ` Stephen Boyd
  2012-07-13  2:16           ` Shinya Kuribayashi
  0 siblings, 1 reply; 38+ messages in thread
From: Stephen Boyd @ 2012-07-12 16:40 UTC (permalink / raw)
  To: linux-arm-kernel

On 07/12/12 02:35, Shinya Kuribayashi wrote:
> On 7/12/2012 5:44 PM, Will Deacon wrote:
>>> On 6/30/2012 2:33 AM, Will Deacon wrote:
>>>> +void __init init_current_timer_delay(unsigned long freq)
>>>> +{
>>>> +	pr_info("Switching to timer-based delay loop\n");
>>>> +	lpj_fine			= freq / HZ;
>>>> +	arm_delay_ops.delay		= __timer_delay;
>>>> +	arm_delay_ops.const_udelay	= __timer_const_udelay;
>>>> +	arm_delay_ops.udelay		= __timer_udelay;
>>>> +}
>>> Once this function is processed, the udelay() behavior changes
>>> _immediately_ from loop-based delay to timer-based one, without waiting
>>> for 'loops_per_jiffy' itself being corrected in calibrate_delay().
>>>
>>> As a result, actual udelay()s may be toooo long than expected, in
>>> particular udelay()s used between init_current_timer_delay() and
>>> calibrate_delay().  It's unlikely be short, as the frequency of a
>>> counter for read_current_timer is typically slower than CPU frequency.
>> Surely using udelay before calibrate_delay_loop has been called is a
>> fundamental error?
> Got it.  I'm just not confident about disallowing early use of udelay().
>
>

I don't think it's an error. Instead you get a very large delay, similar
to what would happen if you called udelay() before calibrate_delay()
anyway (see the comment in init/main.c above loops_per_jiffy).

-- 
Sent by an employee of the Qualcomm Innovation Center, Inc.
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum.

^ permalink raw reply	[flat|nested] 38+ messages in thread

* [PATCH v2 2/2] ARM: delay: allow timer-based delay implementation to be selected
  2012-07-12 16:40         ` Stephen Boyd
@ 2012-07-13  2:16           ` Shinya Kuribayashi
  2012-07-13  8:57             ` Will Deacon
  0 siblings, 1 reply; 38+ messages in thread
From: Shinya Kuribayashi @ 2012-07-13  2:16 UTC (permalink / raw)
  To: linux-arm-kernel

On 7/13/2012 1:40 AM, Stephen Boyd wrote:
>>>> As a result, actual udelay()s may be toooo long than expected, in
>>>> particular udelay()s used between init_current_timer_delay() and
>>>> calibrate_delay().  It's unlikely be short, as the frequency of a
>>>> counter for read_current_timer is typically slower than CPU frequency.
>>> Surely using udelay before calibrate_delay_loop has been called is a
>>> fundamental error?
>> Got it.  I'm just not confident about disallowing early use of udelay().
>>
> 
> I don't think it's an error. Instead you get a very large delay, similar
> to what would happen if you called udelay() before calibrate_delay()
> anyway (see the comment in init/main.c above loops_per_jiffy).

Thanks, so I'd set up loops_per_jiffy early, along with lpj_fine in
init_current_timer_delay().

As a matter of fact, I do have early use of udelay()s in my tree :-)
Will give it a try for some time.
-- 
Shinya Kuribayashi
Renesas Electronics

^ permalink raw reply	[flat|nested] 38+ messages in thread

* [PATCH v2 2/2] ARM: delay: allow timer-based delay implementation to be selected
  2012-07-13  2:16           ` Shinya Kuribayashi
@ 2012-07-13  8:57             ` Will Deacon
  2012-07-13 10:48               ` Shilimkar, Santosh
  0 siblings, 1 reply; 38+ messages in thread
From: Will Deacon @ 2012-07-13  8:57 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, Jul 13, 2012 at 03:16:41AM +0100, Shinya Kuribayashi wrote:
> On 7/13/2012 1:40 AM, Stephen Boyd wrote:
> >>>> As a result, actual udelay()s may be toooo long than expected, in
> >>>> particular udelay()s used between init_current_timer_delay() and
> >>>> calibrate_delay().  It's unlikely be short, as the frequency of a
> >>>> counter for read_current_timer is typically slower than CPU frequency.
> >>> Surely using udelay before calibrate_delay_loop has been called is a
> >>> fundamental error?
> >> Got it.  I'm just not confident about disallowing early use of udelay().
> >>
> > 
> > I don't think it's an error. Instead you get a very large delay, similar
> > to what would happen if you called udelay() before calibrate_delay()
> > anyway (see the comment in init/main.c above loops_per_jiffy).

Interesting, I didn't notice it was initialised to 4k, so yes I suppose you
could make use of some sort of delay. I don't think it's necessarily `very
large' though -- anything ticking at over ~400KHz with HZ=100 would give you
a smaller delay.

> Thanks, so I'd set up loops_per_jiffy early, along with lpj_fine in
> init_current_timer_delay().

That should work, providing you can get a sensible initial estimate for
loops_per_jiffy.

Will

^ permalink raw reply	[flat|nested] 38+ messages in thread

* [PATCH v2 2/2] ARM: delay: allow timer-based delay implementation to be selected
  2012-07-13  8:57             ` Will Deacon
@ 2012-07-13 10:48               ` Shilimkar, Santosh
  2012-07-13 11:13                 ` Will Deacon
  0 siblings, 1 reply; 38+ messages in thread
From: Shilimkar, Santosh @ 2012-07-13 10:48 UTC (permalink / raw)
  To: linux-arm-kernel

Wiil,

On Fri, Jul 13, 2012 at 2:27 PM, Will Deacon <will.deacon@arm.com> wrote:
> On Fri, Jul 13, 2012 at 03:16:41AM +0100, Shinya Kuribayashi wrote:
>> On 7/13/2012 1:40 AM, Stephen Boyd wrote:
>> >>>> As a result, actual udelay()s may be toooo long than expected, in
>> >>>> particular udelay()s used between init_current_timer_delay() and
>> >>>> calibrate_delay().  It's unlikely be short, as the frequency of a
>> >>>> counter for read_current_timer is typically slower than CPU frequency.
>> >>> Surely using udelay before calibrate_delay_loop has been called is a
>> >>> fundamental error?
>> >> Got it.  I'm just not confident about disallowing early use of udelay().
>> >>
>> >
>> > I don't think it's an error. Instead you get a very large delay, similar
>> > to what would happen if you called udelay() before calibrate_delay()
>> > anyway (see the comment in init/main.c above loops_per_jiffy).
>
> Interesting, I didn't notice it was initialised to 4k, so yes I suppose you
> could make use of some sort of delay. I don't think it's necessarily `very
> large' though -- anything ticking at over ~400KHz with HZ=100 would give you
> a smaller delay.
>
>> Thanks, so I'd set up loops_per_jiffy early, along with lpj_fine in
>> init_current_timer_delay().
>
> That should work, providing you can get a sensible initial estimate for
> loops_per_jiffy.
>
Do you have an updated version of the patch ?

Regards,
Santosh

^ permalink raw reply	[flat|nested] 38+ messages in thread

* [PATCH v2 2/2] ARM: delay: allow timer-based delay implementation to be selected
  2012-07-13 10:48               ` Shilimkar, Santosh
@ 2012-07-13 11:13                 ` Will Deacon
  2012-07-13 12:04                   ` Shilimkar, Santosh
  2012-07-17  3:10                   ` Shinya Kuribayashi
  0 siblings, 2 replies; 38+ messages in thread
From: Will Deacon @ 2012-07-13 11:13 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, Jul 13, 2012 at 11:48:25AM +0100, Shilimkar, Santosh wrote:
> Wiil,

Hi Saantosh :),

> On Fri, Jul 13, 2012 at 2:27 PM, Will Deacon <will.deacon@arm.com> wrote:
> > That should work, providing you can get a sensible initial estimate for
> > loops_per_jiffy.
> >
> Do you have an updated version of the patch ?

I was anticipating that the platform would set the initial loops_per_jiffy
value if it requires udelays before loop calibration and the default of 4k
is wildly off.

If people need loops_per_jiffy to be updated at the same time as lpj_fine,
I can post that as a separate patch (below) as Russell has merged v2 of these
patches into his delay branch. That said, I'd certainly like to know if this
is actually a real problem (and can't be solved by choosing a compromise value
as the initial loops_per_jiffy). I think Shinya was doing some tests so
I'll wait to see how those went.

Will

---8<---

diff --git a/arch/arm/lib/delay.c b/arch/arm/lib/delay.c
index d6dacc6..395d5fb 100644
--- a/arch/arm/lib/delay.c
+++ b/arch/arm/lib/delay.c
@@ -59,6 +59,7 @@ void __init init_current_timer_delay(unsigned long freq)
 {
        pr_info("Switching to timer-based delay loop\n");
        lpj_fine                        = freq / HZ;
+       loops_per_jiffy                 = lpj_fine;
        arm_delay_ops.delay             = __timer_delay;
        arm_delay_ops.const_udelay      = __timer_const_udelay;
        arm_delay_ops.udelay            = __timer_udelay;

^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH v2 2/2] ARM: delay: allow timer-based delay implementation to be selected
  2012-07-13 11:13                 ` Will Deacon
@ 2012-07-13 12:04                   ` Shilimkar, Santosh
  2012-07-13 12:08                     ` Will Deacon
  2012-07-17  3:10                   ` Shinya Kuribayashi
  1 sibling, 1 reply; 38+ messages in thread
From: Shilimkar, Santosh @ 2012-07-13 12:04 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, Jul 13, 2012 at 4:43 PM, Will Deacon <will.deacon@arm.com> wrote:
> On Fri, Jul 13, 2012 at 11:48:25AM +0100, Shilimkar, Santosh wrote:
>> Wiil,
>
> Hi Saantosh :),
>
>> On Fri, Jul 13, 2012 at 2:27 PM, Will Deacon <will.deacon@arm.com> wrote:
>> > That should work, providing you can get a sensible initial estimate for
>> > loops_per_jiffy.
>> >
>> Do you have an updated version of the patch ?
>
> I was anticipating that the platform would set the initial loops_per_jiffy
> value if it requires udelays before loop calibration and the default of 4k
> is wildly off.
>
> If people need loops_per_jiffy to be updated at the same time as lpj_fine,
> I can post that as a separate patch (below) as Russell has merged v2 of these
> patches into his delay branch. That said, I'd certainly like to know if this
> is actually a real problem (and can't be solved by choosing a compromise value
> as the initial loops_per_jiffy). I think Shinya was doing some tests so
> I'll wait to see how those went.
>
I just tried the suggested change + two patches.
I haven't gone through the complete discussion but quick test of your patches
on OMAP5, shows me, the BOGOMIPS is completely wrong. Looks like
you are not updating the  'loops_per_jiffy' in the calibration loop.

What Am I missing ?

Regards
santosh

^ permalink raw reply	[flat|nested] 38+ messages in thread

* [PATCH v2 2/2] ARM: delay: allow timer-based delay implementation to be selected
  2012-07-13 12:04                   ` Shilimkar, Santosh
@ 2012-07-13 12:08                     ` Will Deacon
  2012-07-13 12:14                       ` Shilimkar, Santosh
  0 siblings, 1 reply; 38+ messages in thread
From: Will Deacon @ 2012-07-13 12:08 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, Jul 13, 2012 at 01:04:27PM +0100, Shilimkar, Santosh wrote:
> On Fri, Jul 13, 2012 at 4:43 PM, Will Deacon <will.deacon@arm.com> wrote:
> > I was anticipating that the platform would set the initial loops_per_jiffy
> > value if it requires udelays before loop calibration and the default of 4k
> > is wildly off.
> >
> > If people need loops_per_jiffy to be updated at the same time as lpj_fine,
> > I can post that as a separate patch (below) as Russell has merged v2 of these
> > patches into his delay branch. That said, I'd certainly like to know if this
> > is actually a real problem (and can't be solved by choosing a compromise value
> > as the initial loops_per_jiffy). I think Shinya was doing some tests so
> > I'll wait to see how those went.
> >
> I just tried the suggested change + two patches.
> I haven't gone through the complete discussion but quick test of your patches
> on OMAP5, shows me, the BOGOMIPS is completely wrong. Looks like
> you are not updating the  'loops_per_jiffy' in the calibration loop.

What is your bogomips reading and what frequency is your architected timer
ticking at? loops_per_jiffy is updated by calibrate_delay in
init/calibrate.c so we shouldn't need to worry about that (the diff I posted
in my last email is just for udelay calls between switching to the timer and
doing the calibration).

> What Am I missing ?

Well, your bogomips number will probably be *much* lower than before, so
don't be surprised by that.

Will

^ permalink raw reply	[flat|nested] 38+ messages in thread

* [PATCH v2 2/2] ARM: delay: allow timer-based delay implementation to be selected
  2012-07-13 12:08                     ` Will Deacon
@ 2012-07-13 12:14                       ` Shilimkar, Santosh
  2012-07-13 12:23                         ` Will Deacon
  0 siblings, 1 reply; 38+ messages in thread
From: Shilimkar, Santosh @ 2012-07-13 12:14 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, Jul 13, 2012 at 5:38 PM, Will Deacon <will.deacon@arm.com> wrote:
> On Fri, Jul 13, 2012 at 01:04:27PM +0100, Shilimkar, Santosh wrote:
>> On Fri, Jul 13, 2012 at 4:43 PM, Will Deacon <will.deacon@arm.com> wrote:
>> > I was anticipating that the platform would set the initial loops_per_jiffy
>> > value if it requires udelays before loop calibration and the default of 4k
>> > is wildly off.
>> >
>> > If people need loops_per_jiffy to be updated at the same time as lpj_fine,
>> > I can post that as a separate patch (below) as Russell has merged v2 of these
>> > patches into his delay branch. That said, I'd certainly like to know if this
>> > is actually a real problem (and can't be solved by choosing a compromise value
>> > as the initial loops_per_jiffy). I think Shinya was doing some tests so
>> > I'll wait to see how those went.
>> >
>> I just tried the suggested change + two patches.
>> I haven't gone through the complete discussion but quick test of your patches
>> on OMAP5, shows me, the BOGOMIPS is completely wrong. Looks like
>> you are not updating the  'loops_per_jiffy' in the calibration loop.
>
> What is your bogomips reading and what frequency is your architected timer
> ticking at? loops_per_jiffy is updated by calibrate_delay in
> init/calibrate.c so we shouldn't need to worry about that (the diff I posted
> in my last email is just for udelay calls between switching to the timer and
> doing the calibration).
>
>> What Am I missing ?
>
> Well, your bogomips number will probably be *much* lower than before, so
> don't be surprised by that.
>
Indeed !!
This is what I observed. Can we not keep that behavior old way. BogoMIPS and
CPU clock is often very easy to co-related and helpful for CPUFREQ stats.

Regards
Santosh

^ permalink raw reply	[flat|nested] 38+ messages in thread

* [PATCH v2 2/2] ARM: delay: allow timer-based delay implementation to be selected
  2012-07-13 12:14                       ` Shilimkar, Santosh
@ 2012-07-13 12:23                         ` Will Deacon
  2012-07-13 12:28                           ` Shilimkar, Santosh
  0 siblings, 1 reply; 38+ messages in thread
From: Will Deacon @ 2012-07-13 12:23 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, Jul 13, 2012 at 01:14:59PM +0100, Shilimkar, Santosh wrote:
> On Fri, Jul 13, 2012 at 5:38 PM, Will Deacon <will.deacon@arm.com> wrote:
> > Well, your bogomips number will probably be *much* lower than before, so
> > don't be surprised by that.
> >
> Indeed !!
> This is what I observed. Can we not keep that behavior old way. BogoMIPS and
> CPU clock is often very easy to co-related and helpful for CPUFREQ stats.

I defer to David Miller, who had a similar reaction when he did this for
sparc:

http://www.spinics.net/lists/sparclinux/msg08550.html

Will

^ permalink raw reply	[flat|nested] 38+ messages in thread

* [PATCH v2 2/2] ARM: delay: allow timer-based delay implementation to be selected
  2012-07-13 12:23                         ` Will Deacon
@ 2012-07-13 12:28                           ` Shilimkar, Santosh
  0 siblings, 0 replies; 38+ messages in thread
From: Shilimkar, Santosh @ 2012-07-13 12:28 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, Jul 13, 2012 at 5:53 PM, Will Deacon <will.deacon@arm.com> wrote:
> On Fri, Jul 13, 2012 at 01:14:59PM +0100, Shilimkar, Santosh wrote:
>> On Fri, Jul 13, 2012 at 5:38 PM, Will Deacon <will.deacon@arm.com> wrote:
>> > Well, your bogomips number will probably be *much* lower than before, so
>> > don't be surprised by that.
>> >
>> Indeed !!
>> This is what I observed. Can we not keep that behavior old way. BogoMIPS and
>> CPU clock is often very easy to co-related and helpful for CPUFREQ stats.
>
> I defer to David Miller, who had a similar reaction when he did this for
> sparc:
>
> http://www.spinics.net/lists/sparclinux/msg08550.html
>
Fair enough. Anybody who sees this BOGOMIPS change will
get similar reaction. I am fine with it now after knowing the
reason.
Thought was just to avoid similar question in future which you
might have to answer :-)

Regards
Santosh

^ permalink raw reply	[flat|nested] 38+ messages in thread

* [PATCH v2 2/2] ARM: delay: allow timer-based delay implementation to be selected
  2012-07-13 11:13                 ` Will Deacon
  2012-07-13 12:04                   ` Shilimkar, Santosh
@ 2012-07-17  3:10                   ` Shinya Kuribayashi
  2012-07-17  6:11                     ` Shilimkar, Santosh
  1 sibling, 1 reply; 38+ messages in thread
From: Shinya Kuribayashi @ 2012-07-17  3:10 UTC (permalink / raw)
  To: linux-arm-kernel

Will, Stephen and Santosh,

On 7/13/2012 8:13 PM, Will Deacon wrote:
> I was anticipating that the platform would set the initial loops_per_jiffy
> value if it requires udelays before loop calibration and the default of 4k
> is wildly off.

I overlooked two different lpj setups were involved at hand.

First one was, the initial loops_per_jiffy value of 4k was too small for
almost all processors running Linux today, so I set up loops_per_jiffy
_early_, calculated from the CPU clock speed.  I didn't mentioned this
before, sorry for confusion.

So my initial loops_per_jiffy is not 4k at this point.  It's optimized
for loop-based delay with the CPU running at 1.2GHz (much bigger than
default 4k).

And later, init_current_timer_delay() got processed.  Actual udelay()
behavior switched from loop-based delay to timer-based one immediately,
while my loops_per_jiffy was not changed/updated to appropriate value.

This is why my udelay()s, used after init_current_timer_delay(), were
taking considerable long time to expire.   Note that my initial tests
for Will's patchset was done using a loadable module dedicated for
udelay tests, that was prepared for 2.6.35/3.0 kernels beforehand.

And this time, I confirmed that updating loops_per_jiffy at the same
time as lpj_fine, works perfectly as expected for me.

> If people need loops_per_jiffy to be updated at the same time as lpj_fine,
> I can post that as a separate patch (below) as Russell has merged v2 of these
> patches into his delay branch. That said, I'd certainly like to know if this
> is actually a real problem (and can't be solved by choosing a compromise value
> as the initial loops_per_jiffy). I think Shinya was doing some tests so
> I'll wait to see how those went.

>From my observations:

(1) loops_per_jiffy can easily be calculated from the CPU clock speed.
If your platform is capable of detecting CPU frequency at run-time,
settingi up loops_per_jiffy _early_ can allow you early use of udelay()s.

Or even if you don't need udelay() early, setting up lpj_fine (or having
calibrate_delay_is_known()) allows you to skip calibrate_delay() later.
This is useful and can be applied to both UP and SMP systems.

(2) For SMP platforms, if you need ealy use of udelay(), you have to
update loops_per_jiffy at the same time as init_current_timer_delay().
It could be done in init_current_timer_delay(), or platforms can take
care of that, that need udelay() available early.  Either one should be
fine with me.
--
Shinya Kuribayashi
Renesas Electronics

^ permalink raw reply	[flat|nested] 38+ messages in thread

* [PATCH v2 2/2] ARM: delay: allow timer-based delay implementation to be selected
  2012-07-17  3:10                   ` Shinya Kuribayashi
@ 2012-07-17  6:11                     ` Shilimkar, Santosh
  2012-07-17  7:42                       ` Shinya Kuribayashi
  0 siblings, 1 reply; 38+ messages in thread
From: Shilimkar, Santosh @ 2012-07-17  6:11 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, Jul 17, 2012 at 8:40 AM, Shinya Kuribayashi
<shinya.kuribayashi.px@renesas.com> wrote:
> Will, Stephen and Santosh,
>
> On 7/13/2012 8:13 PM, Will Deacon wrote:
>> I was anticipating that the platform would set the initial loops_per_jiffy
>> value if it requires udelays before loop calibration and the default of 4k
>> is wildly off.
>
> I overlooked two different lpj setups were involved at hand.
>
> First one was, the initial loops_per_jiffy value of 4k was too small for
> almost all processors running Linux today, so I set up loops_per_jiffy
> _early_, calculated from the CPU clock speed.  I didn't mentioned this
> before, sorry for confusion.
>
> So my initial loops_per_jiffy is not 4k at this point.  It's optimized
> for loop-based delay with the CPU running at 1.2GHz (much bigger than
> default 4k).
>
> And later, init_current_timer_delay() got processed.  Actual udelay()
> behavior switched from loop-based delay to timer-based one immediately,
> while my loops_per_jiffy was not changed/updated to appropriate value.
>
> This is why my udelay()s, used after init_current_timer_delay(), were
> taking considerable long time to expire.   Note that my initial tests
> for Will's patchset was done using a loadable module dedicated for
> udelay tests, that was prepared for 2.6.35/3.0 kernels beforehand.
>
> And this time, I confirmed that updating loops_per_jiffy at the same
> time as lpj_fine, works perfectly as expected for me.
>
>> If people need loops_per_jiffy to be updated at the same time as lpj_fine,
>> I can post that as a separate patch (below) as Russell has merged v2 of these
>> patches into his delay branch. That said, I'd certainly like to know if this
>> is actually a real problem (and can't be solved by choosing a compromise value
>> as the initial loops_per_jiffy). I think Shinya was doing some tests so
>> I'll wait to see how those went.
>
> From my observations:
>
> (1) loops_per_jiffy can easily be calculated from the CPU clock speed.
> If your platform is capable of detecting CPU frequency at run-time,
> settingi up loops_per_jiffy _early_ can allow you early use of udelay()s.
>
> Or even if you don't need udelay() early, setting up lpj_fine (or having
> calibrate_delay_is_known()) allows you to skip calibrate_delay() later.
> This is useful and can be applied to both UP and SMP systems.
>
> (2) For SMP platforms, if you need ealy use of udelay(), you have to
> update loops_per_jiffy at the same time as init_current_timer_delay().
> It could be done in init_current_timer_delay(), or platforms can take
> care of that, that need udelay() available early.  Either one should be
> fine with me.

Thanks for the detailed explanation. CPU clock detection is indeed the
nit way to skip the calibration overhead and this was one of the comment
when I tried to push the skipping of calibration for secondary CPUs.

Looks like you have a working patch for the clock detection. Will
you able to post that patch so that this long pending calibration
for secondary CPUs gets optimized.

Regards
Santosh

^ permalink raw reply	[flat|nested] 38+ messages in thread

* [PATCH v2 2/2] ARM: delay: allow timer-based delay implementation to be selected
  2012-07-17  6:11                     ` Shilimkar, Santosh
@ 2012-07-17  7:42                       ` Shinya Kuribayashi
  2012-07-17  9:05                         ` Will Deacon
  2012-07-18 17:52                         ` Will Deacon
  0 siblings, 2 replies; 38+ messages in thread
From: Shinya Kuribayashi @ 2012-07-17  7:42 UTC (permalink / raw)
  To: linux-arm-kernel

On 7/17/2012 3:11 PM, Shilimkar, Santosh wrote:
> Thanks for the detailed explanation. CPU clock detection is indeed the
> nit way to skip the calibration overhead and this was one of the comment
> when I tried to push the skipping of calibration for secondary CPUs.
> 
> Looks like you have a working patch for the clock detection. Will
> you able to post that patch so that this long pending calibration
> for secondary CPUs gets optimized.

Something like this should work (not even build tested, can be applied
on top of Will's v2 patchset):

diff --git a/arch/arm/lib/delay.c b/arch/arm/lib/delay.c
index e1030e1..736dcea 100644
--- a/arch/arm/lib/delay.c
+++ b/arch/arm/lib/delay.c
@@ -25,6 +25,8 @@
 #include <linux/module.h>
 #include <linux/timex.h>
 
+static unsigned long lpj_early;
+
 /*
  * Default to the loop-based delay implementation.
  */
@@ -59,8 +61,22 @@ void __init init_current_timer_delay(unsigned long freq)
 {
 	pr_info("Switching to timer-based delay loop\n");
 	lpj_fine			= freq / HZ;
+	lpj_early			= lpj_fine;
+	loops_per_jiffy			= lpj_fine;
 	arm_delay_ops.delay		= __timer_delay;
 	arm_delay_ops.const_udelay	= __timer_const_udelay;
 	arm_delay_ops.udelay		= __timer_udelay;
 }
+
+void __cpuinit calibrate_delay_early(unsigned long cpu_freq)
+{
+	lpj_early = (cpu_freq + HZ/2) / HZ;
+	loops_per_jiffy = lpj_early;
+	pr_info("Calibrating delay using CPU frequency.. %lu Hz\n", cpu_freq);
+}
+
+unsigned long __cpuinit calibrate_delay_is_known(void)
+{
+	return lpj_early; /* this function works for both UP/SMP cases */
+}
 #endif


And change your ->timer() func (called via time_init) to make use of it:

        unsigned long freq;

        /* For UP/SMP systems */
        freq = get_CPU_frequency();
        calibrate_delay_early(freq);

#ifdef CONFIG_SMP
        /* For SMP systems */
        freq = get_Timer_frequency();
        init_current_timer_delay(freq);
#endif

The way to detect CPU clock speed can vary depending on your systems,
so hard to be generalized.  In my case, I have full-blown clock tree
described for my SoCs, and the CPU clock speed can be easily obtained
by clk_get_rate("xxx").

Hope this helps!
-- 
Shinya Kuribayashi
Renesas Electronics

^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH v2 2/2] ARM: delay: allow timer-based delay implementation to be selected
  2012-07-17  7:42                       ` Shinya Kuribayashi
@ 2012-07-17  9:05                         ` Will Deacon
  2012-07-19 12:43                           ` Shinya Kuribayashi
  2012-07-18 17:52                         ` Will Deacon
  1 sibling, 1 reply; 38+ messages in thread
From: Will Deacon @ 2012-07-17  9:05 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Shinya,

On Tue, Jul 17, 2012 at 08:42:36AM +0100, Shinya Kuribayashi wrote:
> Something like this should work (not even build tested, can be applied
> on top of Will's v2 patchset):

Ok, please see the HEAD of my delay branch as I already pushed an extra patch
there:

http://git.kernel.org/?p=linux/kernel/git/will/linux.git;a=commitdiff;h=aef33b43ac8664d5a3c4990e845946e7eb5aceb0

> diff --git a/arch/arm/lib/delay.c b/arch/arm/lib/delay.c
> index e1030e1..736dcea 100644
> --- a/arch/arm/lib/delay.c
> +++ b/arch/arm/lib/delay.c
> @@ -25,6 +25,8 @@
>  #include <linux/module.h>
>  #include <linux/timex.h>
>  
> +static unsigned long lpj_early;
> +
>  /*
>   * Default to the loop-based delay implementation.
>   */
> @@ -59,8 +61,22 @@ void __init init_current_timer_delay(unsigned long freq)
>  {
>  	pr_info("Switching to timer-based delay loop\n");
>  	lpj_fine			= freq / HZ;
> +	lpj_early			= lpj_fine;
> +	loops_per_jiffy			= lpj_fine;
>  	arm_delay_ops.delay		= __timer_delay;
>  	arm_delay_ops.const_udelay	= __timer_const_udelay;
>  	arm_delay_ops.udelay		= __timer_udelay;
>  }
> +
> +void __cpuinit calibrate_delay_early(unsigned long cpu_freq)
> +{
> +	lpj_early = (cpu_freq + HZ/2) / HZ;
> +	loops_per_jiffy = lpj_early;
> +	pr_info("Calibrating delay using CPU frequency.. %lu Hz\n", cpu_freq);
> +}
> +
> +unsigned long __cpuinit calibrate_delay_is_known(void)
> +{
> +	return lpj_early; /* this function works for both UP/SMP cases */
> +}
>  #endif

I assume the reason you've done it like this is because your timer isn't up
and running until after the delay loop has been calibrated? In which case,
I'd really rather not duplicate the calibration here -- is there no way you
can ensure the timer is available earlier? If not, then I'm not sure we
should be using it as the delay source. I also think you need to consider:

  1. Can your timer change frequency at runtime? [due to PM etc]
  2.
     i. Is its clock domain guaranteed to tick as long as the CPU is up?
    ii. If it's in a separate power domain to the CPU, is that ever shut off
        while the CPU is up?

If the answer to any of those is `yes', then I also think it's questionable
whether it's worthwhile using it for delay.

Updating loops_per_jiffy from init_current_timer_delay is reasonable if
people rely on these delays prior to calibration and there isn't a compromise
value for lpj, but all this _early stuff is really horrible. Just make the
thing tick before calibration occurs (we don't care about interrupts).

Will

^ permalink raw reply	[flat|nested] 38+ messages in thread

* [PATCH v2 2/2] ARM: delay: allow timer-based delay implementation to be selected
  2012-07-17  7:42                       ` Shinya Kuribayashi
  2012-07-17  9:05                         ` Will Deacon
@ 2012-07-18 17:52                         ` Will Deacon
  2012-07-19 15:19                           ` Jonathan Austin
  1 sibling, 1 reply; 38+ messages in thread
From: Will Deacon @ 2012-07-18 17:52 UTC (permalink / raw)
  To: linux-arm-kernel

Hello,

On Tue, Jul 17, 2012 at 08:42:36AM +0100, Shinya Kuribayashi wrote:
> On 7/17/2012 3:11 PM, Shilimkar, Santosh wrote:
> > Looks like you have a working patch for the clock detection. Will
> > you able to post that patch so that this long pending calibration
> > for secondary CPUs gets optimized.
> 
> Something like this should work (not even build tested, can be applied
> on top of Will's v2 patchset):

[...]

> And change your ->timer() func (called via time_init) to make use of it:
> 
>         unsigned long freq;
> 
>         /* For UP/SMP systems */
>         freq = get_CPU_frequency();
>         calibrate_delay_early(freq);
> 
> #ifdef CONFIG_SMP
>         /* For SMP systems */
>         freq = get_Timer_frequency();
>         init_current_timer_delay(freq);
> #endif

Since this seems to be gaining some traction on platforms without the
architected timers, Jonny and I have put together a simple registration
mechanism for the delay timer to avoid people calling init_current_timer_delay
(and defining the global read_current_timer symbol).

The patch is sitting in my delay branch, but I've also included it below.
Please give it a go and let us know what you think.

Cheers,

Will

---8<---

>From 8ea28b6f7acad17b9952f8eb533f2887504b462d Mon Sep 17 00:00:00 2001
From: Jonathan Austin <Jonathan.Austin@arm.com>
Date: Wed, 18 Jul 2012 15:24:53 +0100
Subject: [PATCH] ARM: delay: add registration mechanism for delay timer sources

The current timer-based delay loop relies on the architected timer to
initiate the switch away from the polling-based implementation. This is
unfortunate for platforms without the architected timers but with a
suitable delay source (that is, constant frequency, always powered-up
and ticking as long as the CPUs are online).

This patch introduces a registration mechanism for the delay timer
(which provides an unconditional read_current_timer implementation) and
updates the architected timer code to use the new interface.

Signed-off-by: Jonathan Austin <jonathan.austin@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
---
 arch/arm/include/asm/arch_timer.h |    1 -
 arch/arm/include/asm/delay.h      |    9 +++++++++
 arch/arm/include/asm/timex.h      |    6 ------
 arch/arm/kernel/arch_timer.c      |   12 +++++++-----
 arch/arm/lib/delay.c              |   30 +++++++++++++++++++++---------
 5 files changed, 37 insertions(+), 21 deletions(-)

diff --git a/arch/arm/include/asm/arch_timer.h b/arch/arm/include/asm/arch_timer.h
index 62e7547..88401c2 100644
--- a/arch/arm/include/asm/arch_timer.h
+++ b/arch/arm/include/asm/arch_timer.h
@@ -4,7 +4,6 @@
 #include <asm/errno.h>
 
 #ifdef CONFIG_ARM_ARCH_TIMER
-#define ARCH_HAS_READ_CURRENT_TIMER
 int arch_timer_of_register(void);
 int arch_timer_sched_clock_init(void);
 #else
diff --git a/arch/arm/include/asm/delay.h b/arch/arm/include/asm/delay.h
index dc61451..17deac7 100644
--- a/arch/arm/include/asm/delay.h
+++ b/arch/arm/include/asm/delay.h
@@ -15,6 +15,11 @@
 
 #ifndef __ASSEMBLY__
 
+struct delay_timer {
+	int (*read_current_timer)(unsigned long *timer_val);
+	unsigned long freq;
+};
+
 extern struct arm_delay_ops {
 	void (*delay)(unsigned long);
 	void (*const_udelay)(unsigned long);
@@ -56,6 +61,10 @@ extern void __loop_delay(unsigned long loops);
 extern void __loop_udelay(unsigned long usecs);
 extern void __loop_const_udelay(unsigned long);
 
+/* Delay-loop timer registration. */
+#define ARCH_HAS_READ_CURRENT_TIMER
+extern void register_current_timer_delay(struct delay_timer *timer);
+
 #endif /* __ASSEMBLY__ */
 
 #endif /* defined(_ARM_DELAY_H) */
diff --git a/arch/arm/include/asm/timex.h b/arch/arm/include/asm/timex.h
index ce11944..9acc135 100644
--- a/arch/arm/include/asm/timex.h
+++ b/arch/arm/include/asm/timex.h
@@ -12,15 +12,9 @@
 #ifndef _ASMARM_TIMEX_H
 #define _ASMARM_TIMEX_H
 
-#include <asm/arch_timer.h>
 #include <mach/timex.h>
 
 typedef unsigned long cycles_t;
-
-#ifdef ARCH_HAS_READ_CURRENT_TIMER
 #define get_cycles()	({ cycles_t c; read_current_timer(&c) ? 0 : c; })
-#else
-#define get_cycles()	(0)
-#endif
 
 #endif
diff --git a/arch/arm/kernel/arch_timer.c b/arch/arm/kernel/arch_timer.c
index 675cee0..b624523 100644
--- a/arch/arm/kernel/arch_timer.c
+++ b/arch/arm/kernel/arch_timer.c
@@ -21,6 +21,7 @@
 #include <linux/io.h>
 
 #include <asm/cputype.h>
+#include <asm/delay.h>
 #include <asm/localtimer.h>
 #include <asm/arch_timer.h>
 #include <asm/system_info.h>
@@ -31,8 +32,7 @@ static int arch_timer_ppi;
 static int arch_timer_ppi2;
 
 static struct clock_event_device __percpu **arch_timer_evt;
-
-extern void init_current_timer_delay(unsigned long freq);
+static struct delay_timer arch_delay_timer;
 
 /*
  * Architected system timer support.
@@ -225,7 +225,7 @@ static cycle_t arch_counter_read(struct clocksource *cs)
 	return arch_counter_get_cntpct();
 }
 
-int read_current_timer(unsigned long *timer_val)
+static int arch_timer_read_current_timer(unsigned long *timer_val)
 {
 	if (!arch_timer_rate)
 		return -ENXIO;
@@ -302,11 +302,13 @@ static int __init arch_timer_register(void)
 		arch_timer_global_evt.cpumask = cpumask_of(0);
 		err = arch_timer_setup(&arch_timer_global_evt);
 	}
-
 	if (err)
 		goto out_free_irq;
 
-	init_current_timer_delay(arch_timer_rate);
+	/* Use the architected timer for the delay loop. */
+	arch_delay_timer.read_current_timer = &arch_timer_read_current_timer;
+	arch_delay_timer.freq = arch_timer_rate;
+	register_current_timer_delay(&arch_delay_timer);
 	return 0;
 
 out_free_irq:
diff --git a/arch/arm/lib/delay.c b/arch/arm/lib/delay.c
index 395d5fb..1b51570 100644
--- a/arch/arm/lib/delay.c
+++ b/arch/arm/lib/delay.c
@@ -34,7 +34,14 @@ struct arm_delay_ops arm_delay_ops = {
 	.udelay		= __loop_udelay,
 };
 
-#ifdef ARCH_HAS_READ_CURRENT_TIMER
+static struct delay_timer *delay_timer;
+static bool delay_calibrated;
+
+int read_current_timer(unsigned long *timer_val)
+{
+	return delay_timer ? delay_timer->read_current_timer(timer_val) : -ENXIO;
+}
+
 static void __timer_delay(unsigned long cycles)
 {
 	cycles_t start = get_cycles();
@@ -55,18 +62,23 @@ static void __timer_udelay(unsigned long usecs)
 	__timer_const_udelay(usecs * UDELAY_MULT);
 }
 
-void __init init_current_timer_delay(unsigned long freq)
+void __init register_current_timer_delay(struct delay_timer *timer)
 {
-	pr_info("Switching to timer-based delay loop\n");
-	lpj_fine			= freq / HZ;
-	loops_per_jiffy			= lpj_fine;
-	arm_delay_ops.delay		= __timer_delay;
-	arm_delay_ops.const_udelay	= __timer_const_udelay;
-	arm_delay_ops.udelay		= __timer_udelay;
+	if (!delay_calibrated) {
+		pr_info("Switching to timer-based delay loop\n");
+		delay_timer			= timer;
+		lpj_fine			= timer->freq / HZ;
+		loops_per_jiffy			= lpj_fine;
+		arm_delay_ops.delay		= __timer_delay;
+		arm_delay_ops.const_udelay	= __timer_const_udelay;
+		arm_delay_ops.udelay		= __timer_udelay;
+	} else {
+		pr_info("Ignoring late registration of read_current_timer delay\n");
+	}
 }
 
 unsigned long __cpuinit calibrate_delay_is_known(void)
 {
+	delay_calibrated = true;
 	return lpj_fine;
 }
-#endif
-- 
1.7.4.1

^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH v2 2/2] ARM: delay: allow timer-based delay implementation to be selected
  2012-07-17  9:05                         ` Will Deacon
@ 2012-07-19 12:43                           ` Shinya Kuribayashi
  0 siblings, 0 replies; 38+ messages in thread
From: Shinya Kuribayashi @ 2012-07-19 12:43 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Will,

Have not gone through your new mail yet, sorry..

On 7/17/2012 6:05 PM, Will Deacon wrote:
> I assume the reason you've done it like this is because your timer isn't up
> and running until after the delay loop has been calibrated? In which case,
> I'd really rather not duplicate the calibration here -- is there no way you
> can ensure the timer is available earlier? If not, then I'm not sure we
> should be using it as the delay source.

Of course not :-)  My timers get started and running from time_init()
point, before calibrate_delay().

And don't get me wrong, I just tried to provide a working example to
skip calibrate_delay() in response to Santosh request:

> Thanks for the detailed explanation. CPU clock detection is indeed the
> nit way to skip the calibration overhead and this was one of the comment
> when I tried to push the skipping of calibration for secondary CPUs.
> 
> Looks like you have a working patch for the clock detection. Will
> you able to post that patch so that this long pending calibration
> for secondary CPUs gets optimized.

... with keeping the following points in mind:

1. skip the calibration for secondary CPUs (that is, for SMP use)

2. can work without init_current_timer_delay() help; calculating lpj
   from the CPU clock speed, not from current timer

If I understand Santosh request correctly, he requested a patch that
could be tested on non-A15 SMP systems (may be A9).  We know that A15
comes with the architected timer and does not require any additional
patches apart from already provided ones.  Also he said he gave a try
to "the suggested change + two patches" and confirmed it worked.

As for lpj_early variable, I just wanted to reserve lpj_fine for real
"timer" use, and did not want to mix with CPU frequency thing.  That
being said, I have to admit that rewriting using lpj_fine looks much
simpler, and should have done so from the beginning:

diff --git a/arch/arm/lib/delay.c b/arch/arm/lib/delay.c
index 395d5fb..b9874a3 100644
--- a/arch/arm/lib/delay.c
+++ b/arch/arm/lib/delay.c
@@ -65,6 +65,13 @@ void __init init_current_timer_delay(unsigned long freq)
 	arm_delay_ops.udelay		= __timer_udelay;
 }
 
+void __init calibrate_delay_early(unsigned long rate)
+{
+	pr_info("Calibrating delay using CPU frequency.. %lu Hz\n", rate);
+	lpj_fine = (rate + (HZ/2)) / HZ;
+	loops_per_jiffy = lpj_fine;
+}
+
 unsigned long __cpuinit calibrate_delay_is_known(void)
 {
 	return lpj_fine;
_

Anyway, my goal was to make calibrate_delay_is_known() work to skip
the calibration for secondary CPUs, whether init_current_timer_delay()
was involved or not.

> I also think you need to consider:
> 
>   1. Can your timer change frequency at runtime? [due to PM etc]
>   2.
>      i. Is its clock domain guaranteed to tick as long as the CPU is up?
>     ii. If it's in a separate power domain to the CPU, is that ever shut off
>         while the CPU is up?
> 
> If the answer to any of those is `yes', then I also think it's questionable
> whether it's worthwhile using it for delay.

Good point.  They all strongly suggest a need for current timer help.

At the same time, however, I wonder is it possible for runtime PM to
change frequency, or stop clock supply to the CPU core, or shut-down
its power domain _unnoticed_ by the CPU core? I do not think so.  It
may be easy from the hardware perspective, but it should not from the
software standpoint.

One thing we have to be careful when make use of the CPU clock speed
to skip the calibration on secondary CPUs is, the clock speed of each
CPU core when it enters / leaves from suspend.  And as long as it is
_predictable_, current calibrate_delay() & calibrate_delay_is_known()
infrastructure we have is going to work great.  On secondary CPUs,
loops_per_jiffy gets set up using calibrate_delay_is_known() at the
first calibration time, then stored to per_cpu(cpu_loops_per_jiffy).
Afterward, per_cpu(cpu_loops_per_jiffy) will be used.  If this works
for Santosh / OMAPs, it woudl be best, the way to go.

On the other hand, if it's unpredictable, we need yet another hacks
to 1) skip the calibration for secondary CPUs, and 2) the way to load
an appropriate value to global loops_per_jiffy, every time each CPU
gets started.

[ And at the point udelay() relies on the global loops_per_jiffy,
  this scenario doesn't work.  Primary reason to use current timer! ]

> Updating loops_per_jiffy from init_current_timer_delay is reasonable if
> people rely on these delays prior to calibration and there isn't a compromise
> value for lpj, but all this _early stuff is really horrible. Just make the
> thing tick before calibration occurs (we don't care about interrupts).

Lastly, would like to update examples of use cases.

/* For UP systems, or SMP systems without dynamic CPU freq scaling */
your_timer_init(void)
{
        unsigned long rate;

        rate = get_CPU_frequency();
        calibrate_delay_early(rate);
}

After calibrate_delay() is processed, loops_per_jiffy is supposed be
under the control of cpufreq, if required.

/* For SMP systems with dynamic CPU freq scaling */
your_timer_init(void)
{
        unsigned long rate;

        rate = get_Timer_frequency();
        init_current_timer_delay(rate);
}

You don't have to use calibrate_delay_early() in this case, of course.
My privious example was not clear on this point.

--
Shinya Kuribayashi
Renesas Electronics

^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH v2 2/2] ARM: delay: allow timer-based delay implementation to be selected
  2012-07-18 17:52                         ` Will Deacon
@ 2012-07-19 15:19                           ` Jonathan Austin
  2012-07-20 10:17                             ` Will Deacon
  0 siblings, 1 reply; 38+ messages in thread
From: Jonathan Austin @ 2012-07-19 15:19 UTC (permalink / raw)
  To: linux-arm-kernel

Hi all,

On 18/07/12 18:52, Will Deacon wrote:

> Hello,
> On Tue, Jul 17, 2012 at 08:42:36AM +0100, Shinya Kuribayashi wrote:
>> On 7/17/2012 3:11 PM, Shilimkar, Santosh wrote:
>>> Looks like you have a working patch for the clock detection. Will
>>> you able to post that patch so that this long pending calibration
>>> for secondary CPUs gets optimized.
>>
>> Something like this should work (not even build tested, can be applied
>> on top of Will's v2 patchset):
> 
> [...]
> 
>> And change your ->timer() func (called via time_init) to make use of it:
>>
>>         unsigned long freq;
>>
>>         /* For UP/SMP systems */
>>         freq = get_CPU_frequency();
>>         calibrate_delay_early(freq);
>>
>> #ifdef CONFIG_SMP
>>         /* For SMP systems */
>>         freq = get_Timer_frequency();
>>         init_current_timer_delay(freq);
>> #endif
> 
> Since this seems to be gaining some traction on platforms without the
> architected timers, Jonny and I have put together a simple registration
> mechanism for the delay timer to avoid people calling init_current_timer_delay
> (and defining the global read_current_timer symbol).


We should probably also ignore and additional registration calls, not just
those made after the delay loop has been calibrated...

Something like the patch below should do the trick.

Jonny

-----8<------

diff --git a/arch/arm/lib/delay.c b/arch/arm/lib/delay.c
index 1b51570..fff305b 100644
--- a/arch/arm/lib/delay.c
+++ b/arch/arm/lib/delay.c
@@ -72,8 +72,9 @@ void __init register_current_timer_delay(struct delay_timer *timer)
                arm_delay_ops.delay             = __timer_delay;
                arm_delay_ops.const_udelay      = __timer_const_udelay;
                arm_delay_ops.udelay            = __timer_udelay;
+               delay_calibrated                = true;
        } else {
-               pr_info("Ignoring late registration of read_current_timer delay\n");
+               pr_info("Ignoring duplicate/late registration of read_current_timer delay\n");
        }
 }
 

^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH v2 2/2] ARM: delay: allow timer-based delay implementation to be selected
  2012-07-19 15:19                           ` Jonathan Austin
@ 2012-07-20 10:17                             ` Will Deacon
  2012-07-24  9:06                               ` Shinya Kuribayashi
  0 siblings, 1 reply; 38+ messages in thread
From: Will Deacon @ 2012-07-20 10:17 UTC (permalink / raw)
  To: linux-arm-kernel

On Thu, Jul 19, 2012 at 04:19:52PM +0100, Jonathan Austin wrote:
> We should probably also ignore and additional registration calls, not just
> those made after the delay loop has been calibrated...
> 
> Something like the patch below should do the trick.

[...]

Cheers, pushed into my delay branch. Note that I don't plan to queue these
patches until somebody decides they want a delay source other than the
architected timers in mainline.

Will

^ permalink raw reply	[flat|nested] 38+ messages in thread

* [PATCH v2 2/2] ARM: delay: allow timer-based delay implementation to be selected
  2012-07-20 10:17                             ` Will Deacon
@ 2012-07-24  9:06                               ` Shinya Kuribayashi
  2012-07-24  9:15                                 ` Will Deacon
  0 siblings, 1 reply; 38+ messages in thread
From: Shinya Kuribayashi @ 2012-07-24  9:06 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Will, Jonathan,

On 7/20/2012 7:17 PM, Will Deacon wrote:
> On Thu, Jul 19, 2012 at 04:19:52PM +0100, Jonathan Austin wrote:
>> We should probably also ignore and additional registration calls, not just
>> those made after the delay loop has been calibrated...
>>
>> Something like the patch below should do the trick.
> 
> [...]
> 
> Cheers, pushed into my delay branch.

Thanks for the registration machanism.  It looks nice, and works for
me on a Cortex-A9 SMP platform with a non-arch_timer counter.

I've picked up a revised version of the patch from will/delay and
merged it into our local development branch.  We'll give it a try
in different conditions, UP and SMP, with/without DVFS and so on,
and let you know if I find anything.

-- 
Shinya Kuribayashi
Renesas Electronics

^ permalink raw reply	[flat|nested] 38+ messages in thread

* [PATCH v2 2/2] ARM: delay: allow timer-based delay implementation to be selected
  2012-07-24  9:06                               ` Shinya Kuribayashi
@ 2012-07-24  9:15                                 ` Will Deacon
  0 siblings, 0 replies; 38+ messages in thread
From: Will Deacon @ 2012-07-24  9:15 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, Jul 24, 2012 at 10:06:06AM +0100, Shinya Kuribayashi wrote:
> Hi Will, Jonathan,
> 
> On 7/20/2012 7:17 PM, Will Deacon wrote:
> > On Thu, Jul 19, 2012 at 04:19:52PM +0100, Jonathan Austin wrote:
> >> We should probably also ignore and additional registration calls, not just
> >> those made after the delay loop has been calibrated...
> >>
> >> Something like the patch below should do the trick.
> > 
> > [...]
> > 
> > Cheers, pushed into my delay branch.
> 
> Thanks for the registration machanism.  It looks nice, and works for
> me on a Cortex-A9 SMP platform with a non-arch_timer counter.

That's good to hear, thanks.

> I've picked up a revised version of the patch from will/delay and
> merged it into our local development branch.  We'll give it a try
> in different conditions, UP and SMP, with/without DVFS and so on,
> and let you know if I find anything.

Great! If everything looks good, and you're interested in using this in
mainline, I could push the registration for 3.7. I just don't want to see it
there if nobody's using it.

Will

^ permalink raw reply	[flat|nested] 38+ messages in thread

end of thread, other threads:[~2012-07-24  9:15 UTC | newest]

Thread overview: 38+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-06-29 17:33 [PATCH v2 0/2] Use architected timers for delay loop Will Deacon
2012-06-29 17:33 ` [PATCH v2 1/2] ARM: arch timer: implement read_current_timer and get_cycles Will Deacon
2012-07-02 19:14   ` Stephen Boyd
2012-07-05 12:35   ` Shinya Kuribayashi
2012-07-05 12:59     ` Will Deacon
2012-06-29 17:33 ` [PATCH v2 2/2] ARM: delay: allow timer-based delay implementation to be selected Will Deacon
2012-07-02 19:14   ` Stephen Boyd
2012-07-02 21:53     ` Will Deacon
2012-07-03 12:09   ` Shinya Kuribayashi
2012-07-04 15:36     ` Will Deacon
2012-07-05 12:12       ` Shinya Kuribayashi
2012-07-05 12:56         ` Will Deacon
2012-07-05 16:51           ` Stephen Boyd
2012-07-05 13:06   ` Shinya Kuribayashi
2012-07-05 14:15     ` Will Deacon
2012-07-12  7:33   ` Shinya Kuribayashi
2012-07-12  8:44     ` Will Deacon
2012-07-12  9:35       ` Shinya Kuribayashi
2012-07-12 16:40         ` Stephen Boyd
2012-07-13  2:16           ` Shinya Kuribayashi
2012-07-13  8:57             ` Will Deacon
2012-07-13 10:48               ` Shilimkar, Santosh
2012-07-13 11:13                 ` Will Deacon
2012-07-13 12:04                   ` Shilimkar, Santosh
2012-07-13 12:08                     ` Will Deacon
2012-07-13 12:14                       ` Shilimkar, Santosh
2012-07-13 12:23                         ` Will Deacon
2012-07-13 12:28                           ` Shilimkar, Santosh
2012-07-17  3:10                   ` Shinya Kuribayashi
2012-07-17  6:11                     ` Shilimkar, Santosh
2012-07-17  7:42                       ` Shinya Kuribayashi
2012-07-17  9:05                         ` Will Deacon
2012-07-19 12:43                           ` Shinya Kuribayashi
2012-07-18 17:52                         ` Will Deacon
2012-07-19 15:19                           ` Jonathan Austin
2012-07-20 10:17                             ` Will Deacon
2012-07-24  9:06                               ` Shinya Kuribayashi
2012-07-24  9:15                                 ` Will Deacon

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.