All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/2] Use architected timers for delay loop
@ 2012-06-22 15:09 Will Deacon
  2012-06-22 15:09 ` [PATCH 1/2] ARM: arch timer: implement read_current_timer and get_cycles Will Deacon
                   ` (3 more replies)
  0 siblings, 4 replies; 18+ messages in thread
From: Will Deacon @ 2012-06-22 15:09 UTC (permalink / raw)
  To: linux-arm-kernel

Hello,

This patch series reworks our delay loop implementation to allow the
busy-loop to be replaced with the physical counter included as part of
the architected timer on recent CPUs. The polling implementation is
still maintained to support cores without the timers and we use the same
maths and upper delay bound of 2ms.

Russell - I think my maths is correct, but why can't we increase the
          precision by using a magic constant of 4398046U and a right
	  shift of 31?

As usual, all feedback is welcome,

Will


Will Deacon (2):
  ARM: arch timer: implement read_current_timer and get_cycles
  ARM: delay: allow timer-based delay implementation to be selected

 arch/arm/include/asm/arch_timer.h      |    1 +
 arch/arm/include/asm/delay.h           |   19 +++++++--
 arch/arm/include/asm/timex.h           |   10 +++--
 arch/arm/kernel/arch_timer.c           |   11 +++++
 arch/arm/kernel/armksyms.c             |    3 +-
 arch/arm/lib/Makefile                  |    2 +-
 arch/arm/lib/{delay.S => delay-loop.S} |   16 +++----
 arch/arm/lib/delay.c                   |   66 ++++++++++++++++++++++++++++++++
 arch/arm/mach-sa1100/sleep.S           |    8 ++--
 9 files changed, 112 insertions(+), 24 deletions(-)
 rename arch/arm/lib/{delay.S => delay-loop.S} (85%)
 create mode 100644 arch/arm/lib/delay.c

-- 
1.7.4.1

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCH 1/2] ARM: arch timer: implement read_current_timer and get_cycles
  2012-06-22 15:09 [PATCH 0/2] Use architected timers for delay loop Will Deacon
@ 2012-06-22 15:09 ` Will Deacon
  2012-06-25 21:39   ` Stephen Boyd
  2012-06-22 15:09 ` [PATCH 2/2] ARM: delay: allow timer-based delay implementation to be selected Will Deacon
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 18+ messages in thread
From: Will Deacon @ 2012-06-22 15:09 UTC (permalink / raw)
  To: linux-arm-kernel

This patch implements read_current_timer using the architected timers
when they are selected via CONFIG_ARM_ARCH_TIMER. If they are detected
not to be usable at runtime, we return -ENXIO to the caller.

Furthermore, if read_current_timer is exported then we can implement
get_cycles in terms of it for use as both an entropy source and for
implementing __udelay and friends.

Signed-off-by: Will Deacon <will.deacon@arm.com>
---
 arch/arm/include/asm/arch_timer.h |    1 +
 arch/arm/include/asm/timex.h      |   10 ++++++----
 arch/arm/kernel/arch_timer.c      |    8 ++++++++
 3 files changed, 15 insertions(+), 4 deletions(-)

diff --git a/arch/arm/include/asm/arch_timer.h b/arch/arm/include/asm/arch_timer.h
index ed2e95d..7b0fe68 100644
--- a/arch/arm/include/asm/arch_timer.h
+++ b/arch/arm/include/asm/arch_timer.h
@@ -2,6 +2,7 @@
 #define __ASMARM_ARCH_TIMER_H
 
 #ifdef CONFIG_ARM_ARCH_TIMER
+#define ARCH_HAS_READ_CURRENT_TIMER
 int arch_timer_of_register(void);
 int arch_timer_sched_clock_init(void);
 #else
diff --git a/arch/arm/include/asm/timex.h b/arch/arm/include/asm/timex.h
index 3be8de3..ce11944 100644
--- a/arch/arm/include/asm/timex.h
+++ b/arch/arm/include/asm/timex.h
@@ -12,13 +12,15 @@
 #ifndef _ASMARM_TIMEX_H
 #define _ASMARM_TIMEX_H
 
+#include <asm/arch_timer.h>
 #include <mach/timex.h>
 
 typedef unsigned long cycles_t;
 
-static inline cycles_t get_cycles (void)
-{
-	return 0;
-}
+#ifdef ARCH_HAS_READ_CURRENT_TIMER
+#define get_cycles()	({ cycles_t c; read_current_timer(&c) ? 0 : c; })
+#else
+#define get_cycles()	(0)
+#endif
 
 #endif
diff --git a/arch/arm/kernel/arch_timer.c b/arch/arm/kernel/arch_timer.c
index dd58035..dbbeec4 100644
--- a/arch/arm/kernel/arch_timer.c
+++ b/arch/arm/kernel/arch_timer.c
@@ -223,6 +223,14 @@ static cycle_t arch_counter_read(struct clocksource *cs)
 	return arch_counter_get_cntpct();
 }
 
+int read_current_timer(unsigned long *timer_val)
+{
+	if (!arch_timer_rate)
+		return -ENXIO;
+	*timer_val = arch_counter_get_cntpct();
+	return 0;
+}
+
 static struct clocksource clocksource_counter = {
 	.name	= "arch_sys_counter",
 	.rating	= 400,
-- 
1.7.4.1

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH 2/2] ARM: delay: allow timer-based delay implementation to be selected
  2012-06-22 15:09 [PATCH 0/2] Use architected timers for delay loop Will Deacon
  2012-06-22 15:09 ` [PATCH 1/2] ARM: arch timer: implement read_current_timer and get_cycles Will Deacon
@ 2012-06-22 15:09 ` Will Deacon
  2012-06-25 21:39   ` Stephen Boyd
  2012-06-22 22:26 ` [PATCH 0/2] Use architected timers for delay loop Russell King - ARM Linux
  2012-06-25 21:38 ` Stephen Boyd
  3 siblings, 1 reply; 18+ messages in thread
From: Will Deacon @ 2012-06-22 15:09 UTC (permalink / raw)
  To: linux-arm-kernel

This patch allows a timer-based delay implementation to be selected by
switching the delay routines over to use get_cycles, which is
implemented in terms of read_current_timer. This further allows us to
skip the loop calibration and have a consistent delay function in the
face of core frequency scaling.

To avoid the pain of dealing with memory-mapped counters, this
implementation uses the co-processor interface to the architected timers
when they are available. The previous loop-based implementation is
kept around for CPUs without the architected timers and we retain both
the maximum delay (2ms) and the corresponding conversion factors for
determining the number of loops required for a given interval. Since the
indirection of the timer routines will only work when called from C,
the sa1100 sleep routines are modified to branch to the loop-based delay
functions directly.

Signed-off-by: Will Deacon <will.deacon@arm.com>
---
 arch/arm/include/asm/delay.h           |   19 +++++++--
 arch/arm/kernel/arch_timer.c           |    3 +
 arch/arm/kernel/armksyms.c             |    3 +-
 arch/arm/lib/Makefile                  |    2 +-
 arch/arm/lib/{delay.S => delay-loop.S} |   16 +++----
 arch/arm/lib/delay.c                   |   66 ++++++++++++++++++++++++++++++++
 arch/arm/mach-sa1100/sleep.S           |    8 ++--
 7 files changed, 97 insertions(+), 20 deletions(-)
 rename arch/arm/lib/{delay.S => delay-loop.S} (85%)
 create mode 100644 arch/arm/lib/delay.c

diff --git a/arch/arm/include/asm/delay.h b/arch/arm/include/asm/delay.h
index b2deda1..c6ab91a 100644
--- a/arch/arm/include/asm/delay.h
+++ b/arch/arm/include/asm/delay.h
@@ -8,7 +8,13 @@
 
 #include <asm/param.h>	/* HZ */
 
-extern void __delay(int loops);
+extern struct arm_delay_ops {
+	void (*delay)(unsigned long);
+	void (*const_udelay)(unsigned long);
+	void (*udelay)(unsigned long);
+} arm_delay_ops;
+
+#define __delay(n)		arm_delay_ops.delay(n)
 
 /*
  * This function intentionally does not exist; if you see references to
@@ -23,14 +29,14 @@ extern void __bad_udelay(void);
  * division by multiplication: you don't have to worry about
  * loss of precision.
  *
- * Use only for very small delays ( < 1 msec).  Should probably use a
+ * Use only for very small delays ( < 2 msec).  Should probably use a
  * lookup table, really, as the multiplications take much too long with
  * short delays.  This is a "reasonable" implementation, though (and the
  * first constant multiplications gets optimized away if the delay is
  * a constant)
  */
-extern void __udelay(unsigned long usecs);
-extern void __const_udelay(unsigned long);
+#define __udelay(n)		arm_delay_ops.udelay(n)
+#define __const_udelay(n)	arm_delay_ops.const_udelay(n)
 
 #define MAX_UDELAY_MS 2
 
@@ -40,5 +46,10 @@ extern void __const_udelay(unsigned long);
 			__const_udelay((n) * ((2199023U*HZ)>>11))) :	\
 	  __udelay(n))
 
+/* Loop-based definitions for assembly code. */
+extern void __loop_delay(unsigned long loops);
+extern void __loop_udelay(unsigned long usecs);
+extern void __loop_const_udelay(unsigned long);
+
 #endif /* defined(_ARM_DELAY_H) */
 
diff --git a/arch/arm/kernel/arch_timer.c b/arch/arm/kernel/arch_timer.c
index dbbeec4..675cee0 100644
--- a/arch/arm/kernel/arch_timer.c
+++ b/arch/arm/kernel/arch_timer.c
@@ -32,6 +32,8 @@ static int arch_timer_ppi2;
 
 static struct clock_event_device __percpu **arch_timer_evt;
 
+extern void init_current_timer_delay(unsigned long freq);
+
 /*
  * Architected system timer support.
  */
@@ -304,6 +306,7 @@ static int __init arch_timer_register(void)
 	if (err)
 		goto out_free_irq;
 
+	init_current_timer_delay(arch_timer_rate);
 	return 0;
 
 out_free_irq:
diff --git a/arch/arm/kernel/armksyms.c b/arch/arm/kernel/armksyms.c
index b57c75e..7196228 100644
--- a/arch/arm/kernel/armksyms.c
+++ b/arch/arm/kernel/armksyms.c
@@ -49,8 +49,7 @@ extern void __aeabi_ulcmp(void);
 extern void fpundefinstr(void);
 
 	/* platform dependent support */
-EXPORT_SYMBOL(__udelay);
-EXPORT_SYMBOL(__const_udelay);
+EXPORT_SYMBOL(arm_delay_ops);
 
 	/* networking */
 EXPORT_SYMBOL(csum_partial);
diff --git a/arch/arm/lib/Makefile b/arch/arm/lib/Makefile
index 992769a..b621114 100644
--- a/arch/arm/lib/Makefile
+++ b/arch/arm/lib/Makefile
@@ -6,7 +6,7 @@
 
 lib-y		:= backtrace.o changebit.o csumipv6.o csumpartial.o   \
 		   csumpartialcopy.o csumpartialcopyuser.o clearbit.o \
-		   delay.o findbit.o memchr.o memcpy.o		      \
+		   delay.o delay-loop.o findbit.o memchr.o memcpy.o   \
 		   memmove.o memset.o memzero.o setbit.o              \
 		   strncpy_from_user.o strnlen_user.o                 \
 		   strchr.o strrchr.o                                 \
diff --git a/arch/arm/lib/delay.S b/arch/arm/lib/delay-loop.S
similarity index 85%
rename from arch/arm/lib/delay.S
rename to arch/arm/lib/delay-loop.S
index 3c9a05c..1428a8f 100644
--- a/arch/arm/lib/delay.S
+++ b/arch/arm/lib/delay-loop.S
@@ -21,10 +21,10 @@
  * HZ  <= 1000
  */
 
-ENTRY(__udelay)
+ENTRY(__loop_udelay)
 		ldr	r2, .LC1
 		mul	r0, r2, r0
-ENTRY(__const_udelay)				@ 0 <= r0 <= 0x7fffff06
+ENTRY(__loop_const_udelay)			@ 0 <= r0 <= 0x7fffff06
 		mov	r1, #-1
 		ldr	r2, .LC0
 		ldr	r2, [r2]		@ max = 0x01ffffff
@@ -39,12 +39,10 @@ ENTRY(__const_udelay)				@ 0 <= r0 <= 0x7fffff06
 
 /*
  * loops = r0 * HZ * loops_per_jiffy / 1000000
- *
- * Oh, if only we had a cycle counter...
  */
 
 @ Delay routine
-ENTRY(__delay)
+ENTRY(__loop_delay)
 		subs	r0, r0, #1
 #if 0
 		movls	pc, lr
@@ -62,8 +60,8 @@ ENTRY(__delay)
 		movls	pc, lr
 		subs	r0, r0, #1
 #endif
-		bhi	__delay
+		bhi	__loop_delay
 		mov	pc, lr
-ENDPROC(__udelay)
-ENDPROC(__const_udelay)
-ENDPROC(__delay)
+ENDPROC(__loop_udelay)
+ENDPROC(__loop_const_udelay)
+ENDPROC(__loop_delay)
diff --git a/arch/arm/lib/delay.c b/arch/arm/lib/delay.c
new file mode 100644
index 0000000..2b590c3
--- /dev/null
+++ b/arch/arm/lib/delay.c
@@ -0,0 +1,66 @@
+/*
+ * Delay loops based on the OpenRISC implementation.
+ *
+ * Copyright (C) 2012 ARM Limited
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
+ *
+ * Author: Will Deacon <will.deacon@arm.com>
+ */
+
+#include <linux/delay.h>
+#include <linux/init.h>
+#include <linux/kernel.h>
+#include <linux/module.h>
+#include <linux/timex.h>
+
+/*
+ * Default to the loop-based delay implementation.
+ */
+struct arm_delay_ops arm_delay_ops = {
+	.delay		= __loop_delay,
+	.const_udelay	= __loop_const_udelay,
+	.udelay		= __loop_udelay,
+};
+
+#ifdef ARCH_HAS_READ_CURRENT_TIMER
+static void __timer_delay(unsigned long cycles)
+{
+	cycles_t start = get_cycles();
+
+	while ((get_cycles() - start) < cycles)
+		cpu_relax();
+}
+
+static void __timer_const_udelay(unsigned long xloops)
+{
+	unsigned long long loops = xloops;
+	loops *= loops_per_jiffy;
+	__timer_delay(loops >> 30);
+}
+
+static void __timer_udelay(unsigned long usecs)
+{
+	__timer_const_udelay(usecs * ((2199023U * HZ) >> 11));
+}
+
+void __init init_current_timer_delay(unsigned long freq)
+{
+	pr_info("Switching to timer-based delay loop\n");
+	lpj_fine			= freq / HZ;
+	arm_delay_ops.delay		= __timer_delay;
+	arm_delay_ops.const_udelay	= __timer_const_udelay;
+	arm_delay_ops.udelay		= __timer_udelay;
+}
+#endif
diff --git a/arch/arm/mach-sa1100/sleep.S b/arch/arm/mach-sa1100/sleep.S
index 30cc672..8586374 100644
--- a/arch/arm/mach-sa1100/sleep.S
+++ b/arch/arm/mach-sa1100/sleep.S
@@ -38,9 +38,9 @@ ENTRY(sa1100_finish_suspend)
 	orr     r4, r4, #MDREFR_K1DB2
 	ldr	r5, =PPCR
 
-	@ Pre-load __udelay into the I-cache
+	@ Pre-load __loop_udelay into the I-cache
 	mov	r0, #1
-	bl	__udelay
+	bl	__loop_udelay
 	mov	r0, r0
 
 	@ The following must all exist in a single cache line to
@@ -53,11 +53,11 @@ ENTRY(sa1100_finish_suspend)
 	@ delay 90us and set CPU PLL to lowest speed
 	@ fixes resume problem on high speed SA1110
 	mov	r0, #90
-	bl	__udelay
+	bl	__loop_udelay
 	mov	r1, #0
 	str	r1, [r5]
 	mov	r0, #90
-	bl	__udelay
+	bl	__loop_udelay
 
 	/*
 	 * SA1110 SDRAM controller workaround.  register values:
-- 
1.7.4.1

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH 0/2] Use architected timers for delay loop
  2012-06-22 15:09 [PATCH 0/2] Use architected timers for delay loop Will Deacon
  2012-06-22 15:09 ` [PATCH 1/2] ARM: arch timer: implement read_current_timer and get_cycles Will Deacon
  2012-06-22 15:09 ` [PATCH 2/2] ARM: delay: allow timer-based delay implementation to be selected Will Deacon
@ 2012-06-22 22:26 ` Russell King - ARM Linux
  2012-06-25 10:03   ` Will Deacon
  2012-06-25 21:38 ` Stephen Boyd
  3 siblings, 1 reply; 18+ messages in thread
From: Russell King - ARM Linux @ 2012-06-22 22:26 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, Jun 22, 2012 at 04:09:32PM +0100, Will Deacon wrote:
> Hello,
> 
> This patch series reworks our delay loop implementation to allow the
> busy-loop to be replaced with the physical counter included as part of
> the architected timer on recent CPUs. The polling implementation is
> still maintained to support cores without the timers and we use the same
> maths and upper delay bound of 2ms.
> 
> Russell - I think my maths is correct, but why can't we increase the
>           precision by using a magic constant of 4398046U and a right
> 	  shift of 31?

It's been soo long since I looked at that code that I've forgotten the
maths behind it.

What I can say is that the maths in the assembly code version is
carefully chosen to avoid overflows during the various stages of
computation without loosing too much precision along the way.

The assembly is also designed to round up any lost precision, so we
try to produce a longer delay by way of the lost precision in the
maths.

I forget what kind of precision we do get, but precision is not really
a major concern here.  We are only after a "rough" estimate of the
delay (hell, we don't even guarantee according to Linus that you will
get a minimum of the delay you asked for - you might get a shorter
delay!)  You _will_ get shorter delays because of the way the loop is
calibrated (think: the jiffy code tries to find how many loops fit in
a single jiffy, but a jiffy _also_ includes the overhead of running
the timer interrupt, which artificially reduces loops_per_jiffy.)

Provided we get a delay "in the ballpark" of what was requested, that's
good enough.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCH 0/2] Use architected timers for delay loop
  2012-06-22 22:26 ` [PATCH 0/2] Use architected timers for delay loop Russell King - ARM Linux
@ 2012-06-25 10:03   ` Will Deacon
  0 siblings, 0 replies; 18+ messages in thread
From: Will Deacon @ 2012-06-25 10:03 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Russell,

Thanks for the clarification.

On Fri, Jun 22, 2012 at 11:26:47PM +0100, Russell King - ARM Linux wrote:
> On Fri, Jun 22, 2012 at 04:09:32PM +0100, Will Deacon wrote:
> > Russell - I think my maths is correct, but why can't we increase the
> >           precision by using a magic constant of 4398046U and a right
> > 	  shift of 31?
> 
> It's been soo long since I looked at that code that I've forgotten the
> maths behind it.

No problem. I managed to convince myself of the maths but I don't want to
touch the assembly code in delay.S if I don't have to!

> What I can say is that the maths in the assembly code version is
> carefully chosen to avoid overflows during the various stages of
> computation without loosing too much precision along the way.
> 
> The assembly is also designed to round up any lost precision, so we
> try to produce a longer delay by way of the lost precision in the
> maths.

Right, that probably explains why I was having difficulty following it.

> I forget what kind of precision we do get, but precision is not really
> a major concern here.  We are only after a "rough" estimate of the
> delay (hell, we don't even guarantee according to Linus that you will
> get a minimum of the delay you asked for - you might get a shorter
> delay!)  You _will_ get shorter delays because of the way the loop is
> calibrated (think: the jiffy code tries to find how many loops fit in
> a single jiffy, but a jiffy _also_ includes the overhead of running
> the timer interrupt, which artificially reduces loops_per_jiffy.)

We avoid the calibration step for the timer-based loop and use the counter
frequency directly so I think we probably always delay for slightly longer
in that case. But yes, I see that precision isn't the be-all and end-all
here!

> Provided we get a delay "in the ballpark" of what was requested, that's
> good enough.

In which case, let's leave the maths like it is. The current values work
fine with both the polling loop and the timer loop up to 2ms.

Cheers,

Will

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCH 0/2] Use architected timers for delay loop
  2012-06-22 15:09 [PATCH 0/2] Use architected timers for delay loop Will Deacon
                   ` (2 preceding siblings ...)
  2012-06-22 22:26 ` [PATCH 0/2] Use architected timers for delay loop Russell King - ARM Linux
@ 2012-06-25 21:38 ` Stephen Boyd
  2012-06-26 10:35   ` Will Deacon
  3 siblings, 1 reply; 18+ messages in thread
From: Stephen Boyd @ 2012-06-25 21:38 UTC (permalink / raw)
  To: linux-arm-kernel

On 06/22/12 08:09, Will Deacon wrote:
> Hello,
>
> This patch series reworks our delay loop implementation to allow the
> busy-loop to be replaced with the physical counter included as part of
> the architected timer on recent CPUs. The polling implementation is
> still maintained to support cores without the timers and we use the same
> maths and upper delay bound of 2ms.
>
> Russell - I think my maths is correct, but why can't we increase the
>           precision by using a magic constant of 4398046U and a right
> 	  shift of 31?
>
> As usual, all feedback is welcome,

I have posted a read_current_timer implementation to the list a couple
times but had no success in getting it merged. The patches are still in
the patch tracker but I haven't really pushed them to get merged.

    http://www.arm.linux.org.uk/developer/patches/viewpatch.php?id=6874/1
    http://www.arm.linux.org.uk/developer/patches/viewpatch.php?id=6873/1
    http://www.arm.linux.org.uk/developer/patches/viewpatch.php?id=6875/1

So far we've just been carrying them internally. I suppose ARM needs
this now because of big little?

-- 
Sent by an employee of the Qualcomm Innovation Center, Inc.
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCH 1/2] ARM: arch timer: implement read_current_timer and get_cycles
  2012-06-22 15:09 ` [PATCH 1/2] ARM: arch timer: implement read_current_timer and get_cycles Will Deacon
@ 2012-06-25 21:39   ` Stephen Boyd
  2012-06-26 10:37     ` Will Deacon
  0 siblings, 1 reply; 18+ messages in thread
From: Stephen Boyd @ 2012-06-25 21:39 UTC (permalink / raw)
  To: linux-arm-kernel

On 06/22/12 08:09, Will Deacon wrote:
> This patch implements read_current_timer using the architected timers
> when they are selected via CONFIG_ARM_ARCH_TIMER. If they are detected
> not to be usable at runtime, we return -ENXIO to the caller.
>
> Furthermore, if read_current_timer is exported then we can implement
> get_cycles in terms of it for use as both an entropy source and for
> implementing __udelay and friends.

Neat.

> diff --git a/arch/arm/include/asm/timex.h b/arch/arm/include/asm/timex.h
> index 3be8de3..ce11944 100644
> --- a/arch/arm/include/asm/timex.h
> +++ b/arch/arm/include/asm/timex.h
> @@ -12,13 +12,15 @@
>  #ifndef _ASMARM_TIMEX_H
>  #define _ASMARM_TIMEX_H
>  
> +#include <asm/arch_timer.h>

This seems to cause compilation failures if you aren't building with
architected timers enabled.

In file included from arch/arm/include/asm/timex.h:15,
                 from include/linux/timex.h:174,
                 from include/linux/sched.h:57,
                 from arch/arm/kernel/asm-offsets.c:13:
arch/arm/include/asm/arch_timer.h: In function 'arch_timer_of_register':
arch/arm/include/asm/arch_timer.h:11: error: 'ENXIO' undeclared (first use in this function)


>  #include <mach/timex.h>
>  
>  typedef unsigned long cycles_t;
>  
> -static inline cycles_t get_cycles (void)
> -{
> -	return 0;
> -}
> +#ifdef ARCH_HAS_READ_CURRENT_TIMER
> +#define get_cycles()	({ cycles_t c; read_current_timer(&c) ? 0 : c; })
> +#else
> +#define get_cycles()	(0)
> +#endif
>  
>  #endif
> diff --git a/arch/arm/kernel/arch_timer.c b/arch/arm/kernel/arch_timer.c
> index dd58035..dbbeec4 100644
> --- a/arch/arm/kernel/arch_timer.c
> +++ b/arch/arm/kernel/arch_timer.c
> @@ -223,6 +223,14 @@ static cycle_t arch_counter_read(struct clocksource *cs)
>  	return arch_counter_get_cntpct();
>  }
>  
> +int read_current_timer(unsigned long *timer_val)
> +{
> +	if (!arch_timer_rate)
> +		return -ENXIO;

Shouldn't this be returning 0? Otherwise get_cycles() up there will
evaluate to -ENXIO?

-- 
Sent by an employee of the Qualcomm Innovation Center, Inc.
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCH 2/2] ARM: delay: allow timer-based delay implementation to be selected
  2012-06-22 15:09 ` [PATCH 2/2] ARM: delay: allow timer-based delay implementation to be selected Will Deacon
@ 2012-06-25 21:39   ` Stephen Boyd
  2012-06-26 10:49     ` Will Deacon
  0 siblings, 1 reply; 18+ messages in thread
From: Stephen Boyd @ 2012-06-25 21:39 UTC (permalink / raw)
  To: linux-arm-kernel

On 06/22/12 08:09, Will Deacon wrote:
> diff --git a/arch/arm/kernel/arch_timer.c b/arch/arm/kernel/arch_timer.c
> index dbbeec4..675cee0 100644
> --- a/arch/arm/kernel/arch_timer.c
> +++ b/arch/arm/kernel/arch_timer.c
> @@ -32,6 +32,8 @@ static int arch_timer_ppi2;
>  
>  static struct clock_event_device __percpu **arch_timer_evt;
>  
> +extern void init_current_timer_delay(unsigned long freq);

Can we find a home for this in some header file?

> +static void __timer_const_udelay(unsigned long xloops)
> +{
> +	unsigned long long loops = xloops;
> +	loops *= loops_per_jiffy;
> +	__timer_delay(loops >> 30);
> +}

Is it ok to have a 64 bit multiply here? It seems the assembly version
tries to keep it all 32 bit math.

> +
> +static void __timer_udelay(unsigned long usecs)
> +{
> +	__timer_const_udelay(usecs * ((2199023U * HZ) >> 11));
> +}

It's unfortunate that we have to duplicate the same code and constants
in both C and assembly. With my approach we convert delay.S into C and
avoid the duplication.

> +
> +void __init init_current_timer_delay(unsigned long freq)
> +{
> +	pr_info("Switching to timer-based delay loop\n");

Might be worth printing the frequency here too.

-- 
Sent by an employee of the Qualcomm Innovation Center, Inc.
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCH 0/2] Use architected timers for delay loop
  2012-06-25 21:38 ` Stephen Boyd
@ 2012-06-26 10:35   ` Will Deacon
  2012-06-26 17:42     ` Stephen Boyd
  0 siblings, 1 reply; 18+ messages in thread
From: Will Deacon @ 2012-06-26 10:35 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Stephen,

Thanks for taking a look at the series!

On Mon, Jun 25, 2012 at 10:38:45PM +0100, Stephen Boyd wrote:
> On 06/22/12 08:09, Will Deacon wrote:
> > This patch series reworks our delay loop implementation to allow the
> > busy-loop to be replaced with the physical counter included as part of
> > the architected timer on recent CPUs. The polling implementation is
> > still maintained to support cores without the timers and we use the same
> > maths and upper delay bound of 2ms.
> 
> I have posted a read_current_timer implementation to the list a couple
> times but had no success in getting it merged. The patches are still in
> the patch tracker but I haven't really pushed them to get merged.
> 
>     http://www.arm.linux.org.uk/developer/patches/viewpatch.php?id=6874/1
>     http://www.arm.linux.org.uk/developer/patches/viewpatch.php?id=6873/1
>     http://www.arm.linux.org.uk/developer/patches/viewpatch.php?id=6875/1

I took a look at these but I really shy away from using memory-mapped
clocksources for delay -- it feels like we're asking for problems if we go
down that route. Maybe it could work, but switching from the polling loop to
the clocksource would surely require some recalibration?

> So far we've just been carrying them internally. I suppose ARM needs
> this now because of big little?

To be honest, I just implemented this because it allows us to skip the
calibration and I thought the solution fell out quite neatly. It does also
remove the need for re-calibration when scaling the core clock, so yes it
would be useful whenever that occurs.

Will

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCH 1/2] ARM: arch timer: implement read_current_timer and get_cycles
  2012-06-25 21:39   ` Stephen Boyd
@ 2012-06-26 10:37     ` Will Deacon
  2012-06-26 17:44       ` Stephen Boyd
  0 siblings, 1 reply; 18+ messages in thread
From: Will Deacon @ 2012-06-26 10:37 UTC (permalink / raw)
  To: linux-arm-kernel

On Mon, Jun 25, 2012 at 10:39:04PM +0100, Stephen Boyd wrote:
> On 06/22/12 08:09, Will Deacon wrote:
> > diff --git a/arch/arm/include/asm/timex.h b/arch/arm/include/asm/timex.h
> > index 3be8de3..ce11944 100644
> > --- a/arch/arm/include/asm/timex.h
> > +++ b/arch/arm/include/asm/timex.h
> > @@ -12,13 +12,15 @@
> >  #ifndef _ASMARM_TIMEX_H
> >  #define _ASMARM_TIMEX_H
> >  
> > +#include <asm/arch_timer.h>
> 
> This seems to cause compilation failures if you aren't building with
> architected timers enabled.
> 
> In file included from arch/arm/include/asm/timex.h:15,
>                  from include/linux/timex.h:174,
>                  from include/linux/sched.h:57,
>                  from arch/arm/kernel/asm-offsets.c:13:
> arch/arm/include/asm/arch_timer.h: In function 'arch_timer_of_register':
> arch/arm/include/asm/arch_timer.h:11: error: 'ENXIO' undeclared (first use in this function)

Thanks, will fix for v2. I tested on a platform without the timers but I
must've left them compiled in.

> 
> >  #include <mach/timex.h>
> >  
> >  typedef unsigned long cycles_t;
> >  
> > -static inline cycles_t get_cycles (void)
> > -{
> > -	return 0;
> > -}
> > +#ifdef ARCH_HAS_READ_CURRENT_TIMER
> > +#define get_cycles()	({ cycles_t c; read_current_timer(&c) ? 0 : c; })
> > +#else
> > +#define get_cycles()	(0)
> > +#endif
> >  
> >  #endif
> > diff --git a/arch/arm/kernel/arch_timer.c b/arch/arm/kernel/arch_timer.c
> > index dd58035..dbbeec4 100644
> > --- a/arch/arm/kernel/arch_timer.c
> > +++ b/arch/arm/kernel/arch_timer.c
> > @@ -223,6 +223,14 @@ static cycle_t arch_counter_read(struct clocksource *cs)
> >  	return arch_counter_get_cntpct();
> >  }
> >  
> > +int read_current_timer(unsigned long *timer_val)
> > +{
> > +	if (!arch_timer_rate)
> > +		return -ENXIO;
> 
> Shouldn't this be returning 0? Otherwise get_cycles() up there will
> evaluate to -ENXIO?

I don't think so. The cycle count is returned via the timer_val parameter
and the get_cycles code checks the return value, so will give 0 if
read_current_timer returns anything other than 0. The core calibration code
(not that we call it) also expects a return value < 0 to indicate failure.

Will

> -- 
> Sent by an employee of the Qualcomm Innovation Center, Inc.
> The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum.
> 
> 

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCH 2/2] ARM: delay: allow timer-based delay implementation to be selected
  2012-06-25 21:39   ` Stephen Boyd
@ 2012-06-26 10:49     ` Will Deacon
  2012-06-26 15:54       ` Will Deacon
  2012-06-27  2:07       ` Stephen Boyd
  0 siblings, 2 replies; 18+ messages in thread
From: Will Deacon @ 2012-06-26 10:49 UTC (permalink / raw)
  To: linux-arm-kernel

On Mon, Jun 25, 2012 at 10:39:10PM +0100, Stephen Boyd wrote:
> On 06/22/12 08:09, Will Deacon wrote:
> > diff --git a/arch/arm/kernel/arch_timer.c b/arch/arm/kernel/arch_timer.c
> > index dbbeec4..675cee0 100644
> > --- a/arch/arm/kernel/arch_timer.c
> > +++ b/arch/arm/kernel/arch_timer.c
> > @@ -32,6 +32,8 @@ static int arch_timer_ppi2;
> >  
> >  static struct clock_event_device __percpu **arch_timer_evt;
> >  
> > +extern void init_current_timer_delay(unsigned long freq);
> 
> Can we find a home for this in some header file?

I wondered about that...

The reason I didn't add it to a header file is that we really don't want it
to be called willy-nilly across the kernel. In fact, it must be called
exactly once by the clocksource backing read_current_timer when it knows
that the timer is live and ticking.

I suppose I could allow the function to fail if it's called after we've
calibrated. What do you reckon?

> > +static void __timer_const_udelay(unsigned long xloops)
> > +{
> > +	unsigned long long loops = xloops;
> > +	loops *= loops_per_jiffy;
> > +	__timer_delay(loops >> 30);
> > +}
> 
> Is it ok to have a 64 bit multiply here? It seems the assembly version
> tries to keep it all 32 bit math.

It's actually a 32-bit multiply with a 64-bit result, so it's just a umull:

00000050 <__timer_const_udelay>:
  50:	e3003000	movw	r3, #0
  54:	e3403000	movt	r3, #0
  58:	e5932000	ldr	r2, [r3]
  5c:	e0832290	umull	r2, r3, r0, r2
  60:	e1a00f22	lsr	r0, r2, #30
  64:	e1800103	orr	r0, r0, r3, lsl #2
  68:	eaffffe4	b	0 <__timer_delay>

> > +
> > +static void __timer_udelay(unsigned long usecs)
> > +{
> > +	__timer_const_udelay(usecs * ((2199023U * HZ) >> 11));
> > +}
> 
> It's unfortunate that we have to duplicate the same code and constants
> in both C and assembly. With my approach we convert delay.S into C and
> avoid the duplication.

It's probably easy enough to have a #define for the multiplier, I can do
that for v2.

> > +
> > +void __init init_current_timer_delay(unsigned long freq)
> > +{
> > +	pr_info("Switching to timer-based delay loop\n");
> 
> Might be worth printing the frequency here too.

Could do, but the timer itself probably prints that information already (at
least it does the arch timer).

Cheers,

Will

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCH 2/2] ARM: delay: allow timer-based delay implementation to be selected
  2012-06-26 10:49     ` Will Deacon
@ 2012-06-26 15:54       ` Will Deacon
  2012-06-26 16:00         ` Rob Herring
  2012-06-27  2:07       ` Stephen Boyd
  1 sibling, 1 reply; 18+ messages in thread
From: Will Deacon @ 2012-06-26 15:54 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, Jun 26, 2012 at 11:49:44AM +0100, Will Deacon wrote:
> On Mon, Jun 25, 2012 at 10:39:10PM +0100, Stephen Boyd wrote:
> > 
> > It's unfortunate that we have to duplicate the same code and constants
> > in both C and assembly. With my approach we convert delay.S into C and
> > avoid the duplication.
> 
> It's probably easy enough to have a #define for the multiplier, I can do
> that for v2.

Looks like I spoke too soon :)

The reason this is slightly problematic is due to gas's inability to
represent unsigned constants -- namely that the 'U' suffix causes it to
barf. I can fix it with the diff below, but please let me know what you
think.

Cheers,

Will


diff --git a/arch/arm/include/asm/delay.h b/arch/arm/include/asm/delay.h
index c6ab91a..ce9f564 100644
--- a/arch/arm/include/asm/delay.h
+++ b/arch/arm/include/asm/delay.h
@@ -8,6 +8,12 @@
 
 #include <asm/param.h>	/* HZ */
 
+#define MAX_UDELAY_MS	2
+#define UDELAY_MULT	((UDELAY_MULT_HZ * HZ) >> 11)
+#define UDELAY_SHIFT	30
+
+#ifndef __ASSEMBLY__
+
 extern struct arm_delay_ops {
 	void (*delay)(unsigned long);
 	void (*const_udelay)(unsigned long);
@@ -38,12 +44,10 @@ extern void __bad_udelay(void);
 #define __udelay(n)		arm_delay_ops.udelay(n)
 #define __const_udelay(n)	arm_delay_ops.const_udelay(n)
 
-#define MAX_UDELAY_MS 2
-
 #define udelay(n)							\
 	(__builtin_constant_p(n) ?					\
 	  ((n) > (MAX_UDELAY_MS * 1000) ? __bad_udelay() :		\
-			__const_udelay((n) * ((2199023U*HZ)>>11))) :	\
+			__const_udelay((n) * UDELAY_MULT)) :		\
 	  __udelay(n))
 
 /* Loop-based definitions for assembly code. */
@@ -51,5 +55,10 @@ extern void __loop_delay(unsigned long loops);
 extern void __loop_udelay(unsigned long usecs);
 extern void __loop_const_udelay(unsigned long);
 
+#define UDELAY_MULT_HZ	2199023U
+#else
+#define UDELAY_MULT_HZ	2199023
+#endif /* __ASSEMBLY__ */
+
 #endif /* defined(_ARM_DELAY_H) */
 
diff --git a/arch/arm/lib/delay-loop.S b/arch/arm/lib/delay-loop.S
index 1428a8f..36b668d 100644
--- a/arch/arm/lib/delay-loop.S
+++ b/arch/arm/lib/delay-loop.S
@@ -9,11 +9,11 @@
  */
 #include <linux/linkage.h>
 #include <asm/assembler.h>
-#include <asm/param.h>
+#include <asm/delay.h>
 		.text
 
 .LC0:		.word	loops_per_jiffy
-.LC1:		.word	(2199023*HZ)>>11
+.LC1:		.word	UDELAY_MULT
 
 /*
  * r0  <= 2000
diff --git a/arch/arm/lib/delay.c b/arch/arm/lib/delay.c
index 2b590c3..e1030e1 100644
--- a/arch/arm/lib/delay.c
+++ b/arch/arm/lib/delay.c
@@ -47,12 +47,12 @@ static void __timer_const_udelay(unsigned long xloops)
 {
 	unsigned long long loops = xloops;
 	loops *= loops_per_jiffy;
-	__timer_delay(loops >> 30);
+	__timer_delay(loops >> UDELAY_SHIFT);
 }
 
 static void __timer_udelay(unsigned long usecs)
 {
-	__timer_const_udelay(usecs * ((2199023U * HZ) >> 11));
+	__timer_const_udelay(usecs * UDELAY_MULT);
 }
 
 void __init init_current_timer_delay(unsigned long freq)

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH 2/2] ARM: delay: allow timer-based delay implementation to be selected
  2012-06-26 15:54       ` Will Deacon
@ 2012-06-26 16:00         ` Rob Herring
  2012-06-26 16:28           ` Will Deacon
  0 siblings, 1 reply; 18+ messages in thread
From: Rob Herring @ 2012-06-26 16:00 UTC (permalink / raw)
  To: linux-arm-kernel

On 06/26/2012 10:54 AM, Will Deacon wrote:
> On Tue, Jun 26, 2012 at 11:49:44AM +0100, Will Deacon wrote:
>> On Mon, Jun 25, 2012 at 10:39:10PM +0100, Stephen Boyd wrote:
>>>
>>> It's unfortunate that we have to duplicate the same code and constants
>>> in both C and assembly. With my approach we convert delay.S into C and
>>> avoid the duplication.
>>
>> It's probably easy enough to have a #define for the multiplier, I can do
>> that for v2.
> 
> Looks like I spoke too soon :)
> 
> The reason this is slightly problematic is due to gas's inability to
> represent unsigned constants -- namely that the 'U' suffix causes it to
> barf. I can fix it with the diff below, but please let me know what you
> think.

[snip]

> +#ifndef __ASSEMBLY__
> +


> +#define UDELAY_MULT_HZ	2199023U
> +#else
> +#define UDELAY_MULT_HZ	2199023

Doesn't UL() macro do what you need?

Rob

> +#endif /* __ASSEMBLY__ */
> +
>  #endif /* defined(_ARM_DELAY_H) */
>  
> diff --git a/arch/arm/lib/delay-loop.S b/arch/arm/lib/delay-loop.S
> index 1428a8f..36b668d 100644
> --- a/arch/arm/lib/delay-loop.S
> +++ b/arch/arm/lib/delay-loop.S
> @@ -9,11 +9,11 @@
>   */
>  #include <linux/linkage.h>
>  #include <asm/assembler.h>
> -#include <asm/param.h>
> +#include <asm/delay.h>
>  		.text
>  
>  .LC0:		.word	loops_per_jiffy
> -.LC1:		.word	(2199023*HZ)>>11
> +.LC1:		.word	UDELAY_MULT
>  
>  /*
>   * r0  <= 2000
> diff --git a/arch/arm/lib/delay.c b/arch/arm/lib/delay.c
> index 2b590c3..e1030e1 100644
> --- a/arch/arm/lib/delay.c
> +++ b/arch/arm/lib/delay.c
> @@ -47,12 +47,12 @@ static void __timer_const_udelay(unsigned long xloops)
>  {
>  	unsigned long long loops = xloops;
>  	loops *= loops_per_jiffy;
> -	__timer_delay(loops >> 30);
> +	__timer_delay(loops >> UDELAY_SHIFT);
>  }
>  
>  static void __timer_udelay(unsigned long usecs)
>  {
> -	__timer_const_udelay(usecs * ((2199023U * HZ) >> 11));
> +	__timer_const_udelay(usecs * UDELAY_MULT);
>  }
>  
>  void __init init_current_timer_delay(unsigned long freq)
> 
> _______________________________________________
> linux-arm-kernel mailing list
> linux-arm-kernel at lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCH 2/2] ARM: delay: allow timer-based delay implementation to be selected
  2012-06-26 16:00         ` Rob Herring
@ 2012-06-26 16:28           ` Will Deacon
  0 siblings, 0 replies; 18+ messages in thread
From: Will Deacon @ 2012-06-26 16:28 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, Jun 26, 2012 at 05:00:21PM +0100, Rob Herring wrote:
> On 06/26/2012 10:54 AM, Will Deacon wrote:
> > The reason this is slightly problematic is due to gas's inability to
> > represent unsigned constants -- namely that the 'U' suffix causes it to
> > barf. I can fix it with the diff below, but please let me know what you
> > think.
> 
> [snip]
> 
> > +#ifndef __ASSEMBLY__
> > +
> 
> 
> > +#define UDELAY_MULT_HZ	2199023U
> > +#else
> > +#define UDELAY_MULT_HZ	2199023
> 
> Doesn't UL() macro do what you need?

Indeed, thanks for pointing that out as I'd failed to spot it.
I'll include this change for v2.

Will

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCH 0/2] Use architected timers for delay loop
  2012-06-26 10:35   ` Will Deacon
@ 2012-06-26 17:42     ` Stephen Boyd
  0 siblings, 0 replies; 18+ messages in thread
From: Stephen Boyd @ 2012-06-26 17:42 UTC (permalink / raw)
  To: linux-arm-kernel

On 06/26/12 03:35, Will Deacon wrote:
> On Mon, Jun 25, 2012 at 10:38:45PM +0100, Stephen Boyd wrote:
>> I have posted a read_current_timer implementation to the list a couple
>> times but had no success in getting it merged. The patches are still in
>> the patch tracker but I haven't really pushed them to get merged.
>>
>>     http://www.arm.linux.org.uk/developer/patches/viewpatch.php?id=6874/1
>>     http://www.arm.linux.org.uk/developer/patches/viewpatch.php?id=6873/1
>>     http://www.arm.linux.org.uk/developer/patches/viewpatch.php?id=6875/1
> I took a look at these but I really shy away from using memory-mapped
> clocksources for delay -- it feels like we're asking for problems if we go
> down that route. Maybe it could work, but switching from the polling loop to
> the clocksource would surely require some recalibration?

We never switch from polling to clocksource (or vice versa) after the
calibration is done in calibrate_delay(). The lpj calculation is always
based on the clocksource using calibrate_delay_direct(). This looks to
be pretty much like what your patches are doing but you skip the
calibration step by setting lpj_fine, which is even better.

Even then, I don't understand why the series scares you that much. You
could just define read_current_timer() for architected timers like you
already have and then the series would work just the same with
coprocessor accesses. The benefit is no code duplication. I like how
you've implemented get_cycle(), but that's a minor difference.

-- 
Sent by an employee of the Qualcomm Innovation Center, Inc.
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCH 1/2] ARM: arch timer: implement read_current_timer and get_cycles
  2012-06-26 10:37     ` Will Deacon
@ 2012-06-26 17:44       ` Stephen Boyd
  0 siblings, 0 replies; 18+ messages in thread
From: Stephen Boyd @ 2012-06-26 17:44 UTC (permalink / raw)
  To: linux-arm-kernel

On 06/26/12 03:37, Will Deacon wrote:
> On Mon, Jun 25, 2012 at 10:39:04PM +0100, Stephen Boyd wrote:
>>> +int read_current_timer(unsigned long *timer_val)
>>> +{
>>> +	if (!arch_timer_rate)
>>> +		return -ENXIO;
>> Shouldn't this be returning 0? Otherwise get_cycles() up there will
>> evaluate to -ENXIO?
> I don't think so. The cycle count is returned via the timer_val parameter
> and the get_cycles code checks the return value, so will give 0 if
> read_current_timer returns anything other than 0. The core calibration code
> (not that we call it) also expects a return value < 0 to indicate failure.

Ah you're right. More coffee in the morning would do me some good.

-- 
Sent by an employee of the Qualcomm Innovation Center, Inc.
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCH 2/2] ARM: delay: allow timer-based delay implementation to be selected
  2012-06-26 10:49     ` Will Deacon
  2012-06-26 15:54       ` Will Deacon
@ 2012-06-27  2:07       ` Stephen Boyd
  2012-06-27  9:41         ` Will Deacon
  1 sibling, 1 reply; 18+ messages in thread
From: Stephen Boyd @ 2012-06-27  2:07 UTC (permalink / raw)
  To: linux-arm-kernel

On 06/26/12 03:49, Will Deacon wrote:
> On Mon, Jun 25, 2012 at 10:39:10PM +0100, Stephen Boyd wrote:
>> On 06/22/12 08:09, Will Deacon wrote:
>>> diff --git a/arch/arm/kernel/arch_timer.c b/arch/arm/kernel/arch_timer.c
>>> index dbbeec4..675cee0 100644
>>> --- a/arch/arm/kernel/arch_timer.c
>>> +++ b/arch/arm/kernel/arch_timer.c
>>> @@ -32,6 +32,8 @@ static int arch_timer_ppi2;
>>>  
>>>  static struct clock_event_device __percpu **arch_timer_evt;
>>>  
>>> +extern void init_current_timer_delay(unsigned long freq);
>> Can we find a home for this in some header file?
> I wondered about that...
>
> The reason I didn't add it to a header file is that we really don't want it
> to be called willy-nilly across the kernel. In fact, it must be called
> exactly once by the clocksource backing read_current_timer when it knows
> that the timer is live and ticking.
>
> I suppose I could allow the function to fail if it's called after we've
> calibrated. What do you reckon?
>

Fair enough. Would anything actually go wrong if you called it twice? I
would think everything would be assigned to what it already is but I
haven't thought deeply about it. I don't really care to make the
function more complicated for a case that should never happen.

> It's actually a 32-bit multiply with a 64-bit result, so it's just a umull:
>
> 00000050 <__timer_const_udelay>:
>   50:	e3003000	movw	r3, #0
>   54:	e3403000	movt	r3, #0
>   58:	e5932000	ldr	r2, [r3]
>   5c:	e0832290	umull	r2, r3, r0, r2
>   60:	e1a00f22	lsr	r0, r2, #30
>   64:	e1800103	orr	r0, r0, r3, lsl #2
>   68:	eaffffe4	b	0 <__timer_delay>

Ok. Maybe Russell can comment further. Or maybe it doesn't matter to
save some cycles after Linus said that udelay() doesn't need to be that
accurate.

>> It's unfortunate that we have to duplicate the same code and constants
>> in both C and assembly. With my approach we convert delay.S into C and
>> avoid the duplication.
> It's probably easy enough to have a #define for the multiplier, I can do
> that for v2.

I look forward to seeing how v2 works out.

>
>>> +
>>> +void __init init_current_timer_delay(unsigned long freq)
>>> +{
>>> +	pr_info("Switching to timer-based delay loop\n");
>> Might be worth printing the frequency here too.
> Could do, but the timer itself probably prints that information already (at
> least it does the arch timer).

Sure. Thinking more about it I don't like my suggestion.

-- 
Sent by an employee of the Qualcomm Innovation Center, Inc.
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCH 2/2] ARM: delay: allow timer-based delay implementation to be selected
  2012-06-27  2:07       ` Stephen Boyd
@ 2012-06-27  9:41         ` Will Deacon
  0 siblings, 0 replies; 18+ messages in thread
From: Will Deacon @ 2012-06-27  9:41 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, Jun 27, 2012 at 03:07:35AM +0100, Stephen Boyd wrote:
> On 06/26/12 03:49, Will Deacon wrote:
> > I suppose I could allow the function to fail if it's called after we've
> > calibrated. What do you reckon?
> >
> 
> Fair enough. Would anything actually go wrong if you called it twice? I
> would think everything would be assigned to what it already is but I
> haven't thought deeply about it. I don't really care to make the
> function more complicated for a case that should never happen.

If it's called once before calibration and again after calibration then that
could cause problems if a different frequency is passed. However, I think we
can probably leave the function as-is until we have the capacity to support
multiple read_current_timer implementations in the same kernel. At that
point, we'd probably have a way to register the current_timer which could
also setup the delay loop.

Will

^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2012-06-27  9:41 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-06-22 15:09 [PATCH 0/2] Use architected timers for delay loop Will Deacon
2012-06-22 15:09 ` [PATCH 1/2] ARM: arch timer: implement read_current_timer and get_cycles Will Deacon
2012-06-25 21:39   ` Stephen Boyd
2012-06-26 10:37     ` Will Deacon
2012-06-26 17:44       ` Stephen Boyd
2012-06-22 15:09 ` [PATCH 2/2] ARM: delay: allow timer-based delay implementation to be selected Will Deacon
2012-06-25 21:39   ` Stephen Boyd
2012-06-26 10:49     ` Will Deacon
2012-06-26 15:54       ` Will Deacon
2012-06-26 16:00         ` Rob Herring
2012-06-26 16:28           ` Will Deacon
2012-06-27  2:07       ` Stephen Boyd
2012-06-27  9:41         ` Will Deacon
2012-06-22 22:26 ` [PATCH 0/2] Use architected timers for delay loop Russell King - ARM Linux
2012-06-25 10:03   ` Will Deacon
2012-06-25 21:38 ` Stephen Boyd
2012-06-26 10:35   ` Will Deacon
2012-06-26 17:42     ` Stephen Boyd

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.