All of lore.kernel.org
 help / color / mirror / Atom feed
* [patch 0/2] NOHZ vs. profile/oprofile v2
@ 2009-06-03 15:22 Martin Schwidefsky
  2009-06-03 15:22 ` [patch 1/2] idle profile hits with NOHZ Martin Schwidefsky
                   ` (2 more replies)
  0 siblings, 3 replies; 21+ messages in thread
From: Martin Schwidefsky @ 2009-06-03 15:22 UTC (permalink / raw)
  To: linux-kernel
  Cc: Rob van der Heij, Heiko Carstens, Ingo Molnar, Thomas Gleixner,
	john stultz, Andi Kleen

Greetings,
version 2 of the profile patches. The only change is the in_interrupt()
fix in tick_nohz_stop_idle(). I would like to know how to proceed with
the issue.
Andy, do you still prefer to handle the old style profiler analog to
the oprofile patch? If yes I would drop patch #1 and extend patch #2
with another tick_nohz_disable().

-- 
blue skies,
   Martin.

"Reality continues to ruin my life." - Calvin.


^ permalink raw reply	[flat|nested] 21+ messages in thread

* [patch 1/2] idle profile hits with NOHZ
  2009-06-03 15:22 [patch 0/2] NOHZ vs. profile/oprofile v2 Martin Schwidefsky
@ 2009-06-03 15:22 ` Martin Schwidefsky
  2009-06-03 15:22 ` [patch 2/2] keep on ticking if oprofile is active Martin Schwidefsky
  2009-06-09 20:52 ` [patch 0/2] NOHZ vs. profile/oprofile v2 Thomas Gleixner
  2 siblings, 0 replies; 21+ messages in thread
From: Martin Schwidefsky @ 2009-06-03 15:22 UTC (permalink / raw)
  To: linux-kernel
  Cc: Rob van der Heij, Heiko Carstens, Ingo Molnar, Thomas Gleixner,
	john stultz, Andi Kleen, Martin Schwidefsky

[-- Attachment #1: profile-idle-tick.diff --]
[-- Type: text/plain, Size: 2989 bytes --]

From: Martin Schwidefsky <schwidefsky@de.ibm.com>

On a NOHZ system the generic clock events code reports a maximum
of 1 tick to the kernel profiler. If the system slept for more than
a single tick or has not been woken by a timer interrupt the number
of ticks reported will be incorrect. This skewes the percentages in
the profile. A good place to report the profile ticks is in
tick_nohz_restart_sched_tick. We calculate the number of ticks for
the cpu accounting anyway. The only obstacle is that we need an
instruction address for the profile ticks. In order to get one
extend the tick_sched structure by an idle_pc field and set it in
tick_nohz_stop_idle which is called from tick_check_idle.
Note: this does not solve the same problem in oprofile.

Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
---

 include/linux/tick.h     |    1 +
 kernel/time/tick-sched.c |   14 +++++++++-----
 2 files changed, 10 insertions(+), 5 deletions(-)

Index: quilt-2.6/include/linux/tick.h
===================================================================
--- quilt-2.6.orig/include/linux/tick.h
+++ quilt-2.6/include/linux/tick.h
@@ -64,6 +64,7 @@ struct tick_sched {
 	unsigned long			last_jiffies;
 	unsigned long			next_jiffies;
 	ktime_t				idle_expires;
+	unsigned long			idle_pc;
 };
 
 extern void __init tick_init(void);
Index: quilt-2.6/kernel/time/tick-sched.c
===================================================================
--- quilt-2.6.orig/kernel/time/tick-sched.c
+++ quilt-2.6/kernel/time/tick-sched.c
@@ -166,6 +166,8 @@ static void tick_nohz_stop_idle(int cpu)
 		ts->idle_lastupdate = now;
 		ts->idle_sleeptime = ktime_add(ts->idle_sleeptime, delta);
 		ts->idle_active = 0;
+		if (in_interrupt())
+			ts->idle_pc = profile_pc(get_irq_regs());
 
 		sched_clock_idle_wakeup_event(0);
 	}
@@ -419,9 +421,7 @@ void tick_nohz_restart_sched_tick(void)
 {
 	int cpu = smp_processor_id();
 	struct tick_sched *ts = &per_cpu(tick_cpu_sched, cpu);
-#ifndef CONFIG_VIRT_CPU_ACCOUNTING
 	unsigned long ticks;
-#endif
 	ktime_t now;
 
 	local_irq_disable();
@@ -443,19 +443,23 @@ void tick_nohz_restart_sched_tick(void)
 	tick_do_update_jiffies64(now);
 	cpumask_clear_cpu(cpu, nohz_cpu_mask);
 
+	ticks = jiffies - ts->idle_jiffies;
 #ifndef CONFIG_VIRT_CPU_ACCOUNTING
 	/*
 	 * We stopped the tick in idle. Update process times would miss the
 	 * time we slept as update_process_times does only a 1 tick
 	 * accounting. Enforce that this is accounted to idle !
-	 */
-	ticks = jiffies - ts->idle_jiffies;
-	/*
+	 *
 	 * We might be one off. Do not randomly account a huge number of ticks!
 	 */
 	if (ticks && ticks < LONG_MAX)
 		account_idle_ticks(ticks);
 #endif
+#ifdef CONFIG_PROFILING
+	/* Create profile hits for all ticks we slept in idle. */
+	if (ticks && ticks < LONG_MAX)
+		profile_hits(CPU_PROFILING, (void *) ts->idle_pc, ticks);
+#endif
 
 	touch_softlockup_watchdog();
 	/*

-- 
blue skies,
   Martin.

"Reality continues to ruin my life." - Calvin.


^ permalink raw reply	[flat|nested] 21+ messages in thread

* [patch 2/2] keep on ticking if oprofile is active
  2009-06-03 15:22 [patch 0/2] NOHZ vs. profile/oprofile v2 Martin Schwidefsky
  2009-06-03 15:22 ` [patch 1/2] idle profile hits with NOHZ Martin Schwidefsky
@ 2009-06-03 15:22 ` Martin Schwidefsky
  2009-06-09 20:52 ` [patch 0/2] NOHZ vs. profile/oprofile v2 Thomas Gleixner
  2 siblings, 0 replies; 21+ messages in thread
From: Martin Schwidefsky @ 2009-06-03 15:22 UTC (permalink / raw)
  To: linux-kernel
  Cc: Rob van der Heij, Heiko Carstens, Ingo Molnar, Thomas Gleixner,
	john stultz, Andi Kleen, Martin Schwidefsky

[-- Attachment #1: profile-stop-nohz.diff --]
[-- Type: text/plain, Size: 3958 bytes --]

From: Martin Schwidefsky <schwidefsky@de.ibm.com>

On a NOHZ system with oprofile enabled the timer tick should not be
stopped when a cpu goes idle. Oprofile needs the pt_regs structure
of the interrupt and allocates memory in the ring buffer for each
sample. Current a maximum of 1 tick is accounted with oprofile if a
cpu sleeps for a longer period of time. This does bad things to the
percentages in the oprofile output. To postpone the oprofile tick to
tick_nohz_restart_sched_tick analog to the in kernel profiler is not
possible as there is no pt_regs structure in the context the
tick_nohz_restart_sched_tick function is called and it is not a good
idea to create hundreds of samples at once.

Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
---

 drivers/oprofile/oprof.c |    3 +++
 include/linux/tick.h     |    4 ++++
 kernel/time/tick-sched.c |   26 +++++++++++++++++++++++++-
 3 files changed, 32 insertions(+), 1 deletion(-)

Index: quilt-2.6/drivers/oprofile/oprof.c
===================================================================
--- quilt-2.6.orig/drivers/oprofile/oprof.c
+++ quilt-2.6/drivers/oprofile/oprof.c
@@ -12,6 +12,7 @@
 #include <linux/init.h>
 #include <linux/oprofile.h>
 #include <linux/moduleparam.h>
+#include <linux/tick.h>
 #include <asm/mutex.h>
 
 #include "oprof.h"
@@ -103,6 +104,7 @@ int oprofile_start(void)
 	if (oprofile_started)
 		goto out;
 
+	tick_nohz_disable();
 	oprofile_reset_stats();
 
 	if ((err = oprofile_ops.start()))
@@ -123,6 +125,7 @@ void oprofile_stop(void)
 		goto out;
 	oprofile_ops.stop();
 	oprofile_started = 0;
+	tick_nohz_enable();
 	/* wake up the daemon to read what remains */
 	wake_up_buffer_waiter();
 out:
Index: quilt-2.6/include/linux/tick.h
===================================================================
--- quilt-2.6.orig/include/linux/tick.h
+++ quilt-2.6/include/linux/tick.h
@@ -117,6 +117,8 @@ extern void tick_nohz_stop_sched_tick(in
 extern void tick_nohz_restart_sched_tick(void);
 extern ktime_t tick_nohz_get_sleep_length(void);
 extern u64 get_cpu_idle_time_us(int cpu, u64 *last_update_time);
+extern void tick_nohz_enable(void);
+extern void tick_nohz_disable(void);
 # else
 static inline void tick_nohz_stop_sched_tick(int inidle) { }
 static inline void tick_nohz_restart_sched_tick(void) { }
@@ -127,6 +129,8 @@ static inline ktime_t tick_nohz_get_slee
 	return len;
 }
 static inline u64 get_cpu_idle_time_us(int cpu, u64 *unused) { return -1; }
+static inline void tick_nohz_enable(void) { }
+static inline void tick_nohz_disable(void) { }
 # endif /* !NO_HZ */
 
 #endif
Index: quilt-2.6/kernel/time/tick-sched.c
===================================================================
--- quilt-2.6.orig/kernel/time/tick-sched.c
+++ quilt-2.6/kernel/time/tick-sched.c
@@ -124,6 +124,29 @@ static int __init setup_tick_nohz(char *
 
 __setup("nohz=", setup_tick_nohz);
 
+/*
+ * NO HZ currently disabled ?
+ */
+static atomic_t tick_nohz_disable_counter = ATOMIC_INIT(0);
+
+void tick_nohz_enable(void)
+{
+	atomic_dec(&tick_nohz_disable_counter);
+}
+EXPORT_SYMBOL_GPL(tick_nohz_enable);
+
+static void __tick_nohz_disable(void *dummy)
+{
+}
+
+void tick_nohz_disable(void)
+{
+	if (atomic_inc_return(&tick_nohz_disable_counter) == 1)
+		/* Wake up all cpus to make them start ticking. */
+		smp_call_function(__tick_nohz_disable, NULL, 0);
+}
+EXPORT_SYMBOL_GPL(tick_nohz_disable);
+
 /**
  * tick_nohz_update_jiffies - update jiffies when idle was interrupted
  *
@@ -272,7 +295,8 @@ void tick_nohz_stop_sched_tick(int inidl
 	next_jiffies = get_next_timer_interrupt(last_jiffies);
 	delta_jiffies = next_jiffies - last_jiffies;
 
-	if (rcu_needs_cpu(cpu) || printk_needs_cpu(cpu))
+	if (rcu_needs_cpu(cpu) || printk_needs_cpu(cpu) ||
+	    atomic_read(&tick_nohz_disable_counter) > 0)
 		delta_jiffies = 1;
 	/*
 	 * Do not stop the tick, if we are only one off

-- 
blue skies,
   Martin.

"Reality continues to ruin my life." - Calvin.


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [patch 0/2] NOHZ vs. profile/oprofile v2
  2009-06-03 15:22 [patch 0/2] NOHZ vs. profile/oprofile v2 Martin Schwidefsky
  2009-06-03 15:22 ` [patch 1/2] idle profile hits with NOHZ Martin Schwidefsky
  2009-06-03 15:22 ` [patch 2/2] keep on ticking if oprofile is active Martin Schwidefsky
@ 2009-06-09 20:52 ` Thomas Gleixner
  2009-06-10 17:33   ` Martin Schwidefsky
  2009-06-22 14:26   ` Martin Schwidefsky
  2 siblings, 2 replies; 21+ messages in thread
From: Thomas Gleixner @ 2009-06-09 20:52 UTC (permalink / raw)
  To: Martin Schwidefsky
  Cc: linux-kernel, Rob van der Heij, Heiko Carstens, Ingo Molnar,
	john stultz, Andi Kleen

On Wed, 3 Jun 2009, Martin Schwidefsky wrote:

> Greetings,
> version 2 of the profile patches. The only change is the in_interrupt()
> fix in tick_nohz_stop_idle(). I would like to know how to proceed with
> the issue.
> Andy, do you still prefer to handle the old style profiler analog to
> the oprofile patch? If yes I would drop patch #1 and extend patch #2
> with another tick_nohz_disable().

Any update on this one ?

Thanks,

	tglx

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [patch 0/2] NOHZ vs. profile/oprofile v2
  2009-06-09 20:52 ` [patch 0/2] NOHZ vs. profile/oprofile v2 Thomas Gleixner
@ 2009-06-10 17:33   ` Martin Schwidefsky
  2009-06-22 14:26   ` Martin Schwidefsky
  1 sibling, 0 replies; 21+ messages in thread
From: Martin Schwidefsky @ 2009-06-10 17:33 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: linux-kernel, Rob van der Heij, Heiko Carstens, Ingo Molnar,
	john stultz, Andi Kleen

On Tue, 9 Jun 2009 22:52:51 +0200 (CEST)
Thomas Gleixner <tglx@linutronix.de> wrote:

> On Wed, 3 Jun 2009, Martin Schwidefsky wrote:
> 
> > Greetings,
> > version 2 of the profile patches. The only change is the in_interrupt()
> > fix in tick_nohz_stop_idle(). I would like to know how to proceed with
> > the issue.
> > Andy, do you still prefer to handle the old style profiler analog to
> > the oprofile patch? If yes I would drop patch #1 and extend patch #2
> > with another tick_nohz_disable().
> 
> Any update on this one ?

I haven't heard anything from any Andy in regard to patch #1. We could
do patch #2 now and decide later what we want to do with the old style
profiler.

blue skies,
  Martin.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [patch 0/2] NOHZ vs. profile/oprofile v2
  2009-06-09 20:52 ` [patch 0/2] NOHZ vs. profile/oprofile v2 Thomas Gleixner
  2009-06-10 17:33   ` Martin Schwidefsky
@ 2009-06-22 14:26   ` Martin Schwidefsky
  2009-06-22 14:41     ` Ingo Molnar
  1 sibling, 1 reply; 21+ messages in thread
From: Martin Schwidefsky @ 2009-06-22 14:26 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: linux-kernel, Rob van der Heij, Heiko Carstens, Ingo Molnar,
	john stultz, Andi Kleen

On Tue, 9 Jun 2009 22:52:51 +0200 (CEST)
Thomas Gleixner <tglx@linutronix.de> wrote:

> > version 2 of the profile patches. The only change is the in_interrupt()
> > fix in tick_nohz_stop_idle(). I would like to know how to proceed with
> > the issue.
> > Andy, do you still prefer to handle the old style profiler analog to
> > the oprofile patch? If yes I would drop patch #1 and extend patch #2
> > with another tick_nohz_disable().  
> 
> Any update on this one ?

A solution to this problem should go upstream soon, no? How about this
patch, it uses the tick_nohz_disable/tick_nohz_enable mechanic for 
oprofile and the old style kernel profiler. Good enough ?

---
Subject: [PATCH] keep on ticking if a profiler is active

From: Martin Schwidefsky <schwidefsky@de.ibm.com>

On a NOHZ system with oprofile or the old style kernel profiler enabled
the timer tick should not be stopped when a cpu goes idle. Currently
a maximum of 1 tick is accounted if a cpu sleeps for a longer period of
time. This does bad things to the percentages in the profiler output.

Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
---

 drivers/oprofile/oprof.c |    3 +++
 include/linux/tick.h     |    4 ++++
 kernel/profile.c         |    4 ++++
 kernel/time/tick-sched.c |   27 ++++++++++++++++++++++++++-
 4 files changed, 37 insertions(+), 1 deletion(-)

diff -urpN linux-2.6/drivers/oprofile/oprof.c linux-2.6-patched/drivers/oprofile/oprof.c
--- linux-2.6/drivers/oprofile/oprof.c	2009-06-10 05:05:27.000000000 +0200
+++ linux-2.6-patched/drivers/oprofile/oprof.c	2009-06-22 11:26:50.000000000 +0200
@@ -12,6 +12,7 @@
 #include <linux/init.h>
 #include <linux/oprofile.h>
 #include <linux/moduleparam.h>
+#include <linux/tick.h>
 #include <asm/mutex.h>
 
 #include "oprof.h"
@@ -103,6 +104,7 @@ int oprofile_start(void)
 	if (oprofile_started)
 		goto out;
 
+	tick_nohz_disable(1);
 	oprofile_reset_stats();
 
 	if ((err = oprofile_ops.start()))
@@ -123,6 +125,7 @@ void oprofile_stop(void)
 		goto out;
 	oprofile_ops.stop();
 	oprofile_started = 0;
+	tick_nohz_enable();
 	/* wake up the daemon to read what remains */
 	wake_up_buffer_waiter();
 out:
diff -urpN linux-2.6/include/linux/tick.h linux-2.6-patched/include/linux/tick.h
--- linux-2.6/include/linux/tick.h	2009-06-22 11:26:26.000000000 +0200
+++ linux-2.6-patched/include/linux/tick.h	2009-06-22 11:26:50.000000000 +0200
@@ -119,6 +119,8 @@ extern void tick_nohz_stop_sched_tick(in
 extern void tick_nohz_restart_sched_tick(void);
 extern ktime_t tick_nohz_get_sleep_length(void);
 extern u64 get_cpu_idle_time_us(int cpu, u64 *last_update_time);
+extern void tick_nohz_enable(void);
+extern void tick_nohz_disable(int wakeup);
 # else
 static inline void tick_nohz_stop_sched_tick(int inidle) { }
 static inline void tick_nohz_restart_sched_tick(void) { }
@@ -129,6 +131,8 @@ static inline ktime_t tick_nohz_get_slee
 	return len;
 }
 static inline u64 get_cpu_idle_time_us(int cpu, u64 *unused) { return -1; }
+static inline void tick_nohz_enable(void) { }
+static inline void tick_nohz_disable(int wakeup) { }
 # endif /* !NO_HZ */
 
 #endif
diff -urpN linux-2.6/kernel/profile.c linux-2.6-patched/kernel/profile.c
--- linux-2.6/kernel/profile.c	2009-06-22 11:26:26.000000000 +0200
+++ linux-2.6-patched/kernel/profile.c	2009-06-22 11:26:50.000000000 +0200
@@ -24,6 +24,7 @@
 #include <linux/mutex.h>
 #include <linux/slab.h>
 #include <linux/vmalloc.h>
+#include <linux/tick.h>
 #include <asm/sections.h>
 #include <asm/irq_regs.h>
 #include <asm/ptrace.h>
@@ -97,6 +98,8 @@ int profile_setup(char *str)
 		printk(KERN_INFO "kernel profiling enabled (shift: %ld)\n",
 			prof_shift);
 	}
+	if (prof_on)
+		tick_nohz_disable(0);
 	return 1;
 }
 __setup("profile=", profile_setup);
@@ -582,6 +585,7 @@ static int create_hash_tables(void)
 	return 0;
 out_cleanup:
 	prof_on = 0;
+	tick_nohz_enable();
 	smp_mb();
 	on_each_cpu(profile_nop, NULL, 1);
 	for_each_online_cpu(cpu) {
diff -urpN linux-2.6/kernel/time/tick-sched.c linux-2.6-patched/kernel/time/tick-sched.c
--- linux-2.6/kernel/time/tick-sched.c	2009-06-22 11:26:26.000000000 +0200
+++ linux-2.6-patched/kernel/time/tick-sched.c	2009-06-22 11:26:50.000000000 +0200
@@ -124,6 +124,30 @@ static int __init setup_tick_nohz(char *
 
 __setup("nohz=", setup_tick_nohz);
 
+/*
+ * NO HZ currently disabled ?
+ */
+static atomic_t tick_nohz_disable_counter = ATOMIC_INIT(0);
+
+void tick_nohz_enable(void)
+{
+	atomic_dec(&tick_nohz_disable_counter);
+}
+EXPORT_SYMBOL_GPL(tick_nohz_enable);
+
+static void __tick_nohz_disable(void *dummy)
+{
+}
+
+void tick_nohz_disable(int wakeup)
+{
+	if (atomic_inc_return(&tick_nohz_disable_counter) == 1)
+		if (wakeup)
+			/* Wake up all cpus to make them start ticking. */
+			smp_call_function(__tick_nohz_disable, NULL, 0);
+}
+EXPORT_SYMBOL_GPL(tick_nohz_disable);
+
 /**
  * tick_nohz_update_jiffies - update jiffies when idle was interrupted
  *
@@ -276,7 +300,8 @@ void tick_nohz_stop_sched_tick(int inidl
 	next_jiffies = get_next_timer_interrupt(last_jiffies);
 	delta_jiffies = next_jiffies - last_jiffies;
 
-	if (rcu_needs_cpu(cpu) || printk_needs_cpu(cpu))
+	if (rcu_needs_cpu(cpu) || printk_needs_cpu(cpu) ||
+	    atomic_read(&tick_nohz_disable_counter) > 0)
 		delta_jiffies = 1;
 	/*
 	 * Do not stop the tick, if we are only one off


-- 
blue skies,
   Martin.

"Reality continues to ruin my life." - Calvin.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [patch 0/2] NOHZ vs. profile/oprofile v2
  2009-06-22 14:26   ` Martin Schwidefsky
@ 2009-06-22 14:41     ` Ingo Molnar
  2009-06-22 14:59       ` Martin Schwidefsky
  0 siblings, 1 reply; 21+ messages in thread
From: Ingo Molnar @ 2009-06-22 14:41 UTC (permalink / raw)
  To: Martin Schwidefsky
  Cc: Thomas Gleixner, linux-kernel, Rob van der Heij, Heiko Carstens,
	john stultz, Andi Kleen, Peter Zijlstra


* Martin Schwidefsky <schwidefsky@de.ibm.com> wrote:

> On Tue, 9 Jun 2009 22:52:51 +0200 (CEST)
> Thomas Gleixner <tglx@linutronix.de> wrote:
> 
> > > version 2 of the profile patches. The only change is the in_interrupt()
> > > fix in tick_nohz_stop_idle(). I would like to know how to proceed with
> > > the issue.
> > > Andy, do you still prefer to handle the old style profiler analog to
> > > the oprofile patch? If yes I would drop patch #1 and extend patch #2
> > > with another tick_nohz_disable().  
> > 
> > Any update on this one ?
> 
> A solution to this problem should go upstream soon, no? How about this
> patch, it uses the tick_nohz_disable/tick_nohz_enable mechanic for 
> oprofile and the old style kernel profiler. Good enough ?
> 
> ---
> Subject: [PATCH] keep on ticking if a profiler is active
> 
> From: Martin Schwidefsky <schwidefsky@de.ibm.com>
> 
> On a NOHZ system with oprofile or the old style kernel profiler enabled
> the timer tick should not be stopped when a cpu goes idle. Currently
> a maximum of 1 tick is accounted if a cpu sleeps for a longer period of
> time. This does bad things to the percentages in the profiler output.
> 
> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
> ---
> 
>  drivers/oprofile/oprof.c |    3 +++
>  include/linux/tick.h     |    4 ++++
>  kernel/profile.c         |    4 ++++
>  kernel/time/tick-sched.c |   27 ++++++++++++++++++++++++++-
>  4 files changed, 37 insertions(+), 1 deletion(-)

Hm, this is rather ugly. Why not use hrtimers like 'perf' does when 
it fallback-samples based on the timer tick?

That method has three advantages:

 - no weird hookery needed
 - resolution can go far beyond HZ
 - it is evidently dynticks-safe

	Ingo

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [patch 0/2] NOHZ vs. profile/oprofile v2
  2009-06-22 14:41     ` Ingo Molnar
@ 2009-06-22 14:59       ` Martin Schwidefsky
  2009-06-22 15:05         ` Ingo Molnar
  0 siblings, 1 reply; 21+ messages in thread
From: Martin Schwidefsky @ 2009-06-22 14:59 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Thomas Gleixner, linux-kernel, Rob van der Heij, Heiko Carstens,
	john stultz, Andi Kleen, Peter Zijlstra

On Mon, 22 Jun 2009 16:41:10 +0200
Ingo Molnar <mingo@elte.hu> wrote:

> Hm, this is rather ugly. Why not use hrtimers like 'perf' does when 
> it fallback-samples based on the timer tick?
> 
> That method has three advantages:
> 
>  - no weird hookery needed
>  - resolution can go far beyond HZ
>  - it is evidently dynticks-safe

Hmm, if we replace the HZ based oprofile tick with an hrtimer we should
add an interface to configure the sample interval as well, no? Otherwise
we just replace one timer event (HZ) with another (hrtimer).

-- 
blue skies,
   Martin.

"Reality continues to ruin my life." - Calvin.


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [patch 0/2] NOHZ vs. profile/oprofile v2
  2009-06-22 14:59       ` Martin Schwidefsky
@ 2009-06-22 15:05         ` Ingo Molnar
  2009-06-22 15:18           ` Martin Schwidefsky
  0 siblings, 1 reply; 21+ messages in thread
From: Ingo Molnar @ 2009-06-22 15:05 UTC (permalink / raw)
  To: Martin Schwidefsky
  Cc: Thomas Gleixner, linux-kernel, Rob van der Heij, Heiko Carstens,
	john stultz, Andi Kleen, Peter Zijlstra


* Martin Schwidefsky <schwidefsky@de.ibm.com> wrote:

> On Mon, 22 Jun 2009 16:41:10 +0200
> Ingo Molnar <mingo@elte.hu> wrote:
> 
> > Hm, this is rather ugly. Why not use hrtimers like 'perf' does when 
> > it fallback-samples based on the timer tick?
> > 
> > That method has three advantages:
> > 
> >  - no weird hookery needed
> >  - resolution can go far beyond HZ
> >  - it is evidently dynticks-safe
> 
> Hmm, if we replace the HZ based oprofile tick with an hrtimer we 
> should add an interface to configure the sample interval as well, 
> no? Otherwise we just replace one timer event (HZ) with another 
> (hrtimer).

Even if the hrtimer is set with a 1/HZ period it's a better 
solution, as it's dynticks safe without invasive changes.

	Ingo

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [patch 0/2] NOHZ vs. profile/oprofile v2
  2009-06-22 15:05         ` Ingo Molnar
@ 2009-06-22 15:18           ` Martin Schwidefsky
  2009-06-22 15:29             ` Ingo Molnar
  0 siblings, 1 reply; 21+ messages in thread
From: Martin Schwidefsky @ 2009-06-22 15:18 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Thomas Gleixner, linux-kernel, Rob van der Heij, Heiko Carstens,
	john stultz, Andi Kleen, Peter Zijlstra

On Mon, 22 Jun 2009 17:05:53 +0200
Ingo Molnar <mingo@elte.hu> wrote:

> 
> * Martin Schwidefsky <schwidefsky@de.ibm.com> wrote:
> 
> > On Mon, 22 Jun 2009 16:41:10 +0200
> > Ingo Molnar <mingo@elte.hu> wrote:
> > 
> > > Hm, this is rather ugly. Why not use hrtimers like 'perf' does when 
> > > it fallback-samples based on the timer tick?
> > > 
> > > That method has three advantages:
> > > 
> > >  - no weird hookery needed
> > >  - resolution can go far beyond HZ
> > >  - it is evidently dynticks-safe
> > 
> > Hmm, if we replace the HZ based oprofile tick with an hrtimer we 
> > should add an interface to configure the sample interval as well, 
> > no? Otherwise we just replace one timer event (HZ) with another 
> > (hrtimer).
> 
> Even if the hrtimer is set with a 1/HZ period it's a better 
> solution, as it's dynticks safe without invasive changes.

Ok, but the patch will be quite big. All the profile_tick() calls from
the architecture files will have to be removed. And if we really want
to keep things separate there will be two sets of per-cpu hrtimer,
one for the old style profiler and one for oprofile. Any preference
for the user space interface to set the sample rate? A sysctl?

-- 
blue skies,
   Martin.

"Reality continues to ruin my life." - Calvin.


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [patch 0/2] NOHZ vs. profile/oprofile v2
  2009-06-22 15:18           ` Martin Schwidefsky
@ 2009-06-22 15:29             ` Ingo Molnar
  2009-06-22 15:36               ` Martin Schwidefsky
  0 siblings, 1 reply; 21+ messages in thread
From: Ingo Molnar @ 2009-06-22 15:29 UTC (permalink / raw)
  To: Martin Schwidefsky
  Cc: Thomas Gleixner, linux-kernel, Rob van der Heij, Heiko Carstens,
	john stultz, Andi Kleen, Peter Zijlstra


* Martin Schwidefsky <schwidefsky@de.ibm.com> wrote:

> On Mon, 22 Jun 2009 17:05:53 +0200
> Ingo Molnar <mingo@elte.hu> wrote:
> 
> > 
> > * Martin Schwidefsky <schwidefsky@de.ibm.com> wrote:
> > 
> > > On Mon, 22 Jun 2009 16:41:10 +0200
> > > Ingo Molnar <mingo@elte.hu> wrote:
> > > 
> > > > Hm, this is rather ugly. Why not use hrtimers like 'perf' does when 
> > > > it fallback-samples based on the timer tick?
> > > > 
> > > > That method has three advantages:
> > > > 
> > > >  - no weird hookery needed
> > > >  - resolution can go far beyond HZ
> > > >  - it is evidently dynticks-safe
> > > 
> > > Hmm, if we replace the HZ based oprofile tick with an hrtimer we 
> > > should add an interface to configure the sample interval as well, 
> > > no? Otherwise we just replace one timer event (HZ) with another 
> > > (hrtimer).
> > 
> > Even if the hrtimer is set with a 1/HZ period it's a better 
> > solution, as it's dynticks safe without invasive changes.
> 
> Ok, but the patch will be quite big. All the profile_tick() calls 
> from the architecture files will have to be removed. [...]

Hey, that's a bonus :)

> [...] And if we really want to keep things separate there will be 
> two sets of per-cpu hrtimer, one for the old style profiler and 
> one for oprofile. Any preference for the user space interface to 
> set the sample rate? A sysctl?

I dont think we want to keep things separate. Regarding old-style 
profiler, does anyone still use it? There's now a superior in-tree 
replacement for it, so we could phase it out.

A sysctl sounds like the most obvious way to set the sample period - 
and it can default to 1/Hz.

	Ingo

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [patch 0/2] NOHZ vs. profile/oprofile v2
  2009-06-22 15:29             ` Ingo Molnar
@ 2009-06-22 15:36               ` Martin Schwidefsky
  2009-06-22 15:40                 ` Ingo Molnar
  0 siblings, 1 reply; 21+ messages in thread
From: Martin Schwidefsky @ 2009-06-22 15:36 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Thomas Gleixner, linux-kernel, Rob van der Heij, Heiko Carstens,
	john stultz, Andi Kleen, Peter Zijlstra

On Mon, 22 Jun 2009 17:29:37 +0200
Ingo Molnar <mingo@elte.hu> wrote:

> 
> * Martin Schwidefsky <schwidefsky@de.ibm.com> wrote:
> 
> > On Mon, 22 Jun 2009 17:05:53 +0200
> > Ingo Molnar <mingo@elte.hu> wrote:
> > 
> > > 
> > > * Martin Schwidefsky <schwidefsky@de.ibm.com> wrote:
> > > 
> > > > On Mon, 22 Jun 2009 16:41:10 +0200
> > > > Ingo Molnar <mingo@elte.hu> wrote:
> > > > 
> > > > > Hm, this is rather ugly. Why not use hrtimers like 'perf' does when 
> > > > > it fallback-samples based on the timer tick?
> > > > > 
> > > > > That method has three advantages:
> > > > > 
> > > > >  - no weird hookery needed
> > > > >  - resolution can go far beyond HZ
> > > > >  - it is evidently dynticks-safe
> > > > 
> > > > Hmm, if we replace the HZ based oprofile tick with an hrtimer we 
> > > > should add an interface to configure the sample interval as well, 
> > > > no? Otherwise we just replace one timer event (HZ) with another 
> > > > (hrtimer).
> > > 
> > > Even if the hrtimer is set with a 1/HZ period it's a better 
> > > solution, as it's dynticks safe without invasive changes.
> > 
> > Ok, but the patch will be quite big. All the profile_tick() calls 
> > from the architecture files will have to be removed. [...]
> 
> Hey, that's a bonus :)

It would remove some oddball code :-)
 
> > [...] And if we really want to keep things separate there will be 
> > two sets of per-cpu hrtimer, one for the old style profiler and 
> > one for oprofile. Any preference for the user space interface to 
> > set the sample rate? A sysctl?
> 
> I dont think we want to keep things separate. Regarding old-style 
> profiler, does anyone still use it? There's now a superior in-tree 
> replacement for it, so we could phase it out.

Well, for my part I won't miss it. But to be able to remove the 
profile_tick() calls from the architectures I either have to rip out
the old profiler now, or adapt it to use hrtimer as well.
 
> A sysctl sounds like the most obvious way to set the sample period - 
> and it can default to 1/Hz.

Agreed.

-- 
blue skies,
   Martin.

"Reality continues to ruin my life." - Calvin.


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [patch 0/2] NOHZ vs. profile/oprofile v2
  2009-06-22 15:36               ` Martin Schwidefsky
@ 2009-06-22 15:40                 ` Ingo Molnar
  2009-06-22 16:37                   ` Martin Schwidefsky
  2009-06-24 16:51                   ` Martin Schwidefsky
  0 siblings, 2 replies; 21+ messages in thread
From: Ingo Molnar @ 2009-06-22 15:40 UTC (permalink / raw)
  To: Martin Schwidefsky
  Cc: Thomas Gleixner, linux-kernel, Rob van der Heij, Heiko Carstens,
	john stultz, Andi Kleen, Peter Zijlstra


* Martin Schwidefsky <schwidefsky@de.ibm.com> wrote:

> On Mon, 22 Jun 2009 17:29:37 +0200
> Ingo Molnar <mingo@elte.hu> wrote:
> 
> > 
> > * Martin Schwidefsky <schwidefsky@de.ibm.com> wrote:
> > 
> > > On Mon, 22 Jun 2009 17:05:53 +0200
> > > Ingo Molnar <mingo@elte.hu> wrote:
> > > 
> > > > 
> > > > * Martin Schwidefsky <schwidefsky@de.ibm.com> wrote:
> > > > 
> > > > > On Mon, 22 Jun 2009 16:41:10 +0200
> > > > > Ingo Molnar <mingo@elte.hu> wrote:
> > > > > 
> > > > > > Hm, this is rather ugly. Why not use hrtimers like 'perf' does when 
> > > > > > it fallback-samples based on the timer tick?
> > > > > > 
> > > > > > That method has three advantages:
> > > > > > 
> > > > > >  - no weird hookery needed
> > > > > >  - resolution can go far beyond HZ
> > > > > >  - it is evidently dynticks-safe
> > > > > 
> > > > > Hmm, if we replace the HZ based oprofile tick with an hrtimer we 
> > > > > should add an interface to configure the sample interval as well, 
> > > > > no? Otherwise we just replace one timer event (HZ) with another 
> > > > > (hrtimer).
> > > > 
> > > > Even if the hrtimer is set with a 1/HZ period it's a better 
> > > > solution, as it's dynticks safe without invasive changes.
> > > 
> > > Ok, but the patch will be quite big. All the profile_tick() calls 
> > > from the architecture files will have to be removed. [...]
> > 
> > Hey, that's a bonus :)
> 
> It would remove some oddball code :-)
>  
> > > [...] And if we really want to keep things separate there will be 
> > > two sets of per-cpu hrtimer, one for the old style profiler and 
> > > one for oprofile. Any preference for the user space interface to 
> > > set the sample rate? A sysctl?
> > 
> > I dont think we want to keep things separate. Regarding old-style 
> > profiler, does anyone still use it? There's now a superior in-tree 
> > replacement for it, so we could phase it out.
> 
> Well, for my part I won't miss it. But to be able to remove the 
> profile_tick() calls from the architectures I either have to rip 
> out the old profiler now, or adapt it to use hrtimer as well.

Do we _have to_ touch it so widely right now? We could start with a 
deprecation warning in this cycle. Once it's deprecated we can 
remove all those calls.

	Ingo

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [patch 0/2] NOHZ vs. profile/oprofile v2
  2009-06-22 15:40                 ` Ingo Molnar
@ 2009-06-22 16:37                   ` Martin Schwidefsky
  2009-06-24 16:51                   ` Martin Schwidefsky
  1 sibling, 0 replies; 21+ messages in thread
From: Martin Schwidefsky @ 2009-06-22 16:37 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Thomas Gleixner, linux-kernel, Rob van der Heij, Heiko Carstens,
	john stultz, Andi Kleen, Peter Zijlstra

On Mon, 22 Jun 2009 17:40:30 +0200
Ingo Molnar <mingo@elte.hu> wrote:

> 
> * Martin Schwidefsky <schwidefsky@de.ibm.com> wrote:
> 
> > On Mon, 22 Jun 2009 17:29:37 +0200
> > Ingo Molnar <mingo@elte.hu> wrote:
> > 
> > > 
> > > * Martin Schwidefsky <schwidefsky@de.ibm.com> wrote:
> > > 
> > > > On Mon, 22 Jun 2009 17:05:53 +0200
> > > > Ingo Molnar <mingo@elte.hu> wrote:
> > > > 
> > > > > 
> > > > > * Martin Schwidefsky <schwidefsky@de.ibm.com> wrote:
> > > > > 
> > > > > > On Mon, 22 Jun 2009 16:41:10 +0200
> > > > > > Ingo Molnar <mingo@elte.hu> wrote:
> > > > > > 
> > > > > > > Hm, this is rather ugly. Why not use hrtimers like 'perf' does when 
> > > > > > > it fallback-samples based on the timer tick?
> > > > > > > 
> > > > > > > That method has three advantages:
> > > > > > > 
> > > > > > >  - no weird hookery needed
> > > > > > >  - resolution can go far beyond HZ
> > > > > > >  - it is evidently dynticks-safe
> > > > > > 
> > > > > > Hmm, if we replace the HZ based oprofile tick with an hrtimer we 
> > > > > > should add an interface to configure the sample interval as well, 
> > > > > > no? Otherwise we just replace one timer event (HZ) with another 
> > > > > > (hrtimer).
> > > > > 
> > > > > Even if the hrtimer is set with a 1/HZ period it's a better 
> > > > > solution, as it's dynticks safe without invasive changes.
> > > > 
> > > > Ok, but the patch will be quite big. All the profile_tick() calls 
> > > > from the architecture files will have to be removed. [...]
> > > 
> > > Hey, that's a bonus :)
> > 
> > It would remove some oddball code :-)
> >  
> > > > [...] And if we really want to keep things separate there will be 
> > > > two sets of per-cpu hrtimer, one for the old style profiler and 
> > > > one for oprofile. Any preference for the user space interface to 
> > > > set the sample rate? A sysctl?
> > > 
> > > I dont think we want to keep things separate. Regarding old-style 
> > > profiler, does anyone still use it? There's now a superior in-tree 
> > > replacement for it, so we could phase it out.
> > 
> > Well, for my part I won't miss it. But to be able to remove the 
> > profile_tick() calls from the architectures I either have to rip 
> > out the old profiler now, or adapt it to use hrtimer as well.
> 
> Do we _have to_ touch it so widely right now? We could start with a 
> deprecation warning in this cycle. Once it's deprecated we can 
> remove all those calls.

Hmm, I like that. Fix oprofile with hrtimer and leave profile_tick()
as it is for now.

-- 
blue skies,
   Martin.

"Reality continues to ruin my life." - Calvin.


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [patch 0/2] NOHZ vs. profile/oprofile v2
  2009-06-22 15:40                 ` Ingo Molnar
  2009-06-22 16:37                   ` Martin Schwidefsky
@ 2009-06-24 16:51                   ` Martin Schwidefsky
  2009-06-24 17:47                     ` Ingo Molnar
  2010-03-02 13:57                     ` Aaro Koskinen
  1 sibling, 2 replies; 21+ messages in thread
From: Martin Schwidefsky @ 2009-06-24 16:51 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Thomas Gleixner, linux-kernel, Rob van der Heij, Heiko Carstens,
	john stultz, Andi Kleen, Peter Zijlstra

On Mon, 22 Jun 2009 17:40:30 +0200
Ingo Molnar <mingo@elte.hu> wrote:

> 
> * Martin Schwidefsky <schwidefsky@de.ibm.com> wrote:
> 
> > On Mon, 22 Jun 2009 17:29:37 +0200
> > Ingo Molnar <mingo@elte.hu> wrote:
> > 
> > > 
> > > * Martin Schwidefsky <schwidefsky@de.ibm.com> wrote:
> > > 
> > > > [...] And if we really want to keep things separate there will be 
> > > > two sets of per-cpu hrtimer, one for the old style profiler and 
> > > > one for oprofile. Any preference for the user space interface to 
> > > > set the sample rate? A sysctl?
> > > 
> > > I dont think we want to keep things separate. Regarding old-style 
> > > profiler, does anyone still use it? There's now a superior in-tree 
> > > replacement for it, so we could phase it out.
> > 
> > Well, for my part I won't miss it. But to be able to remove the 
> > profile_tick() calls from the architectures I either have to rip 
> > out the old profiler now, or adapt it to use hrtimer as well.
> 
> Do we _have to_ touch it so widely right now? We could start with a 
> deprecation warning in this cycle. Once it's deprecated we can 
> remove all those calls.

First version of the hrtimer patch for oprofile. I did not add the
sysctl yet, if the sysctl is added in oprofile_timer_init it would not
be available if some better profiling source is available. If it is
added unconditionally it would only have an effect if the timer
fallback is used. Both cases are not exactly nice for a user space
interface.

---
Subject: [PATCH] convert oprofile from timer_hook to hrtimer

From: Martin Schwidefsky <schwidefsky@de.ibm.com>

Oprofile is currently broken on systems running with NOHZ enabled.
A maximum of 1 tick is accounted via the timer_hook if a cpu sleeps
for a longer period of time. This does bad things to the percentages
in the profiler output. To solve this problem convert oprofile to
use a restarting hrtimer instead of the timer_hook.

Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
---

 drivers/oprofile/oprof.c     |   12 ++++--
 drivers/oprofile/oprof.h     |    3 +
 drivers/oprofile/timer_int.c |   77 +++++++++++++++++++++++++++++++++++++------
 3 files changed, 78 insertions(+), 14 deletions(-)

diff -urpN linux-2.6/drivers/oprofile/oprof.c linux-2.6-patched/drivers/oprofile/oprof.c
--- linux-2.6/drivers/oprofile/oprof.c	2009-06-24 17:21:12.000000000 +0200
+++ linux-2.6-patched/drivers/oprofile/oprof.c	2009-06-24 17:18:28.000000000 +0200
@@ -184,22 +184,26 @@ static int __init oprofile_init(void)
 	int err;
 
 	err = oprofile_arch_init(&oprofile_ops);
-
 	if (err < 0 || timer) {
 		printk(KERN_INFO "oprofile: using timer interrupt.\n");
-		oprofile_timer_init(&oprofile_ops);
+		err = oprofile_timer_init(&oprofile_ops);
+		if (err)
+			goto out_arch;
 	}
-
 	err = oprofilefs_register();
 	if (err)
-		oprofile_arch_exit();
+		goto out_arch;
+	return 0;
 
+out_arch:
+	oprofile_arch_exit();
 	return err;
 }
 
 
 static void __exit oprofile_exit(void)
 {
+	oprofile_timer_exit();
 	oprofilefs_unregister();
 	oprofile_arch_exit();
 }
diff -urpN linux-2.6/drivers/oprofile/oprof.h linux-2.6-patched/drivers/oprofile/oprof.h
--- linux-2.6/drivers/oprofile/oprof.h	2009-06-24 17:21:12.000000000 +0200
+++ linux-2.6-patched/drivers/oprofile/oprof.h	2009-06-24 17:19:11.000000000 +0200
@@ -32,7 +32,8 @@ struct super_block;
 struct dentry;
 
 void oprofile_create_files(struct super_block *sb, struct dentry *root);
-void oprofile_timer_init(struct oprofile_operations *ops);
+int oprofile_timer_init(struct oprofile_operations *ops);
+void oprofile_timer_exit(void);
 
 int oprofile_set_backtrace(unsigned long depth);
 
diff -urpN linux-2.6/drivers/oprofile/timer_int.c linux-2.6-patched/drivers/oprofile/timer_int.c
--- linux-2.6/drivers/oprofile/timer_int.c	2009-06-24 17:21:12.000000000 +0200
+++ linux-2.6-patched/drivers/oprofile/timer_int.c	2009-06-24 17:18:44.000000000 +0200
@@ -13,34 +13,93 @@
 #include <linux/oprofile.h>
 #include <linux/profile.h>
 #include <linux/init.h>
+#include <linux/cpu.h>
+#include <linux/hrtimer.h>
+#include <asm/irq_regs.h>
 #include <asm/ptrace.h>
 
 #include "oprof.h"
 
-static int timer_notify(struct pt_regs *regs)
+static DEFINE_PER_CPU(struct hrtimer, oprofile_hrtimer);
+
+static enum hrtimer_restart oprofile_hrtimer_notify(struct hrtimer *hrtimer)
+{
+	oprofile_add_sample(get_irq_regs(), 0);
+	hrtimer_forward_now(hrtimer, ns_to_ktime(TICK_NSEC));
+	return HRTIMER_RESTART;
+}
+
+static void __oprofile_hrtimer_start(void *unused)
+{
+	struct hrtimer *hrtimer = &__get_cpu_var(oprofile_hrtimer);
+
+	hrtimer_init(hrtimer, CLOCK_MONOTONIC, HRTIMER_MODE_REL);
+	hrtimer->function = oprofile_hrtimer_notify;
+
+	hrtimer_start(hrtimer, ns_to_ktime(TICK_NSEC),
+		      HRTIMER_MODE_REL_PINNED);
+}
+
+static int oprofile_hrtimer_start(void)
 {
-	oprofile_add_sample(regs, 0);
+	on_each_cpu(__oprofile_hrtimer_start, NULL, 1);
 	return 0;
 }
 
-static int timer_start(void)
+static void __oprofile_hrtimer_stop(int cpu)
 {
-	return register_timer_hook(timer_notify);
+	struct hrtimer *hrtimer = &per_cpu(oprofile_hrtimer, cpu);
+
+	hrtimer_cancel(hrtimer);
 }
 
+static void oprofile_hrtimer_stop(void)
+{
+	int cpu;
+	for_each_online_cpu(cpu)
+		__oprofile_hrtimer_stop(cpu);
+}
 
-static void timer_stop(void)
+static int __cpuinit oprofile_cpu_notify(struct notifier_block *self,
+					 unsigned long action, void *hcpu)
 {
-	unregister_timer_hook(timer_notify);
+	long cpu = (long) hcpu;
+
+	switch (action) {
+	case CPU_ONLINE:
+	case CPU_ONLINE_FROZEN:
+		smp_call_function_single(cpu, __oprofile_hrtimer_start,
+					 NULL, 1);
+		break;
+	case CPU_DEAD:
+	case CPU_DEAD_FROZEN:
+		__oprofile_hrtimer_stop(cpu);
+		break;
+	}
+	return NOTIFY_OK;
 }
 
+static struct notifier_block __refdata oprofile_cpu_notifier = {
+	.notifier_call = oprofile_cpu_notify,
+};
 
-void __init oprofile_timer_init(struct oprofile_operations *ops)
+int __init oprofile_timer_init(struct oprofile_operations *ops)
 {
+	int rc;
+
+        rc = register_hotcpu_notifier(&oprofile_cpu_notifier);
+        if (rc)
+		return rc;
 	ops->create_files = NULL;
 	ops->setup = NULL;
 	ops->shutdown = NULL;
-	ops->start = timer_start;
-	ops->stop = timer_stop;
+	ops->start = oprofile_hrtimer_start;
+	ops->stop = oprofile_hrtimer_stop;
 	ops->cpu_type = "timer";
+	return 0;
+}
+
+void __exit oprofile_timer_exit(void)
+{
+	unregister_hotcpu_notifier(&oprofile_cpu_notifier);
 }

---

-- 
blue skies,
   Martin.

"Reality continues to ruin my life." - Calvin.


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [patch 0/2] NOHZ vs. profile/oprofile v2
  2009-06-24 16:51                   ` Martin Schwidefsky
@ 2009-06-24 17:47                     ` Ingo Molnar
  2010-03-02 13:57                     ` Aaro Koskinen
  1 sibling, 0 replies; 21+ messages in thread
From: Ingo Molnar @ 2009-06-24 17:47 UTC (permalink / raw)
  To: Martin Schwidefsky, Robert Richter
  Cc: Thomas Gleixner, linux-kernel, Rob van der Heij, Heiko Carstens,
	john stultz, Andi Kleen, Peter Zijlstra


* Martin Schwidefsky <schwidefsky@de.ibm.com> wrote:

> On Mon, 22 Jun 2009 17:40:30 +0200
> Ingo Molnar <mingo@elte.hu> wrote:
> 
> > 
> > * Martin Schwidefsky <schwidefsky@de.ibm.com> wrote:
> > 
> > > On Mon, 22 Jun 2009 17:29:37 +0200
> > > Ingo Molnar <mingo@elte.hu> wrote:
> > > 
> > > > 
> > > > * Martin Schwidefsky <schwidefsky@de.ibm.com> wrote:
> > > > 
> > > > > [...] And if we really want to keep things separate there will be 
> > > > > two sets of per-cpu hrtimer, one for the old style profiler and 
> > > > > one for oprofile. Any preference for the user space interface to 
> > > > > set the sample rate? A sysctl?
> > > > 
> > > > I dont think we want to keep things separate. Regarding old-style 
> > > > profiler, does anyone still use it? There's now a superior in-tree 
> > > > replacement for it, so we could phase it out.
> > > 
> > > Well, for my part I won't miss it. But to be able to remove the 
> > > profile_tick() calls from the architectures I either have to rip 
> > > out the old profiler now, or adapt it to use hrtimer as well.
> > 
> > Do we _have to_ touch it so widely right now? We could start with a 
> > deprecation warning in this cycle. Once it's deprecated we can 
> > remove all those calls.
> 
> First version of the hrtimer patch for oprofile. I did not add the 
> sysctl yet, if the sysctl is added in oprofile_timer_init it would 
> not be available if some better profiling source is available. If 
> it is added unconditionally it would only have an effect if the 
> timer fallback is used. Both cases are not exactly nice for a user 
> space interface.

looks quite sane. I've Cc:-ed Robert.

	Ingo

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [patch 0/2] NOHZ vs. profile/oprofile v2
  2009-06-24 16:51                   ` Martin Schwidefsky
  2009-06-24 17:47                     ` Ingo Molnar
@ 2010-03-02 13:57                     ` Aaro Koskinen
  2010-03-02 15:01                       ` Martin Schwidefsky
  1 sibling, 1 reply; 21+ messages in thread
From: Aaro Koskinen @ 2010-03-02 13:57 UTC (permalink / raw)
  To: ext Martin Schwidefsky
  Cc: Ingo Molnar, Thomas Gleixner, linux-kernel, Rob van der Heij,
	Heiko Carstens, john stultz, Andi Kleen, Peter Zijlstra,
	robert.richter

Hi,

Martin Schwidefsky wrote:
> First version of the hrtimer patch for oprofile. I did not add the
> sysctl yet, if the sysctl is added in oprofile_timer_init it would not
> be available if some better profiling source is available. If it is
> added unconditionally it would only have an effect if the timer
> fallback is used. Both cases are not exactly nice for a user space
> interface.

I wonder what happened to this patch? Some platforms would need
this fix (i.e. the timer mode has to be used due to HW issues).

> ---
> Subject: [PATCH] convert oprofile from timer_hook to hrtimer
> 
> From: Martin Schwidefsky <schwidefsky@de.ibm.com>
> 
> Oprofile is currently broken on systems running with NOHZ enabled.
> A maximum of 1 tick is accounted via the timer_hook if a cpu sleeps
> for a longer period of time. This does bad things to the percentages
> in the profiler output. To solve this problem convert oprofile to
> use a restarting hrtimer instead of the timer_hook.
> 
> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
> ---
> 
>  drivers/oprofile/oprof.c     |   12 ++++--
>  drivers/oprofile/oprof.h     |    3 +
>  drivers/oprofile/timer_int.c |   77 +++++++++++++++++++++++++++++++++++++------
>  3 files changed, 78 insertions(+), 14 deletions(-)
> 
> diff -urpN linux-2.6/drivers/oprofile/oprof.c linux-2.6-patched/drivers/oprofile/oprof.c
> --- linux-2.6/drivers/oprofile/oprof.c	2009-06-24 17:21:12.000000000 +0200
> +++ linux-2.6-patched/drivers/oprofile/oprof.c	2009-06-24 17:18:28.000000000 +0200
> @@ -184,22 +184,26 @@ static int __init oprofile_init(void)
>  	int err;
>  
>  	err = oprofile_arch_init(&oprofile_ops);
> -
>  	if (err < 0 || timer) {
>  		printk(KERN_INFO "oprofile: using timer interrupt.\n");
> -		oprofile_timer_init(&oprofile_ops);
> +		err = oprofile_timer_init(&oprofile_ops);
> +		if (err)
> +			goto out_arch;
>  	}
> -
>  	err = oprofilefs_register();
>  	if (err)
> -		oprofile_arch_exit();
> +		goto out_arch;
> +	return 0;
>  
> +out_arch:
> +	oprofile_arch_exit();
>  	return err;
>  }
>  
>  
>  static void __exit oprofile_exit(void)
>  {
> +	oprofile_timer_exit();
>  	oprofilefs_unregister();
>  	oprofile_arch_exit();
>  }
> diff -urpN linux-2.6/drivers/oprofile/oprof.h linux-2.6-patched/drivers/oprofile/oprof.h
> --- linux-2.6/drivers/oprofile/oprof.h	2009-06-24 17:21:12.000000000 +0200
> +++ linux-2.6-patched/drivers/oprofile/oprof.h	2009-06-24 17:19:11.000000000 +0200
> @@ -32,7 +32,8 @@ struct super_block;
>  struct dentry;
>  
>  void oprofile_create_files(struct super_block *sb, struct dentry *root);
> -void oprofile_timer_init(struct oprofile_operations *ops);
> +int oprofile_timer_init(struct oprofile_operations *ops);
> +void oprofile_timer_exit(void);
>  
>  int oprofile_set_backtrace(unsigned long depth);
>  
> diff -urpN linux-2.6/drivers/oprofile/timer_int.c linux-2.6-patched/drivers/oprofile/timer_int.c
> --- linux-2.6/drivers/oprofile/timer_int.c	2009-06-24 17:21:12.000000000 +0200
> +++ linux-2.6-patched/drivers/oprofile/timer_int.c	2009-06-24 17:18:44.000000000 +0200
> @@ -13,34 +13,93 @@
>  #include <linux/oprofile.h>
>  #include <linux/profile.h>
>  #include <linux/init.h>
> +#include <linux/cpu.h>
> +#include <linux/hrtimer.h>
> +#include <asm/irq_regs.h>
>  #include <asm/ptrace.h>
>  
>  #include "oprof.h"
>  
> -static int timer_notify(struct pt_regs *regs)
> +static DEFINE_PER_CPU(struct hrtimer, oprofile_hrtimer);
> +
> +static enum hrtimer_restart oprofile_hrtimer_notify(struct hrtimer *hrtimer)
> +{
> +	oprofile_add_sample(get_irq_regs(), 0);
> +	hrtimer_forward_now(hrtimer, ns_to_ktime(TICK_NSEC));
> +	return HRTIMER_RESTART;
> +}
> +
> +static void __oprofile_hrtimer_start(void *unused)
> +{
> +	struct hrtimer *hrtimer = &__get_cpu_var(oprofile_hrtimer);
> +
> +	hrtimer_init(hrtimer, CLOCK_MONOTONIC, HRTIMER_MODE_REL);
> +	hrtimer->function = oprofile_hrtimer_notify;
> +
> +	hrtimer_start(hrtimer, ns_to_ktime(TICK_NSEC),
> +		      HRTIMER_MODE_REL_PINNED);
> +}
> +
> +static int oprofile_hrtimer_start(void)
>  {
> -	oprofile_add_sample(regs, 0);
> +	on_each_cpu(__oprofile_hrtimer_start, NULL, 1);
>  	return 0;
>  }
>  
> -static int timer_start(void)
> +static void __oprofile_hrtimer_stop(int cpu)
>  {
> -	return register_timer_hook(timer_notify);
> +	struct hrtimer *hrtimer = &per_cpu(oprofile_hrtimer, cpu);
> +
> +	hrtimer_cancel(hrtimer);
>  }
>  
> +static void oprofile_hrtimer_stop(void)
> +{
> +	int cpu;
> +	for_each_online_cpu(cpu)
> +		__oprofile_hrtimer_stop(cpu);
> +}
>  
> -static void timer_stop(void)
> +static int __cpuinit oprofile_cpu_notify(struct notifier_block *self,
> +					 unsigned long action, void *hcpu)
>  {
> -	unregister_timer_hook(timer_notify);
> +	long cpu = (long) hcpu;
> +
> +	switch (action) {
> +	case CPU_ONLINE:
> +	case CPU_ONLINE_FROZEN:
> +		smp_call_function_single(cpu, __oprofile_hrtimer_start,
> +					 NULL, 1);
> +		break;
> +	case CPU_DEAD:
> +	case CPU_DEAD_FROZEN:
> +		__oprofile_hrtimer_stop(cpu);
> +		break;
> +	}
> +	return NOTIFY_OK;
>  }
>  
> +static struct notifier_block __refdata oprofile_cpu_notifier = {
> +	.notifier_call = oprofile_cpu_notify,
> +};
>  
> -void __init oprofile_timer_init(struct oprofile_operations *ops)
> +int __init oprofile_timer_init(struct oprofile_operations *ops)
>  {
> +	int rc;
> +
> +        rc = register_hotcpu_notifier(&oprofile_cpu_notifier);
> +        if (rc)
> +		return rc;
>  	ops->create_files = NULL;
>  	ops->setup = NULL;
>  	ops->shutdown = NULL;
> -	ops->start = timer_start;
> -	ops->stop = timer_stop;
> +	ops->start = oprofile_hrtimer_start;
> +	ops->stop = oprofile_hrtimer_stop;
>  	ops->cpu_type = "timer";
> +	return 0;
> +}
> +
> +void __exit oprofile_timer_exit(void)
> +{
> +	unregister_hotcpu_notifier(&oprofile_cpu_notifier);
>  }
> 
> ---
> 


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [patch 0/2] NOHZ vs. profile/oprofile v2
  2010-03-02 13:57                     ` Aaro Koskinen
@ 2010-03-02 15:01                       ` Martin Schwidefsky
  2010-03-02 15:08                         ` Thomas Gleixner
  0 siblings, 1 reply; 21+ messages in thread
From: Martin Schwidefsky @ 2010-03-02 15:01 UTC (permalink / raw)
  To: Aaro Koskinen
  Cc: Ingo Molnar, Thomas Gleixner, linux-kernel, Rob van der Heij,
	Heiko Carstens, john stultz, Andi Kleen, Peter Zijlstra,
	robert.richter

On Tue, 02 Mar 2010 15:57:22 +0200
Aaro Koskinen <aaro.koskinen@nokia.com> wrote:

> Hi,
> 
> Martin Schwidefsky wrote:
> > First version of the hrtimer patch for oprofile. I did not add the
> > sysctl yet, if the sysctl is added in oprofile_timer_init it would not
> > be available if some better profiling source is available. If it is
> > added unconditionally it would only have an effect if the timer
> > fallback is used. Both cases are not exactly nice for a user space
> > interface.
> 
> I wonder what happened to this patch? Some platforms would need
> this fix (i.e. the timer mode has to be used due to HW issues).

After SH removed their oprofile hook there is nothing left that would
prevent us from converting oprofile to hrtimer. Just a matter of
reminding me and have me try again ;-)

Patch against current git. Ingo, Thomas: one for the timer tree I guess.

--
Subject: [PATCH] convert oprofile from timer_hook to hrtimer

From: Martin Schwidefsky <schwidefsky@de.ibm.com>

Oprofile is currently broken on systems running with NOHZ enabled.
A maximum of 1 tick is accounted via the timer_hook if a cpu sleeps
for a longer period of time. This does bad things to the percentages
in the profiler output. To solve this problem convert oprofile to
use a restarting hrtimer instead of the timer_hook.

Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
---

 drivers/oprofile/oprof.c     |   12 ++++--
 drivers/oprofile/oprof.h     |    3 +
 drivers/oprofile/timer_int.c |   78 ++++++++++++++++++++++++++++++++++++++-----
 3 files changed, 79 insertions(+), 14 deletions(-)

diff -urpN linux-2.6/drivers/oprofile/oprof.c linux-2.6-patched/drivers/oprofile/oprof.c
--- linux-2.6/drivers/oprofile/oprof.c	2010-02-24 19:52:17.000000000 +0100
+++ linux-2.6-patched/drivers/oprofile/oprof.c	2010-03-02 15:57:02.000000000 +0100
@@ -253,22 +253,26 @@ static int __init oprofile_init(void)
 	int err;
 
 	err = oprofile_arch_init(&oprofile_ops);
-
 	if (err < 0 || timer) {
 		printk(KERN_INFO "oprofile: using timer interrupt.\n");
-		oprofile_timer_init(&oprofile_ops);
+		err = oprofile_timer_init(&oprofile_ops);
+		if (err)
+			goto out_arch;
 	}
-
 	err = oprofilefs_register();
 	if (err)
-		oprofile_arch_exit();
+		goto out_arch;
+	return 0;
 
+out_arch:
+	oprofile_arch_exit();
 	return err;
 }
 
 
 static void __exit oprofile_exit(void)
 {
+	oprofile_timer_exit();
 	oprofilefs_unregister();
 	oprofile_arch_exit();
 }
diff -urpN linux-2.6/drivers/oprofile/oprof.h linux-2.6-patched/drivers/oprofile/oprof.h
--- linux-2.6/drivers/oprofile/oprof.h	2010-02-24 19:52:17.000000000 +0100
+++ linux-2.6-patched/drivers/oprofile/oprof.h	2010-03-02 15:57:02.000000000 +0100
@@ -34,7 +34,8 @@ struct super_block;
 struct dentry;
 
 void oprofile_create_files(struct super_block *sb, struct dentry *root);
-void oprofile_timer_init(struct oprofile_operations *ops);
+int oprofile_timer_init(struct oprofile_operations *ops);
+void oprofile_timer_exit(void);
 
 int oprofile_set_backtrace(unsigned long depth);
 int oprofile_set_timeout(unsigned long time);
diff -urpN linux-2.6/drivers/oprofile/timer_int.c linux-2.6-patched/drivers/oprofile/timer_int.c
--- linux-2.6/drivers/oprofile/timer_int.c	2010-02-24 19:52:17.000000000 +0100
+++ linux-2.6-patched/drivers/oprofile/timer_int.c	2010-03-02 15:57:02.000000000 +0100
@@ -13,34 +13,94 @@
 #include <linux/oprofile.h>
 #include <linux/profile.h>
 #include <linux/init.h>
+#include <linux/cpu.h>
+#include <linux/hrtimer.h>
+#include <asm/irq_regs.h>
 #include <asm/ptrace.h>
 
 #include "oprof.h"
 
-static int timer_notify(struct pt_regs *regs)
+static DEFINE_PER_CPU(struct hrtimer, oprofile_hrtimer);
+
+static enum hrtimer_restart oprofile_hrtimer_notify(struct hrtimer *hrtimer)
+{
+	oprofile_add_sample(get_irq_regs(), 0);
+	hrtimer_forward_now(hrtimer, ns_to_ktime(TICK_NSEC));
+	return HRTIMER_RESTART;
+}
+
+static void __oprofile_hrtimer_start(void *unused)
+{
+	struct hrtimer *hrtimer = &__get_cpu_var(oprofile_hrtimer);
+
+	hrtimer_init(hrtimer, CLOCK_MONOTONIC, HRTIMER_MODE_REL);
+	hrtimer->function = oprofile_hrtimer_notify;
+
+	hrtimer_start(hrtimer, ns_to_ktime(TICK_NSEC),
+		      HRTIMER_MODE_REL_PINNED);
+}
+
+static int oprofile_hrtimer_start(void)
 {
-	oprofile_add_sample(regs, 0);
+	on_each_cpu(__oprofile_hrtimer_start, NULL, 1);
 	return 0;
 }
 
-static int timer_start(void)
+static void __oprofile_hrtimer_stop(int cpu)
 {
-	return register_timer_hook(timer_notify);
+	struct hrtimer *hrtimer = &per_cpu(oprofile_hrtimer, cpu);
+
+	hrtimer_cancel(hrtimer);
 }
 
+static void oprofile_hrtimer_stop(void)
+{
+	int cpu;
+
+	for_each_online_cpu(cpu)
+		__oprofile_hrtimer_stop(cpu);
+}
 
-static void timer_stop(void)
+static int __cpuinit oprofile_cpu_notify(struct notifier_block *self,
+					 unsigned long action, void *hcpu)
 {
-	unregister_timer_hook(timer_notify);
+	long cpu = (long) hcpu;
+
+	switch (action) {
+	case CPU_ONLINE:
+	case CPU_ONLINE_FROZEN:
+		smp_call_function_single(cpu, __oprofile_hrtimer_start,
+					 NULL, 1);
+		break;
+	case CPU_DEAD:
+	case CPU_DEAD_FROZEN:
+		__oprofile_hrtimer_stop(cpu);
+		break;
+	}
+	return NOTIFY_OK;
 }
 
+static struct notifier_block __refdata oprofile_cpu_notifier = {
+	.notifier_call = oprofile_cpu_notify,
+};
 
-void __init oprofile_timer_init(struct oprofile_operations *ops)
+int __init oprofile_timer_init(struct oprofile_operations *ops)
 {
+	int rc;
+
+	rc = register_hotcpu_notifier(&oprofile_cpu_notifier);
+	if (rc)
+		return rc;
 	ops->create_files = NULL;
 	ops->setup = NULL;
 	ops->shutdown = NULL;
-	ops->start = timer_start;
-	ops->stop = timer_stop;
+	ops->start = oprofile_hrtimer_start;
+	ops->stop = oprofile_hrtimer_stop;
 	ops->cpu_type = "timer";
+	return 0;
+}
+
+void __exit oprofile_timer_exit(void)
+{
+	unregister_hotcpu_notifier(&oprofile_cpu_notifier);
 }

-- 
blue skies,
   Martin.

"Reality continues to ruin my life." - Calvin.


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [patch 0/2] NOHZ vs. profile/oprofile v2
  2010-03-02 15:01                       ` Martin Schwidefsky
@ 2010-03-02 15:08                         ` Thomas Gleixner
  2010-03-02 15:25                           ` Martin Schwidefsky
  2010-03-02 15:38                           ` Robert Richter
  0 siblings, 2 replies; 21+ messages in thread
From: Thomas Gleixner @ 2010-03-02 15:08 UTC (permalink / raw)
  To: Martin Schwidefsky
  Cc: Aaro Koskinen, Ingo Molnar, linux-kernel, Rob van der Heij,
	Heiko Carstens, john stultz, Andi Kleen, Peter Zijlstra,
	robert.richter

On Tue, 2 Mar 2010, Martin Schwidefsky wrote:

> On Tue, 02 Mar 2010 15:57:22 +0200
> Aaro Koskinen <aaro.koskinen@nokia.com> wrote:
> 
> > Hi,
> > 
> > Martin Schwidefsky wrote:
> > > First version of the hrtimer patch for oprofile. I did not add the
> > > sysctl yet, if the sysctl is added in oprofile_timer_init it would not
> > > be available if some better profiling source is available. If it is
> > > added unconditionally it would only have an effect if the timer
> > > fallback is used. Both cases are not exactly nice for a user space
> > > interface.
> > 
> > I wonder what happened to this patch? Some platforms would need
> > this fix (i.e. the timer mode has to be used due to HW issues).
> 
> After SH removed their oprofile hook there is nothing left that would
> prevent us from converting oprofile to hrtimer. Just a matter of
> reminding me and have me try again ;-)
> 
> Patch against current git. Ingo, Thomas: one for the timer tree I guess.

No objections from my side, but shouldn't that go via Robert ?

Thanks,

	tglx

> --
> Subject: [PATCH] convert oprofile from timer_hook to hrtimer
> 
> From: Martin Schwidefsky <schwidefsky@de.ibm.com>
> 
> Oprofile is currently broken on systems running with NOHZ enabled.
> A maximum of 1 tick is accounted via the timer_hook if a cpu sleeps
> for a longer period of time. This does bad things to the percentages
> in the profiler output. To solve this problem convert oprofile to
> use a restarting hrtimer instead of the timer_hook.
> 
> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
> ---
> 
>  drivers/oprofile/oprof.c     |   12 ++++--
>  drivers/oprofile/oprof.h     |    3 +
>  drivers/oprofile/timer_int.c |   78 ++++++++++++++++++++++++++++++++++++++-----
>  3 files changed, 79 insertions(+), 14 deletions(-)
> 
> diff -urpN linux-2.6/drivers/oprofile/oprof.c linux-2.6-patched/drivers/oprofile/oprof.c
> --- linux-2.6/drivers/oprofile/oprof.c	2010-02-24 19:52:17.000000000 +0100
> +++ linux-2.6-patched/drivers/oprofile/oprof.c	2010-03-02 15:57:02.000000000 +0100
> @@ -253,22 +253,26 @@ static int __init oprofile_init(void)
>  	int err;
>  
>  	err = oprofile_arch_init(&oprofile_ops);
> -
>  	if (err < 0 || timer) {
>  		printk(KERN_INFO "oprofile: using timer interrupt.\n");
> -		oprofile_timer_init(&oprofile_ops);
> +		err = oprofile_timer_init(&oprofile_ops);
> +		if (err)
> +			goto out_arch;
>  	}
> -
>  	err = oprofilefs_register();
>  	if (err)
> -		oprofile_arch_exit();
> +		goto out_arch;
> +	return 0;
>  
> +out_arch:
> +	oprofile_arch_exit();
>  	return err;
>  }
>  
>  
>  static void __exit oprofile_exit(void)
>  {
> +	oprofile_timer_exit();
>  	oprofilefs_unregister();
>  	oprofile_arch_exit();
>  }
> diff -urpN linux-2.6/drivers/oprofile/oprof.h linux-2.6-patched/drivers/oprofile/oprof.h
> --- linux-2.6/drivers/oprofile/oprof.h	2010-02-24 19:52:17.000000000 +0100
> +++ linux-2.6-patched/drivers/oprofile/oprof.h	2010-03-02 15:57:02.000000000 +0100
> @@ -34,7 +34,8 @@ struct super_block;
>  struct dentry;
>  
>  void oprofile_create_files(struct super_block *sb, struct dentry *root);
> -void oprofile_timer_init(struct oprofile_operations *ops);
> +int oprofile_timer_init(struct oprofile_operations *ops);
> +void oprofile_timer_exit(void);
>  
>  int oprofile_set_backtrace(unsigned long depth);
>  int oprofile_set_timeout(unsigned long time);
> diff -urpN linux-2.6/drivers/oprofile/timer_int.c linux-2.6-patched/drivers/oprofile/timer_int.c
> --- linux-2.6/drivers/oprofile/timer_int.c	2010-02-24 19:52:17.000000000 +0100
> +++ linux-2.6-patched/drivers/oprofile/timer_int.c	2010-03-02 15:57:02.000000000 +0100
> @@ -13,34 +13,94 @@
>  #include <linux/oprofile.h>
>  #include <linux/profile.h>
>  #include <linux/init.h>
> +#include <linux/cpu.h>
> +#include <linux/hrtimer.h>
> +#include <asm/irq_regs.h>
>  #include <asm/ptrace.h>
>  
>  #include "oprof.h"
>  
> -static int timer_notify(struct pt_regs *regs)
> +static DEFINE_PER_CPU(struct hrtimer, oprofile_hrtimer);
> +
> +static enum hrtimer_restart oprofile_hrtimer_notify(struct hrtimer *hrtimer)
> +{
> +	oprofile_add_sample(get_irq_regs(), 0);
> +	hrtimer_forward_now(hrtimer, ns_to_ktime(TICK_NSEC));
> +	return HRTIMER_RESTART;
> +}
> +
> +static void __oprofile_hrtimer_start(void *unused)
> +{
> +	struct hrtimer *hrtimer = &__get_cpu_var(oprofile_hrtimer);
> +
> +	hrtimer_init(hrtimer, CLOCK_MONOTONIC, HRTIMER_MODE_REL);
> +	hrtimer->function = oprofile_hrtimer_notify;
> +
> +	hrtimer_start(hrtimer, ns_to_ktime(TICK_NSEC),
> +		      HRTIMER_MODE_REL_PINNED);
> +}
> +
> +static int oprofile_hrtimer_start(void)
>  {
> -	oprofile_add_sample(regs, 0);
> +	on_each_cpu(__oprofile_hrtimer_start, NULL, 1);
>  	return 0;
>  }
>  
> -static int timer_start(void)
> +static void __oprofile_hrtimer_stop(int cpu)
>  {
> -	return register_timer_hook(timer_notify);
> +	struct hrtimer *hrtimer = &per_cpu(oprofile_hrtimer, cpu);
> +
> +	hrtimer_cancel(hrtimer);
>  }
>  
> +static void oprofile_hrtimer_stop(void)
> +{
> +	int cpu;
> +
> +	for_each_online_cpu(cpu)
> +		__oprofile_hrtimer_stop(cpu);
> +}
>  
> -static void timer_stop(void)
> +static int __cpuinit oprofile_cpu_notify(struct notifier_block *self,
> +					 unsigned long action, void *hcpu)
>  {
> -	unregister_timer_hook(timer_notify);
> +	long cpu = (long) hcpu;
> +
> +	switch (action) {
> +	case CPU_ONLINE:
> +	case CPU_ONLINE_FROZEN:
> +		smp_call_function_single(cpu, __oprofile_hrtimer_start,
> +					 NULL, 1);
> +		break;
> +	case CPU_DEAD:
> +	case CPU_DEAD_FROZEN:
> +		__oprofile_hrtimer_stop(cpu);
> +		break;
> +	}
> +	return NOTIFY_OK;
>  }
>  
> +static struct notifier_block __refdata oprofile_cpu_notifier = {
> +	.notifier_call = oprofile_cpu_notify,
> +};
>  
> -void __init oprofile_timer_init(struct oprofile_operations *ops)
> +int __init oprofile_timer_init(struct oprofile_operations *ops)
>  {
> +	int rc;
> +
> +	rc = register_hotcpu_notifier(&oprofile_cpu_notifier);
> +	if (rc)
> +		return rc;
>  	ops->create_files = NULL;
>  	ops->setup = NULL;
>  	ops->shutdown = NULL;
> -	ops->start = timer_start;
> -	ops->stop = timer_stop;
> +	ops->start = oprofile_hrtimer_start;
> +	ops->stop = oprofile_hrtimer_stop;
>  	ops->cpu_type = "timer";
> +	return 0;
> +}
> +
> +void __exit oprofile_timer_exit(void)
> +{
> +	unregister_hotcpu_notifier(&oprofile_cpu_notifier);
>  }
> 
> -- 
> blue skies,
>    Martin.
> 
> "Reality continues to ruin my life." - Calvin.
> 

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [patch 0/2] NOHZ vs. profile/oprofile v2
  2010-03-02 15:08                         ` Thomas Gleixner
@ 2010-03-02 15:25                           ` Martin Schwidefsky
  2010-03-02 15:38                           ` Robert Richter
  1 sibling, 0 replies; 21+ messages in thread
From: Martin Schwidefsky @ 2010-03-02 15:25 UTC (permalink / raw)
  To: Thomas Gleixner, Robert Richter
  Cc: Aaro Koskinen, Ingo Molnar, linux-kernel, Rob van der Heij,
	Heiko Carstens, john stultz, Andi Kleen, Peter Zijlstra,
	robert.richter

On Tue, 2 Mar 2010 16:08:09 +0100 (CET)
Thomas Gleixner <tglx@linutronix.de> wrote:

> On Tue, 2 Mar 2010, Martin Schwidefsky wrote:
> 
> > On Tue, 02 Mar 2010 15:57:22 +0200
> > Aaro Koskinen <aaro.koskinen@nokia.com> wrote:
> > 
> > > Hi,
> > > 
> > > Martin Schwidefsky wrote:
> > > > First version of the hrtimer patch for oprofile. I did not add the
> > > > sysctl yet, if the sysctl is added in oprofile_timer_init it would not
> > > > be available if some better profiling source is available. If it is
> > > > added unconditionally it would only have an effect if the timer
> > > > fallback is used. Both cases are not exactly nice for a user space
> > > > interface.
> > > 
> > > I wonder what happened to this patch? Some platforms would need
> > > this fix (i.e. the timer mode has to be used due to HW issues).
> > 
> > After SH removed their oprofile hook there is nothing left that would
> > prevent us from converting oprofile to hrtimer. Just a matter of
> > reminding me and have me try again ;-)
> > 
> > Patch against current git. Ingo, Thomas: one for the timer tree I guess.
> 
> No objections from my side, but shouldn't that go via Robert ?

Agreed, this should be handled by the oprofile maintainer.
Robert, could you take care of the patch?

-- 
blue skies,
   Martin.

"Reality continues to ruin my life." - Calvin.


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [patch 0/2] NOHZ vs. profile/oprofile v2
  2010-03-02 15:08                         ` Thomas Gleixner
  2010-03-02 15:25                           ` Martin Schwidefsky
@ 2010-03-02 15:38                           ` Robert Richter
  1 sibling, 0 replies; 21+ messages in thread
From: Robert Richter @ 2010-03-02 15:38 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Martin Schwidefsky, Aaro Koskinen, Ingo Molnar, linux-kernel,
	Rob van der Heij, Heiko Carstens, john stultz, Andi Kleen,
	Peter Zijlstra

On 02.03.10 16:08:09, Thomas Gleixner wrote:
> > Patch against current git. Ingo, Thomas: one for the timer tree I guess.
> 
> No objections from my side, but shouldn't that go via Robert ?

I will send the patch upstream.

-Robert

-- 
Advanced Micro Devices, Inc.
Operating System Research Center
email: robert.richter@amd.com


^ permalink raw reply	[flat|nested] 21+ messages in thread

end of thread, other threads:[~2010-03-02 15:38 UTC | newest]

Thread overview: 21+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2009-06-03 15:22 [patch 0/2] NOHZ vs. profile/oprofile v2 Martin Schwidefsky
2009-06-03 15:22 ` [patch 1/2] idle profile hits with NOHZ Martin Schwidefsky
2009-06-03 15:22 ` [patch 2/2] keep on ticking if oprofile is active Martin Schwidefsky
2009-06-09 20:52 ` [patch 0/2] NOHZ vs. profile/oprofile v2 Thomas Gleixner
2009-06-10 17:33   ` Martin Schwidefsky
2009-06-22 14:26   ` Martin Schwidefsky
2009-06-22 14:41     ` Ingo Molnar
2009-06-22 14:59       ` Martin Schwidefsky
2009-06-22 15:05         ` Ingo Molnar
2009-06-22 15:18           ` Martin Schwidefsky
2009-06-22 15:29             ` Ingo Molnar
2009-06-22 15:36               ` Martin Schwidefsky
2009-06-22 15:40                 ` Ingo Molnar
2009-06-22 16:37                   ` Martin Schwidefsky
2009-06-24 16:51                   ` Martin Schwidefsky
2009-06-24 17:47                     ` Ingo Molnar
2010-03-02 13:57                     ` Aaro Koskinen
2010-03-02 15:01                       ` Martin Schwidefsky
2010-03-02 15:08                         ` Thomas Gleixner
2010-03-02 15:25                           ` Martin Schwidefsky
2010-03-02 15:38                           ` Robert Richter

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.