linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] Dynamic tick, version 050127-1
@ 2005-01-27 21:29 Tony Lindgren
  2005-01-27 21:50 ` Tony Lindgren
                   ` (2 more replies)
  0 siblings, 3 replies; 38+ messages in thread
From: Tony Lindgren @ 2005-01-27 21:29 UTC (permalink / raw)
  To: Benjamin Herrenschmidt, Pavel Machek, Arjan van de Ven,
	Martin Schwidefsky, Andrea Arcangeli, George Anzinger,
	Thomas Gleixner, john stultz, Zwane Mwaikambo, Lee Revell
  Cc: linux-kernel

[-- Attachment #1: Type: text/plain, Size: 2392 bytes --]

Hi all,

Thanks for all the comments, here's an updated version of the dynamic
tick patch.

I've fixed couple of things:

- Dyn-tick now supports local APIC timer. This allows longer sleep time
  inbetween ticks, over 1000 ticks compared to 54 ticks with PIT timer.
  It seems to stop timers on SMP too, but I've only briefly played with
  it on SMP.

- Fixed a stupid bug where next_timer_interrupt() was called, but
  jiffies was not substracted from the value. This caused the sleep to
  be always maximum available...

- CONFIG_HPET_TIMER is still not supported, but now the dyn-tick should
  automatically get disabled if HPET timer is detected.

- Now processor _should_ stay in idle for the duration of skipped ticks,
  as PIT and local APIC timers are disabled. I haven't verified this
  though.

I haven't fixes some things, such as the separation of the idle loop
into it's own module, the generic part does not really do much
anyting, etc.

Please note that this patch alone probably does not produce any
considerable power savings. More work is needed in the PM area to take
advantage of the savings. Some examples of the work needed are:

- There's lots of polling timers being used in Linux, such as in the 
  keyboard driver, that keeps the ticks skipped very short. Many of
  these timers could be improved.

- There's currently no way to specify what kind of idle to use based on
  the estimated length of the sleep. For example, if the system supports
  ACPI C3 state, it should be possible to automatically enter C3 if the
  skippable jiffies are long enough. I believe the current ACPI idle
  loop bases the promotion/demotion to the number of idle loops run in
  certain time, which does not work when skipping ticks.

Please also note that this patch does not solve the high-resolution
timers issues. This patch is intended to be a minimal patch to expose
and improve PM related issues.

The main difference between dyn-tick patch compared to the high-res VST
patch is that dyn-tick uses the next_timer_interrupt() function, and 
does not require the high-res timers patch to be installed. In the long
run the dyn-tick and VST patches will probably merge. But meanwhile, 
PM related work can be done that benefits both patches.

Again, comments and testing is appreciated! If having problems, please
provide output of dmesg | grep -i "time\|tick\|apic".

Regards,

Tony

[-- Attachment #2: patch-dynamic-tick-2.6.11-rc2-050127-1 --]
[-- Type: text/plain, Size: 20607 bytes --]

diff -Nru a/arch/i386/Kconfig b/arch/i386/Kconfig
--- a/arch/i386/Kconfig	2005-01-27 13:10:04 -08:00
+++ b/arch/i386/Kconfig	2005-01-27 13:10:04 -08:00
@@ -452,6 +452,16 @@
 	bool "Provide RTC interrupt"
 	depends on HPET_TIMER && RTC=y
 
+config NO_IDLE_HZ
+	bool "Dynamic Tick Timer - Skip timer ticks during idle"
+	help
+	  This option enables support for skipping timer ticks when the
+	  processor is idle. During system load, timer is continuous.
+	  This option saves power, as it allows the system to stay in
+	  idle mode longer. Currently supported timers are ACPI PM
+	  timer, local APIC timer, and TSC timer. HPET timer is currently
+	  not supported.
+
 config SMP
 	bool "Symmetric multi-processing support"
 	---help---
diff -Nru a/arch/i386/kernel/apic.c b/arch/i386/kernel/apic.c
--- a/arch/i386/kernel/apic.c	2005-01-27 13:10:04 -08:00
+++ b/arch/i386/kernel/apic.c	2005-01-27 13:10:04 -08:00
@@ -26,6 +26,7 @@
 #include <linux/mc146818rtc.h>
 #include <linux/kernel_stat.h>
 #include <linux/sysdev.h>
+#include <linux/dyn-tick-timer.h>
 
 #include <asm/atomic.h>
 #include <asm/smp.h>
@@ -796,8 +797,12 @@
 	if (!smp_found_config && detect_init_APIC()) {
 		apic_phys = (unsigned long) alloc_bootmem_pages(PAGE_SIZE);
 		apic_phys = __pa(apic_phys);
-	} else
+	} else {
 		apic_phys = mp_lapic_addr;
+#ifdef CONFIG_NO_IDLE_HZ
+		dyn_tick->state |= DYN_TICK_USE_APIC;
+#endif
+	}
 
 	set_fixmap_nocache(FIX_APIC_BASE, apic_phys);
 	printk(KERN_DEBUG "mapped APIC to %08lx (%08lx)\n", APIC_BASE,
@@ -910,6 +915,8 @@
 
 #define APIC_DIVISOR 16
 
+static u32 apic_timer_val;
+
 void __setup_APIC_LVTT(unsigned int clocks)
 {
 	unsigned int lvtt_value, tmp_value, ver;
@@ -928,7 +935,15 @@
 				& ~(APIC_TDR_DIV_1 | APIC_TDR_DIV_TMBASE))
 				| APIC_TDR_DIV_16);
 
-	apic_write_around(APIC_TMICT, clocks/APIC_DIVISOR);
+	apic_timer_val = clocks/APIC_DIVISOR;
+
+#ifdef CONFIG_NO_IDLE_HZ
+	/* Local APIC timer is 24-bit */
+	if (apic_timer_val)
+		dyn_tick->max_skip = 0xffffff / apic_timer_val;
+#endif
+
+	apic_write_around(APIC_TMICT, apic_timer_val);
 }
 
 static void setup_APIC_timer(unsigned int clocks)
@@ -1071,6 +1086,18 @@
 	}
 }
 
+#ifdef CONFIG_NO_IDLE_HZ
+void reprogram_apic_timer(unsigned int count)
+{
+	unsigned long flags;
+
+	count *= apic_timer_val;
+	local_irq_save(flags);
+	apic_write_around(APIC_TMICT, count);
+	local_irq_restore(flags);
+}             
+#endif
+
 /*
  * the frequency of the profiling timer can be changed
  * by writing a multiplier value into /proc/profile.
@@ -1163,6 +1190,7 @@
 
 fastcall void smp_apic_timer_interrupt(struct pt_regs *regs)
 {
+	unsigned long seq;
 	int cpu = smp_processor_id();
 
 	/*
@@ -1181,6 +1209,23 @@
 	 * interrupt lock, which is the WrongThing (tm) to do.
 	 */
 	irq_enter();
+
+#ifdef CONFIG_NO_IDLE_HZ
+	/*
+	 * Check if we need to wake up PIT interrupt handler.
+	 * Otherwise just wake up local APIC timer.
+	 */
+	do {
+		seq = read_seqbegin(&xtime_lock);
+		if (dyn_tick->state & (DYN_TICK_ENABLED | DYN_TICK_SKIPPING)) {
+			if (dyn_tick->skip_cpu == cpu && dyn_tick->skip > DYN_TICK_MIN_SKIP)
+				dyn_tick->interrupt(0, NULL, regs);
+			else
+				reprogram_apic_timer(1);
+		}
+	} while (read_seqretry(&xtime_lock, seq));
+#endif
+
 	smp_local_timer_interrupt(regs);
 	irq_exit();
 }
diff -Nru a/arch/i386/kernel/irq.c b/arch/i386/kernel/irq.c
--- a/arch/i386/kernel/irq.c	2005-01-27 13:10:04 -08:00
+++ b/arch/i386/kernel/irq.c	2005-01-27 13:10:04 -08:00
@@ -15,6 +15,7 @@
 #include <linux/seq_file.h>
 #include <linux/interrupt.h>
 #include <linux/kernel_stat.h>
+#include <linux/dyn-tick-timer.h>
 
 #ifndef CONFIG_X86_LOCAL_APIC
 /*
@@ -100,6 +101,11 @@
 	} else
 #endif
 		__do_IRQ(irq, regs);
+
+#ifdef CONFIG_NO_IDLE_HZ
+	if (dyn_tick->state & (DYN_TICK_ENABLED | DYN_TICK_SKIPPING) && irq != 0)
+		dyn_tick->interrupt(irq, NULL, regs);
+#endif
 
 	irq_exit();
 
diff -Nru a/arch/i386/kernel/time.c b/arch/i386/kernel/time.c
--- a/arch/i386/kernel/time.c	2005-01-27 13:10:04 -08:00
+++ b/arch/i386/kernel/time.c	2005-01-27 13:10:04 -08:00
@@ -46,6 +46,7 @@
 #include <linux/bcd.h>
 #include <linux/efi.h>
 #include <linux/mca.h>
+#include <linux/dyn-tick-timer.h>
 
 #include <asm/io.h>
 #include <asm/smp.h>
@@ -301,6 +302,55 @@
 	return IRQ_HANDLED;
 }
 
+#ifdef CONFIG_NO_IDLE_HZ
+static unsigned long long last_tick;
+void reprogram_pit_tick(int jiffies_to_skip);
+extern void reprogram_apic_timer(unsigned int count);
+extern void replace_timer_interrupt(void * new_handler);
+
+#ifdef DEBUG
+#define dbg_dyn_tick_irq() {if (skipped && skipped < dyn_tick->skip) \
+				printk("%li/%li ", skipped, dyn_tick->skip);}
+#else
+#define dbg_dyn_tick_irq() {}
+#endif
+
+
+
+/*
+ * This interrupt handler updates the time based on number of jiffies skipped
+ * It would be somewhat more optimized to have a customa handler in each timer
+ * using hardware ticks instead of nanoseconds. Note that CONFIG_NO_IDLE_HZ
+ * currently disables timer fallback on skipped jiffies.
+ */
+irqreturn_t dyn_tick_timer_interrupt(int irq, void *dev_id, struct pt_regs *regs)
+{
+	unsigned long flags;
+	volatile unsigned long long now;
+	unsigned int skipped = 0;
+	write_seqlock_irqsave(&xtime_lock, flags);
+	now = cur_timer->get_hw_time();
+	while (now - last_tick >= NS_TICK_LEN) {
+		last_tick += NS_TICK_LEN;
+		cur_timer->mark_offset();
+		do_timer_interrupt(irq, NULL, regs);
+		skipped++;
+	}
+	if (dyn_tick->state & (DYN_TICK_ENABLED | DYN_TICK_SKIPPING)) {
+		dbg_dyn_tick_irq();
+		dyn_tick->skip = 1;
+		if (cpu_has_local_apic())
+			reprogram_apic_timer(dyn_tick->skip);
+		reprogram_pit_tick(dyn_tick->skip);
+		dyn_tick->state |= DYN_TICK_ENABLED;
+		dyn_tick->state &= ~DYN_TICK_SKIPPING;
+	}
+	write_sequnlock_irqrestore(&xtime_lock, flags);
+
+	return IRQ_HANDLED;
+}
+#endif
+
 /* not static: needed by APM */
 unsigned long get_cmos_time(void)
 {
@@ -396,6 +446,72 @@
 }
 #endif
 
+#ifdef CONFIG_NO_IDLE_HZ
+static struct dyn_tick_timer arch_ltt;
+
+#if defined(CONFIG_X86_UP_APIC) || defined(CONFIG_SMP)
+void disable_pit_tick(void)
+{
+	extern spinlock_t i8253_lock;
+	unsigned long flags;
+	spin_lock_irqsave(&i8253_lock, flags);
+	outb_p(0x31, PIT_MODE);		/* binary, mode 1, LSB/MSB, ch 0 */
+	spin_unlock_irqrestore(&i8253_lock, flags);
+}
+#endif
+
+/*
+ * Reprograms the next timer interrupt
+ * PIT timer reprogramming code taken from APM code.
+ * Note that PIT timer is a 16-bit timer, which allows max
+ * skip of only few seconds.
+ */
+void reprogram_pit_tick(int jiffies_to_skip)
+{
+	int skip;
+	extern spinlock_t i8253_lock;
+	unsigned long flags;
+
+	skip = jiffies_to_skip * LATCH;
+	if (skip > 0xffff) {
+		skip = 0xffff;
+	}      
+
+	spin_lock_irqsave(&i8253_lock, flags);
+	outb_p(0x34, PIT_MODE);		/* binary, mode 2, LSB/MSB, ch 0 */
+	outb_p(skip & 0xff, PIT_CH0);	/* LSB */
+	outb(skip >> 8, PIT_CH0);	/* MSB */
+	spin_unlock_irqrestore(&i8253_lock, flags);
+}
+
+static int __init dyn_tick_late_init(void)
+{
+	unsigned long flags;
+
+	if (!cur_timer->get_hw_time)
+		return -ENODEV;
+	write_seqlock_irqsave(&xtime_lock, flags);
+	last_tick = cur_timer->get_hw_time();
+	dyn_tick->skip = 1;
+	if (!cpu_has_local_apic())
+		dyn_tick->max_skip = 0xffff/LATCH;	/* PIT timer length */
+	printk(KERN_INFO "dyn-tick: Maximum ticks to skip limited to %i\n",
+	       dyn_tick->max_skip);
+	write_sequnlock_irqrestore(&xtime_lock, flags);
+
+	if (cur_timer->late_init)
+		cur_timer->late_init();
+	dyn_tick->interrupt = dyn_tick_timer_interrupt;
+	replace_timer_interrupt(dyn_tick->interrupt);
+
+	write_seqlock_irqsave(&xtime_lock, flags);
+	dyn_tick->state |= DYN_TICK_ENABLED;
+	write_sequnlock_irqrestore(&xtime_lock, flags);
+
+	return 0;
+}
+#endif
+
 void __init time_init(void)
 {
 #ifdef CONFIG_HPET_TIMER
@@ -415,6 +531,16 @@
 
 	cur_timer = select_timer();
 	printk(KERN_INFO "Using %s for high-res timesource\n",cur_timer->name);
+
+#ifdef CONFIG_NO_IDLE_HZ
+	if (strncmp(cur_timer->name, "tsc", 3) == 0 ||
+	    strncmp(cur_timer->name, "pmtmr", 3) == 0) {
+		arch_ltt.init = dyn_tick_late_init;
+		dyn_tick_register(&arch_ltt);
+	} else
+		printk(KERN_INFO "dyn-tick: Cannot use timer %s\n",
+		       cur_timer->name);
+#endif
 
 	time_init_hook();
 }
diff -Nru a/arch/i386/kernel/timers/timer_pm.c b/arch/i386/kernel/timers/timer_pm.c
--- a/arch/i386/kernel/timers/timer_pm.c	2005-01-27 13:10:04 -08:00
+++ b/arch/i386/kernel/timers/timer_pm.c	2005-01-27 13:10:04 -08:00
@@ -15,6 +15,7 @@
 #include <linux/module.h>
 #include <linux/device.h>
 #include <linux/init.h>
+#include <linux/dyn-tick-timer.h>
 #include <asm/types.h>
 #include <asm/timer.h>
 #include <asm/smp.h>
@@ -168,6 +169,7 @@
 	monotonic_base += delta * NSEC_PER_USEC;
 	write_sequnlock(&monotonic_lock);
 
+#ifndef CONFIG_NO_IDLE_HZ
 	/* convert to ticks */
 	delta += offset_delay;
 	lost = delta / (USEC_PER_SEC / HZ);
@@ -184,6 +186,7 @@
 		first_run = 0;
 		offset_delay = 0;
 	}
+#endif
 }
 
 
@@ -238,6 +241,25 @@
 	return (unsigned long) offset_delay + cyc2us(delta);
 }
 
+static unsigned long long ns_time;
+
+static unsigned long long get_hw_time_pmtmr(void)
+{
+	u32 now, delta;
+	static unsigned int last_cycles;
+	now = read_pmtmr();
+	delta = (now - last_cycles) & ACPI_PM_MASK;
+	last_cycles = now;
+	ns_time += cyc2us(delta) * NSEC_PER_USEC;
+	return ns_time;
+}
+
+static void late_init_pmtmr(void)
+{
+	ns_time = monotonic_clock_pmtmr();
+}
+
+extern irqreturn_t pmtmr_interrupt(int irq, void *dev_id, struct pt_regs *regs);
 
 /* acpi timer_opts struct */
 static struct timer_opts timer_pmtmr = {
@@ -245,7 +267,9 @@
 	.mark_offset		= mark_offset_pmtmr,
 	.get_offset		= get_offset_pmtmr,
 	.monotonic_clock 	= monotonic_clock_pmtmr,
+	.get_hw_time		= get_hw_time_pmtmr,
 	.delay 			= delay_pmtmr,
+	.late_init		= late_init_pmtmr,
 };
 
 struct init_timer_opts __initdata timer_pmtmr_init = {
diff -Nru a/arch/i386/kernel/timers/timer_tsc.c b/arch/i386/kernel/timers/timer_tsc.c
--- a/arch/i386/kernel/timers/timer_tsc.c	2005-01-27 13:10:04 -08:00
+++ b/arch/i386/kernel/timers/timer_tsc.c	2005-01-27 13:10:04 -08:00
@@ -112,6 +112,15 @@
 	return delay_at_last_interrupt + edx;
 }
 
+static unsigned long get_hw_time_tsc(void)
+{
+	register unsigned long eax, edx;
+
+	unsigned long long hw_time;
+	rdtscll(hw_time);
+	return cycles_2_ns(hw_time);
+}
+
 static unsigned long long monotonic_clock_tsc(void)
 {
 	unsigned long long last_offset, this_offset, base;
@@ -348,6 +357,7 @@
 
 	rdtsc(last_tsc_low, last_tsc_high);
 
+#ifndef CONFIG_NO_IDLE_HZ
 	spin_lock(&i8253_lock);
 	outb_p(0x00, PIT_MODE);     /* latch the count ASAP */
 
@@ -415,14 +425,18 @@
 			cpufreq_delayed_get();
 	} else
 		lost_count = 0;
+#endif
+
 	/* update the monotonic base value */
 	this_offset = ((unsigned long long)last_tsc_high<<32)|last_tsc_low;
 	monotonic_base += cycles_2_ns(this_offset - last_offset);
 	write_sequnlock(&monotonic_lock);
 
+#ifndef CONFIG_NO_IDLE_HZ
 	/* calculate delay_at_last_interrupt */
 	count = ((LATCH-1) - count) * TICK_SIZE;
 	delay_at_last_interrupt = (count + LATCH/2) / LATCH;
+#endif
 
 	/* catch corner case where tick rollover occured
 	 * between tsc and pit reads (as noted when
@@ -551,6 +565,7 @@
 	.mark_offset = mark_offset_tsc, 
 	.get_offset = get_offset_tsc,
 	.monotonic_clock = monotonic_clock_tsc,
+	.get_hw_time = get_hw_time_tsc,
 	.delay = delay_tsc,
 };
 
diff -Nru a/arch/i386/mach-default/setup.c b/arch/i386/mach-default/setup.c
--- a/arch/i386/mach-default/setup.c	2005-01-27 13:10:04 -08:00
+++ b/arch/i386/mach-default/setup.c	2005-01-27 13:10:04 -08:00
@@ -85,6 +85,22 @@
 	setup_irq(0, &irq0);
 }
 
+/**
+ * replace_timer_interrupt - allow replacing timer interrupt handler
+ *
+ * Description:
+ *	Can be used to replace timer interrupt handler with a more optimized
+ *	handler. Used for enabling and disabling of CONFIG_NO_IDLE_HZ.
+ */
+void replace_timer_interrupt(void * new_handler)
+{
+	unsigned long flags;
+
+	write_seqlock_irqsave(&xtime_lock, flags);
+	irq0.handler = new_handler;
+	write_sequnlock_irqrestore(&xtime_lock, flags);
+}
+
 #ifdef CONFIG_MCA
 /**
  * mca_nmi_hook - hook into MCA specific NMI chain
diff -Nru a/include/asm-i386/timer.h b/include/asm-i386/timer.h
--- a/include/asm-i386/timer.h	2005-01-27 13:10:04 -08:00
+++ b/include/asm-i386/timer.h	2005-01-27 13:10:04 -08:00
@@ -1,6 +1,7 @@
 #ifndef _ASMi386_TIMER_H
 #define _ASMi386_TIMER_H
 #include <linux/init.h>
+#include <linux/interrupt.h>
 
 /**
  * struct timer_ops - used to define a timer source
@@ -21,7 +22,9 @@
 	void (*mark_offset)(void);
 	unsigned long (*get_offset)(void);
 	unsigned long long (*monotonic_clock)(void);
+	unsigned long long (*get_hw_time)(void);
 	void (*delay)(unsigned long);
+	void (*late_init)(void);
 };
 
 struct init_timer_opts {
diff -Nru a/include/linux/dyn-tick-timer.h b/include/linux/dyn-tick-timer.h
--- /dev/null	Wed Dec 31 16:00:00 196900
+++ b/include/linux/dyn-tick-timer.h	2005-01-27 13:10:04 -08:00
@@ -0,0 +1,73 @@
+/*
+ * linux/include/linux/dyn-tick-timer.h
+ *
+ * Copyright (C) 2004 Nokia Corporation
+ * Written by Tony Lindgen <tony@atomide.com> and
+ * Tuukka Tikkanen <tuukka.tikkanen@elektrobit.com>
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms of the GNU General Public License as published by the
+ * Free Software Foundation; either version 2 of the License, or (at your
+ * option) any later version.
+ *
+ * THIS SOFTWARE IS PROVIDED ``AS IS'' AND ANY EXPRESS OR IMPLIED
+ * WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF
+ * MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN
+ * NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY DIRECT, INDIRECT,
+ * INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT
+ * NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF
+ * USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON
+ * ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF
+ * THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ *
+ * You should have received a copy of the GNU General Public License along
+ * with this program; if not, write to the Free Software Foundation, Inc.,
+ * 675 Mass Ave, Cambridge, MA 02139, USA.
+ */
+
+#include <linux/interrupt.h>
+
+#define DYN_TICK_USE_APIC	(1 << 2)
+#define DYN_TICK_SKIPPING	(1 << 1)
+#define DYN_TICK_ENABLED	(1 << 0)
+
+struct dyn_tick_state {
+	unsigned int state;		/* Current state */
+	int skip_cpu;			/* Skip handling processor */
+	unsigned long skip;		/* Ticks to skip */
+	unsigned int max_skip;		/* Max number of ticks to skip */
+	unsigned long irq_skip_mask;	/* Do not update time from these irqs */
+	irqreturn_t (*interrupt)(int, void *, struct pt_regs *);
+};
+
+/* REVISIT: Add functions to enable/disable dyn-tick on the fly */
+struct dyn_tick_timer {
+	int (*init) (void);
+};
+
+extern struct dyn_tick_state * dyn_tick;
+extern void dyn_tick_register(struct dyn_tick_timer * new_timer);
+
+#define NS_TICK_LEN		((1 * 1000000000)/HZ)
+#define DYN_TICK_MIN_SKIP	2
+
+#if defined(CONFIG_SMP)
+#define cpu_has_local_apic()	1
+#elif defined(CONFIG_X86_UP_APIC)
+#define cpu_has_local_apic()	(dyn_tick->state & DYN_TICK_USE_APIC)
+#else
+#define cpu_has_local_apic()	0
+#endif
+
+#ifdef CONFIG_NO_IDLE_HZ
+
+#if defined(CONFIG_X86) || defined(CONFIG_IA64) || defined(CONFIG_X86_64)
+#define arch_has_safe_halt()	1
+#endif
+
+#else
+
+#define arch_has_safe_halt()	0
+
+#endif
diff -Nru a/kernel/Makefile b/kernel/Makefile
--- a/kernel/Makefile	2005-01-27 13:10:04 -08:00
+++ b/kernel/Makefile	2005-01-27 13:10:04 -08:00
@@ -26,6 +26,7 @@
 obj-$(CONFIG_KPROBES) += kprobes.o
 obj-$(CONFIG_SYSFS) += ksysfs.o
 obj-$(CONFIG_GENERIC_HARDIRQS) += irq/
+obj-$(CONFIG_NO_IDLE_HZ) += dyn-tick-timer.o
 
 ifneq ($(CONFIG_IA64),y)
 # According to Alan Modra <alan@linuxcare.com.au>, the -fno-omit-frame-pointer is
diff -Nru a/kernel/dyn-tick-timer.c b/kernel/dyn-tick-timer.c
--- /dev/null	Wed Dec 31 16:00:00 196900
+++ b/kernel/dyn-tick-timer.c	2005-01-27 13:10:04 -08:00
@@ -0,0 +1,140 @@
+/*
+ * linux/arch/i386/kernel/dyn-tick.c
+ *
+ * Beginnings of generic dynamic tick timer support
+ *
+ * Copyright (C) 2004 Nokia Corporation
+ * Written by Tony Lindgen <tony@atomide.com> and
+ * Tuukka Tikkanen <tuukka.tikkanen@elektrobit.com>
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms of the GNU General Public License as published by the
+ * Free Software Foundation; either version 2 of the License, or (at your
+ * option) any later version.
+ *
+ * THIS SOFTWARE IS PROVIDED ``AS IS'' AND ANY EXPRESS OR IMPLIED
+ * WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF
+ * MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN
+ * NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY DIRECT, INDIRECT,
+ * INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT
+ * NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF
+ * USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON
+ * ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF
+ * THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ *
+ * You should have received a copy of the GNU General Public License along
+ * with this program; if not, write to the Free Software Foundation, Inc.,
+ * 675 Mass Ave, Cambridge, MA 02139, USA.
+ *
+ *
+ * TODO:
+ * - Add functions for enabling/disabling dyn-tick on the fly
+ * - Generalize to work with ARM sys_timer
+ */
+
+#include <linux/version.h>
+#include <linux/config.h>
+#include <linux/kernel.h>
+#include <linux/init.h>
+#include <linux/module.h>
+#include <linux/interrupt.h>
+#include <linux/cpumask.h>
+#include <linux/pm.h>
+#include <linux/dyn-tick-timer.h>
+#include <asm/io.h>
+
+#include "io_ports.h"
+
+#define VERSION	050227-1
+
+struct dyn_tick_state dyn_tick_state;
+struct dyn_tick_state * dyn_tick = &dyn_tick_state;
+struct dyn_tick_timer dyn_tick_timer;
+struct dyn_tick_timer * dyn_tick_cfg = &dyn_tick_timer;
+static void (*orig_idle) (void) = 0;
+extern void disable_pit_tick(void);
+extern void reprogram_pit_tick(int jiffies_to_skip);
+extern void reprogram_apic_timer(unsigned int count);
+extern void reprogram_pit_tick(int jiffies_to_skip);
+static cpumask_t dyn_cpu_map;
+
+/*
+ * We want to have all processors idle before reprogramming the next
+ * timer interrupt. Note that we must maintain the state for dynamic tick,
+ * otherwise the idle loop could be reprogramming the timer continuously
+ * further into the future, and the timer interrupt would never happen.
+ */
+static void dyn_tick_idle(void)
+{
+	int cpu;
+	unsigned long flags;
+
+	if (!(dyn_tick->state & DYN_TICK_ENABLED))
+		goto out;
+
+	/* Check if we are already skipping ticks and can idle other cpus */
+	if (dyn_tick->state & DYN_TICK_SKIPPING) {
+		reprogram_apic_timer(dyn_tick->skip);
+		goto out;
+	}
+
+	/* Check if we can start skipping ticks */
+	write_seqlock_irqsave(&xtime_lock, flags);
+	cpu = smp_processor_id();
+	cpu_set(cpu, dyn_cpu_map);
+	if (cpus_full(dyn_cpu_map)) {
+		dyn_tick->skip = next_timer_interrupt() - jiffies;
+		if (dyn_tick->skip > DYN_TICK_MIN_SKIP) {
+			if (dyn_tick->skip > dyn_tick->max_skip)
+				dyn_tick->skip = dyn_tick->max_skip;
+			if (cpu_has_local_apic()) {
+				disable_pit_tick();
+				reprogram_apic_timer(dyn_tick->skip);
+			} else
+				reprogram_pit_tick(dyn_tick->skip);
+			dyn_tick->skip_cpu = cpu;
+			dyn_tick->state |= DYN_TICK_SKIPPING;
+		}
+		cpus_clear(dyn_cpu_map);
+	}
+	write_sequnlock_irqrestore(&xtime_lock, flags);
+
+out:
+	if (orig_idle)
+		orig_idle();
+	else if (arch_has_safe_halt())
+		safe_halt();
+}
+
+void __init dyn_tick_register(struct dyn_tick_timer * new_timer)
+{
+	dyn_tick_cfg->init = new_timer->init;
+	printk(KERN_INFO "dyn-tick: Registering dynamic tick timer\n");
+}
+
+/*
+ * We need to initialize dynamic tick after calibrate delay
+ */
+static int __init dyn_tick_init(void)
+{
+	int ret = 0;
+
+	if (dyn_tick_cfg->init == NULL)
+		return -ENODEV;
+
+	ret = dyn_tick_cfg->init();
+	if (ret != 0) {
+		printk(KERN_WARNING "dyn-tick: Init failed\n");
+		return -ENODEV;
+	}
+	orig_idle = pm_idle;
+	pm_idle = dyn_tick_idle;
+#if (LINUX_VERSION_CODE > KERNEL_VERSION(2,6,10))
+	cpu_idle_wait();
+#endif
+	printk(KERN_INFO "dyn-tick: Timer using dynamic tick\n");
+
+	return ret;
+}
+late_initcall(dyn_tick_init);

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH] Dynamic tick, version 050127-1
  2005-01-27 21:29 [PATCH] Dynamic tick, version 050127-1 Tony Lindgren
@ 2005-01-27 21:50 ` Tony Lindgren
  2005-02-01 11:00 ` Pavel Machek
  2005-02-01 20:20 ` Lee Revell
  2 siblings, 0 replies; 38+ messages in thread
From: Tony Lindgren @ 2005-01-27 21:50 UTC (permalink / raw)
  To: Benjamin Herrenschmidt, Pavel Machek, Arjan van de Ven,
	Martin Schwidefsky, Andrea Arcangeli, George Anzinger,
	Thomas Gleixner, john stultz, Zwane Mwaikambo, Lee Revell
  Cc: linux-kernel

* Tony Lindgren <tony@atomide.com> [050127 13:34]:
> Hi all,
> 
> Thanks for all the comments, here's an updated version of the dynamic
> tick patch.

Oops, I guess I should test before posting :)

Looks like CONFIG_X86_LOCAL_APIC=y is currenly needed on uniprocessor
machines to compile. Also CONFIG_SMP=y makes the skipping to fail
on a uniprocessor machine.

Regards,

Tony

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH] Dynamic tick, version 050127-1
  2005-01-27 21:29 [PATCH] Dynamic tick, version 050127-1 Tony Lindgren
  2005-01-27 21:50 ` Tony Lindgren
@ 2005-02-01 11:00 ` Pavel Machek
  2005-02-01 20:40   ` Tony Lindgren
  2005-02-01 20:20 ` Lee Revell
  2 siblings, 1 reply; 38+ messages in thread
From: Pavel Machek @ 2005-02-01 11:00 UTC (permalink / raw)
  To: Tony Lindgren
  Cc: Benjamin Herrenschmidt, Arjan van de Ven, Martin Schwidefsky,
	Andrea Arcangeli, George Anzinger, Thomas Gleixner, john stultz,
	Zwane Mwaikambo, Lee Revell, linux-kernel

Hi!

> Thanks for all the comments, here's an updated version of the dynamic
> tick patch.
> 
> I've fixed couple of things:
> 
> - Dyn-tick now supports local APIC timer. This allows longer sleep time
>   inbetween ticks, over 1000 ticks compared to 54 ticks with PIT timer.
>   It seems to stop timers on SMP too, but I've only briefly played with
>   it on SMP.

I used your config advices from second mail, still it does not work as
expected: system gets "too sleepy". Like it takes a nap during boot
after "dyn-tick: Maximum ticks to skip limited to 1339", and key is
needed to make it continue boot. Then cursor stops blinking and
machine is hung at random intervals during use, key is enough to awake
it.
								Pavel
-- 
People were complaining that M$ turns users into beta-testers...
...jr ghea gurz vagb qrirybcref, naq gurl frrz gb yvxr vg gung jnl!

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH] Dynamic tick, version 050127-1
  2005-01-27 21:29 [PATCH] Dynamic tick, version 050127-1 Tony Lindgren
  2005-01-27 21:50 ` Tony Lindgren
  2005-02-01 11:00 ` Pavel Machek
@ 2005-02-01 20:20 ` Lee Revell
  2005-02-01 23:42   ` Tony Lindgren
  2005-02-02  1:06   ` Eric St-Laurent
  2 siblings, 2 replies; 38+ messages in thread
From: Lee Revell @ 2005-02-01 20:20 UTC (permalink / raw)
  To: Tony Lindgren
  Cc: Benjamin Herrenschmidt, Pavel Machek, Arjan van de Ven,
	Martin Schwidefsky, Andrea Arcangeli, George Anzinger,
	Thomas Gleixner, john stultz, Zwane Mwaikambo, linux-kernel

On Thu, 2005-01-27 at 13:29 -0800, Tony Lindgren wrote:
> Hi all,
> 
> Thanks for all the comments, here's an updated version of the dynamic
> tick patch.

Hi,

I was wondering how Windows handles high res timers, if at all.  The
reason I ask is because I have been reverse engineering a Windows ASIO
driver, and I find that if the latency is set below about 5ms, by
examining the kernel timer queue with SoftICE I can see that several
kernel timers with 1ms period are created.  (Presumably the sound card's
interval timer is used for longer periods).

But, I have seen people mention in the "singing capacitor" threads on
this list that Windows uses 100 for HZ.

So, how do they implement 1ms timers with a 10ms tick rate?  Does
Windows dynamically reprogram the PIT as well?

Lee


^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH] Dynamic tick, version 050127-1
  2005-02-01 11:00 ` Pavel Machek
@ 2005-02-01 20:40   ` Tony Lindgren
  2005-02-01 21:25     ` Pavel Machek
  0 siblings, 1 reply; 38+ messages in thread
From: Tony Lindgren @ 2005-02-01 20:40 UTC (permalink / raw)
  To: Pavel Machek
  Cc: Benjamin Herrenschmidt, Arjan van de Ven, Martin Schwidefsky,
	Andrea Arcangeli, George Anzinger, Thomas Gleixner, john stultz,
	Zwane Mwaikambo, Lee Revell, linux-kernel

* Pavel Machek <pavel@suse.cz> [050201 03:03]:
> Hi!
> 
> > Thanks for all the comments, here's an updated version of the dynamic
> > tick patch.
> > 
> > I've fixed couple of things:
> > 
> > - Dyn-tick now supports local APIC timer. This allows longer sleep time
> >   inbetween ticks, over 1000 ticks compared to 54 ticks with PIT timer.
> >   It seems to stop timers on SMP too, but I've only briefly played with
> >   it on SMP.
> 
> I used your config advices from second mail, still it does not work as
> expected: system gets "too sleepy". Like it takes a nap during boot
> after "dyn-tick: Maximum ticks to skip limited to 1339", and key is
> needed to make it continue boot. Then cursor stops blinking and
> machine is hung at random intervals during use, key is enough to awake
> it.

Hmmm, that sounds like the local APIC does not wake up the PIT
interrupt properly after sleep. Hitting the keys causes the timer
interrupt to get called, and that explains why it keeps running. But
the timer ticks are not happening as they should for some reason.
This should not happen (tm)...

I've noticed that the only machine I have with ACPI C2/C3 support
does not do anything in the C2/C3 loops, it just spins around and
consumes more power than in C1 with hlt!

That's because we currently don't have any code to enable the C2/C3
states in the southbridges on many Athlon boards. It's the same
problem on my Crusoe laptop ALi 1533 chipset.

I think we should have some ACPI code that scans the southbridges,
and sets them up with C2/C3 enable functions that can be
enabled/disabled via /sys.

Does anybody happen to have documentation for the ALi 1533, 1535
or M7101 chipset, BTW? I'd like to know how to enable the C2/C3
on it.

Tony

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH] Dynamic tick, version 050127-1
  2005-02-01 20:40   ` Tony Lindgren
@ 2005-02-01 21:25     ` Pavel Machek
  2005-02-01 23:03       ` Tony Lindgren
  0 siblings, 1 reply; 38+ messages in thread
From: Pavel Machek @ 2005-02-01 21:25 UTC (permalink / raw)
  To: Tony Lindgren
  Cc: Benjamin Herrenschmidt, Arjan van de Ven, Martin Schwidefsky,
	Andrea Arcangeli, George Anzinger, Thomas Gleixner, john stultz,
	Zwane Mwaikambo, Lee Revell, linux-kernel

Hi!

> > I used your config advices from second mail, still it does not work as
> > expected: system gets "too sleepy". Like it takes a nap during boot
> > after "dyn-tick: Maximum ticks to skip limited to 1339", and key is
> > needed to make it continue boot. Then cursor stops blinking and
> > machine is hung at random intervals during use, key is enough to awake
> > it.
> 
> Hmmm, that sounds like the local APIC does not wake up the PIT
> interrupt properly after sleep. Hitting the keys causes the timer
> interrupt to get called, and that explains why it keeps running. But
> the timer ticks are not happening as they should for some reason.
> This should not happen (tm)...

:-). Any ideas how to debug it? Previous version of patch seemed to work better...

> I've noticed that the only machine I have with ACPI C2/C3 support
> does not do anything in the C2/C3 loops, it just spins around and
> consumes more power than in C1 with hlt!
> 
> That's because we currently don't have any code to enable the C2/C3
> states in the southbridges on many Athlon boards. It's the same
> problem on my Crusoe laptop ALi 1533 chipset.

I do not think we should need any chipset-specific code. ACPI
is expected to solve it... Can you ask on acpi-devel?

-- 
64 bytes from 195.113.31.123: icmp_seq=28 ttl=51 time=448769.1 ms         


^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH] Dynamic tick, version 050127-1
  2005-02-01 21:25     ` Pavel Machek
@ 2005-02-01 23:03       ` Tony Lindgren
  2005-02-02 13:50         ` Pavel Machek
                           ` (3 more replies)
  0 siblings, 4 replies; 38+ messages in thread
From: Tony Lindgren @ 2005-02-01 23:03 UTC (permalink / raw)
  To: Pavel Machek
  Cc: Benjamin Herrenschmidt, Arjan van de Ven, Martin Schwidefsky,
	Andrea Arcangeli, George Anzinger, Thomas Gleixner, john stultz,
	Zwane Mwaikambo, Lee Revell, linux-kernel

* Pavel Machek <pavel@suse.cz> [050201 13:50]:
> Hi!
> 
> > > I used your config advices from second mail, still it does not work as
> > > expected: system gets "too sleepy". Like it takes a nap during boot
> > > after "dyn-tick: Maximum ticks to skip limited to 1339", and key is
> > > needed to make it continue boot. Then cursor stops blinking and
> > > machine is hung at random intervals during use, key is enough to awake
> > > it.
> > 
> > Hmmm, that sounds like the local APIC does not wake up the PIT
> > interrupt properly after sleep. Hitting the keys causes the timer
> > interrupt to get called, and that explains why it keeps running. But
> > the timer ticks are not happening as they should for some reason.
> > This should not happen (tm)...
> 
> :-). Any ideas how to debug it? Previous version of patch seemed to work better...

I don't think it's HPET timer, or CONFIG_SMP. It also looks like your
local APIC timer is working.

If you have a serial console, you can put one letter printks in the
code. Can you check if you ever get to smp_apic_timer_interrupt()?
That's where you should get to after the sleep, and that calls the
PIT timer interrupt to get it going again. I'm thinking that you'll
get to smp_apic_timer_interrupt(), but once therebut function
dyn_tick->interrupt(0, NULL, regs) never gets called.

It's OK to put printks to the timer code here, there's tons of 
output only when the system is busy :)

Also, can you post your .config again? And also please post output
from:

dmesg | grep -i "time\|tick\|apic"

> > I've noticed that the only machine I have with ACPI C2/C3 support
> > does not do anything in the C2/C3 loops, it just spins around and
> > consumes more power than in C1 with hlt!
> > 
> > That's because we currently don't have any code to enable the C2/C3
> > states in the southbridges on many Athlon boards. It's the same
> > problem on my Crusoe laptop ALi 1533 chipset.
> 
> I do not think we should need any chipset-specific code. ACPI
> is expected to solve it... Can you ask on acpi-devel?

Yeah, I've been meaning to, I just subscribed to it yesterday.

Tony

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH] Dynamic tick, version 050127-1
  2005-02-01 20:20 ` Lee Revell
@ 2005-02-01 23:42   ` Tony Lindgren
  2005-02-02  1:06   ` Eric St-Laurent
  1 sibling, 0 replies; 38+ messages in thread
From: Tony Lindgren @ 2005-02-01 23:42 UTC (permalink / raw)
  To: Lee Revell
  Cc: Benjamin Herrenschmidt, Pavel Machek, Arjan van de Ven,
	Martin Schwidefsky, Andrea Arcangeli, George Anzinger,
	Thomas Gleixner, john stultz, Zwane Mwaikambo, linux-kernel

* Lee Revell <rlrevell@joe-job.com> [050201 12:20]:
> On Thu, 2005-01-27 at 13:29 -0800, Tony Lindgren wrote:
> > Hi all,
> > 
> > Thanks for all the comments, here's an updated version of the dynamic
> > tick patch.
> 
> Hi,
> 
> I was wondering how Windows handles high res timers, if at all.  The
> reason I ask is because I have been reverse engineering a Windows ASIO
> driver, and I find that if the latency is set below about 5ms, by
> examining the kernel timer queue with SoftICE I can see that several
> kernel timers with 1ms period are created.  (Presumably the sound card's
> interval timer is used for longer periods).
> 
> But, I have seen people mention in the "singing capacitor" threads on
> this list that Windows uses 100 for HZ.
> 
> So, how do they implement 1ms timers with a 10ms tick rate?  Does
> Windows dynamically reprogram the PIT as well?

No idea, but it would probably show up by adding some debug code
some an emulator like Bochs?

Tony

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH] Dynamic tick, version 050127-1
  2005-02-01 20:20 ` Lee Revell
  2005-02-01 23:42   ` Tony Lindgren
@ 2005-02-02  1:06   ` Eric St-Laurent
  1 sibling, 0 replies; 38+ messages in thread
From: Eric St-Laurent @ 2005-02-02  1:06 UTC (permalink / raw)
  To: Lee Revell
  Cc: Tony Lindgren, Benjamin Herrenschmidt, Pavel Machek,
	Arjan van de Ven, Martin Schwidefsky, Andrea Arcangeli,
	George Anzinger, Thomas Gleixner, john stultz, Zwane Mwaikambo,
	linux-kernel

On Tue, 2005-02-01 at 15:20 -0500, Lee Revell wrote:

> I was wondering how Windows handles high res timers, if at all.  The
> reason I ask is because I have been reverse engineering a Windows ASIO
> driver, and I find that if the latency is set below about 5ms, by

By default, Windows "multimedia" timers have 10ms resolution (this
depends on the exact version of Windows used...).  You can call the
timeBeginPeriod() function to lower the resolution to 1ms.

This resolution seem related to the task scheduler timeslice.  After you
call this function, the Sleep() call has also a resolution of 1ms
instead of 10ms.

I remember reading that the multimedia timers are implemented as a high
priority thread.

You can found more details on this site :

http://www.geisswerks.com/ryan/FAQS/timing.html

Best regards,

Eric St-Laurent



^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH] Dynamic tick, version 050127-1
  2005-02-01 23:03       ` Tony Lindgren
@ 2005-02-02 13:50         ` Pavel Machek
  2005-02-02 13:50         ` Pavel Machek
                           ` (2 subsequent siblings)
  3 siblings, 0 replies; 38+ messages in thread
From: Pavel Machek @ 2005-02-02 13:50 UTC (permalink / raw)
  To: Tony Lindgren
  Cc: Benjamin Herrenschmidt, Arjan van de Ven, Martin Schwidefsky,
	Andrea Arcangeli, George Anzinger, Thomas Gleixner, john stultz,
	Zwane Mwaikambo, Lee Revell, linux-kernel

[-- Attachment #1: Type: text/plain, Size: 1818 bytes --]

Hi!

> > > Hmmm, that sounds like the local APIC does not wake up the PIT
> > > interrupt properly after sleep. Hitting the keys causes the timer
> > > interrupt to get called, and that explains why it keeps running. But
> > > the timer ticks are not happening as they should for some reason.
> > > This should not happen (tm)...
> > 
> > :-). Any ideas how to debug it? Previous version of patch seemed to work better...
> 
> I don't think it's HPET timer, or CONFIG_SMP. It also looks like your
> local APIC timer is working.

I ran find /, now my machine seems to work... Except that the time is
now two times as fast as it should be, ouch.

> If you have a serial console, you can put one letter printks in the
> code. Can you check if you ever get to smp_apic_timer_interrupt()?
> That's where you should get to after the sleep, and that calls the
> PIT timer interrupt to get it going again. I'm thinking that you'll
> get to smp_apic_timer_interrupt(), but once therebut function
> dyn_tick->interrupt(0, NULL, regs) never gets called.

Serial console would be slightly tricky to arrange...

Heh, is it possible that I'm not running NMI deadlock detector and
therefore it does not tick or something like that?

> It's OK to put printks to the timer code here, there's tons of 
> output only when the system is busy :)
> 
> Also, can you post your .config again? And also please post output
> from:
> 
> dmesg | grep -i "time\|tick\|apic"

pavel@amd:~$ dmesg | grep -i "time\|tick\|apic"
PCI: Setting latency timer of device 0000:00:11.5 to 64
dyn-tick: Maximum ticks to skip limited to 1339
dyn-tick: Timer using dynamic tick
pavel@amd:~$

Config is attached.
								Pavel
-- 
People were complaining that M$ turns users into beta-testers...
...jr ghea gurz vagb qrirybcref, naq gurl frrz gb yvxr vg gung jnl!

[-- Attachment #2: config.gz --]
[-- Type: application/octet-stream, Size: 10193 bytes --]

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH] Dynamic tick, version 050127-1
  2005-02-01 23:03       ` Tony Lindgren
  2005-02-02 13:50         ` Pavel Machek
@ 2005-02-02 13:50         ` Pavel Machek
  2005-02-02 13:56         ` Pavel Machek
  2005-02-02 14:11         ` Pavel Machek
  3 siblings, 0 replies; 38+ messages in thread
From: Pavel Machek @ 2005-02-02 13:50 UTC (permalink / raw)
  To: Tony Lindgren
  Cc: Benjamin Herrenschmidt, Arjan van de Ven, Martin Schwidefsky,
	Andrea Arcangeli, George Anzinger, Thomas Gleixner, john stultz,
	Zwane Mwaikambo, Lee Revell, linux-kernel

Hi!

> > > Hmmm, that sounds like the local APIC does not wake up the PIT
> > > interrupt properly after sleep. Hitting the keys causes the timer
> > > interrupt to get called, and that explains why it keeps running. But
> > > the timer ticks are not happening as they should for some reason.
> > > This should not happen (tm)...
> > 
> > :-). Any ideas how to debug it? Previous version of patch seemed to work better...
> 
> I don't think it's HPET timer, or CONFIG_SMP. It also looks like your
> local APIC timer is working.
> 
> If you have a serial console, you can put one letter printks in the
> code. Can you check if you ever get to smp_apic_timer_interrupt()?
> That's where you should get to after the sleep, and that calls the
> PIT timer interrupt to get it going again. I'm thinking that you'll
> get to smp_apic_timer_interrupt(), but once therebut function
> dyn_tick->interrupt(0, NULL, regs) never gets called.

I definitely get to smp_apic_timer_interrupt:

Feb  2 14:46:53 amd kernel:
ic_timer_irqsmp_apic_timer_irqsmp_apic_timer_irqsmp_apic_timer_irqsmp_apic_timer_irqsmp_apic_timer_irqsmp_apic_timer_irqsmp_apic_timer_irqsmp_apic_timer_irqsmp_apic_timer_irqsmp_apic_timer_irqsmp_apic_timer_irqsmp_apic_timer_irqsmp_apic_timer_irqsmp_apic_timer_irqsmp_apic_timer_irqsmp_apic_timer_irqsmp_apic_timer_irqsmp_apic_timer_irqsmp_apic_timer_irqsmp_apic_timer_irqsmp_apic_timer_irqsmp_apic_timer_irqsmp_apic_timer_irqsmp_apic_timer_irqsmp_apic_timer_irqsmp_apic_timer_irqsmp_apic_timer_irqsmp_apic_timer_irqsmp_apic_timer_irqsmp_apic_timer_irqsmp_apic_timer_irqsmp_apic_timer_irqsmp_apic_timer_irqsmp_apic_timer_irqsmp_apic_timer_irqsmp_apic_timer_irqsmp_apic_timer_irqsmp_apic_timer_irqsmp_apic_timer_irqsmp_apic_timer_irqsmp_apic_timer_irqsmp_apic_timer_irqsmp_apic_timer_irqsmp_apic_timer_irqsmp_apic_timer_irqsmp_apic_timer_irqsmp_apic_timer_irqsmp_apic_timer_irqsmp_apic_timer_irqsmp_apic_timer_irqsmp_apic_timer_irqsmp_apic_timer_irqsmp_apic_timer_irqsmp_apic_timer_irqsmp_apic_ti
Feb  2 14:46:54 amd kernel:
irqsmp_apic_timer_irqsmp_apic_timer_irqsmp_apic_timer_irqsmp_apic_timer_irqsmp_apic_timer_irqsmp_apic_timer_irqsmp_apic_timer_irqsmp_apic_timer_irqsmp_apic_timer_irqsmp_apic_timer_irqsmp_apic_timer_irqsmp_apic_timer_irqsmp_apic_timer_irqsmp_apic_timer_irqsmp_apic_timer_irqsmp_apic_timer_irqsmp_apic_timer_irqsmp_apic_timer_irqsmp_apic_timer_irqsmp_apic_timer_irqsmp_apic_timer_irqsmp_apic_timer_irqsmp_apic_timer_irqsmp_apic_timer_irqsmp_apic_timer_irqsmp_apic_timer_irqsmp_apic_timer_irqsmp_apic_timer_irqsmp_apic_timer_irqsmp_apic_timer_irqsmp_apic_timer_irqsmp_apic_timer_irqsmp_apic_timer_irqsmp_apic_timer_irqsmp_apic_timer_irqsmp_apic_timer_irqsmp_apic_timer_irqsmp_apic_timer_irqsmp_apic_timer_irqsmp_apic_timer_irqsmp_apic_timer_irqsmp_apic_timer_irqsmp_apic_timer_irqsmp_apic_timer_irqsmp_apic_timer_irqsmp_apic_timer_irqsmp_apic_timer_irqsmp_apic_timer_irqsmp_apic_timer_irqsmp_apic_timer_irqsmp_apic_timer_irqsmp_apic_timer_irqsmp_apic_timer_irqsmp_apic_timer_irqsmp_apic_timer_irqsm


I'll test with this code:

                if (dyn_tick->state & (DYN_TICK_ENABLED | DYN_TICK_SKIPPING)) {
                        if (dyn_tick->skip_cpu == cpu && dyn_tick->skip > DYN_TICK_MIN_SKIP) {
                                printk("dyn_tick->interrupt\n");
                                dyn_tick->interrupt(0, NULL, regs);
                        } else

								Pavel
-- 
People were complaining that M$ turns users into beta-testers...
...jr ghea gurz vagb qrirybcref, naq gurl frrz gb yvxr vg gung jnl!

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH] Dynamic tick, version 050127-1
  2005-02-01 23:03       ` Tony Lindgren
  2005-02-02 13:50         ` Pavel Machek
  2005-02-02 13:50         ` Pavel Machek
@ 2005-02-02 13:56         ` Pavel Machek
  2005-02-02 14:11         ` Pavel Machek
  3 siblings, 0 replies; 38+ messages in thread
From: Pavel Machek @ 2005-02-02 13:56 UTC (permalink / raw)
  To: Tony Lindgren
  Cc: Benjamin Herrenschmidt, Arjan van de Ven, Martin Schwidefsky,
	Andrea Arcangeli, George Anzinger, Thomas Gleixner, john stultz,
	Zwane Mwaikambo, Lee Revell, linux-kernel

Hi!

> I don't think it's HPET timer, or CONFIG_SMP. It also looks like your
> local APIC timer is working.
> 
> If you have a serial console, you can put one letter printks in the
> code. Can you check if you ever get to smp_apic_timer_interrupt()?
> That's where you should get to after the sleep, and that calls the
> PIT timer interrupt to get it going again. I'm thinking that you'll
> get to smp_apic_timer_interrupt(), but once therebut function
> dyn_tick->interrupt(0, NULL, regs) never gets called.

dyn_tick->interrupt *is* being called:

Feb  2 14:53:41 amd last message repeated 36 times
Feb  2 14:53:41 amd postfix/postfix-script: starting the Postfix mail
system
Feb  2 14:53:41 amd kernel: dyn_tick->interrupt
Feb  2 14:53:41 amd kernel: dyn_tick->interrupt
Feb  2 14:53:41 amd postfix/master[1301]: daemon started -- version
2.1.5
Feb  2 14:53:41 amd kernel: dyn_tick->interrupt
Feb  2 14:53:45 amd last message repeated 30 times
Feb  2 14:53:45 amd log1n[1220]: ROOT LOGIN on `tty8'
Feb  2 14:53:45 amd kernel: dyn_tick->interrupt
Feb  2 14:54:16 amd last message repeated 228 times

I'll try turning off CONFIG_PREEMPT...
								Pavel
-- 
People were complaining that M$ turns users into beta-testers...
...jr ghea gurz vagb qrirybcref, naq gurl frrz gb yvxr vg gung jnl!

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH] Dynamic tick, version 050127-1
  2005-02-01 23:03       ` Tony Lindgren
                           ` (2 preceding siblings ...)
  2005-02-02 13:56         ` Pavel Machek
@ 2005-02-02 14:11         ` Pavel Machek
  2005-02-03  3:04           ` Tony Lindgren
  3 siblings, 1 reply; 38+ messages in thread
From: Pavel Machek @ 2005-02-02 14:11 UTC (permalink / raw)
  To: Tony Lindgren
  Cc: Benjamin Herrenschmidt, Arjan van de Ven, Martin Schwidefsky,
	Andrea Arcangeli, George Anzinger, Thomas Gleixner, john stultz,
	Zwane Mwaikambo, Lee Revell, linux-kernel


Hi!

> > > > I used your config advices from second mail, still it does not work as
> > > > expected: system gets "too sleepy". Like it takes a nap during boot
> > > > after "dyn-tick: Maximum ticks to skip limited to 1339", and key is
> > > > needed to make it continue boot. Then cursor stops blinking and
> > > > machine is hung at random intervals during use, key is enough to awake
> > > > it.
> > > 
> > > Hmmm, that sounds like the local APIC does not wake up the PIT
> > > interrupt properly after sleep. Hitting the keys causes the timer
> > > interrupt to get called, and that explains why it keeps running. But
> > > the timer ticks are not happening as they should for some reason.
> > > This should not happen (tm)...
> > 
> > :-). Any ideas how to debug it? Previous version of patch seemed to work better...
> 
> I don't think it's HPET timer, or CONFIG_SMP. It also looks like your
> local APIC timer is working.

I turned off CONFIG_PREEMPT, but nothing changed :-(.
									Pavel
-- 
People were complaining that M$ turns users into beta-testers...
...jr ghea gurz vagb qrirybcref, naq gurl frrz gb yvxr vg gung jnl!

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH] Dynamic tick, version 050127-1
  2005-02-02 14:11         ` Pavel Machek
@ 2005-02-03  3:04           ` Tony Lindgren
  2005-02-03 10:56             ` Pavel Machek
  0 siblings, 1 reply; 38+ messages in thread
From: Tony Lindgren @ 2005-02-03  3:04 UTC (permalink / raw)
  To: Pavel Machek
  Cc: Benjamin Herrenschmidt, Arjan van de Ven, Martin Schwidefsky,
	Andrea Arcangeli, George Anzinger, Thomas Gleixner, john stultz,
	Zwane Mwaikambo, Lee Revell, linux-kernel

[-- Attachment #1: Type: text/plain, Size: 1356 bytes --]

* Pavel Machek <pavel@suse.cz> [050202 06:13]:
> 
> Hi!
> 
> > > > > I used your config advices from second mail, still it does not work as
> > > > > expected: system gets "too sleepy". Like it takes a nap during boot
> > > > > after "dyn-tick: Maximum ticks to skip limited to 1339", and key is
> > > > > needed to make it continue boot. Then cursor stops blinking and
> > > > > machine is hung at random intervals during use, key is enough to awake
> > > > > it.
> > > > 
> > > > Hmmm, that sounds like the local APIC does not wake up the PIT
> > > > interrupt properly after sleep. Hitting the keys causes the timer
> > > > interrupt to get called, and that explains why it keeps running. But
> > > > the timer ticks are not happening as they should for some reason.
> > > > This should not happen (tm)...
> > > 
> > > :-). Any ideas how to debug it? Previous version of patch seemed to work better...
> > 
> > I don't think it's HPET timer, or CONFIG_SMP. It also looks like your
> > local APIC timer is working.
> 
> I turned off CONFIG_PREEMPT, but nothing changed :-(.

What about reprogramming the timers in time.c after the sleep? Do
you to dyn_tick->skip = 1; part in dyn_tick_timer_interrupt?

It could also be that the reprogamming of PIT timer does not work on
your machine. I chopped off the udelays there... Can you try
something like this:

[-- Attachment #2: patch-pit-udelay --]
[-- Type: text/plain, Size: 414 bytes --]

--- a/arch/i386/kernel/time.c	2005-01-27 12:58:04 -08:00
+++ b/arch/i368/kernel/time.c	2005-02-02 19:01:31 -08:00
@@ -479,8 +480,11 @@
 
 	spin_lock_irqsave(&i8253_lock, flags);
 	outb_p(0x34, PIT_MODE);		/* binary, mode 2, LSB/MSB, ch 0 */
+	udelay(10);
 	outb_p(skip & 0xff, PIT_CH0);	/* LSB */
+	udelay(10);
 	outb(skip >> 8, PIT_CH0);	/* MSB */
+	udelay(10);
 	spin_unlock_irqrestore(&i8253_lock, flags);
 }
 

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH] Dynamic tick, version 050127-1
  2005-02-03  3:04           ` Tony Lindgren
@ 2005-02-03 10:56             ` Pavel Machek
  2005-02-03 16:43               ` Tony Lindgren
  0 siblings, 1 reply; 38+ messages in thread
From: Pavel Machek @ 2005-02-03 10:56 UTC (permalink / raw)
  To: Tony Lindgren
  Cc: Benjamin Herrenschmidt, Arjan van de Ven, Martin Schwidefsky,
	Andrea Arcangeli, George Anzinger, Thomas Gleixner, john stultz,
	Zwane Mwaikambo, Lee Revell, linux-kernel

Hi!

> > > > > > I used your config advices from second mail, still it does not work as
> > > > > > expected: system gets "too sleepy". Like it takes a nap during boot
> > > > > > after "dyn-tick: Maximum ticks to skip limited to 1339", and key is
> > > > > > needed to make it continue boot. Then cursor stops blinking and
> > > > > > machine is hung at random intervals during use, key is enough to awake
> > > > > > it.
> > > > > 
> > > > > Hmmm, that sounds like the local APIC does not wake up the PIT
> > > > > interrupt properly after sleep. Hitting the keys causes the timer
> > > > > interrupt to get called, and that explains why it keeps running. But
> > > > > the timer ticks are not happening as they should for some reason.
> > > > > This should not happen (tm)...
> > > > 
> > > > :-). Any ideas how to debug it? Previous version of patch seemed to work better...
> > > 
> > > I don't think it's HPET timer, or CONFIG_SMP. It also looks like your
> > > local APIC timer is working.
> > 
> > I turned off CONFIG_PREEMPT, but nothing changed :-(.
> 
> What about reprogramming the timers in time.c after the sleep? Do
> you to dyn_tick->skip = 1; part in dyn_tick_timer_interrupt?

Yes, when I enabled debugging, dbg_dyn_tick_irq() was reached and
produced lot of noise to syslog. After I done nothing for a while,
machine would just sit there and wait, not doing anything. When it was
hung, dbg_dyn_timer_tick was not reached.

> It could also be that the reprogamming of PIT timer does not work on
> your machine. I chopped off the udelays there... Can you try
> something like this:

I added the udelays, but behaviour did not change.

> --- a/arch/i386/kernel/time.c	2005-01-27 12:58:04 -08:00
> +++ b/arch/i368/kernel/time.c	2005-02-02 19:01:31 -08:00
> @@ -479,8 +480,11 @@
>  
>  	spin_lock_irqsave(&i8253_lock, flags);
>  	outb_p(0x34, PIT_MODE);		/* binary, mode 2, LSB/MSB, ch 0 */
> +	udelay(10);
>  	outb_p(skip & 0xff, PIT_CH0);	/* LSB */
> +	udelay(10);
>  	outb(skip >> 8, PIT_CH0);	/* MSB */
> +	udelay(10);
>  	spin_unlock_irqrestore(&i8253_lock, flags);
>  }
>  
								Pavel
-- 
People were complaining that M$ turns users into beta-testers...
...jr ghea gurz vagb qrirybcref, naq gurl frrz gb yvxr vg gung jnl!

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH] Dynamic tick, version 050127-1
  2005-02-03 10:56             ` Pavel Machek
@ 2005-02-03 16:43               ` Tony Lindgren
  2005-02-04  5:19                 ` Tony Lindgren
  0 siblings, 1 reply; 38+ messages in thread
From: Tony Lindgren @ 2005-02-03 16:43 UTC (permalink / raw)
  To: Pavel Machek
  Cc: Benjamin Herrenschmidt, Arjan van de Ven, Martin Schwidefsky,
	Andrea Arcangeli, George Anzinger, Thomas Gleixner, john stultz,
	Zwane Mwaikambo, Lee Revell, linux-kernel

* Pavel Machek <pavel@suse.cz> [050203 02:57]:
> Hi!
> 
> > > > > > > I used your config advices from second mail, still it does not work as
> > > > > > > expected: system gets "too sleepy". Like it takes a nap during boot
> > > > > > > after "dyn-tick: Maximum ticks to skip limited to 1339", and key is
> > > > > > > needed to make it continue boot. Then cursor stops blinking and
> > > > > > > machine is hung at random intervals during use, key is enough to awake
> > > > > > > it.
> > > > > > 
> > > > > > Hmmm, that sounds like the local APIC does not wake up the PIT
> > > > > > interrupt properly after sleep. Hitting the keys causes the timer
> > > > > > interrupt to get called, and that explains why it keeps running. But
> > > > > > the timer ticks are not happening as they should for some reason.
> > > > > > This should not happen (tm)...
> > > > > 
> > > > > :-). Any ideas how to debug it? Previous version of patch seemed to work better...
> > > > 
> > > > I don't think it's HPET timer, or CONFIG_SMP. It also looks like your
> > > > local APIC timer is working.
> > > 
> > > I turned off CONFIG_PREEMPT, but nothing changed :-(.
> > 
> > What about reprogramming the timers in time.c after the sleep? Do
> > you to dyn_tick->skip = 1; part in dyn_tick_timer_interrupt?
> 
> Yes, when I enabled debugging, dbg_dyn_tick_irq() was reached and
> produced lot of noise to syslog. After I done nothing for a while,
> machine would just sit there and wait, not doing anything. When it was
> hung, dbg_dyn_timer_tick was not reached.

OK. Function dbg_dyn_timer_tick only printks if the sleep was less
than expected and the system woke to a non-timer interrupt. But when
idling, it should still printk something occasionally.

> > It could also be that the reprogamming of PIT timer does not work on
> > your machine. I chopped off the udelays there... Can you try
> > something like this:
> 
> I added the udelays, but behaviour did not change.

Yeah, and if the first patch was working better, that means the PIT
interrupts work. I'll do another version of the patch where PIT
interrupts work again without local APIC needed, let's see what
happens with that.

Tony

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH] Dynamic tick, version 050127-1
  2005-02-03 16:43               ` Tony Lindgren
@ 2005-02-04  5:19                 ` Tony Lindgren
  2005-02-04  6:33                   ` Zwane Mwaikambo
  2005-02-05 23:00                   ` Pavel Machek
  0 siblings, 2 replies; 38+ messages in thread
From: Tony Lindgren @ 2005-02-04  5:19 UTC (permalink / raw)
  To: Pavel Machek
  Cc: Benjamin Herrenschmidt, Arjan van de Ven, Martin Schwidefsky,
	Andrea Arcangeli, George Anzinger, Thomas Gleixner, john stultz,
	Zwane Mwaikambo, Lee Revell, linux-kernel

[-- Attachment #1: Type: text/plain, Size: 3014 bytes --]

* Tony Lindgren <tony@atomide.com> [050203 15:07]:
> * Pavel Machek <pavel@suse.cz> [050203 02:57]:
> > Hi!
> > 
> > > > > > > > I used your config advices from second mail, still it does not work as
> > > > > > > > expected: system gets "too sleepy". Like it takes a nap during boot
> > > > > > > > after "dyn-tick: Maximum ticks to skip limited to 1339", and key is
> > > > > > > > needed to make it continue boot. Then cursor stops blinking and
> > > > > > > > machine is hung at random intervals during use, key is enough to awake
> > > > > > > > it.
> > > > > > > 
> > > > > > > Hmmm, that sounds like the local APIC does not wake up the PIT
> > > > > > > interrupt properly after sleep. Hitting the keys causes the timer
> > > > > > > interrupt to get called, and that explains why it keeps running. But
> > > > > > > the timer ticks are not happening as they should for some reason.
> > > > > > > This should not happen (tm)...
> > > > > > 
> > > > > > :-). Any ideas how to debug it? Previous version of patch seemed to work better...
> > > > > 
> > > > > I don't think it's HPET timer, or CONFIG_SMP. It also looks like your
> > > > > local APIC timer is working.
> > > > 
> > > > I turned off CONFIG_PREEMPT, but nothing changed :-(.
> > > 
> > > What about reprogramming the timers in time.c after the sleep? Do
> > > you to dyn_tick->skip = 1; part in dyn_tick_timer_interrupt?
> > 
> > Yes, when I enabled debugging, dbg_dyn_tick_irq() was reached and
> > produced lot of noise to syslog. After I done nothing for a while,
> > machine would just sit there and wait, not doing anything. When it was
> > hung, dbg_dyn_timer_tick was not reached.
> 
> OK. Function dbg_dyn_timer_tick only printks if the sleep was less
> than expected and the system woke to a non-timer interrupt. But when
> idling, it should still printk something occasionally.
> 
> > > It could also be that the reprogamming of PIT timer does not work on
> > > your machine. I chopped off the udelays there... Can you try
> > > something like this:
> > 
> > I added the udelays, but behaviour did not change.
> 
> Yeah, and if the first patch was working better, that means the PIT
> interrupts work. I'll do another version of the patch where PIT
> interrupts work again without local APIC needed, let's see what
> happens with that.

I think something broke TSC timer after the first patch, but I could
not figure out yet what. So the bad combo might be local APIC + TSC.
At least I'm seeing similar problems with local APIC + TSC timer.

Attached is a slightly improved patch, but the patch does not fix
the TSC problem. It just fixes compile without local APIC, and
booting SMP kernel on uniprocessor machine.

Currently the suggested combo is local APIC + ACPI PM timer...

And if that works, changing the I8042_POLL_PERIOD from HZ/20 in
drivers/input/serio/i8042.h to something like HZ increases the
sleep interval quite a bit. I think I had lots of polling also in
CONFIG_NETFILTER, but I haven't verified that.

Regards,

Tony

[-- Attachment #2: patch-2.6.11-rc3-dyn-tick-050203-1 --]
[-- Type: text/plain, Size: 21019 bytes --]

diff -Nru a/arch/i386/Kconfig b/arch/i386/Kconfig
--- a/arch/i386/Kconfig	2005-02-03 21:11:01 -08:00
+++ b/arch/i386/Kconfig	2005-02-03 21:11:01 -08:00
@@ -452,6 +452,16 @@
 	bool "Provide RTC interrupt"
 	depends on HPET_TIMER && RTC=y
 
+config NO_IDLE_HZ
+	bool "Dynamic Tick Timer - Skip timer ticks during idle"
+	help
+	  This option enables support for skipping timer ticks when the
+	  processor is idle. During system load, timer is continuous.
+	  This option saves power, as it allows the system to stay in
+	  idle mode longer. Currently supported timers are ACPI PM
+	  timer, local APIC timer, and TSC timer. HPET timer is currently
+	  not supported.
+
 config SMP
 	bool "Symmetric multi-processing support"
 	---help---
diff -Nru a/arch/i386/kernel/apic.c b/arch/i386/kernel/apic.c
--- a/arch/i386/kernel/apic.c	2005-02-03 21:11:01 -08:00
+++ b/arch/i386/kernel/apic.c	2005-02-03 21:11:01 -08:00
@@ -26,6 +26,7 @@
 #include <linux/mc146818rtc.h>
 #include <linux/kernel_stat.h>
 #include <linux/sysdev.h>
+#include <linux/dyn-tick-timer.h>
 
 #include <asm/atomic.h>
 #include <asm/smp.h>
@@ -795,8 +796,12 @@
 	if (!smp_found_config && detect_init_APIC()) {
 		apic_phys = (unsigned long) alloc_bootmem_pages(PAGE_SIZE);
 		apic_phys = __pa(apic_phys);
-	} else
+	} else {
 		apic_phys = mp_lapic_addr;
+#ifdef CONFIG_NO_IDLE_HZ
+		dyn_tick->state |= DYN_TICK_USE_APIC;
+#endif
+	}
 
 	set_fixmap_nocache(FIX_APIC_BASE, apic_phys);
 	printk(KERN_DEBUG "mapped APIC to %08lx (%08lx)\n", APIC_BASE,
@@ -909,6 +914,8 @@
 
 #define APIC_DIVISOR 16
 
+static u32 apic_timer_val;
+
 void __setup_APIC_LVTT(unsigned int clocks)
 {
 	unsigned int lvtt_value, tmp_value, ver;
@@ -927,7 +934,15 @@
 				& ~(APIC_TDR_DIV_1 | APIC_TDR_DIV_TMBASE))
 				| APIC_TDR_DIV_16);
 
-	apic_write_around(APIC_TMICT, clocks/APIC_DIVISOR);
+	apic_timer_val = clocks/APIC_DIVISOR;
+
+#ifdef CONFIG_NO_IDLE_HZ
+	/* Local APIC timer is 24-bit */
+	if (apic_timer_val)
+		dyn_tick->max_skip = 0xffffff / apic_timer_val;
+#endif
+
+	apic_write_around(APIC_TMICT, apic_timer_val);
 }
 
 static void setup_APIC_timer(unsigned int clocks)
@@ -1068,6 +1083,18 @@
 	}
 }
 
+#if defined(CONFIG_NO_IDLE_HZ)
+void reprogram_apic_timer(unsigned int count)
+{
+	unsigned long flags;
+
+	count *= apic_timer_val;
+	local_irq_save(flags);
+	apic_write_around(APIC_TMICT, count);
+	local_irq_restore(flags);
+}
+#endif
+
 /*
  * the frequency of the profiling timer can be changed
  * by writing a multiplier value into /proc/profile.
@@ -1160,6 +1187,7 @@
 
 fastcall void smp_apic_timer_interrupt(struct pt_regs *regs)
 {
+	unsigned long seq;
 	int cpu = smp_processor_id();
 
 	/*
@@ -1178,6 +1206,23 @@
 	 * interrupt lock, which is the WrongThing (tm) to do.
 	 */
 	irq_enter();
+
+#ifdef CONFIG_NO_IDLE_HZ
+	/*
+	 * Check if we need to wake up PIT interrupt handler.
+	 * Otherwise just wake up local APIC timer.
+	 */
+	do {
+		seq = read_seqbegin(&xtime_lock);
+		if (dyn_tick->state & (DYN_TICK_ENABLED | DYN_TICK_SKIPPING)) {
+			if (dyn_tick->skip_cpu == cpu && dyn_tick->skip > DYN_TICK_MIN_SKIP)
+				dyn_tick->interrupt(0, NULL, regs);
+			else
+				reprogram_apic_timer(1);
+		}
+	} while (read_seqretry(&xtime_lock, seq));
+#endif
+
 	smp_local_timer_interrupt(regs);
 	irq_exit();
 }
diff -Nru a/arch/i386/kernel/irq.c b/arch/i386/kernel/irq.c
--- a/arch/i386/kernel/irq.c	2005-02-03 21:11:01 -08:00
+++ b/arch/i386/kernel/irq.c	2005-02-03 21:11:01 -08:00
@@ -15,6 +15,7 @@
 #include <linux/seq_file.h>
 #include <linux/interrupt.h>
 #include <linux/kernel_stat.h>
+#include <linux/dyn-tick-timer.h>
 
 #ifndef CONFIG_X86_LOCAL_APIC
 /*
@@ -100,6 +101,11 @@
 	} else
 #endif
 		__do_IRQ(irq, regs);
+
+#ifdef CONFIG_NO_IDLE_HZ
+	if (dyn_tick->state & (DYN_TICK_ENABLED | DYN_TICK_SKIPPING) && irq != 0)
+		dyn_tick->interrupt(irq, NULL, regs);
+#endif
 
 	irq_exit();
 
diff -Nru a/arch/i386/kernel/time.c b/arch/i386/kernel/time.c
--- a/arch/i386/kernel/time.c	2005-02-03 21:11:01 -08:00
+++ b/arch/i386/kernel/time.c	2005-02-03 21:11:01 -08:00
@@ -46,6 +46,7 @@
 #include <linux/bcd.h>
 #include <linux/efi.h>
 #include <linux/mca.h>
+#include <linux/dyn-tick-timer.h>
 
 #include <asm/io.h>
 #include <asm/smp.h>
@@ -301,6 +302,60 @@
 	return IRQ_HANDLED;
 }
 
+#ifdef CONFIG_NO_IDLE_HZ
+static unsigned long long last_tick;
+void reprogram_pit_tick(int jiffies_to_skip);
+extern void replace_timer_interrupt(void * new_handler);
+
+#if defined(CONFIG_NO_IDLE_HZ) && defined(CONFIG_X86_LOCAL_APIC)
+extern void reprogram_apic_timer(unsigned int count);
+#else
+void reprogram_apic_timer(unsigned int count) {}
+#endif
+
+#ifdef DEBUG
+#define dbg_dyn_tick_irq() {if (skipped && skipped < dyn_tick->skip) \
+				printk("%u/%li ", skipped, dyn_tick->skip);}
+#else
+#define dbg_dyn_tick_irq() {}
+#endif
+
+
+
+/*
+ * This interrupt handler updates the time based on number of jiffies skipped
+ * It would be somewhat more optimized to have a customa handler in each timer
+ * using hardware ticks instead of nanoseconds. Note that CONFIG_NO_IDLE_HZ
+ * currently disables timer fallback on skipped jiffies.
+ */
+irqreturn_t dyn_tick_timer_interrupt(int irq, void *dev_id, struct pt_regs *regs)
+{
+	unsigned long flags;
+	volatile unsigned long long now;
+	unsigned int skipped = 0;
+	write_seqlock_irqsave(&xtime_lock, flags);
+	now = cur_timer->get_hw_time();
+	while (now - last_tick >= NS_TICK_LEN) {
+		last_tick += NS_TICK_LEN;
+		cur_timer->mark_offset();
+		do_timer_interrupt(irq, NULL, regs);
+		skipped++;
+	}
+	if (dyn_tick->state & (DYN_TICK_ENABLED | DYN_TICK_SKIPPING)) {
+		dbg_dyn_tick_irq();
+		dyn_tick->skip = 1;
+		if (cpu_has_local_apic())
+			reprogram_apic_timer(dyn_tick->skip);
+		reprogram_pit_tick(dyn_tick->skip);
+		dyn_tick->state |= DYN_TICK_ENABLED;
+		dyn_tick->state &= ~DYN_TICK_SKIPPING;
+	}
+	write_sequnlock_irqrestore(&xtime_lock, flags);
+
+	return IRQ_HANDLED;
+}
+#endif
+
 /* not static: needed by APM */
 unsigned long get_cmos_time(void)
 {
@@ -396,6 +451,72 @@
 }
 #endif
 
+#ifdef CONFIG_NO_IDLE_HZ
+static struct dyn_tick_timer arch_ltt;
+
+#if defined(CONFIG_X86_UP_APIC) || defined(CONFIG_SMP)
+void disable_pit_tick(void)
+{
+	extern spinlock_t i8253_lock;
+	unsigned long flags;
+	spin_lock_irqsave(&i8253_lock, flags);
+	outb_p(0x31, PIT_MODE);		/* binary, mode 1, LSB/MSB, ch 0 */
+	spin_unlock_irqrestore(&i8253_lock, flags);
+}
+#endif
+
+/*
+ * Reprograms the next timer interrupt
+ * PIT timer reprogramming code taken from APM code.
+ * Note that PIT timer is a 16-bit timer, which allows max
+ * skip of only few seconds.
+ */
+void reprogram_pit_tick(int jiffies_to_skip)
+{
+	int skip;
+	extern spinlock_t i8253_lock;
+	unsigned long flags;
+
+	skip = jiffies_to_skip * LATCH;
+	if (skip > 0xffff) {
+		skip = 0xffff;
+	}      
+
+	spin_lock_irqsave(&i8253_lock, flags);
+	outb_p(0x34, PIT_MODE);		/* binary, mode 2, LSB/MSB, ch 0 */
+	outb_p(skip & 0xff, PIT_CH0);	/* LSB */
+	outb(skip >> 8, PIT_CH0);	/* MSB */
+	spin_unlock_irqrestore(&i8253_lock, flags);
+}
+
+static int __init dyn_tick_late_init(void)
+{
+	unsigned long flags;
+
+	if (!cur_timer->get_hw_time)
+		return -ENODEV;
+	write_seqlock_irqsave(&xtime_lock, flags);
+	last_tick = cur_timer->get_hw_time();
+	dyn_tick->skip = 1;
+	if (!cpu_has_local_apic())
+		dyn_tick->max_skip = 0xffff/LATCH;	/* PIT timer length */
+	printk(KERN_INFO "dyn-tick: Maximum ticks to skip limited to %i\n",
+	       dyn_tick->max_skip);
+	write_sequnlock_irqrestore(&xtime_lock, flags);
+
+	if (cur_timer->late_init)
+		cur_timer->late_init();
+	dyn_tick->interrupt = dyn_tick_timer_interrupt;
+	replace_timer_interrupt(dyn_tick->interrupt);
+
+	write_seqlock_irqsave(&xtime_lock, flags);
+	dyn_tick->state |= DYN_TICK_ENABLED;
+	write_sequnlock_irqrestore(&xtime_lock, flags);
+
+	return 0;
+}
+#endif
+
 void __init time_init(void)
 {
 #ifdef CONFIG_HPET_TIMER
@@ -415,6 +536,16 @@
 
 	cur_timer = select_timer();
 	printk(KERN_INFO "Using %s for high-res timesource\n",cur_timer->name);
+
+#ifdef CONFIG_NO_IDLE_HZ
+	if (strncmp(cur_timer->name, "tsc", 3) == 0 ||
+	    strncmp(cur_timer->name, "pmtmr", 3) == 0) {
+		arch_ltt.init = dyn_tick_late_init;
+		dyn_tick_register(&arch_ltt);
+	} else
+		printk(KERN_INFO "dyn-tick: Cannot use timer %s\n",
+		       cur_timer->name);
+#endif
 
 	time_init_hook();
 }
diff -Nru a/arch/i386/kernel/timers/timer_pm.c b/arch/i386/kernel/timers/timer_pm.c
--- a/arch/i386/kernel/timers/timer_pm.c	2005-02-03 21:11:01 -08:00
+++ b/arch/i386/kernel/timers/timer_pm.c	2005-02-03 21:11:01 -08:00
@@ -15,6 +15,7 @@
 #include <linux/module.h>
 #include <linux/device.h>
 #include <linux/init.h>
+#include <linux/dyn-tick-timer.h>
 #include <asm/types.h>
 #include <asm/timer.h>
 #include <asm/smp.h>
@@ -168,6 +169,7 @@
 	monotonic_base += delta * NSEC_PER_USEC;
 	write_sequnlock(&monotonic_lock);
 
+#ifndef CONFIG_NO_IDLE_HZ
 	/* convert to ticks */
 	delta += offset_delay;
 	lost = delta / (USEC_PER_SEC / HZ);
@@ -184,6 +186,7 @@
 		first_run = 0;
 		offset_delay = 0;
 	}
+#endif
 }
 
 
@@ -238,6 +241,25 @@
 	return (unsigned long) offset_delay + cyc2us(delta);
 }
 
+static unsigned long long ns_time;
+
+static unsigned long long get_hw_time_pmtmr(void)
+{
+	u32 now, delta;
+	static unsigned int last_cycles;
+	now = read_pmtmr();
+	delta = (now - last_cycles) & ACPI_PM_MASK;
+	last_cycles = now;
+	ns_time += cyc2us(delta) * NSEC_PER_USEC;
+	return ns_time;
+}
+
+static void late_init_pmtmr(void)
+{
+	ns_time = monotonic_clock_pmtmr();
+}
+
+extern irqreturn_t pmtmr_interrupt(int irq, void *dev_id, struct pt_regs *regs);
 
 /* acpi timer_opts struct */
 static struct timer_opts timer_pmtmr = {
@@ -245,7 +267,9 @@
 	.mark_offset		= mark_offset_pmtmr,
 	.get_offset		= get_offset_pmtmr,
 	.monotonic_clock 	= monotonic_clock_pmtmr,
+	.get_hw_time		= get_hw_time_pmtmr,
 	.delay 			= delay_pmtmr,
+	.late_init		= late_init_pmtmr,
 };
 
 struct init_timer_opts __initdata timer_pmtmr_init = {
diff -Nru a/arch/i386/kernel/timers/timer_tsc.c b/arch/i386/kernel/timers/timer_tsc.c
--- a/arch/i386/kernel/timers/timer_tsc.c	2005-02-03 21:11:01 -08:00
+++ b/arch/i386/kernel/timers/timer_tsc.c	2005-02-03 21:11:01 -08:00
@@ -112,6 +112,15 @@
 	return delay_at_last_interrupt + edx;
 }
 
+static unsigned long get_hw_time_tsc(void)
+{
+	register unsigned long eax, edx;
+
+	unsigned long long hw_time;
+	rdtscll(hw_time);
+	return cycles_2_ns(hw_time);
+}
+
 static unsigned long long monotonic_clock_tsc(void)
 {
 	unsigned long long last_offset, this_offset, base;
@@ -348,6 +357,7 @@
 
 	rdtsc(last_tsc_low, last_tsc_high);
 
+#ifndef CONFIG_NO_IDLE_HZ
 	spin_lock(&i8253_lock);
 	outb_p(0x00, PIT_MODE);     /* latch the count ASAP */
 
@@ -415,14 +425,18 @@
 			cpufreq_delayed_get();
 	} else
 		lost_count = 0;
+#endif
+
 	/* update the monotonic base value */
 	this_offset = ((unsigned long long)last_tsc_high<<32)|last_tsc_low;
 	monotonic_base += cycles_2_ns(this_offset - last_offset);
 	write_sequnlock(&monotonic_lock);
 
+#ifndef CONFIG_NO_IDLE_HZ
 	/* calculate delay_at_last_interrupt */
 	count = ((LATCH-1) - count) * TICK_SIZE;
 	delay_at_last_interrupt = (count + LATCH/2) / LATCH;
+#endif
 
 	/* catch corner case where tick rollover occured
 	 * between tsc and pit reads (as noted when
@@ -551,6 +565,7 @@
 	.mark_offset = mark_offset_tsc, 
 	.get_offset = get_offset_tsc,
 	.monotonic_clock = monotonic_clock_tsc,
+	.get_hw_time = get_hw_time_tsc,
 	.delay = delay_tsc,
 };
 
diff -Nru a/arch/i386/mach-default/setup.c b/arch/i386/mach-default/setup.c
--- a/arch/i386/mach-default/setup.c	2005-02-03 21:11:01 -08:00
+++ b/arch/i386/mach-default/setup.c	2005-02-03 21:11:01 -08:00
@@ -85,6 +85,22 @@
 	setup_irq(0, &irq0);
 }
 
+/**
+ * replace_timer_interrupt - allow replacing timer interrupt handler
+ *
+ * Description:
+ *	Can be used to replace timer interrupt handler with a more optimized
+ *	handler. Used for enabling and disabling of CONFIG_NO_IDLE_HZ.
+ */
+void replace_timer_interrupt(void * new_handler)
+{
+	unsigned long flags;
+
+	write_seqlock_irqsave(&xtime_lock, flags);
+	irq0.handler = new_handler;
+	write_sequnlock_irqrestore(&xtime_lock, flags);
+}
+
 #ifdef CONFIG_MCA
 /**
  * mca_nmi_hook - hook into MCA specific NMI chain
diff -Nru a/include/asm-i386/timer.h b/include/asm-i386/timer.h
--- a/include/asm-i386/timer.h	2005-02-03 21:11:01 -08:00
+++ b/include/asm-i386/timer.h	2005-02-03 21:11:01 -08:00
@@ -1,6 +1,7 @@
 #ifndef _ASMi386_TIMER_H
 #define _ASMi386_TIMER_H
 #include <linux/init.h>
+#include <linux/interrupt.h>
 
 /**
  * struct timer_ops - used to define a timer source
@@ -21,7 +22,9 @@
 	void (*mark_offset)(void);
 	unsigned long (*get_offset)(void);
 	unsigned long long (*monotonic_clock)(void);
+	unsigned long long (*get_hw_time)(void);
 	void (*delay)(unsigned long);
+	void (*late_init)(void);
 };
 
 struct init_timer_opts {
diff -Nru a/include/linux/dyn-tick-timer.h b/include/linux/dyn-tick-timer.h
--- /dev/null	Wed Dec 31 16:00:00 196900
+++ b/include/linux/dyn-tick-timer.h	2005-02-03 21:11:01 -08:00
@@ -0,0 +1,73 @@
+/*
+ * linux/include/linux/dyn-tick-timer.h
+ *
+ * Copyright (C) 2004 Nokia Corporation
+ * Written by Tony Lindgen <tony@atomide.com> and
+ * Tuukka Tikkanen <tuukka.tikkanen@elektrobit.com>
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms of the GNU General Public License as published by the
+ * Free Software Foundation; either version 2 of the License, or (at your
+ * option) any later version.
+ *
+ * THIS SOFTWARE IS PROVIDED ``AS IS'' AND ANY EXPRESS OR IMPLIED
+ * WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF
+ * MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN
+ * NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY DIRECT, INDIRECT,
+ * INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT
+ * NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF
+ * USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON
+ * ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF
+ * THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ *
+ * You should have received a copy of the GNU General Public License along
+ * with this program; if not, write to the Free Software Foundation, Inc.,
+ * 675 Mass Ave, Cambridge, MA 02139, USA.
+ */
+
+#include <linux/interrupt.h>
+
+#define DYN_TICK_USE_APIC	(1 << 2)
+#define DYN_TICK_SKIPPING	(1 << 1)
+#define DYN_TICK_ENABLED	(1 << 0)
+
+struct dyn_tick_state {
+	unsigned int state;		/* Current state */
+	int skip_cpu;			/* Skip handling processor */
+	unsigned long skip;		/* Ticks to skip */
+	unsigned int max_skip;		/* Max number of ticks to skip */
+	unsigned long irq_skip_mask;	/* Do not update time from these irqs */
+	irqreturn_t (*interrupt)(int, void *, struct pt_regs *);
+};
+
+/* REVISIT: Add functions to enable/disable dyn-tick on the fly */
+struct dyn_tick_timer {
+	int (*init) (void);
+};
+
+extern struct dyn_tick_state * dyn_tick;
+extern void dyn_tick_register(struct dyn_tick_timer * new_timer);
+
+#define NS_TICK_LEN		((1 * 1000000000)/HZ)
+#define DYN_TICK_MIN_SKIP	2
+
+#if defined(CONFIG_SMP)
+#define cpu_has_local_apic()	1
+#elif defined(CONFIG_X86_UP_APIC)
+#define cpu_has_local_apic()	(dyn_tick->state & DYN_TICK_USE_APIC)
+#else
+#define cpu_has_local_apic()	0
+#endif
+
+#ifdef CONFIG_NO_IDLE_HZ
+
+#if defined(CONFIG_X86) || defined(CONFIG_IA64) || defined(CONFIG_X86_64)
+#define arch_has_safe_halt()	1
+#endif
+
+#else
+
+#define arch_has_safe_halt()	0
+
+#endif
diff -Nru a/kernel/Makefile b/kernel/Makefile
--- a/kernel/Makefile	2005-02-03 21:11:01 -08:00
+++ b/kernel/Makefile	2005-02-03 21:11:01 -08:00
@@ -26,6 +26,7 @@
 obj-$(CONFIG_KPROBES) += kprobes.o
 obj-$(CONFIG_SYSFS) += ksysfs.o
 obj-$(CONFIG_GENERIC_HARDIRQS) += irq/
+obj-$(CONFIG_NO_IDLE_HZ) += dyn-tick-timer.o
 
 ifneq ($(CONFIG_IA64),y)
 # According to Alan Modra <alan@linuxcare.com.au>, the -fno-omit-frame-pointer is
diff -Nru a/kernel/dyn-tick-timer.c b/kernel/dyn-tick-timer.c
--- /dev/null	Wed Dec 31 16:00:00 196900
+++ b/kernel/dyn-tick-timer.c	2005-02-03 21:11:01 -08:00
@@ -0,0 +1,149 @@
+/*
+ * linux/arch/i386/kernel/dyn-tick.c
+ *
+ * Beginnings of generic dynamic tick timer support
+ *
+ * Copyright (C) 2004 Nokia Corporation
+ * Written by Tony Lindgen <tony@atomide.com> and
+ * Tuukka Tikkanen <tuukka.tikkanen@elektrobit.com>
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms of the GNU General Public License as published by the
+ * Free Software Foundation; either version 2 of the License, or (at your
+ * option) any later version.
+ *
+ * THIS SOFTWARE IS PROVIDED ``AS IS'' AND ANY EXPRESS OR IMPLIED
+ * WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF
+ * MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN
+ * NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY DIRECT, INDIRECT,
+ * INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT
+ * NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF
+ * USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON
+ * ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF
+ * THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ *
+ * You should have received a copy of the GNU General Public License along
+ * with this program; if not, write to the Free Software Foundation, Inc.,
+ * 675 Mass Ave, Cambridge, MA 02139, USA.
+ *
+ *
+ * TODO:
+ * - Add functions for enabling/disabling dyn-tick on the fly
+ * - Generalize to work with ARM sys_timer
+ */
+
+#include <linux/version.h>
+#include <linux/config.h>
+#include <linux/kernel.h>
+#include <linux/init.h>
+#include <linux/module.h>
+#include <linux/interrupt.h>
+#include <linux/cpumask.h>
+#include <linux/pm.h>
+#include <linux/dyn-tick-timer.h>
+#include <asm/io.h>
+
+#include "io_ports.h"
+
+#define VERSION	050227-1
+
+struct dyn_tick_state dyn_tick_state;
+struct dyn_tick_state * dyn_tick = &dyn_tick_state;
+struct dyn_tick_timer dyn_tick_timer;
+struct dyn_tick_timer * dyn_tick_cfg = &dyn_tick_timer;
+static void (*orig_idle) (void) = 0;
+extern void disable_pit_tick(void);
+extern void reprogram_pit_tick(int jiffies_to_skip);
+extern void reprogram_apic_timer(unsigned int count);
+extern void reprogram_pit_tick(int jiffies_to_skip);
+static cpumask_t dyn_cpu_map;
+
+/*
+ * We want to have all processors idle before reprogramming the next
+ * timer interrupt. Note that we must maintain the state for dynamic tick,
+ * otherwise the idle loop could be reprogramming the timer continuously
+ * further into the future, and the timer interrupt would never happen.
+ */
+static void dyn_tick_idle(void)
+{
+	int cpu;
+	unsigned long flags;
+	cpumask_t idle_cpus;
+	unsigned long next;
+
+	if (!(dyn_tick->state & DYN_TICK_ENABLED))
+		goto out;
+
+	/* Check if we are already skipping ticks and can idle other cpus */
+	if (dyn_tick->state & DYN_TICK_SKIPPING) {
+		reprogram_apic_timer(dyn_tick->skip);
+		goto out;
+	}
+
+	/* Check if we can start skipping ticks */
+	write_seqlock_irqsave(&xtime_lock, flags);
+	cpu = smp_processor_id();
+	cpu_set(cpu, dyn_cpu_map);
+	cpus_and(idle_cpus, dyn_cpu_map, cpu_online_map);
+	if (cpus_equal(idle_cpus, cpu_online_map)) {
+		next = next_timer_interrupt();
+		if (jiffies > next) {
+			//printk("Too late? next: %lu jiffies: %lu\n",
+			//       next, jjiffies);
+			dyn_tick->skip = 1;
+		} else
+			dyn_tick->skip = next_timer_interrupt() - jiffies;
+		if (dyn_tick->skip > DYN_TICK_MIN_SKIP) {
+			if (dyn_tick->skip > dyn_tick->max_skip)
+				dyn_tick->skip = dyn_tick->max_skip;
+			if (cpu_has_local_apic()) {
+				disable_pit_tick();
+				reprogram_apic_timer(dyn_tick->skip);
+			} else
+				reprogram_pit_tick(dyn_tick->skip);
+			dyn_tick->skip_cpu = cpu;
+			dyn_tick->state |= DYN_TICK_SKIPPING;
+		}
+		cpus_clear(dyn_cpu_map);
+	}
+	write_sequnlock_irqrestore(&xtime_lock, flags);
+
+out:
+	if (orig_idle)
+		orig_idle();
+	else if (arch_has_safe_halt())
+		safe_halt();
+}
+
+void __init dyn_tick_register(struct dyn_tick_timer * new_timer)
+{
+	dyn_tick_cfg->init = new_timer->init;
+	printk(KERN_INFO "dyn-tick: Registering dynamic tick timer\n");
+}
+
+/*
+ * We need to initialize dynamic tick after calibrate delay
+ */
+static int __init dyn_tick_init(void)
+{
+	int ret = 0;
+
+	if (dyn_tick_cfg->init == NULL)
+		return -ENODEV;
+
+	ret = dyn_tick_cfg->init();
+	if (ret != 0) {
+		printk(KERN_WARNING "dyn-tick: Init failed\n");
+		return -ENODEV;
+	}
+	orig_idle = pm_idle;
+	pm_idle = dyn_tick_idle;
+#if (LINUX_VERSION_CODE > KERNEL_VERSION(2,6,10))
+	cpu_idle_wait();
+#endif
+	printk(KERN_INFO "dyn-tick: Timer using dynamic tick\n");
+
+	return ret;
+}
+late_initcall(dyn_tick_init);

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH] Dynamic tick, version 050127-1
  2005-02-04  5:19                 ` Tony Lindgren
@ 2005-02-04  6:33                   ` Zwane Mwaikambo
  2005-02-04 17:18                     ` Tony Lindgren
  2005-02-05 23:00                   ` Pavel Machek
  1 sibling, 1 reply; 38+ messages in thread
From: Zwane Mwaikambo @ 2005-02-04  6:33 UTC (permalink / raw)
  To: Tony Lindgren
  Cc: Pavel Machek, Benjamin Herrenschmidt, Arjan van de Ven,
	Martin Schwidefsky, Andrea Arcangeli, George Anzinger,
	Thomas Gleixner, john stultz, Lee Revell, linux-kernel

On Thu, 3 Feb 2005, Tony Lindgren wrote:

> > > > It could also be that the reprogamming of PIT timer does not work on
> > > > your machine. I chopped off the udelays there... Can you try
> > > > something like this:
> > > 
> > > I added the udelays, but behaviour did not change.
> > 
> > Yeah, and if the first patch was working better, that means the PIT
> > interrupts work. I'll do another version of the patch where PIT
> > interrupts work again without local APIC needed, let's see what
> > happens with that.

I see in the patch that you're reprogramming the PIT for a periodic mode 
(2) but using dyn_tick->skip as the period. Is this intentional? I thought 
you wanted a oneshot for that.

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH] Dynamic tick, version 050127-1
  2005-02-04  6:33                   ` Zwane Mwaikambo
@ 2005-02-04 17:18                     ` Tony Lindgren
  2005-02-04 17:31                       ` Zwane Mwaikambo
  0 siblings, 1 reply; 38+ messages in thread
From: Tony Lindgren @ 2005-02-04 17:18 UTC (permalink / raw)
  To: Zwane Mwaikambo
  Cc: Pavel Machek, Benjamin Herrenschmidt, Arjan van de Ven,
	Martin Schwidefsky, Andrea Arcangeli, George Anzinger,
	Thomas Gleixner, john stultz, Lee Revell, linux-kernel

* Zwane Mwaikambo <zwane@arm.linux.org.uk> [050203 22:33]:
> On Thu, 3 Feb 2005, Tony Lindgren wrote:
> 
> > > > > It could also be that the reprogamming of PIT timer does not work on
> > > > > your machine. I chopped off the udelays there... Can you try
> > > > > something like this:
> > > > 
> > > > I added the udelays, but behaviour did not change.
> > > 
> > > Yeah, and if the first patch was working better, that means the PIT
> > > interrupts work. I'll do another version of the patch where PIT
> > > interrupts work again without local APIC needed, let's see what
> > > happens with that.
> 
> I see in the patch that you're reprogramming the PIT for a periodic mode 
> (2) but using dyn_tick->skip as the period. Is this intentional? I thought 
> you wanted a oneshot for that.

Yes, it's safer to keep the timer periodic, although it's
used for oneshot purposes for the skips. If the timer interrupt
got missed for some reason, the system would be able to recover when
it's in periodic mode.

And with some timers, we can do the reprogramming faster, as we just
need to load the new value.

I could not figure out how to disable the interrupts for PIT
when local APIC is used and the ticks to skip is longer than PIT
would allow. So I just changed the mode temporarily to disable it.

Does anybody know if there's a way to stop PIT interrupts while
keeping it in the periodic mode?

Regards,

Tony

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH] Dynamic tick, version 050127-1
  2005-02-04 17:18                     ` Tony Lindgren
@ 2005-02-04 17:31                       ` Zwane Mwaikambo
  2005-02-04 17:42                         ` Tony Lindgren
  0 siblings, 1 reply; 38+ messages in thread
From: Zwane Mwaikambo @ 2005-02-04 17:31 UTC (permalink / raw)
  To: Tony Lindgren
  Cc: Pavel Machek, Benjamin Herrenschmidt, Arjan van de Ven,
	Martin Schwidefsky, Andrea Arcangeli, George Anzinger,
	Thomas Gleixner, john stultz, Lee Revell, linux-kernel

On Fri, 4 Feb 2005, Tony Lindgren wrote:

> Yes, it's safer to keep the timer periodic, although it's
> used for oneshot purposes for the skips. If the timer interrupt
> got missed for some reason, the system would be able to recover when
> it's in periodic mode.
> 
> And with some timers, we can do the reprogramming faster, as we just
> need to load the new value.
> 
> I could not figure out how to disable the interrupts for PIT
> when local APIC is used and the ticks to skip is longer than PIT
> would allow. So I just changed the mode temporarily to disable it.
>
> Does anybody know if there's a way to stop PIT interrupts while
> keeping it in the periodic mode?

disable_irq(0) ?

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH] Dynamic tick, version 050127-1
  2005-02-04 17:31                       ` Zwane Mwaikambo
@ 2005-02-04 17:42                         ` Tony Lindgren
  2005-02-04 17:54                           ` Zwane Mwaikambo
  0 siblings, 1 reply; 38+ messages in thread
From: Tony Lindgren @ 2005-02-04 17:42 UTC (permalink / raw)
  To: Zwane Mwaikambo
  Cc: Pavel Machek, Benjamin Herrenschmidt, Arjan van de Ven,
	Martin Schwidefsky, Andrea Arcangeli, George Anzinger,
	Thomas Gleixner, john stultz, Lee Revell, linux-kernel

* Zwane Mwaikambo <zwane@arm.linux.org.uk> [050204 09:31]:
> On Fri, 4 Feb 2005, Tony Lindgren wrote:
> 
> > Yes, it's safer to keep the timer periodic, although it's
> > used for oneshot purposes for the skips. If the timer interrupt
> > got missed for some reason, the system would be able to recover when
> > it's in periodic mode.
> > 
> > And with some timers, we can do the reprogramming faster, as we just
> > need to load the new value.
> > 
> > I could not figure out how to disable the interrupts for PIT
> > when local APIC is used and the ticks to skip is longer than PIT
> > would allow. So I just changed the mode temporarily to disable it.
> >
> > Does anybody know if there's a way to stop PIT interrupts while
> > keeping it in the periodic mode?
> 
> disable_irq(0) ?

Then the problem is that the CPU does not stay in sleep but wakes to
the first PIT interrupt AFAIK.

Tony

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH] Dynamic tick, version 050127-1
  2005-02-04 17:42                         ` Tony Lindgren
@ 2005-02-04 17:54                           ` Zwane Mwaikambo
  2005-02-04 18:58                             ` Tony Lindgren
  0 siblings, 1 reply; 38+ messages in thread
From: Zwane Mwaikambo @ 2005-02-04 17:54 UTC (permalink / raw)
  To: Tony Lindgren
  Cc: Pavel Machek, Benjamin Herrenschmidt, Arjan van de Ven,
	Martin Schwidefsky, Andrea Arcangeli, George Anzinger,
	Thomas Gleixner, john stultz, Lee Revell, linux-kernel

On Fri, 4 Feb 2005, Tony Lindgren wrote:

> * Zwane Mwaikambo <zwane@arm.linux.org.uk> [050204 09:31]:
> > On Fri, 4 Feb 2005, Tony Lindgren wrote:
> > 
> > > Yes, it's safer to keep the timer periodic, although it's
> > > used for oneshot purposes for the skips. If the timer interrupt
> > > got missed for some reason, the system would be able to recover when
> > > it's in periodic mode.
> > > 
> > > And with some timers, we can do the reprogramming faster, as we just
> > > need to load the new value.
> > > 
> > > I could not figure out how to disable the interrupts for PIT
> > > when local APIC is used and the ticks to skip is longer than PIT
> > > would allow. So I just changed the mode temporarily to disable it.
> > >
> > > Does anybody know if there's a way to stop PIT interrupts while
> > > keeping it in the periodic mode?
> > 
> > disable_irq(0) ?
> 
> Then the problem is that the CPU does not stay in sleep but wakes to
> the first PIT interrupt AFAIK.

I do not understand, do you want to disable the PIT from interrupting the 
processor and enable it interrupting at a later time?

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH] Dynamic tick, version 050127-1
  2005-02-04 17:54                           ` Zwane Mwaikambo
@ 2005-02-04 18:58                             ` Tony Lindgren
  2005-02-04 19:24                               ` Tony Lindgren
  0 siblings, 1 reply; 38+ messages in thread
From: Tony Lindgren @ 2005-02-04 18:58 UTC (permalink / raw)
  To: Zwane Mwaikambo
  Cc: Pavel Machek, Benjamin Herrenschmidt, Arjan van de Ven,
	Martin Schwidefsky, Andrea Arcangeli, George Anzinger,
	Thomas Gleixner, john stultz, Lee Revell, linux-kernel

* Zwane Mwaikambo <zwane@arm.linux.org.uk> [050204 09:54]:
> On Fri, 4 Feb 2005, Tony Lindgren wrote:
> 
> > * Zwane Mwaikambo <zwane@arm.linux.org.uk> [050204 09:31]:
> > > On Fri, 4 Feb 2005, Tony Lindgren wrote:
> > > 
> > > > Yes, it's safer to keep the timer periodic, although it's
> > > > used for oneshot purposes for the skips. If the timer interrupt
> > > > got missed for some reason, the system would be able to recover when
> > > > it's in periodic mode.
> > > > 
> > > > And with some timers, we can do the reprogramming faster, as we just
> > > > need to load the new value.
> > > > 
> > > > I could not figure out how to disable the interrupts for PIT
> > > > when local APIC is used and the ticks to skip is longer than PIT
> > > > would allow. So I just changed the mode temporarily to disable it.
> > > >
> > > > Does anybody know if there's a way to stop PIT interrupts while
> > > > keeping it in the periodic mode?
> > > 
> > > disable_irq(0) ?
> > 
> > Then the problem is that the CPU does not stay in sleep but wakes to
> > the first PIT interrupt AFAIK.
> 
> I do not understand, do you want to disable the PIT from interrupting the 
> processor and enable it interrupting at a later time?

Yes, that right. PIT max skip ticks = 54 and local APIC timer > 1000.
PIT interrupt needs to be disabled to stay in sleep for over 54 ticks.

But I think you're right, disable_irq(0) should do the trick :)

Hmmm, we should be able to keep PIT irq disabled all the time when using
local APIC timer. I'll play with it a bit.

Tony

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH] Dynamic tick, version 050127-1
  2005-02-04 18:58                             ` Tony Lindgren
@ 2005-02-04 19:24                               ` Tony Lindgren
  0 siblings, 0 replies; 38+ messages in thread
From: Tony Lindgren @ 2005-02-04 19:24 UTC (permalink / raw)
  To: Zwane Mwaikambo
  Cc: Pavel Machek, Benjamin Herrenschmidt, Arjan van de Ven,
	Martin Schwidefsky, Andrea Arcangeli, George Anzinger,
	Thomas Gleixner, john stultz, Lee Revell, linux-kernel

* Tony Lindgren <tony@atomide.com> [050204 11:14]:
> * Zwane Mwaikambo <zwane@arm.linux.org.uk> [050204 09:54]:
> > On Fri, 4 Feb 2005, Tony Lindgren wrote:
> > 
> > > * Zwane Mwaikambo <zwane@arm.linux.org.uk> [050204 09:31]:
> > > > On Fri, 4 Feb 2005, Tony Lindgren wrote:
> > > > 
> > > > > Yes, it's safer to keep the timer periodic, although it's
> > > > > used for oneshot purposes for the skips. If the timer interrupt
> > > > > got missed for some reason, the system would be able to recover when
> > > > > it's in periodic mode.
> > > > > 
> > > > > And with some timers, we can do the reprogramming faster, as we just
> > > > > need to load the new value.
> > > > > 
> > > > > I could not figure out how to disable the interrupts for PIT
> > > > > when local APIC is used and the ticks to skip is longer than PIT
> > > > > would allow. So I just changed the mode temporarily to disable it.
> > > > >
> > > > > Does anybody know if there's a way to stop PIT interrupts while
> > > > > keeping it in the periodic mode?
> > > > 
> > > > disable_irq(0) ?
> > > 
> > > Then the problem is that the CPU does not stay in sleep but wakes to
> > > the first PIT interrupt AFAIK.
> > 
> > I do not understand, do you want to disable the PIT from interrupting the 
> > processor and enable it interrupting at a later time?
> 
> Yes, that right. PIT max skip ticks = 54 and local APIC timer > 1000.
> PIT interrupt needs to be disabled to stay in sleep for over 54 ticks.
> 
> But I think you're right, disable_irq(0) should do the trick :)
> 
> Hmmm, we should be able to keep PIT irq disabled all the time when using
> local APIC timer. I'll play with it a bit.

Oops, no, PIT must be running at least when the system is busy.
Otherwise time won't get updated during load, as we never get to the
idle loop.

Tony

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH] Dynamic tick, version 050127-1
  2005-02-04  5:19                 ` Tony Lindgren
  2005-02-04  6:33                   ` Zwane Mwaikambo
@ 2005-02-05 23:00                   ` Pavel Machek
  2005-02-06  2:33                     ` Tony Lindgren
  1 sibling, 1 reply; 38+ messages in thread
From: Pavel Machek @ 2005-02-05 23:00 UTC (permalink / raw)
  To: Tony Lindgren
  Cc: Benjamin Herrenschmidt, Arjan van de Ven, Martin Schwidefsky,
	Andrea Arcangeli, George Anzinger, Thomas Gleixner, john stultz,
	Zwane Mwaikambo, Lee Revell, linux-kernel

Hi!

> > > > It could also be that the reprogamming of PIT timer does not work on
> > > > your machine. I chopped off the udelays there... Can you try
> > > > something like this:
> > > 
> > > I added the udelays, but behaviour did not change.
> > 
> > Yeah, and if the first patch was working better, that means the PIT
> > interrupts work. I'll do another version of the patch where PIT
> > interrupts work again without local APIC needed, let's see what
> > happens with that.
> 
> I think something broke TSC timer after the first patch, but I could
> not figure out yet what. So the bad combo might be local APIC + TSC.
> At least I'm seeing similar problems with local APIC + TSC timer.
> 
> Attached is a slightly improved patch, but the patch does not fix
> the TSC problem. It just fixes compile without local APIC, and
> booting SMP kernel on uniprocessor machine.
> 
> Currently the suggested combo is local APIC + ACPI PM timer...

Ok, works slightly better: time no longer runs 2x too fast. When TSC
is used, I get same behaviour  as before ("sleepy machine"). With
"notsc", machine seems to work okay, but I still get 1000 timer
interrupts a second.

> And if that works, changing the I8042_POLL_PERIOD from HZ/20 in
> drivers/input/serio/i8042.h to something like HZ increases the
> sleep interval quite a bit. I think I had lots of polling also in
> CONFIG_NETFILTER, but I haven't verified that.

Okay, I set POLL_PERIOD to 5*HZ, and disabled USB. Perhaps it will
sleep better now?
								Pavel
-- 
People were complaining that M$ turns users into beta-testers...
...jr ghea gurz vagb qrirybcref, naq gurl frrz gb yvxr vg gung jnl!

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH] Dynamic tick, version 050127-1
  2005-02-05 23:00                   ` Pavel Machek
@ 2005-02-06  2:33                     ` Tony Lindgren
  2005-02-06  3:54                       ` Tony Lindgren
  2005-02-06  8:11                       ` Pavel Machek
  0 siblings, 2 replies; 38+ messages in thread
From: Tony Lindgren @ 2005-02-06  2:33 UTC (permalink / raw)
  To: Pavel Machek
  Cc: Benjamin Herrenschmidt, Arjan van de Ven, Martin Schwidefsky,
	Andrea Arcangeli, George Anzinger, Thomas Gleixner, john stultz,
	Zwane Mwaikambo, Lee Revell, linux-kernel

* Pavel Machek <pavel@ucw.cz> [050205 15:08]:
> Hi!
> 
> > > > > It could also be that the reprogamming of PIT timer does not work on
> > > > > your machine. I chopped off the udelays there... Can you try
> > > > > something like this:
> > > > 
> > > > I added the udelays, but behaviour did not change.
> > > 
> > > Yeah, and if the first patch was working better, that means the PIT
> > > interrupts work. I'll do another version of the patch where PIT
> > > interrupts work again without local APIC needed, let's see what
> > > happens with that.
> > 
> > I think something broke TSC timer after the first patch, but I could
> > not figure out yet what. So the bad combo might be local APIC + TSC.
> > At least I'm seeing similar problems with local APIC + TSC timer.
> > 
> > Attached is a slightly improved patch, but the patch does not fix
> > the TSC problem. It just fixes compile without local APIC, and
> > booting SMP kernel on uniprocessor machine.
> > 
> > Currently the suggested combo is local APIC + ACPI PM timer...
> 
> Ok, works slightly better: time no longer runs 2x too fast. When TSC
> is used, I get same behaviour  as before ("sleepy machine"). With
> "notsc", machine seems to work okay, but I still get 1000 timer
> interrupts a second.

Sounds like dyn-tick did not get enabled then, maybe you don't have
CONFIG_X86_PM_TIMER, or don't have ACPI PM timer on your board?

After modifying I8042_POLL_PERIOD and leaving out CONFIG_NETFILTER
I'm getting roughly 6HZ timer rate when idle :)

$ dmesg | grep -i "time\|tick\|apic"
ACPI: PM-Timer IO Port: 0x1008
Kernel command line: root=/dev/nfs ip=dhcp ro console=ttyS0,115200
lapic init=/bin/minit
Local APIC disabled by BIOS -- reenabling.
Found and enabled local APIC!
mapped APIC to ffffd000 (fee00000)
Using pmtmr for high-res timesource
dyn-tick: Registering dynamic tick timer
per-CPU timeslice cutoff: 365.35 usecs.
task migration cache decay timeout: 1 msecs.
Machine check exception polling timer started.
Real Time Clock Driver v1.12
dyn-tick: Maximum ticks to skip limited to 2678
dyn-tick: Timer using dynamic tick

$ cat /proc/interrupts | grep timer && sleep 10 && cat /proc/interrupts | grep timer
  0:      10689          XT-PIC  timer
  0:      10745          XT-PIC  timer

> > And if that works, changing the I8042_POLL_PERIOD from HZ/20 in
> > drivers/input/serio/i8042.h to something like HZ increases the
> > sleep interval quite a bit. I think I had lots of polling also in
> > CONFIG_NETFILTER, but I haven't verified that.
> 
> Okay, I set POLL_PERIOD to 5*HZ, and disabled USB. Perhaps it will
> sleep better now?

Sounds like your system is not running with the dyn-tick... I'll try
to fix that TSC bug.

Tony

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH] Dynamic tick, version 050127-1
  2005-02-06  2:33                     ` Tony Lindgren
@ 2005-02-06  3:54                       ` Tony Lindgren
  2005-02-06  8:41                         ` Pavel Machek
                                           ` (2 more replies)
  2005-02-06  8:11                       ` Pavel Machek
  1 sibling, 3 replies; 38+ messages in thread
From: Tony Lindgren @ 2005-02-06  3:54 UTC (permalink / raw)
  To: Pavel Machek
  Cc: Benjamin Herrenschmidt, Arjan van de Ven, Martin Schwidefsky,
	Andrea Arcangeli, George Anzinger, Thomas Gleixner, john stultz,
	Zwane Mwaikambo, Lee Revell, linux-kernel

[-- Attachment #1: Type: text/plain, Size: 552 bytes --]

* Tony Lindgren <tony@atomide.com> [050205 18:39]:
> * Pavel Machek <pavel@ucw.cz> [050205 15:08]:
> > 
> > Ok, works slightly better: time no longer runs 2x too fast. When TSC
> > is used, I get same behaviour  as before ("sleepy machine"). With
> > "notsc", machine seems to work okay, but I still get 1000 timer
> > interrupts a second.

...

> 
> Sounds like your system is not running with the dyn-tick... I'll try
> to fix that TSC bug.

The following patch fixes TSC timer with dyn-tick, and local APIC
timer on UP system with CONFIG_SMP.

Tony

[-- Attachment #2: patch-2.6.11-rc3-dyn-tick-050205-1 --]
[-- Type: text/plain, Size: 20900 bytes --]

diff -Nru a/arch/i386/Kconfig b/arch/i386/Kconfig
--- a/arch/i386/Kconfig	2005-02-05 19:46:47 -08:00
+++ b/arch/i386/Kconfig	2005-02-05 19:46:47 -08:00
@@ -452,6 +452,16 @@
 	bool "Provide RTC interrupt"
 	depends on HPET_TIMER && RTC=y
 
+config NO_IDLE_HZ
+	bool "Dynamic Tick Timer - Skip timer ticks during idle"
+	help
+	  This option enables support for skipping timer ticks when the
+	  processor is idle. During system load, timer is continuous.
+	  This option saves power, as it allows the system to stay in
+	  idle mode longer. Currently supported timers are ACPI PM
+	  timer, local APIC timer, and TSC timer. HPET timer is currently
+	  not supported.
+
 config SMP
 	bool "Symmetric multi-processing support"
 	---help---
diff -Nru a/arch/i386/kernel/apic.c b/arch/i386/kernel/apic.c
--- a/arch/i386/kernel/apic.c	2005-02-05 19:46:47 -08:00
+++ b/arch/i386/kernel/apic.c	2005-02-05 19:46:47 -08:00
@@ -26,6 +26,7 @@
 #include <linux/mc146818rtc.h>
 #include <linux/kernel_stat.h>
 #include <linux/sysdev.h>
+#include <linux/dyn-tick-timer.h>
 
 #include <asm/atomic.h>
 #include <asm/smp.h>
@@ -909,6 +910,8 @@
 
 #define APIC_DIVISOR 16
 
+static u32 apic_timer_val;
+
 void __setup_APIC_LVTT(unsigned int clocks)
 {
 	unsigned int lvtt_value, tmp_value, ver;
@@ -927,7 +930,15 @@
 				& ~(APIC_TDR_DIV_1 | APIC_TDR_DIV_TMBASE))
 				| APIC_TDR_DIV_16);
 
-	apic_write_around(APIC_TMICT, clocks/APIC_DIVISOR);
+	apic_timer_val = clocks/APIC_DIVISOR;
+
+#ifdef CONFIG_NO_IDLE_HZ
+	/* Local APIC timer is 24-bit */
+	if (apic_timer_val)
+		dyn_tick->max_skip = 0xffffff / apic_timer_val;
+#endif
+
+	apic_write_around(APIC_TMICT, apic_timer_val);
 }
 
 static void setup_APIC_timer(unsigned int clocks)
@@ -1040,6 +1051,13 @@
 	 */
 	setup_APIC_timer(calibration_result);
 
+#ifdef CONFIG_NO_IDLE_HZ
+	if (calibration_result)
+		dyn_tick->state |= DYN_TICK_USE_APIC;
+	else
+		printk(KERN_INFO "dyn-tick: Cannot use local APIC\n");
+#endif
+
 	local_irq_enable();
 }
 
@@ -1068,6 +1086,18 @@
 	}
 }
 
+#if defined(CONFIG_NO_IDLE_HZ)
+void reprogram_apic_timer(unsigned int count)
+{
+	unsigned long flags;
+
+	count *= apic_timer_val;
+	local_irq_save(flags);
+	apic_write_around(APIC_TMICT, count);
+	local_irq_restore(flags);
+}
+#endif
+
 /*
  * the frequency of the profiling timer can be changed
  * by writing a multiplier value into /proc/profile.
@@ -1160,6 +1190,7 @@
 
 fastcall void smp_apic_timer_interrupt(struct pt_regs *regs)
 {
+	unsigned long seq;
 	int cpu = smp_processor_id();
 
 	/*
@@ -1178,6 +1209,23 @@
 	 * interrupt lock, which is the WrongThing (tm) to do.
 	 */
 	irq_enter();
+
+#ifdef CONFIG_NO_IDLE_HZ
+	/*
+	 * Check if we need to wake up PIT interrupt handler.
+	 * Otherwise just wake up local APIC timer.
+	 */
+	do {
+		seq = read_seqbegin(&xtime_lock);
+		if (dyn_tick->state & (DYN_TICK_ENABLED | DYN_TICK_SKIPPING)) {
+			if (dyn_tick->skip_cpu == cpu && dyn_tick->skip > DYN_TICK_MIN_SKIP)
+				dyn_tick->interrupt(99, NULL, regs);
+			else
+				reprogram_apic_timer(1);
+		}
+	} while (read_seqretry(&xtime_lock, seq));
+#endif
+
 	smp_local_timer_interrupt(regs);
 	irq_exit();
 }
diff -Nru a/arch/i386/kernel/irq.c b/arch/i386/kernel/irq.c
--- a/arch/i386/kernel/irq.c	2005-02-05 19:46:47 -08:00
+++ b/arch/i386/kernel/irq.c	2005-02-05 19:46:47 -08:00
@@ -15,6 +15,7 @@
 #include <linux/seq_file.h>
 #include <linux/interrupt.h>
 #include <linux/kernel_stat.h>
+#include <linux/dyn-tick-timer.h>
 
 #ifndef CONFIG_X86_LOCAL_APIC
 /*
@@ -100,6 +101,11 @@
 	} else
 #endif
 		__do_IRQ(irq, regs);
+
+#ifdef CONFIG_NO_IDLE_HZ
+	if (dyn_tick->state & (DYN_TICK_ENABLED | DYN_TICK_SKIPPING) && irq != 0)
+		dyn_tick->interrupt(irq, NULL, regs);
+#endif
 
 	irq_exit();
 
diff -Nru a/arch/i386/kernel/time.c b/arch/i386/kernel/time.c
--- a/arch/i386/kernel/time.c	2005-02-05 19:46:47 -08:00
+++ b/arch/i386/kernel/time.c	2005-02-05 19:46:47 -08:00
@@ -46,6 +46,7 @@
 #include <linux/bcd.h>
 #include <linux/efi.h>
 #include <linux/mca.h>
+#include <linux/dyn-tick-timer.h>
 
 #include <asm/io.h>
 #include <asm/smp.h>
@@ -301,6 +302,60 @@
 	return IRQ_HANDLED;
 }
 
+#ifdef CONFIG_NO_IDLE_HZ
+static unsigned long long last_tick;
+void reprogram_pit_tick(int jiffies_to_skip);
+extern void replace_timer_interrupt(void * new_handler);
+
+#if defined(CONFIG_NO_IDLE_HZ) && defined(CONFIG_X86_LOCAL_APIC)
+extern void reprogram_apic_timer(unsigned int count);
+#else
+void reprogram_apic_timer(unsigned int count) {}
+#endif
+
+#ifdef DEBUG
+#define dbg_dyn_tick_irq() {if (skipped && skipped < dyn_tick->skip) \
+				printk("%u/%li ", skipped, dyn_tick->skip);}
+#else
+#define dbg_dyn_tick_irq() {}
+#endif
+
+
+
+/*
+ * This interrupt handler updates the time based on number of jiffies skipped
+ * It would be somewhat more optimized to have a customa handler in each timer
+ * using hardware ticks instead of nanoseconds. Note that CONFIG_NO_IDLE_HZ
+ * currently disables timer fallback on skipped jiffies.
+ */
+irqreturn_t dyn_tick_timer_interrupt(int irq, void *dev_id, struct pt_regs *regs)
+{
+	unsigned long flags;
+	volatile unsigned long long now;
+	unsigned int skipped = 0;
+	write_seqlock_irqsave(&xtime_lock, flags);
+	now = cur_timer->get_hw_time();
+	while (now - last_tick >= NS_TICK_LEN) {
+		last_tick += NS_TICK_LEN;
+		cur_timer->mark_offset();
+		do_timer_interrupt(irq, NULL, regs);
+		skipped++;
+	}
+	if (dyn_tick->state & (DYN_TICK_ENABLED | DYN_TICK_SKIPPING)) {
+		dbg_dyn_tick_irq();
+		dyn_tick->skip = 1;
+		if (cpu_has_local_apic())
+			reprogram_apic_timer(dyn_tick->skip);
+		reprogram_pit_tick(dyn_tick->skip);
+		dyn_tick->state |= DYN_TICK_ENABLED;
+		dyn_tick->state &= ~DYN_TICK_SKIPPING;
+	}
+	write_sequnlock_irqrestore(&xtime_lock, flags);
+
+	return IRQ_HANDLED;
+}
+#endif
+
 /* not static: needed by APM */
 unsigned long get_cmos_time(void)
 {
@@ -396,6 +451,72 @@
 }
 #endif
 
+#ifdef CONFIG_NO_IDLE_HZ
+static struct dyn_tick_timer arch_ltt;
+
+#if defined(CONFIG_X86_UP_APIC) || defined(CONFIG_SMP)
+void disable_pit_tick(void)
+{
+	extern spinlock_t i8253_lock;
+	unsigned long flags;
+	spin_lock_irqsave(&i8253_lock, flags);
+	outb_p(0x31, PIT_MODE);		/* binary, mode 1, LSB/MSB, ch 0 */
+	spin_unlock_irqrestore(&i8253_lock, flags);
+}
+#endif
+
+/*
+ * Reprograms the next timer interrupt
+ * PIT timer reprogramming code taken from APM code.
+ * Note that PIT timer is a 16-bit timer, which allows max
+ * skip of only few seconds.
+ */
+void reprogram_pit_tick(int jiffies_to_skip)
+{
+	int skip;
+	extern spinlock_t i8253_lock;
+	unsigned long flags;
+
+	skip = jiffies_to_skip * LATCH;
+	if (skip > 0xffff) {
+		skip = 0xffff;
+	}      
+
+	spin_lock_irqsave(&i8253_lock, flags);
+	outb_p(0x34, PIT_MODE);		/* binary, mode 2, LSB/MSB, ch 0 */
+	outb_p(skip & 0xff, PIT_CH0);	/* LSB */
+	outb(skip >> 8, PIT_CH0);	/* MSB */
+	spin_unlock_irqrestore(&i8253_lock, flags);
+}
+
+static int __init dyn_tick_late_init(void)
+{
+	unsigned long flags;
+
+	if (!cur_timer->get_hw_time)
+		return -ENODEV;
+	write_seqlock_irqsave(&xtime_lock, flags);
+	last_tick = cur_timer->get_hw_time();
+	dyn_tick->skip = 1;
+	if (!cpu_has_local_apic())
+		dyn_tick->max_skip = 0xffff/LATCH;	/* PIT timer length */
+	printk(KERN_INFO "dyn-tick: Maximum ticks to skip limited to %i\n",
+	       dyn_tick->max_skip);
+	write_sequnlock_irqrestore(&xtime_lock, flags);
+
+	if (cur_timer->late_init)
+		cur_timer->late_init();
+	dyn_tick->interrupt = dyn_tick_timer_interrupt;
+	replace_timer_interrupt(dyn_tick->interrupt);
+
+	write_seqlock_irqsave(&xtime_lock, flags);
+	dyn_tick->state |= DYN_TICK_ENABLED;
+	write_sequnlock_irqrestore(&xtime_lock, flags);
+
+	return 0;
+}
+#endif
+
 void __init time_init(void)
 {
 #ifdef CONFIG_HPET_TIMER
@@ -415,6 +536,16 @@
 
 	cur_timer = select_timer();
 	printk(KERN_INFO "Using %s for high-res timesource\n",cur_timer->name);
+
+#ifdef CONFIG_NO_IDLE_HZ
+	if (strncmp(cur_timer->name, "tsc", 3) == 0 ||
+	    strncmp(cur_timer->name, "pmtmr", 3) == 0) {
+		arch_ltt.init = dyn_tick_late_init;
+		dyn_tick_register(&arch_ltt);
+	} else
+		printk(KERN_INFO "dyn-tick: Cannot use timer %s\n",
+		       cur_timer->name);
+#endif
 
 	time_init_hook();
 }
diff -Nru a/arch/i386/kernel/timers/timer_pm.c b/arch/i386/kernel/timers/timer_pm.c
--- a/arch/i386/kernel/timers/timer_pm.c	2005-02-05 19:46:47 -08:00
+++ b/arch/i386/kernel/timers/timer_pm.c	2005-02-05 19:46:47 -08:00
@@ -15,6 +15,7 @@
 #include <linux/module.h>
 #include <linux/device.h>
 #include <linux/init.h>
+#include <linux/dyn-tick-timer.h>
 #include <asm/types.h>
 #include <asm/timer.h>
 #include <asm/smp.h>
@@ -168,6 +169,7 @@
 	monotonic_base += delta * NSEC_PER_USEC;
 	write_sequnlock(&monotonic_lock);
 
+#ifndef CONFIG_NO_IDLE_HZ
 	/* convert to ticks */
 	delta += offset_delay;
 	lost = delta / (USEC_PER_SEC / HZ);
@@ -184,6 +186,7 @@
 		first_run = 0;
 		offset_delay = 0;
 	}
+#endif
 }
 
 
@@ -238,6 +241,25 @@
 	return (unsigned long) offset_delay + cyc2us(delta);
 }
 
+static unsigned long long ns_time;
+
+static unsigned long long get_hw_time_pmtmr(void)
+{
+	u32 now, delta;
+	static unsigned int last_cycles;
+	now = read_pmtmr();
+	delta = (now - last_cycles) & ACPI_PM_MASK;
+	last_cycles = now;
+	ns_time += cyc2us(delta) * NSEC_PER_USEC;
+	return ns_time;
+}
+
+static void late_init_pmtmr(void)
+{
+	ns_time = monotonic_clock_pmtmr();
+}
+
+extern irqreturn_t pmtmr_interrupt(int irq, void *dev_id, struct pt_regs *regs);
 
 /* acpi timer_opts struct */
 static struct timer_opts timer_pmtmr = {
@@ -245,7 +267,9 @@
 	.mark_offset		= mark_offset_pmtmr,
 	.get_offset		= get_offset_pmtmr,
 	.monotonic_clock 	= monotonic_clock_pmtmr,
+	.get_hw_time		= get_hw_time_pmtmr,
 	.delay 			= delay_pmtmr,
+	.late_init		= late_init_pmtmr,
 };
 
 struct init_timer_opts __initdata timer_pmtmr_init = {
diff -Nru a/arch/i386/kernel/timers/timer_tsc.c b/arch/i386/kernel/timers/timer_tsc.c
--- a/arch/i386/kernel/timers/timer_tsc.c	2005-02-05 19:46:47 -08:00
+++ b/arch/i386/kernel/timers/timer_tsc.c	2005-02-05 19:46:47 -08:00
@@ -112,6 +112,15 @@
 	return delay_at_last_interrupt + edx;
 }
 
+static unsigned long get_hw_time_tsc(void)
+{
+	register unsigned long eax, edx;
+
+	unsigned long long hw_time;
+	rdtscll(hw_time);
+	return cycles_2_ns(hw_time);
+}
+
 static unsigned long long monotonic_clock_tsc(void)
 {
 	unsigned long long last_offset, this_offset, base;
@@ -348,6 +357,7 @@
 
 	rdtsc(last_tsc_low, last_tsc_high);
 
+#ifndef CONFIG_NO_IDLE_HZ
 	spin_lock(&i8253_lock);
 	outb_p(0x00, PIT_MODE);     /* latch the count ASAP */
 
@@ -415,11 +425,14 @@
 			cpufreq_delayed_get();
 	} else
 		lost_count = 0;
+#endif
+
 	/* update the monotonic base value */
 	this_offset = ((unsigned long long)last_tsc_high<<32)|last_tsc_low;
 	monotonic_base += cycles_2_ns(this_offset - last_offset);
 	write_sequnlock(&monotonic_lock);
 
+#ifndef CONFIG_NO_IDLE_HZ
 	/* calculate delay_at_last_interrupt */
 	count = ((LATCH-1) - count) * TICK_SIZE;
 	delay_at_last_interrupt = (count + LATCH/2) / LATCH;
@@ -430,6 +443,7 @@
 	 */
 	if (lost && abs(delay - delay_at_last_interrupt) > (900000/HZ))
 		jiffies_64++;
+#endif
 }
 
 static int __init init_tsc(char* override)
@@ -551,6 +565,7 @@
 	.mark_offset = mark_offset_tsc, 
 	.get_offset = get_offset_tsc,
 	.monotonic_clock = monotonic_clock_tsc,
+	.get_hw_time = get_hw_time_tsc,
 	.delay = delay_tsc,
 };
 
diff -Nru a/arch/i386/mach-default/setup.c b/arch/i386/mach-default/setup.c
--- a/arch/i386/mach-default/setup.c	2005-02-05 19:46:47 -08:00
+++ b/arch/i386/mach-default/setup.c	2005-02-05 19:46:47 -08:00
@@ -85,6 +85,22 @@
 	setup_irq(0, &irq0);
 }
 
+/**
+ * replace_timer_interrupt - allow replacing timer interrupt handler
+ *
+ * Description:
+ *	Can be used to replace timer interrupt handler with a more optimized
+ *	handler. Used for enabling and disabling of CONFIG_NO_IDLE_HZ.
+ */
+void replace_timer_interrupt(void * new_handler)
+{
+	unsigned long flags;
+
+	write_seqlock_irqsave(&xtime_lock, flags);
+	irq0.handler = new_handler;
+	write_sequnlock_irqrestore(&xtime_lock, flags);
+}
+
 #ifdef CONFIG_MCA
 /**
  * mca_nmi_hook - hook into MCA specific NMI chain
diff -Nru a/include/asm-i386/timer.h b/include/asm-i386/timer.h
--- a/include/asm-i386/timer.h	2005-02-05 19:46:47 -08:00
+++ b/include/asm-i386/timer.h	2005-02-05 19:46:47 -08:00
@@ -1,6 +1,7 @@
 #ifndef _ASMi386_TIMER_H
 #define _ASMi386_TIMER_H
 #include <linux/init.h>
+#include <linux/interrupt.h>
 
 /**
  * struct timer_ops - used to define a timer source
@@ -21,7 +22,9 @@
 	void (*mark_offset)(void);
 	unsigned long (*get_offset)(void);
 	unsigned long long (*monotonic_clock)(void);
+	unsigned long long (*get_hw_time)(void);
 	void (*delay)(unsigned long);
+	void (*late_init)(void);
 };
 
 struct init_timer_opts {
diff -Nru a/include/linux/dyn-tick-timer.h b/include/linux/dyn-tick-timer.h
--- /dev/null	Wed Dec 31 16:00:00 196900
+++ b/include/linux/dyn-tick-timer.h	2005-02-05 19:46:47 -08:00
@@ -0,0 +1,71 @@
+/*
+ * linux/include/linux/dyn-tick-timer.h
+ *
+ * Copyright (C) 2004 Nokia Corporation
+ * Written by Tony Lindgen <tony@atomide.com> and
+ * Tuukka Tikkanen <tuukka.tikkanen@elektrobit.com>
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms of the GNU General Public License as published by the
+ * Free Software Foundation; either version 2 of the License, or (at your
+ * option) any later version.
+ *
+ * THIS SOFTWARE IS PROVIDED ``AS IS'' AND ANY EXPRESS OR IMPLIED
+ * WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF
+ * MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN
+ * NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY DIRECT, INDIRECT,
+ * INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT
+ * NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF
+ * USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON
+ * ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF
+ * THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ *
+ * You should have received a copy of the GNU General Public License along
+ * with this program; if not, write to the Free Software Foundation, Inc.,
+ * 675 Mass Ave, Cambridge, MA 02139, USA.
+ */
+
+#include <linux/interrupt.h>
+
+#define DYN_TICK_USE_APIC	(1 << 2)
+#define DYN_TICK_SKIPPING	(1 << 1)
+#define DYN_TICK_ENABLED	(1 << 0)
+
+struct dyn_tick_state {
+	unsigned int state;		/* Current state */
+	int skip_cpu;			/* Skip handling processor */
+	unsigned long skip;		/* Ticks to skip */
+	unsigned int max_skip;		/* Max number of ticks to skip */
+	unsigned long irq_skip_mask;	/* Do not update time from these irqs */
+	irqreturn_t (*interrupt)(int, void *, struct pt_regs *);
+};
+
+/* REVISIT: Add functions to enable/disable dyn-tick on the fly */
+struct dyn_tick_timer {
+	int (*init) (void);
+};
+
+extern struct dyn_tick_state * dyn_tick;
+extern void dyn_tick_register(struct dyn_tick_timer * new_timer);
+
+#define NS_TICK_LEN		((1 * 1000000000)/HZ)
+#define DYN_TICK_MIN_SKIP	2
+
+#if defined(CONFIG_SMP) || defined(CONFIG_X86_UP_APIC)
+#define cpu_has_local_apic()	(dyn_tick->state & DYN_TICK_USE_APIC)
+#else
+#define cpu_has_local_apic()	0
+#endif
+
+#ifdef CONFIG_NO_IDLE_HZ
+
+#if defined(CONFIG_X86) || defined(CONFIG_IA64) || defined(CONFIG_X86_64)
+#define arch_has_safe_halt()	1
+#endif
+
+#else
+
+#define arch_has_safe_halt()	0
+
+#endif
diff -Nru a/kernel/Makefile b/kernel/Makefile
--- a/kernel/Makefile	2005-02-05 19:46:47 -08:00
+++ b/kernel/Makefile	2005-02-05 19:46:47 -08:00
@@ -26,6 +26,7 @@
 obj-$(CONFIG_KPROBES) += kprobes.o
 obj-$(CONFIG_SYSFS) += ksysfs.o
 obj-$(CONFIG_GENERIC_HARDIRQS) += irq/
+obj-$(CONFIG_NO_IDLE_HZ) += dyn-tick-timer.o
 
 ifneq ($(CONFIG_IA64),y)
 # According to Alan Modra <alan@linuxcare.com.au>, the -fno-omit-frame-pointer is
diff -Nru a/kernel/dyn-tick-timer.c b/kernel/dyn-tick-timer.c
--- /dev/null	Wed Dec 31 16:00:00 196900
+++ b/kernel/dyn-tick-timer.c	2005-02-05 19:46:47 -08:00
@@ -0,0 +1,149 @@
+/*
+ * linux/arch/i386/kernel/dyn-tick.c
+ *
+ * Beginnings of generic dynamic tick timer support
+ *
+ * Copyright (C) 2004 Nokia Corporation
+ * Written by Tony Lindgen <tony@atomide.com> and
+ * Tuukka Tikkanen <tuukka.tikkanen@elektrobit.com>
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms of the GNU General Public License as published by the
+ * Free Software Foundation; either version 2 of the License, or (at your
+ * option) any later version.
+ *
+ * THIS SOFTWARE IS PROVIDED ``AS IS'' AND ANY EXPRESS OR IMPLIED
+ * WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF
+ * MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN
+ * NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY DIRECT, INDIRECT,
+ * INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT
+ * NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF
+ * USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON
+ * ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF
+ * THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ *
+ * You should have received a copy of the GNU General Public License along
+ * with this program; if not, write to the Free Software Foundation, Inc.,
+ * 675 Mass Ave, Cambridge, MA 02139, USA.
+ *
+ *
+ * TODO:
+ * - Add functions for enabling/disabling dyn-tick on the fly
+ * - Generalize to work with ARM sys_timer
+ */
+
+#include <linux/version.h>
+#include <linux/config.h>
+#include <linux/kernel.h>
+#include <linux/init.h>
+#include <linux/module.h>
+#include <linux/interrupt.h>
+#include <linux/cpumask.h>
+#include <linux/pm.h>
+#include <linux/dyn-tick-timer.h>
+#include <asm/io.h>
+
+#include "io_ports.h"
+
+#define VERSION	050205-1
+
+struct dyn_tick_state dyn_tick_state;
+struct dyn_tick_state * dyn_tick = &dyn_tick_state;
+struct dyn_tick_timer dyn_tick_timer;
+struct dyn_tick_timer * dyn_tick_cfg = &dyn_tick_timer;
+static void (*orig_idle) (void) = 0;
+extern void disable_pit_tick(void);
+extern void reprogram_pit_tick(int jiffies_to_skip);
+extern void reprogram_apic_timer(unsigned int count);
+extern void reprogram_pit_tick(int jiffies_to_skip);
+static cpumask_t dyn_cpu_map;
+
+/*
+ * We want to have all processors idle before reprogramming the next
+ * timer interrupt. Note that we must maintain the state for dynamic tick,
+ * otherwise the idle loop could be reprogramming the timer continuously
+ * further into the future, and the timer interrupt would never happen.
+ */
+static void dyn_tick_idle(void)
+{
+	int cpu;
+	unsigned long flags;
+	cpumask_t idle_cpus;
+	unsigned long next;
+
+	if (!(dyn_tick->state & DYN_TICK_ENABLED))
+		goto out;
+
+	/* Check if we are already skipping ticks and can idle other cpus */
+	if (dyn_tick->state & DYN_TICK_SKIPPING) {
+		reprogram_apic_timer(dyn_tick->skip);
+		goto out;
+	}
+
+	/* Check if we can start skipping ticks */
+	write_seqlock_irqsave(&xtime_lock, flags);
+	cpu = smp_processor_id();
+	cpu_set(cpu, dyn_cpu_map);
+	cpus_and(idle_cpus, dyn_cpu_map, cpu_online_map);
+	if (cpus_equal(idle_cpus, cpu_online_map)) {
+		next = next_timer_interrupt();
+		if (jiffies > next) {
+			//printk("Too late? next: %lu jiffies: %lu\n",
+			//       next, jjiffies);
+			dyn_tick->skip = 1;
+		} else
+			dyn_tick->skip = next_timer_interrupt() - jiffies;
+		if (dyn_tick->skip > DYN_TICK_MIN_SKIP) {
+			if (dyn_tick->skip > dyn_tick->max_skip)
+				dyn_tick->skip = dyn_tick->max_skip;
+			if (cpu_has_local_apic()) {
+				disable_pit_tick();
+				reprogram_apic_timer(dyn_tick->skip);
+			} else
+				reprogram_pit_tick(dyn_tick->skip);
+			dyn_tick->skip_cpu = cpu;
+			dyn_tick->state |= DYN_TICK_SKIPPING;
+		}
+		cpus_clear(dyn_cpu_map);
+	}
+	write_sequnlock_irqrestore(&xtime_lock, flags);
+
+out:
+	if (orig_idle)
+		orig_idle();
+	else if (arch_has_safe_halt())
+		safe_halt();
+}
+
+void __init dyn_tick_register(struct dyn_tick_timer * new_timer)
+{
+	dyn_tick_cfg->init = new_timer->init;
+	printk(KERN_INFO "dyn-tick: Registering dynamic tick timer\n");
+}
+
+/*
+ * We need to initialize dynamic tick after calibrate delay
+ */
+static int __init dyn_tick_init(void)
+{
+	int ret = 0;
+
+	if (dyn_tick_cfg->init == NULL)
+		return -ENODEV;
+
+	ret = dyn_tick_cfg->init();
+	if (ret != 0) {
+		printk(KERN_WARNING "dyn-tick: Init failed\n");
+		return -ENODEV;
+	}
+	orig_idle = pm_idle;
+	pm_idle = dyn_tick_idle;
+#if (LINUX_VERSION_CODE > KERNEL_VERSION(2,6,10))
+	cpu_idle_wait();
+#endif
+	printk(KERN_INFO "dyn-tick: Timer using dynamic tick\n");
+
+	return ret;
+}
+late_initcall(dyn_tick_init);

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH] Dynamic tick, version 050127-1
  2005-02-06  2:33                     ` Tony Lindgren
  2005-02-06  3:54                       ` Tony Lindgren
@ 2005-02-06  8:11                       ` Pavel Machek
  2005-02-06  8:53                         ` Lee Revell
  2005-02-06 17:10                         ` Tony Lindgren
  1 sibling, 2 replies; 38+ messages in thread
From: Pavel Machek @ 2005-02-06  8:11 UTC (permalink / raw)
  To: Tony Lindgren
  Cc: Benjamin Herrenschmidt, Arjan van de Ven, Martin Schwidefsky,
	Andrea Arcangeli, George Anzinger, Thomas Gleixner, john stultz,
	Zwane Mwaikambo, Lee Revell, linux-kernel

Hi!

> > > Currently the suggested combo is local APIC + ACPI PM timer...
> > 
> > Ok, works slightly better: time no longer runs 2x too fast. When TSC
> > is used, I get same behaviour  as before ("sleepy machine"). With
> > "notsc", machine seems to work okay, but I still get 1000 timer
> > interrupts a second.
> 
> Sounds like dyn-tick did not get enabled then, maybe you don't have
> CONFIG_X86_PM_TIMER, or don't have ACPI PM timer on your board?

I do have CONFIG_X86_PM_TIMER enabled, but it seems by board does not
have such piece of hardware:

pavel@amd:/usr/src/linux-mm$ dmesg | grep -i "time\|tick\|apic"
PCI: Setting latency timer of device 0000:00:11.5 to 64
pavel@amd:/usr/src/linux-mm$ 

[Strange, I should see some messages about apic, no?]
								Pavel
-- 
People were complaining that M$ turns users into beta-testers...
...jr ghea gurz vagb qrirybcref, naq gurl frrz gb yvxr vg gung jnl!

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH] Dynamic tick, version 050127-1
  2005-02-06  3:54                       ` Tony Lindgren
@ 2005-02-06  8:41                         ` Pavel Machek
  2005-02-06  8:50                         ` Pavel Machek
  2005-02-06 12:15                         ` Pavel Machek
  2 siblings, 0 replies; 38+ messages in thread
From: Pavel Machek @ 2005-02-06  8:41 UTC (permalink / raw)
  To: Tony Lindgren
  Cc: Benjamin Herrenschmidt, Arjan van de Ven, Martin Schwidefsky,
	Andrea Arcangeli, George Anzinger, Thomas Gleixner, john stultz,
	Zwane Mwaikambo, Lee Revell, linux-kernel

Hi!

> > > Ok, works slightly better: time no longer runs 2x too fast. When TSC
> > > is used, I get same behaviour  as before ("sleepy machine"). With
> > > "notsc", machine seems to work okay, but I still get 1000 timer
> > > interrupts a second.
> 
> ...
> 
> > 
> > Sounds like your system is not running with the dyn-tick... I'll try
> > to fix that TSC bug.
> 
> The following patch fixes TSC timer with dyn-tick, and local APIC
> timer on UP system with CONFIG_SMP.

Tried that and got same "sleepy machine" behaviour. But now I realized
that I have PREEMPT and CPUFREQ enabled; I'll try disabling them.
								Pavel
-- 
People were complaining that M$ turns users into beta-testers...
...jr ghea gurz vagb qrirybcref, naq gurl frrz gb yvxr vg gung jnl!

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH] Dynamic tick, version 050127-1
  2005-02-06  3:54                       ` Tony Lindgren
  2005-02-06  8:41                         ` Pavel Machek
@ 2005-02-06  8:50                         ` Pavel Machek
  2005-02-06 17:07                           ` Tony Lindgren
  2005-02-06 12:15                         ` Pavel Machek
  2 siblings, 1 reply; 38+ messages in thread
From: Pavel Machek @ 2005-02-06  8:50 UTC (permalink / raw)
  To: Tony Lindgren; +Cc: linux-kernel

Hi!

> > > Ok, works slightly better: time no longer runs 2x too fast. When TSC
> > > is used, I get same behaviour  as before ("sleepy machine"). With
> > > "notsc", machine seems to work okay, but I still get 1000 timer
> > > interrupts a second.
> > 
> > Sounds like your system is not running with the dyn-tick... I'll try
> > to fix that TSC bug.
> 
> The following patch fixes TSC timer with dyn-tick, and local APIC
> timer on UP system with CONFIG_SMP.

I disabled CPUFREQ & PREEMPT, but still get "sleepy machine"
behaviour. Could you perhaps send me your .config?

							Pavel

-- 
People were complaining that M$ turns users into beta-testers...
...jr ghea gurz vagb qrirybcref, naq gurl frrz gb yvxr vg gung jnl!

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH] Dynamic tick, version 050127-1
  2005-02-06  8:11                       ` Pavel Machek
@ 2005-02-06  8:53                         ` Lee Revell
  2005-02-06 10:25                           ` Pavel Machek
  2005-02-06 17:10                         ` Tony Lindgren
  1 sibling, 1 reply; 38+ messages in thread
From: Lee Revell @ 2005-02-06  8:53 UTC (permalink / raw)
  To: Pavel Machek
  Cc: Tony Lindgren, Benjamin Herrenschmidt, Arjan van de Ven,
	Martin Schwidefsky, Andrea Arcangeli, George Anzinger,
	Thomas Gleixner, john stultz, Zwane Mwaikambo, linux-kernel

On Sun, 2005-02-06 at 09:11 +0100, Pavel Machek wrote:
> I do have CONFIG_X86_PM_TIMER enabled, but it seems by board does not
> have such piece of hardware:
> 
> pavel@amd:/usr/src/linux-mm$ dmesg | grep -i "time\|tick\|apic"
> PCI: Setting latency timer of device 0000:00:11.5 to 64
> pavel@amd:/usr/src/linux-mm$ 

If you are sure that machine supports ACPI, maybe this is your problem
(from the POSIX high res timer patch):

          If you enable the ACPI pm timer and it cannot be found, it is
          possible that your BIOS is not producing the ACPI table or
          that your machine does not support ACPI.  In the former case,
          see "Default ACPI pm timer address".  If the timer is not
          found the boot will fail when trying to calibrate the 'delay'
          loop.

[...]


config HIGH_RES_TIMER_ACPI_PM_ADD
        int "Default ACPI pm timer address"
        depends on HIGH_RES_TIMER_ACPI_PM
        default 0
        help
          This option is available for use on systems where the BIOS
          does not generate the ACPI tables if ACPI is not enabled.  For
          example some BIOSes will not generate the ACPI tables if APM
          is enabled.  The ACPI pm timer is still available but cannot
          be found by the software.  This option allows you to supply
          the needed address.  When the high resolution timers code
          finds a valid ACPI pm timer address it reports it in the boot
          messages log (look for lines that begin with
          "High-res-timers:").  You can turn on the ACPI support in the
          BIOS, boot the system and find this value.  You can then enter
          it at configure time.  Both the report and the entry are in
          decimal.

HTH,

Lee


^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH] Dynamic tick, version 050127-1
  2005-02-06  8:53                         ` Lee Revell
@ 2005-02-06 10:25                           ` Pavel Machek
  2005-02-07 22:08                             ` George Anzinger
  0 siblings, 1 reply; 38+ messages in thread
From: Pavel Machek @ 2005-02-06 10:25 UTC (permalink / raw)
  To: Lee Revell
  Cc: Tony Lindgren, Benjamin Herrenschmidt, Arjan van de Ven,
	Martin Schwidefsky, Andrea Arcangeli, George Anzinger,
	Thomas Gleixner, john stultz, Zwane Mwaikambo, linux-kernel

Hi!

> > I do have CONFIG_X86_PM_TIMER enabled, but it seems by board does not
> > have such piece of hardware:
> > 
> > pavel@amd:/usr/src/linux-mm$ dmesg | grep -i "time\|tick\|apic"
> > PCI: Setting latency timer of device 0000:00:11.5 to 64
> > pavel@amd:/usr/src/linux-mm$ 
> 
> If you are sure that machine supports ACPI, maybe this is your problem
> (from the POSIX high res timer patch):
> 
>           If you enable the ACPI pm timer and it cannot be found, it is
>           possible that your BIOS is not producing the ACPI table or
>           that your machine does not support ACPI.  In the former case,
>           see "Default ACPI pm timer address".  If the timer is not
>           found the boot will fail when trying to calibrate the 'delay'
>           loop.

Well, but how do I get the address? I'll try looking at BIOS
options...
								Pavel
-- 
People were complaining that M$ turns users into beta-testers...
...jr ghea gurz vagb qrirybcref, naq gurl frrz gb yvxr vg gung jnl!

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH] Dynamic tick, version 050127-1
  2005-02-06  3:54                       ` Tony Lindgren
  2005-02-06  8:41                         ` Pavel Machek
  2005-02-06  8:50                         ` Pavel Machek
@ 2005-02-06 12:15                         ` Pavel Machek
  2005-02-06 17:08                           ` Tony Lindgren
  2 siblings, 1 reply; 38+ messages in thread
From: Pavel Machek @ 2005-02-06 12:15 UTC (permalink / raw)
  To: Tony Lindgren, kernel list

Hi!

> +extern void disable_pit_tick(void);
> +extern void reprogram_pit_tick(int jiffies_to_skip);
> +extern void reprogram_apic_timer(unsigned int count);
> +extern void reprogram_pit_tick(int jiffies_to_skip);

reprogram_pit_tick is here twice; but perhaps this should be moved to
some kind of header file.
									Pavel
-- 
People were complaining that M$ turns users into beta-testers...
...jr ghea gurz vagb qrirybcref, naq gurl frrz gb yvxr vg gung jnl!

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH] Dynamic tick, version 050127-1
  2005-02-06  8:50                         ` Pavel Machek
@ 2005-02-06 17:07                           ` Tony Lindgren
  0 siblings, 0 replies; 38+ messages in thread
From: Tony Lindgren @ 2005-02-06 17:07 UTC (permalink / raw)
  To: Pavel Machek; +Cc: linux-kernel

[-- Attachment #1: Type: text/plain, Size: 897 bytes --]

* Pavel Machek <pavel@ucw.cz> [050206 00:50]:
> Hi!
> 
> > > > Ok, works slightly better: time no longer runs 2x too fast. When TSC
> > > > is used, I get same behaviour  as before ("sleepy machine"). With
> > > > "notsc", machine seems to work okay, but I still get 1000 timer
> > > > interrupts a second.
> > > 
> > > Sounds like your system is not running with the dyn-tick... I'll try
> > > to fix that TSC bug.
> > 
> > The following patch fixes TSC timer with dyn-tick, and local APIC
> > timer on UP system with CONFIG_SMP.
> 
> I disabled CPUFREQ & PREEMPT, but still get "sleepy machine"
> behaviour. Could you perhaps send me your .config?

I don't have CPUFREQ on but I do have PREEMPT. Attached is my
generic config that I'm using on my Celeron box and my dual Athlon
box for testing with NFS root. I don't think your problems are caused
by the Kconfig options any longer though.

Tony

[-- Attachment #2: config.gz --]
[-- Type: application/x-gunzip, Size: 7112 bytes --]

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH] Dynamic tick, version 050127-1
  2005-02-06 12:15                         ` Pavel Machek
@ 2005-02-06 17:08                           ` Tony Lindgren
  0 siblings, 0 replies; 38+ messages in thread
From: Tony Lindgren @ 2005-02-06 17:08 UTC (permalink / raw)
  To: Pavel Machek; +Cc: kernel list

* Pavel Machek <pavel@ucw.cz> [050206 04:15]:
> Hi!
> 
> > +extern void disable_pit_tick(void);
> > +extern void reprogram_pit_tick(int jiffies_to_skip);
> > +extern void reprogram_apic_timer(unsigned int count);
> > +extern void reprogram_pit_tick(int jiffies_to_skip);
> 
> reprogram_pit_tick is here twice; but perhaps this should be moved to
> some kind of header file.

Yeah, and the function itself should be in timer_pit.c.

Tony

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH] Dynamic tick, version 050127-1
  2005-02-06  8:11                       ` Pavel Machek
  2005-02-06  8:53                         ` Lee Revell
@ 2005-02-06 17:10                         ` Tony Lindgren
  2005-02-06 18:34                           ` Pavel Machek
  1 sibling, 1 reply; 38+ messages in thread
From: Tony Lindgren @ 2005-02-06 17:10 UTC (permalink / raw)
  To: Pavel Machek
  Cc: Benjamin Herrenschmidt, Arjan van de Ven, Martin Schwidefsky,
	Andrea Arcangeli, George Anzinger, Thomas Gleixner, john stultz,
	Zwane Mwaikambo, Lee Revell, linux-kernel

* Pavel Machek <pavel@ucw.cz> [050206 00:20]:
> Hi!
> 
> > > > Currently the suggested combo is local APIC + ACPI PM timer...
> > > 
> > > Ok, works slightly better: time no longer runs 2x too fast. When TSC
> > > is used, I get same behaviour  as before ("sleepy machine"). With
> > > "notsc", machine seems to work okay, but I still get 1000 timer
> > > interrupts a second.
> > 
> > Sounds like dyn-tick did not get enabled then, maybe you don't have
> > CONFIG_X86_PM_TIMER, or don't have ACPI PM timer on your board?
> 
> I do have CONFIG_X86_PM_TIMER enabled, but it seems by board does not
> have such piece of hardware:
> 
> pavel@amd:/usr/src/linux-mm$ dmesg | grep -i "time\|tick\|apic"
> PCI: Setting latency timer of device 0000:00:11.5 to 64
> pavel@amd:/usr/src/linux-mm$ 
> 
> [Strange, I should see some messages about apic, no?]

Yeah, looks like you don't have a local APIC then? Let me test the
patch here with just PIT timer only.

It also looks like you don't have TSC either? Or do you still have
notsc cmdline option?

Tony

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH] Dynamic tick, version 050127-1
  2005-02-06 17:10                         ` Tony Lindgren
@ 2005-02-06 18:34                           ` Pavel Machek
  0 siblings, 0 replies; 38+ messages in thread
From: Pavel Machek @ 2005-02-06 18:34 UTC (permalink / raw)
  To: Tony Lindgren
  Cc: Benjamin Herrenschmidt, Arjan van de Ven, Martin Schwidefsky,
	Andrea Arcangeli, George Anzinger, Thomas Gleixner, john stultz,
	Zwane Mwaikambo, Lee Revell, linux-kernel

Hi!

> > > > Ok, works slightly better: time no longer runs 2x too fast. When TSC
> > > > is used, I get same behaviour  as before ("sleepy machine"). With
> > > > "notsc", machine seems to work okay, but I still get 1000 timer
> > > > interrupts a second.
> > > 
> > > Sounds like dyn-tick did not get enabled then, maybe you don't have
> > > CONFIG_X86_PM_TIMER, or don't have ACPI PM timer on your board?
> > 
> > I do have CONFIG_X86_PM_TIMER enabled, but it seems by board does not
> > have such piece of hardware:
> > 
> > pavel@amd:/usr/src/linux-mm$ dmesg | grep -i "time\|tick\|apic"
> > PCI: Setting latency timer of device 0000:00:11.5 to 64
> > pavel@amd:/usr/src/linux-mm$ 
> > 
> > [Strange, I should see some messages about apic, no?]
> 
> Yeah, looks like you don't have a local APIC then? Let me test the
> patch here with just PIT timer only.
> 
> It also looks like you don't have TSC either? Or do you still have
> notsc cmdline option?

It definitely does have TSC, it is athlon64. It was probably disabled
in this particular run.
								Pavel
-- 
People were complaining that M$ turns users into beta-testers...
...jr ghea gurz vagb qrirybcref, naq gurl frrz gb yvxr vg gung jnl!

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH] Dynamic tick, version 050127-1
  2005-02-06 10:25                           ` Pavel Machek
@ 2005-02-07 22:08                             ` George Anzinger
  0 siblings, 0 replies; 38+ messages in thread
From: George Anzinger @ 2005-02-07 22:08 UTC (permalink / raw)
  To: Pavel Machek
  Cc: Lee Revell, Tony Lindgren, Benjamin Herrenschmidt,
	Arjan van de Ven, Martin Schwidefsky, Andrea Arcangeli,
	Thomas Gleixner, john stultz, Zwane Mwaikambo, linux-kernel

Pavel Machek wrote:
> Hi!
> 
> 
>>>I do have CONFIG_X86_PM_TIMER enabled, but it seems by board does not
>>>have such piece of hardware:
>>>
>>>pavel@amd:/usr/src/linux-mm$ dmesg | grep -i "time\|tick\|apic"
>>>PCI: Setting latency timer of device 0000:00:11.5 to 64
>>>pavel@amd:/usr/src/linux-mm$ 
>>
>>If you are sure that machine supports ACPI, maybe this is your problem
>>(from the POSIX high res timer patch):
>>
>>          If you enable the ACPI pm timer and it cannot be found, it is
>>          possible that your BIOS is not producing the ACPI table or
>>          that your machine does not support ACPI.  In the former case,
>>          see "Default ACPI pm timer address".  If the timer is not
>>          found the boot will fail when trying to calibrate the 'delay'
>>          loop.
> 
> 
> Well, but how do I get the address? I'll try looking at BIOS
> options...
> 								Pavel
In my machine, if I turned off the PM code (in the BIOS) (or possibly turning on 
the ACPI, again in the BIOS) it did produce the address.  Booting then would put 
that address in the dmesg file.  You can then change the BIOS back to what it 
was and use the address found in the dmesg file.
-- 
George Anzinger   george@mvista.com
High-res-timers:  http://sourceforge.net/projects/high-res-timers/


^ permalink raw reply	[flat|nested] 38+ messages in thread

end of thread, other threads:[~2005-02-07 22:08 UTC | newest]

Thread overview: 38+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2005-01-27 21:29 [PATCH] Dynamic tick, version 050127-1 Tony Lindgren
2005-01-27 21:50 ` Tony Lindgren
2005-02-01 11:00 ` Pavel Machek
2005-02-01 20:40   ` Tony Lindgren
2005-02-01 21:25     ` Pavel Machek
2005-02-01 23:03       ` Tony Lindgren
2005-02-02 13:50         ` Pavel Machek
2005-02-02 13:50         ` Pavel Machek
2005-02-02 13:56         ` Pavel Machek
2005-02-02 14:11         ` Pavel Machek
2005-02-03  3:04           ` Tony Lindgren
2005-02-03 10:56             ` Pavel Machek
2005-02-03 16:43               ` Tony Lindgren
2005-02-04  5:19                 ` Tony Lindgren
2005-02-04  6:33                   ` Zwane Mwaikambo
2005-02-04 17:18                     ` Tony Lindgren
2005-02-04 17:31                       ` Zwane Mwaikambo
2005-02-04 17:42                         ` Tony Lindgren
2005-02-04 17:54                           ` Zwane Mwaikambo
2005-02-04 18:58                             ` Tony Lindgren
2005-02-04 19:24                               ` Tony Lindgren
2005-02-05 23:00                   ` Pavel Machek
2005-02-06  2:33                     ` Tony Lindgren
2005-02-06  3:54                       ` Tony Lindgren
2005-02-06  8:41                         ` Pavel Machek
2005-02-06  8:50                         ` Pavel Machek
2005-02-06 17:07                           ` Tony Lindgren
2005-02-06 12:15                         ` Pavel Machek
2005-02-06 17:08                           ` Tony Lindgren
2005-02-06  8:11                       ` Pavel Machek
2005-02-06  8:53                         ` Lee Revell
2005-02-06 10:25                           ` Pavel Machek
2005-02-07 22:08                             ` George Anzinger
2005-02-06 17:10                         ` Tony Lindgren
2005-02-06 18:34                           ` Pavel Machek
2005-02-01 20:20 ` Lee Revell
2005-02-01 23:42   ` Tony Lindgren
2005-02-02  1:06   ` Eric St-Laurent

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).