All of lore.kernel.org
 help / color / mirror / Atom feed
* [2.6.37-rc5] Timer or ICE1724 issues, HZ=250, NO_HZ=y
@ 2010-12-13 10:54 Simon Kirby
  2010-12-13 11:19 ` Markus Trippelsdorf
  0 siblings, 1 reply; 6+ messages in thread
From: Simon Kirby @ 2010-12-13 10:54 UTC (permalink / raw)
  To: linux-kernel

With Linus git HEAD just compiled earlier today, I noticed a weird issue
just now while in the middle of some coding.  I think this may have
happened before (maybe in 2.6.36), but not as obvious as this time.

The music that was playing stopped, and the key I had just pressed did
not show up in rxvt.  I figured the machine had hung, but I hit another
key and the music played again for another second and my _previous_
keystroke output appeared.

Looking at /proc/interrupts, "timer" had completely stopped.  "perf top"
showed "snd_pcm_hwsync" was at the top of the profile with about 80%
usage.  Spinning in "snd_pcm_stream_lock_irq" from other CPUs, probably..

dmesg showed:

[13995.498393] Clocksource tsc unstable (delta = -8589909196 ns)
[13995.511452] Switching to clocksource hpet
[14039.525074] NOHZ: local_softirq_pending 100
[14039.882183] hrtimer: interrupt took 30800 ns
[14134.540724] NOHZ: local_softirq_pending 100

I tried a few things:

# cd /sys/devices/system/clocksource/clocksource0
# grep . *
available_clocksource:hpet acpi_pm 
current_clocksource:hpet
# echo acpi_pm > current_clocksource

[14249.256629] Switching to clocksource acpi_pm

Still problems, so I tried again:

# echo tsc > current_clocksource

[14258.808333] Override clocksource tsc is not HRT compatible. Cannot switch while in HRT/NOHZ mode
[14258.808343] Switching to clocksource hpet

...but after this, things seemed to return to normal.  /proc/interrupts
shows "timer" increasing once again.

Full boot dmesg: http://0x.ca/sim/ref/2.6.37/oof.dmesg
.config: http://0x.ca/sim/ref/2.6.37/oof.config

Simon-

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [2.6.37-rc5] Timer or ICE1724 issues, HZ=250, NO_HZ=y
  2010-12-13 10:54 [2.6.37-rc5] Timer or ICE1724 issues, HZ=250, NO_HZ=y Simon Kirby
@ 2010-12-13 11:19 ` Markus Trippelsdorf
  2010-12-13 11:24   ` Thomas Gleixner
  0 siblings, 1 reply; 6+ messages in thread
From: Markus Trippelsdorf @ 2010-12-13 11:19 UTC (permalink / raw)
  To: Simon Kirby
  Cc: linux-kernel, john stultz, Thomas Gleixner, Borislav Petkov,
	Linus Torvalds

On 2010.12.13 at 02:54 -0800, Simon Kirby wrote:
> With Linus git HEAD just compiled earlier today, I noticed a weird issue
> just now while in the middle of some coding.  I think this may have
> happened before (maybe in 2.6.36), but not as obvious as this time.
> 
> The music that was playing stopped, and the key I had just pressed did
> not show up in rxvt.  I figured the machine had hung, but I hit another
> key and the music played again for another second and my _previous_
> keystroke output appeared.
> 
> Looking at /proc/interrupts, "timer" had completely stopped.  "perf top"
> showed "snd_pcm_hwsync" was at the top of the profile with about 80%
> usage.  Spinning in "snd_pcm_stream_lock_irq" from other CPUs, probably..
> 
> dmesg showed:
> 
> [13995.498393] Clocksource tsc unstable (delta = -8589909196 ns)
> [13995.511452] Switching to clocksource hpet
> [14039.525074] NOHZ: local_softirq_pending 100
> [14039.882183] hrtimer: interrupt took 30800 ns
> [14134.540724] NOHZ: local_softirq_pending 100

Welcome to the club. This is a known problem on newer AMD systems see:
http://thread.gmane.org/gmane.linux.kernel/1054283
and the following patch:
http://thread.gmane.org/gmane.linux.kernel/1069458

But the real problem is that no patch found its way to Linus yet and
2.6.37 will be released in a few days. 

I think Thomas' solution (with a hardcoded hpet_min_tick):
     return res < 128 ? -ETIME : 0;
is the only sane option left this late in the series.
-- 
Markus

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [2.6.37-rc5] Timer or ICE1724 issues, HZ=250, NO_HZ=y
  2010-12-13 11:19 ` Markus Trippelsdorf
@ 2010-12-13 11:24   ` Thomas Gleixner
  2010-12-13 12:46     ` [tip:x86/urgent] x86: HPET: Chose a paranoid safe value for the ETIME check tip-bot for Thomas Gleixner
  0 siblings, 1 reply; 6+ messages in thread
From: Thomas Gleixner @ 2010-12-13 11:24 UTC (permalink / raw)
  To: Markus Trippelsdorf
  Cc: Simon Kirby, linux-kernel, john stultz, Borislav Petkov, Linus Torvalds

On Mon, 13 Dec 2010, Markus Trippelsdorf wrote:
> On 2010.12.13 at 02:54 -0800, Simon Kirby wrote:
> > With Linus git HEAD just compiled earlier today, I noticed a weird issue
> > just now while in the middle of some coding.  I think this may have
> > happened before (maybe in 2.6.36), but not as obvious as this time.
> > 
> > The music that was playing stopped, and the key I had just pressed did
> > not show up in rxvt.  I figured the machine had hung, but I hit another
> > key and the music played again for another second and my _previous_
> > keystroke output appeared.
> > 
> > Looking at /proc/interrupts, "timer" had completely stopped.  "perf top"
> > showed "snd_pcm_hwsync" was at the top of the profile with about 80%
> > usage.  Spinning in "snd_pcm_stream_lock_irq" from other CPUs, probably..
> > 
> > dmesg showed:
> > 
> > [13995.498393] Clocksource tsc unstable (delta = -8589909196 ns)
> > [13995.511452] Switching to clocksource hpet
> > [14039.525074] NOHZ: local_softirq_pending 100
> > [14039.882183] hrtimer: interrupt took 30800 ns
> > [14134.540724] NOHZ: local_softirq_pending 100
> 
> Welcome to the club. This is a known problem on newer AMD systems see:
> http://thread.gmane.org/gmane.linux.kernel/1054283
> and the following patch:
> http://thread.gmane.org/gmane.linux.kernel/1069458
> 
> But the real problem is that no patch found its way to Linus yet and
> 2.6.37 will be released in a few days. 
> 
> I think Thomas' solution (with a hardcoded hpet_min_tick):
>      return res < 128 ? -ETIME : 0;
> is the only sane option left this late in the series.

Yeah. I push that out to Linus. At some point we might hopefully get
some reliable information from the HW folks.

Thanks,

	tglx

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [tip:x86/urgent] x86: HPET: Chose a paranoid safe value for the ETIME check
  2010-12-13 11:24   ` Thomas Gleixner
@ 2010-12-13 12:46     ` tip-bot for Thomas Gleixner
  2011-07-28 15:31       ` [stable-longterm] " Konstantin Khlebnikov
  0 siblings, 1 reply; 6+ messages in thread
From: tip-bot for Thomas Gleixner @ 2010-12-13 12:46 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: linux-kernel, hpa, mingo, Andreas.Herrmann3, johnstul, bp,
	markus, tglx, sim

Commit-ID:  f1c18071ad70e2a78ab31fc26a18fcfa954a05c6
Gitweb:     http://git.kernel.org/tip/f1c18071ad70e2a78ab31fc26a18fcfa954a05c6
Author:     Thomas Gleixner <tglx@linutronix.de>
AuthorDate: Mon, 13 Dec 2010 12:43:23 +0100
Committer:  Thomas Gleixner <tglx@linutronix.de>
CommitDate: Mon, 13 Dec 2010 13:42:44 +0100

x86: HPET: Chose a paranoid safe value for the ETIME check

commit 995bd3bb5 (x86: Hpet: Avoid the comparator readback penalty)
chose 8 HPET cycles as a safe value for the ETIME check, as we had the
confirmation that the posted write to the comparator register is
delayed by two HPET clock cycles on Intel chipsets which showed
readback problems.

After that patch hit mainline we got reports from machines with newer
AMD chipsets which seem to have an even longer delay. See
http://thread.gmane.org/gmane.linux.kernel/1054283 and
http://thread.gmane.org/gmane.linux.kernel/1069458 for further
information.

Boris tried to come up with an ACPI based selection of the minimum
HPET cycles, but this failed on a couple of test machines. And of
course we did not get any useful information from the hardware folks.

For now our only option is to chose a paranoid high and safe value for
the minimum HPET cycles used by the ETIME check. Adjust the minimum ns
value for the HPET clockevent accordingly.

Reported-Bistected-and-Tested-by: Markus Trippelsdorf <markus@trippelsdorf.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
LKML-Reference: <alpine.LFD.2.00.1012131222420.2653@localhost6.localdomain6>
Cc: Simon Kirby <sim@hostway.ca>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Andreas Herrmann <Andreas.Herrmann3@amd.com>
Cc: John Stultz <johnstul@us.ibm.com>
---
 arch/x86/kernel/hpet.c |   26 ++++++++++++++++----------
 1 files changed, 16 insertions(+), 10 deletions(-)

diff --git a/arch/x86/kernel/hpet.c b/arch/x86/kernel/hpet.c
index ae03cab..4ff5968 100644
--- a/arch/x86/kernel/hpet.c
+++ b/arch/x86/kernel/hpet.c
@@ -27,6 +27,9 @@
 #define HPET_DEV_FSB_CAP		0x1000
 #define HPET_DEV_PERI_CAP		0x2000
 
+#define HPET_MIN_CYCLES			128
+#define HPET_MIN_PROG_DELTA		(HPET_MIN_CYCLES + (HPET_MIN_CYCLES >> 1))
+
 #define EVT_TO_HPET_DEV(evt) container_of(evt, struct hpet_dev, evt)
 
 /*
@@ -299,8 +302,9 @@ static void hpet_legacy_clockevent_register(void)
 	/* Calculate the min / max delta */
 	hpet_clockevent.max_delta_ns = clockevent_delta2ns(0x7FFFFFFF,
 							   &hpet_clockevent);
-	/* 5 usec minimum reprogramming delta. */
-	hpet_clockevent.min_delta_ns = 5000;
+	/* Setup minimum reprogramming delta. */
+	hpet_clockevent.min_delta_ns = clockevent_delta2ns(HPET_MIN_PROG_DELTA,
+							   &hpet_clockevent);
 
 	/*
 	 * Start hpet with the boot cpu mask and make it
@@ -393,22 +397,24 @@ static int hpet_next_event(unsigned long delta,
 	 * the wraparound into account) nor a simple count down event
 	 * mode. Further the write to the comparator register is
 	 * delayed internally up to two HPET clock cycles in certain
-	 * chipsets (ATI, ICH9,10). We worked around that by reading
-	 * back the compare register, but that required another
-	 * workaround for ICH9,10 chips where the first readout after
-	 * write can return the old stale value. We already have a
-	 * minimum delta of 5us enforced, but a NMI or SMI hitting
+	 * chipsets (ATI, ICH9,10). Some newer AMD chipsets have even
+	 * longer delays. We worked around that by reading back the
+	 * compare register, but that required another workaround for
+	 * ICH9,10 chips where the first readout after write can
+	 * return the old stale value. We already had a minimum
+	 * programming delta of 5us enforced, but a NMI or SMI hitting
 	 * between the counter readout and the comparator write can
 	 * move us behind that point easily. Now instead of reading
 	 * the compare register back several times, we make the ETIME
 	 * decision based on the following: Return ETIME if the
-	 * counter value after the write is less than 8 HPET cycles
+	 * counter value after the write is less than HPET_MIN_CYCLES
 	 * away from the event or if the counter is already ahead of
-	 * the event.
+	 * the event. The minimum programming delta for the generic
+	 * clockevents code is set to 1.5 * HPET_MIN_CYCLES.
 	 */
 	res = (s32)(cnt - hpet_readl(HPET_COUNTER));
 
-	return res < 8 ? -ETIME : 0;
+	return res < HPET_MIN_CYCLES ? -ETIME : 0;
 }
 
 static void hpet_legacy_set_mode(enum clock_event_mode mode,

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [stable-longterm] x86: HPET: Chose a paranoid safe value for the ETIME check
  2010-12-13 12:46     ` [tip:x86/urgent] x86: HPET: Chose a paranoid safe value for the ETIME check tip-bot for Thomas Gleixner
@ 2011-07-28 15:31       ` Konstantin Khlebnikov
  2011-07-28 15:52         ` Greg KH
  0 siblings, 1 reply; 6+ messages in thread
From: Konstantin Khlebnikov @ 2011-07-28 15:31 UTC (permalink / raw)
  To: linux-kernel, Thomas Gleixner, Greg KH

I think this v2.6.37-rc5-64-gf1c1807 and v2.6.36-rc4-167-g995bd3b should be merged into longterm/stable kernels.
We also found the same bug in RHEL6 kernel. (nohz kernel stuck and wait for HPET wraparound)
This HPET-bug really annoying, while fix is tiny.

tip-bot for Thomas Gleixner wrote:
> Commit-ID:  f1c18071ad70e2a78ab31fc26a18fcfa954a05c6
> Gitweb:     http://git.kernel.org/tip/f1c18071ad70e2a78ab31fc26a18fcfa954a05c6
> Author:     Thomas Gleixner<tglx@linutronix.de>
> AuthorDate: Mon, 13 Dec 2010 12:43:23 +0100
> Committer:  Thomas Gleixner<tglx@linutronix.de>
> CommitDate: Mon, 13 Dec 2010 13:42:44 +0100
>
> x86: HPET: Chose a paranoid safe value for the ETIME check
>
> commit 995bd3bb5 (x86: Hpet: Avoid the comparator readback penalty)
> chose 8 HPET cycles as a safe value for the ETIME check, as we had the
> confirmation that the posted write to the comparator register is
> delayed by two HPET clock cycles on Intel chipsets which showed
> readback problems.
>
> After that patch hit mainline we got reports from machines with newer
> AMD chipsets which seem to have an even longer delay. See
> http://thread.gmane.org/gmane.linux.kernel/1054283 and
> http://thread.gmane.org/gmane.linux.kernel/1069458 for further
> information.
>
> Boris tried to come up with an ACPI based selection of the minimum
> HPET cycles, but this failed on a couple of test machines. And of
> course we did not get any useful information from the hardware folks.
>
> For now our only option is to chose a paranoid high and safe value for
> the minimum HPET cycles used by the ETIME check. Adjust the minimum ns
> value for the HPET clockevent accordingly.
>
> Reported-Bistected-and-Tested-by: Markus Trippelsdorf<markus@trippelsdorf.de>
> Signed-off-by: Thomas Gleixner<tglx@linutronix.de>
> LKML-Reference:<alpine.LFD.2.00.1012131222420.2653@localhost6.localdomain6>
> Cc: Simon Kirby<sim@hostway.ca>
> Cc: Borislav Petkov<bp@alien8.de>
> Cc: Andreas Herrmann<Andreas.Herrmann3@amd.com>
> Cc: John Stultz<johnstul@us.ibm.com>
> ---
>   arch/x86/kernel/hpet.c |   26 ++++++++++++++++----------
>   1 files changed, 16 insertions(+), 10 deletions(-)
>
> diff --git a/arch/x86/kernel/hpet.c b/arch/x86/kernel/hpet.c
> index ae03cab..4ff5968 100644
> --- a/arch/x86/kernel/hpet.c
> +++ b/arch/x86/kernel/hpet.c
> @@ -27,6 +27,9 @@
>   #define HPET_DEV_FSB_CAP		0x1000
>   #define HPET_DEV_PERI_CAP		0x2000
>
> +#define HPET_MIN_CYCLES			128
> +#define HPET_MIN_PROG_DELTA		(HPET_MIN_CYCLES + (HPET_MIN_CYCLES>>  1))
> +
>   #define EVT_TO_HPET_DEV(evt) container_of(evt, struct hpet_dev, evt)
>
>   /*
> @@ -299,8 +302,9 @@ static void hpet_legacy_clockevent_register(void)
>   	/* Calculate the min / max delta */
>   	hpet_clockevent.max_delta_ns = clockevent_delta2ns(0x7FFFFFFF,
>   							&hpet_clockevent);
> -	/* 5 usec minimum reprogramming delta. */
> -	hpet_clockevent.min_delta_ns = 5000;
> +	/* Setup minimum reprogramming delta. */
> +	hpet_clockevent.min_delta_ns = clockevent_delta2ns(HPET_MIN_PROG_DELTA,
> +							&hpet_clockevent);
>
>   	/*
>   	 * Start hpet with the boot cpu mask and make it
> @@ -393,22 +397,24 @@ static int hpet_next_event(unsigned long delta,
>   	 * the wraparound into account) nor a simple count down event
>   	 * mode. Further the write to the comparator register is
>   	 * delayed internally up to two HPET clock cycles in certain
> -	 * chipsets (ATI, ICH9,10). We worked around that by reading
> -	 * back the compare register, but that required another
> -	 * workaround for ICH9,10 chips where the first readout after
> -	 * write can return the old stale value. We already have a
> -	 * minimum delta of 5us enforced, but a NMI or SMI hitting
> +	 * chipsets (ATI, ICH9,10). Some newer AMD chipsets have even
> +	 * longer delays. We worked around that by reading back the
> +	 * compare register, but that required another workaround for
> +	 * ICH9,10 chips where the first readout after write can
> +	 * return the old stale value. We already had a minimum
> +	 * programming delta of 5us enforced, but a NMI or SMI hitting
>   	 * between the counter readout and the comparator write can
>   	 * move us behind that point easily. Now instead of reading
>   	 * the compare register back several times, we make the ETIME
>   	 * decision based on the following: Return ETIME if the
> -	 * counter value after the write is less than 8 HPET cycles
> +	 * counter value after the write is less than HPET_MIN_CYCLES
>   	 * away from the event or if the counter is already ahead of
> -	 * the event.
> +	 * the event. The minimum programming delta for the generic
> +	 * clockevents code is set to 1.5 * HPET_MIN_CYCLES.
>   	 */
>   	res = (s32)(cnt - hpet_readl(HPET_COUNTER));
>
> -	return res<  8 ? -ETIME : 0;
> +	return res<  HPET_MIN_CYCLES ? -ETIME : 0;
>   }
>
>   static void hpet_legacy_set_mode(enum clock_event_mode mode,
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [stable-longterm] x86: HPET: Chose a paranoid safe value for the ETIME check
  2011-07-28 15:31       ` [stable-longterm] " Konstantin Khlebnikov
@ 2011-07-28 15:52         ` Greg KH
  0 siblings, 0 replies; 6+ messages in thread
From: Greg KH @ 2011-07-28 15:52 UTC (permalink / raw)
  To: Konstantin Khlebnikov; +Cc: linux-kernel, Thomas Gleixner

On Thu, Jul 28, 2011 at 07:31:21PM +0400, Konstantin Khlebnikov wrote:
> I think this v2.6.37-rc5-64-gf1c1807 and v2.6.36-rc4-167-g995bd3b should be merged into longterm/stable kernels.
> We also found the same bug in RHEL6 kernel. (nohz kernel stuck and wait for HPET wraparound)
> This HPET-bug really annoying, while fix is tiny.

<formletter>

This is not the correct way to submit patches for inclusion in the
stable kernel tree.  Please read Documentation/stable_kernel_rules.txt
for how to do this properly.

</formletter>

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2011-07-28 15:53 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2010-12-13 10:54 [2.6.37-rc5] Timer or ICE1724 issues, HZ=250, NO_HZ=y Simon Kirby
2010-12-13 11:19 ` Markus Trippelsdorf
2010-12-13 11:24   ` Thomas Gleixner
2010-12-13 12:46     ` [tip:x86/urgent] x86: HPET: Chose a paranoid safe value for the ETIME check tip-bot for Thomas Gleixner
2011-07-28 15:31       ` [stable-longterm] " Konstantin Khlebnikov
2011-07-28 15:52         ` Greg KH

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.