All of lore.kernel.org
 help / color / mirror / Atom feed
* [BUG] 2.6.21-rc1,2,3 regressions on my system that I found so far
@ 2007-03-16 10:30 Maxim Levitsky
  2007-03-16 23:19 ` Len Brown
                   ` (2 more replies)
  0 siblings, 3 replies; 37+ messages in thread
From: Maxim Levitsky @ 2007-03-16 10:30 UTC (permalink / raw)
  To: linux-kernel


Good day, 

I want to report regressions I have with 2.6.21-rc3 kernel.
I use CONFIG_NO_HZ.

1) Both suspend to disk and suspend to RAM are completely broken:
On vanilla 2.6.20 suspend to disk works perfectly and suspend to ram works _almost_ perfectly (I will tell about that later).
On 2.6.21-rc1 and later system hangs even before suspend begins (suspend to disk hangs before image write , and after suspend to ram , 
some devices are powered down (disk,power leds) , and some and not(fans, power) , and system hangs).

I did a git-bisect and I found which commit caused that:
	e3c7db621bed4afb8e231cb005057f2feb5db557 - [PATCH] [PATCH] PM: Change code ordering in main.c (breaks  S3)
	ed746e3b18f4df18afa3763155972c5835f284c5 - [PATCH] [PATCH] swsusp: Change code ordering in disk.c (breaks swsusp, I don't use it, but I tested it)
        259130526c267550bc365d3015917d90667732f1 - [PATCH] [PATCH] swsusp: Change code ordering in user.c (breaks uswsusp, that I use)

I reverted those commits and now system suspends correctly to disk, but suspend to ram showed some more regressions.


2) ) After suspend to ram I get this 

Mar 14 00:22:23 MAIN kernel: [    2.072875] caller is check_tsc_sync_source+0x1d/0x100
Mar 14 00:22:23 MAIN kernel: [    2.072878]  [show_trace_log_lvl+26/48] show_trace_log_lvl+0x1a/0x30
Mar 14 00:22:23 MAIN kernel: [    2.072881]  [show_trace+18/32] show_trace+0x12/0x20
Mar 14 00:22:23 MAIN kernel: [    2.072884]  [dump_stack+22/32] dump_stack+0x16/0x20
Mar 14 00:22:23 MAIN kernel: [    2.072887]  [debug_smp_processor_id+173/176] debug_smp_processor_id+0xad/0xb0
Mar 14 00:22:23 MAIN kernel: [    2.072891]  [check_tsc_sync_source+29/256] check_tsc_sync_source+0x1d/0x100
Mar 14 00:22:23 MAIN kernel: [    2.072894]  [__cpu_up+80/384] __cpu_up+0x50/0x180
Mar 14 00:22:23 MAIN kernel: [    2.072897]  [_cpu_up+98/208] _cpu_up+0x62/0xd0
Mar 14 00:22:23 MAIN kernel: [    2.072901]  [cpu_up+46/80] cpu_up+0x2e/0x50
Mar 14 00:22:23 MAIN kernel: [    2.072903]  [enable_nonboot_cpus+110/160] enable_nonboot_cpus+0x6e/0xa0
Mar 14 00:22:23 MAIN kernel: [    2.072906]  [enter_state+326/496] enter_state+0x146/0x1f0
Mar 14 00:22:23 MAIN kernel: [    2.072909]  [state_store+174/192] state_store+0xae/0xc0
Mar 14 00:22:23 MAIN kernel: [    2.072912]  [subsys_attr_store+43/64] subsys_attr_store+0x2b/0x40
Mar 14 00:22:23 MAIN kernel: [    2.072917]  [sysfs_write_file+186/272] sysfs_write_file+0xba/0x110
Mar 14 00:22:23 MAIN kernel: [    2.072920]  [vfs_write+150/352] vfs_write+0x96/0x160
Mar 14 00:22:23 MAIN kernel: [    2.072923]  [sys_write+61/112] sys_write+0x3d/0x70
Mar 14 00:22:23 MAIN kernel: [    2.072926]  [sysenter_past_esp+93/153] sysenter_past_esp+0x5d/0x99
Mar 14 00:22:23 MAIN kernel: [    2.072929]  =======================
Mar 14 00:22:23 MAIN kernel: [    2.072931] checking TSC synchronization [CPU#0 -> CPU#1]:
Mar 14 00:22:23 MAIN kernel: [    2.092922] Measured 72051818872 cycles TSC warp between CPUs, turning off

It looks clear that preempt is enabled all the way in second cpu initialization, ( I think that at least in check_tsc_sync_source, it should be disabled,
shouldn't it ? )

Then I did add preempt_disable() / preempt_enable()  to this function , and  I still got this:

Mar 14 00:22:23 MAIN kernel: [    2.072931] checking TSC synchronization [CPU#0 -> CPU#1]:
Mar 14 00:22:23 MAIN kernel: [    2.092922] Measured 72051818872 cycles TSC warp between CPUs, turning off

It happens after second CPU is brought back on-line.

Now I understand that this is TSC sync problem and I tried to do some tests:

 I tried to disable/enable second CPU by hand, eg I did number of times,

echo "0" > /sys/devices/system/cpu/cpu1/online
echo "1" > /sys/devices/system/cpu/cpu1/online

and TSC sync was ok.

Then I disabled 2nd CPU, have suspended system to RAM , resumed it  , and then enabled 2nd CPU and got same error message.
Then I disabled cpufreq , and did above tests, and got same results.
I think that maybe this error is false, that there is some difference in TSC clock, but this difference is constant, and can be fixed

3) Sometimes I get this (once in three boots or so)

[   36.217405] ENABLING IO-APIC IRQs
[   36.217587] ..TIMER: vector=0x31 apic1=0 pin1=2 apic2=-1 pin2=-1
[   36.433917] APIC timer disabled due to verification failure.

And NO_HZ is disabled due to that (I get 1000/s timer's interrupts)
I haven't investigated that yet.
It looks like another new test that my hardware fails to perform... 


And now I want to tell you about that _almost_ working suspend to ram I got in 2.6.20:
To put it simply sometimes system wakes from resume, and sometimes not (about 1 in 5 times)
When it does it works perfectly. 

This is quite common problem but ironically my case is very different and harder to solve.

The fact is that I found thanks to RTC tricks , similar to PM_TRACE that system hangs in exactly three places (in one of them of course)

I put between instructions, code like that to save a position in RTC alarm which is not cleared on reboot
Note that this code uses ax, but I checked every time I put it that ax can be used (eg, it is loaded in next instruction)

#define TRACE(val) 			 \
	movb	$0x01, %al		; \
	outb	%al, $0x70 		; \
	movb	$ ## val, %al 		; \
	outb	%al, $0x71

It hangs very early in asm code, and those are places:

1) /home/maxim/software/kernel/linux-2.6.20-mod/arch/i386/kernel/acpi/wakeup.S:wakeup_start:
	ljmpl	$__KERNEL_CS,$wakeup_pmode_return
As I see that is first time wakeup low page is addressing kernel memory by jumping to it.

2)  /home/maxim/software/kernel/linux-2.6.20-mod/arch/i386/kernel/acpi/wakeup.S:do_suspend_lowlevel
	call	restore_registers 
It hangs exacly on that instruction, I can only see that this is first time protected stack is accessed

3) /home/maxim/software/kernel/linux-2.6.20-mod/arch/i386/power/cpu.c :__restore_processor_state(struct saved_context *ctxt)
	mtrr_ap_init();

It hangs somewhere inside this function, and because is is long I haven't found where exactly, it is easier to disable MTRR all together.

Note that all three places have different external behavier:

in first case, system powers on -> off -> on -> hangs
I did put test in RTC to see whenever BIOS call kernel twice but I know from this test for sure that it calls it only once,
It actualy makes sense, because exception occures before IDT is loaded, so system has no choice but to power down

in second case I see blinking leds -> almost sure a oops

in third case system just hangs

That's all, I will continue to dig those problems out

Thanks for attention,
	Maxim Levitsky

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [BUG] 2.6.21-rc1,2,3 regressions on my system that I found so far
  2007-03-16 10:30 [BUG] 2.6.21-rc1,2,3 regressions on my system that I found so far Maxim Levitsky
@ 2007-03-16 23:19 ` Len Brown
  2007-03-17 23:00   ` Maxim
  2007-03-16 23:39 ` [BUG] 2.6.21-rc1,2,3 regressions on my system that I found so far Thomas Gleixner
  2007-03-16 23:44 ` Thomas Gleixner
  2 siblings, 1 reply; 37+ messages in thread
From: Len Brown @ 2007-03-16 23:19 UTC (permalink / raw)
  To: Maxim Levitsky; +Cc: linux-kernel

On Friday 16 March 2007 06:30, Maxim Levitsky wrote:
> 
> Good day, 
> 
> I want to report regressions I have with 2.6.21-rc3 kernel.
> I use CONFIG_NO_HZ.

Do any of these issues go away with CONFIG_NO_HZ=n (or boot with nohz=n)
or are they all independent of it?

thanks,
-Len

> 1) Both suspend to disk and suspend to RAM are completely broken:
> On vanilla 2.6.20 suspend to disk works perfectly and suspend to ram works _almost_ perfectly (I will tell about that later).
> On 2.6.21-rc1 and later system hangs even before suspend begins (suspend to disk hangs before image write , and after suspend to ram , 
> some devices are powered down (disk,power leds) , and some and not(fans, power) , and system hangs).
> 
> I did a git-bisect and I found which commit caused that:
> 	e3c7db621bed4afb8e231cb005057f2feb5db557 - [PATCH] [PATCH] PM: Change code ordering in main.c (breaks  S3)
> 	ed746e3b18f4df18afa3763155972c5835f284c5 - [PATCH] [PATCH] swsusp: Change code ordering in disk.c (breaks swsusp, I don't use it, but I tested it)
>         259130526c267550bc365d3015917d90667732f1 - [PATCH] [PATCH] swsusp: Change code ordering in user.c (breaks uswsusp, that I use)
> 
> I reverted those commits and now system suspends correctly to disk, but suspend to ram showed some more regressions.
> 
> 
> 2) ) After suspend to ram I get this 
> 
> Mar 14 00:22:23 MAIN kernel: [    2.072875] caller is check_tsc_sync_source+0x1d/0x100
> Mar 14 00:22:23 MAIN kernel: [    2.072878]  [show_trace_log_lvl+26/48] show_trace_log_lvl+0x1a/0x30
> Mar 14 00:22:23 MAIN kernel: [    2.072881]  [show_trace+18/32] show_trace+0x12/0x20
> Mar 14 00:22:23 MAIN kernel: [    2.072884]  [dump_stack+22/32] dump_stack+0x16/0x20
> Mar 14 00:22:23 MAIN kernel: [    2.072887]  [debug_smp_processor_id+173/176] debug_smp_processor_id+0xad/0xb0
> Mar 14 00:22:23 MAIN kernel: [    2.072891]  [check_tsc_sync_source+29/256] check_tsc_sync_source+0x1d/0x100
> Mar 14 00:22:23 MAIN kernel: [    2.072894]  [__cpu_up+80/384] __cpu_up+0x50/0x180
> Mar 14 00:22:23 MAIN kernel: [    2.072897]  [_cpu_up+98/208] _cpu_up+0x62/0xd0
> Mar 14 00:22:23 MAIN kernel: [    2.072901]  [cpu_up+46/80] cpu_up+0x2e/0x50
> Mar 14 00:22:23 MAIN kernel: [    2.072903]  [enable_nonboot_cpus+110/160] enable_nonboot_cpus+0x6e/0xa0
> Mar 14 00:22:23 MAIN kernel: [    2.072906]  [enter_state+326/496] enter_state+0x146/0x1f0
> Mar 14 00:22:23 MAIN kernel: [    2.072909]  [state_store+174/192] state_store+0xae/0xc0
> Mar 14 00:22:23 MAIN kernel: [    2.072912]  [subsys_attr_store+43/64] subsys_attr_store+0x2b/0x40
> Mar 14 00:22:23 MAIN kernel: [    2.072917]  [sysfs_write_file+186/272] sysfs_write_file+0xba/0x110
> Mar 14 00:22:23 MAIN kernel: [    2.072920]  [vfs_write+150/352] vfs_write+0x96/0x160
> Mar 14 00:22:23 MAIN kernel: [    2.072923]  [sys_write+61/112] sys_write+0x3d/0x70
> Mar 14 00:22:23 MAIN kernel: [    2.072926]  [sysenter_past_esp+93/153] sysenter_past_esp+0x5d/0x99
> Mar 14 00:22:23 MAIN kernel: [    2.072929]  =======================
> Mar 14 00:22:23 MAIN kernel: [    2.072931] checking TSC synchronization [CPU#0 -> CPU#1]:
> Mar 14 00:22:23 MAIN kernel: [    2.092922] Measured 72051818872 cycles TSC warp between CPUs, turning off
> 
> It looks clear that preempt is enabled all the way in second cpu initialization, ( I think that at least in check_tsc_sync_source, it should be disabled,
> shouldn't it ? )
> 
> Then I did add preempt_disable() / preempt_enable()  to this function , and  I still got this:
> 
> Mar 14 00:22:23 MAIN kernel: [    2.072931] checking TSC synchronization [CPU#0 -> CPU#1]:
> Mar 14 00:22:23 MAIN kernel: [    2.092922] Measured 72051818872 cycles TSC warp between CPUs, turning off
> 
> It happens after second CPU is brought back on-line.
> 
> Now I understand that this is TSC sync problem and I tried to do some tests:
> 
>  I tried to disable/enable second CPU by hand, eg I did number of times,
> 
> echo "0" > /sys/devices/system/cpu/cpu1/online
> echo "1" > /sys/devices/system/cpu/cpu1/online
> 
> and TSC sync was ok.
> 
> Then I disabled 2nd CPU, have suspended system to RAM , resumed it  , and then enabled 2nd CPU and got same error message.
> Then I disabled cpufreq , and did above tests, and got same results.
> I think that maybe this error is false, that there is some difference in TSC clock, but this difference is constant, and can be fixed
> 
> 3) Sometimes I get this (once in three boots or so)
> 
> [   36.217405] ENABLING IO-APIC IRQs
> [   36.217587] ..TIMER: vector=0x31 apic1=0 pin1=2 apic2=-1 pin2=-1
> [   36.433917] APIC timer disabled due to verification failure.
> 
> And NO_HZ is disabled due to that (I get 1000/s timer's interrupts)
> I haven't investigated that yet.
> It looks like another new test that my hardware fails to perform... 
> 
> 
> And now I want to tell you about that _almost_ working suspend to ram I got in 2.6.20:
> To put it simply sometimes system wakes from resume, and sometimes not (about 1 in 5 times)
> When it does it works perfectly. 
> 
> This is quite common problem but ironically my case is very different and harder to solve.
> 
> The fact is that I found thanks to RTC tricks , similar to PM_TRACE that system hangs in exactly three places (in one of them of course)
> 
> I put between instructions, code like that to save a position in RTC alarm which is not cleared on reboot
> Note that this code uses ax, but I checked every time I put it that ax can be used (eg, it is loaded in next instruction)
> 
> #define TRACE(val) 			 \
> 	movb	$0x01, %al		; \
> 	outb	%al, $0x70 		; \
> 	movb	$ ## val, %al 		; \
> 	outb	%al, $0x71
> 
> It hangs very early in asm code, and those are places:
> 
> 1) /home/maxim/software/kernel/linux-2.6.20-mod/arch/i386/kernel/acpi/wakeup.S:wakeup_start:
> 	ljmpl	$__KERNEL_CS,$wakeup_pmode_return
> As I see that is first time wakeup low page is addressing kernel memory by jumping to it.
> 
> 2)  /home/maxim/software/kernel/linux-2.6.20-mod/arch/i386/kernel/acpi/wakeup.S:do_suspend_lowlevel
> 	call	restore_registers 
> It hangs exacly on that instruction, I can only see that this is first time protected stack is accessed
> 
> 3) /home/maxim/software/kernel/linux-2.6.20-mod/arch/i386/power/cpu.c :__restore_processor_state(struct saved_context *ctxt)
> 	mtrr_ap_init();
> 
> It hangs somewhere inside this function, and because is is long I haven't found where exactly, it is easier to disable MTRR all together.
> 
> Note that all three places have different external behavier:
> 
> in first case, system powers on -> off -> on -> hangs
> I did put test in RTC to see whenever BIOS call kernel twice but I know from this test for sure that it calls it only once,
> It actualy makes sense, because exception occures before IDT is loaded, so system has no choice but to power down
> 
> in second case I see blinking leds -> almost sure a oops
> 
> in third case system just hangs
> 
> That's all, I will continue to dig those problems out
> 
> Thanks for attention,
> 	Maxim Levitsky
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
> 

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [BUG] 2.6.21-rc1,2,3 regressions on my system that I found so far
  2007-03-16 10:30 [BUG] 2.6.21-rc1,2,3 regressions on my system that I found so far Maxim Levitsky
  2007-03-16 23:19 ` Len Brown
@ 2007-03-16 23:39 ` Thomas Gleixner
  2007-03-17 23:01   ` Maxim
  2007-03-16 23:44 ` Thomas Gleixner
  2 siblings, 1 reply; 37+ messages in thread
From: Thomas Gleixner @ 2007-03-16 23:39 UTC (permalink / raw)
  To: Maxim Levitsky; +Cc: linux-kernel

On Fri, 2007-03-16 at 12:30 +0200, Maxim Levitsky wrote:
> Mar 14 00:22:23 MAIN kernel: [    2.072875] caller is check_tsc_sync_source+0x1d/0x100
> Mar 14 00:22:23 MAIN kernel: [    2.072878]  [show_trace_log_lvl+26/48] show_trace_log_lvl+0x1a/0x30
> Mar 14 00:22:23 MAIN kernel: [    2.072931] checking TSC synchronization [CPU#0 -> CPU#1]:
> Mar 14 00:22:23 MAIN kernel: [    2.092922] Measured 72051818872 cycles TSC warp between CPUs, turning off
> 
> It looks clear that preempt is enabled all the way in second cpu initialization, ( I think that at least in check_tsc_sync_source, it should be disabled,
> shouldn't it ? )

This should be fixed by commit d04f41e35343f1d788551fd3f753f51794f4afcf

	tglx




^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [BUG] 2.6.21-rc1,2,3 regressions on my system that I found so far
  2007-03-16 10:30 [BUG] 2.6.21-rc1,2,3 regressions on my system that I found so far Maxim Levitsky
  2007-03-16 23:19 ` Len Brown
  2007-03-16 23:39 ` [BUG] 2.6.21-rc1,2,3 regressions on my system that I found so far Thomas Gleixner
@ 2007-03-16 23:44 ` Thomas Gleixner
  2007-03-17  0:04   ` [PATCH] i386: trust the PM-Timer calibration of the local APIC timer Thomas Gleixner
                     ` (2 more replies)
  2 siblings, 3 replies; 37+ messages in thread
From: Thomas Gleixner @ 2007-03-16 23:44 UTC (permalink / raw)
  To: Maxim Levitsky
  Cc: linux-kernel, Ingo Molnar, Andrew Morton, Adrian Bunk,
	Arjan van de Ven, Len Brown

Maxim,

On Fri, 2007-03-16 at 12:30 +0200, Maxim Levitsky wrote:
> 3) Sometimes I get this (once in three boots or so)
> 
> [   36.217405] ENABLING IO-APIC IRQs
> [   36.217587] ..TIMER: vector=0x31 apic1=0 pin1=2 apic2=-1 pin2=-1
> [   36.433917] APIC timer disabled due to verification failure.
> 
> And NO_HZ is disabled due to that (I get 1000/s timer's interrupts)
> I haven't investigated that yet.
> It looks like another new test that my hardware fails to perform... 

Yes, this is probably caused by SMM code trying to emulate a PS/2
keyboard from a (maybe connected or not) USB keyboard. Unfortunately we
have no way to disable this BIOS misfeature in the early boot process. 
Arjan, Len ?????

I built in this test to rule out bogus LAPIC timer calibration values
which are sometimes off by factor 2-10.

But I also built in a calibration against the PM-Timer, which turned out
to be quite reliable and I think the additional verification step is
only necessary for sytems without PM-Timer.

That was a bit over cautious from my side. I send a patch to avoid this
when PM-Timer is available in a separate mail.

	tglx



^ permalink raw reply	[flat|nested] 37+ messages in thread

* [PATCH] i386: trust the PM-Timer calibration of the local APIC timer
  2007-03-16 23:44 ` Thomas Gleixner
@ 2007-03-17  0:04   ` Thomas Gleixner
  2007-03-17  7:22     ` Ingo Molnar
                       ` (2 more replies)
  2007-03-17  1:32   ` [BUG] 2.6.21-rc1,2,3 regressions on my system that I found so far Len Brown
  2007-03-20  5:04   ` Lee Revell
  2 siblings, 3 replies; 37+ messages in thread
From: Thomas Gleixner @ 2007-03-17  0:04 UTC (permalink / raw)
  To: Maxim Levitsky
  Cc: linux-kernel, Ingo Molnar, Andrew Morton, Adrian Bunk,
	Arjan van de Ven, Len Brown

When PM-Timer is available for local APIC timer calibration we can skip
the verification of the calibrated time value. The resulting error is
quite small on a bunch of evaluated platforms and is less harming than
the observed false positives.

We need to keep the verification on systems, which have no PM-Timer to
avoid bogus local APIC timer calibrations in the range of factor 2-10,
which can be observed when swicthing off the PM-timer support in the
kernel configuration.

The wrong calibration values are probably caused by SMM code trying to
emulate a PS/2 keyboard from a (maybe connected or not) USB keyboard.
This prohibits the accurate delivery of PIT interrupts, which are used
to calibrate the local APIC timer. Unfortunately we have no way to
disable this BIOS misfeature in the early boot process.

Add also the dropped cpu_relax() back to the wait loops.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>

diff --git a/arch/i386/kernel/apic.c b/arch/i386/kernel/apic.c
index 2383bcf..92f4210 100644
--- a/arch/i386/kernel/apic.c
+++ b/arch/i386/kernel/apic.c
@@ -338,6 +338,7 @@ void __init setup_boot_APIC_clock(void)
 	void (*real_handler)(struct clock_event_device *dev);
 	unsigned long deltaj;
 	long delta, deltapm;
+	int pm_referenced = 0;
 
 	apic_printk(APIC_VERBOSE, "Using local APIC timer interrupts.\n"
 		    "calibrating APIC timer ...\n");
@@ -357,7 +358,8 @@ void __init setup_boot_APIC_clock(void)
 	/* Let the interrupts run */
 	local_irq_enable();
 
-	while(lapic_cal_loops <= LAPIC_CAL_LOOPS);
+	while(lapic_cal_loops <= LAPIC_CAL_LOOPS)
+		cpu_relax();
 
 	local_irq_disable();
 
@@ -394,6 +396,7 @@ void __init setup_boot_APIC_clock(void)
 			       "%lu (%ld)\n", (unsigned long) res, delta);
 			delta = (long) res;
 		}
+		pm_referenced = 1;
 	}
 
 	/* Calculate the scaled math multiplication factor */
@@ -423,68 +426,41 @@ void __init setup_boot_APIC_clock(void)
 		    calibration_result / (1000000 / HZ),
 		    calibration_result % (1000000 / HZ));
 
-
-	apic_printk(APIC_VERBOSE, "... verify APIC timer\n");
-
-	/*
-	 * Setup the apic timer manually
-	 */
 	local_apic_timer_verify_ok = 1;
-	levt->event_handler = lapic_cal_handler;
-	lapic_timer_setup(CLOCK_EVT_MODE_PERIODIC, levt);
-	lapic_cal_loops = -1;
 
-	/* Let the interrupts run */
-	local_irq_enable();
+	/* We trust the pm timer based calibration */
+	if (!pm_referenced) {
+		apic_printk(APIC_VERBOSE, "... verify APIC timer\n");
 
-	while(lapic_cal_loops <= LAPIC_CAL_LOOPS);
+		/*
+		 * Setup the apic timer manually
+		 */
+		levt->event_handler = lapic_cal_handler;
+		lapic_timer_setup(CLOCK_EVT_MODE_PERIODIC, levt);
+		lapic_cal_loops = -1;
 
-	local_irq_disable();
+		/* Let the interrupts run */
+		local_irq_enable();
 
-	/* Stop the lapic timer */
-	lapic_timer_setup(CLOCK_EVT_MODE_SHUTDOWN, levt);
+		while(lapic_cal_loops <= LAPIC_CAL_LOOPS)
+			cpu_relax();
 
-	local_irq_enable();
+		local_irq_disable();
 
-	/* Jiffies delta */
-	deltaj = lapic_cal_j2 - lapic_cal_j1;
-	apic_printk(APIC_VERBOSE, "... jiffies delta = %lu\n", deltaj);
+		/* Stop the lapic timer */
+		lapic_timer_setup(CLOCK_EVT_MODE_SHUTDOWN, levt);
 
-	/* Check, if the PM timer is available */
-	deltapm = lapic_cal_pm2 - lapic_cal_pm1;
-	apic_printk(APIC_VERBOSE, "... PM timer delta = %ld\n", deltapm);
+		local_irq_enable();
 
-	local_apic_timer_verify_ok = 0;
+		/* Jiffies delta */
+		deltaj = lapic_cal_j2 - lapic_cal_j1;
+		apic_printk(APIC_VERBOSE, "... jiffies delta = %lu\n", deltaj);
 
-	if (deltapm) {
-		if (deltapm > (pm_100ms - pm_thresh) &&
-		    deltapm < (pm_100ms + pm_thresh)) {
-			apic_printk(APIC_VERBOSE, "... PM timer result ok\n");
-			/* Check, if the jiffies result is consistent */
-			if (deltaj < LAPIC_CAL_LOOPS-2 ||
-			    deltaj > LAPIC_CAL_LOOPS+2) {
-				/*
-				 * Not sure, what we can do about this one.
-				 * When high resultion timers are active
-				 * and the lapic timer does not stop in C3
-				 * we are fine. Otherwise more trouble might
-				 * be waiting. -- tglx
-				 */
-				printk(KERN_WARNING "Global event device %s "
-				       "has wrong frequency "
-				       "(%lu ticks instead of %d)\n",
-				       global_clock_event->name, deltaj,
-				       LAPIC_CAL_LOOPS);
-			}
-			local_apic_timer_verify_ok = 1;
-		}
-	} else {
 		/* Check, if the jiffies result is consistent */
-		if (deltaj >= LAPIC_CAL_LOOPS-2 &&
-		    deltaj <= LAPIC_CAL_LOOPS+2) {
+		if (deltaj >= LAPIC_CAL_LOOPS-2 && deltaj <= LAPIC_CAL_LOOPS+2)
 			apic_printk(APIC_VERBOSE, "... jiffies result ok\n");
-			local_apic_timer_verify_ok = 1;
-		}
+		else
+			local_apic_timer_verify_ok = 0;
 	}
 
 	if (!local_apic_timer_verify_ok) {



^ permalink raw reply related	[flat|nested] 37+ messages in thread

* Re: [BUG] 2.6.21-rc1,2,3 regressions on my system that I found so far
  2007-03-16 23:44 ` Thomas Gleixner
  2007-03-17  0:04   ` [PATCH] i386: trust the PM-Timer calibration of the local APIC timer Thomas Gleixner
@ 2007-03-17  1:32   ` Len Brown
  2007-03-17  9:56     ` Thomas Gleixner
                       ` (2 more replies)
  2007-03-20  5:04   ` Lee Revell
  2 siblings, 3 replies; 37+ messages in thread
From: Len Brown @ 2007-03-17  1:32 UTC (permalink / raw)
  To: tglx
  Cc: Maxim Levitsky, linux-kernel, Ingo Molnar, Andrew Morton,
	Adrian Bunk, Arjan van de Ven

On Friday 16 March 2007 19:44, Thomas Gleixner wrote:
> Maxim,
> 
> On Fri, 2007-03-16 at 12:30 +0200, Maxim Levitsky wrote:
> > 3) Sometimes I get this (once in three boots or so)
> > 
> > [   36.217405] ENABLING IO-APIC IRQs
> > [   36.217587] ..TIMER: vector=0x31 apic1=0 pin1=2 apic2=-1 pin2=-1
> > [   36.433917] APIC timer disabled due to verification failure.
> > 
> > And NO_HZ is disabled due to that (I get 1000/s timer's interrupts)
> > I haven't investigated that yet.
> > It looks like another new test that my hardware fails to perform... 
> 
> Yes, this is probably caused by SMM code trying to emulate a PS/2
> keyboard from a (maybe connected or not) USB keyboard. Unfortunately we
> have no way to disable this BIOS misfeature in the early boot process. 
> Arjan, Len ?????

Nope.  By definition, SMM is invisible to the OS -- we don't even
get a bit that said it occurred (though we'd like one -- it would
be really helpful to diagnose issues like this one)

So go into BIOS SETUP and see if there is a USB Legacy Emulation
feature that you can disable.  Sometimes there is not, but disabling
onboard USB altogether may help at least prove the issue is in that area.

> I built in this test to rule out bogus LAPIC timer calibration values
> which are sometimes off by factor 2-10.
> 
> But I also built in a calibration against the PM-Timer, which turned out
> to be quite reliable and I think the additional verification step is
> only necessary for sytems without PM-Timer.
> 
> That was a bit over cautious from my side. I send a patch to avoid this
> when PM-Timer is available in a separate mail.

PM-Timer was invented to work-around the issue that the TSC became unreliable
in the face of power management on laptops.  In particular, to be able
to time duration of OS idle where TSC stopped.

While it is not fine grain, and it is not low-latency, is should
be very reliable.  My understanding is that it is implemented as
a simple divider right off the system 14MHz clock -- the signal
which most motherboard clocks are PLL multiplied up from --
including the 100MHz front-side bus which drives the LAPIC timer.

But that said, I don't understand why calibrating the LAPIC timer
using the PM-timer is going to be more reliable -- exactly how
and why did the previous calibration scheme fail?
Maybe I could follow the new logic in apic.c if I saw the "apic=debug"
output for this box.

cheers,
-Len



^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH] i386: trust the PM-Timer calibration of the local APIC timer
  2007-03-17  0:04   ` [PATCH] i386: trust the PM-Timer calibration of the local APIC timer Thomas Gleixner
@ 2007-03-17  7:22     ` Ingo Molnar
  2007-03-17 13:24     ` Andi Kleen
  2007-03-18  8:12     ` Andrew Morton
  2 siblings, 0 replies; 37+ messages in thread
From: Ingo Molnar @ 2007-03-17  7:22 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Maxim Levitsky, linux-kernel, Andrew Morton, Adrian Bunk,
	Arjan van de Ven, Len Brown


* Thomas Gleixner <tglx@linutronix.de> wrote:

> When PM-Timer is available for local APIC timer calibration we can 
> skip the verification of the calibrated time value. The resulting 
> error is quite small on a bunch of evaluated platforms and is less 
> harming than the observed false positives.
> 
> We need to keep the verification on systems, which have no PM-Timer to 
> avoid bogus local APIC timer calibrations in the range of factor 2-10, 
> which can be observed when swicthing off the PM-timer support in the 
> kernel configuration.
> 
> The wrong calibration values are probably caused by SMM code trying to 
> emulate a PS/2 keyboard from a (maybe connected or not) USB keyboard. 
> This prohibits the accurate delivery of PIT interrupts, which are used 
> to calibrate the local APIC timer. Unfortunately we have no way to 
> disable this BIOS misfeature in the early boot process.
> 
> Add also the dropped cpu_relax() back to the wait loops.
> 
> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>

Acked-by: Ingo Molnar <mingo@elte.hu>

	Ingo

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [BUG] 2.6.21-rc1,2,3 regressions on my system that I found so far
  2007-03-17  1:32   ` [BUG] 2.6.21-rc1,2,3 regressions on my system that I found so far Len Brown
@ 2007-03-17  9:56     ` Thomas Gleixner
  2007-03-17 11:05       ` Thomas Gleixner
  2007-03-17 16:52       ` Thomas Gleixner
  2007-03-17 10:32     ` Arjan van de Ven
  2007-03-17 22:45     ` Maxim
  2 siblings, 2 replies; 37+ messages in thread
From: Thomas Gleixner @ 2007-03-17  9:56 UTC (permalink / raw)
  To: Len Brown
  Cc: Maxim Levitsky, linux-kernel, Ingo Molnar, Andrew Morton,
	Adrian Bunk, Arjan van de Ven

Len,

On Fri, 2007-03-16 at 21:32 -0400, Len Brown wrote:
> > > [   36.433917] APIC timer disabled due to verification failure.
> > > 
> > > And NO_HZ is disabled due to that (I get 1000/s timer's interrupts)
> > > I haven't investigated that yet.
> > > It looks like another new test that my hardware fails to perform... 
> > 
> > Yes, this is probably caused by SMM code trying to emulate a PS/2
> > keyboard from a (maybe connected or not) USB keyboard. Unfortunately we
> > have no way to disable this BIOS misfeature in the early boot process. 
> > Arjan, Len ?????
> 
> Nope.  By definition, SMM is invisible to the OS -- we don't even
> get a bit that said it occurred (though we'd like one -- it would
> be really helpful to diagnose issues like this one)

I know that it is invisible. Nevertheless I know that the BIOSes emulate
PS/2 keyboards from USB via SMM during the boot process until we call
the usb_handoff function.

> So go into BIOS SETUP and see if there is a USB Legacy Emulation
> feature that you can disable.  Sometimes there is not, but disabling
> onboard USB altogether may help at least prove the issue is in that area.

I have more than one box (even original Intel mainboards), where either
plugging a PS/2 keyboard or switching off USB makes this problem go
away.

> > I built in this test to rule out bogus LAPIC timer calibration values
> > which are sometimes off by factor 2-10.
> > 
> > But I also built in a calibration against the PM-Timer, which turned out
> > to be quite reliable and I think the additional verification step is
> > only necessary for sytems without PM-Timer.
> > 
> > That was a bit over cautious from my side. I send a patch to avoid this
> > when PM-Timer is available in a separate mail.
> 
> PM-Timer was invented to work-around the issue that the TSC became unreliable
> in the face of power management on laptops.  In particular, to be able
> to time duration of OS idle where TSC stopped.
> 
> While it is not fine grain, and it is not low-latency, is should
> be very reliable.  My understanding is that it is implemented as
> a simple divider right off the system 14MHz clock -- the signal
> which most motherboard clocks are PLL multiplied up from --
> including the 100MHz front-side bus which drives the LAPIC timer.
> 
> But that said, I don't understand why calibrating the LAPIC timer
> using the PM-timer is going to be more reliable -- exactly how
> and why did the previous calibration scheme fail?
> Maybe I could follow the new logic in apic.c if I saw the "apic=debug"
> output for this box.

calibrating APIC timer ...
... lapic delta = 2426884
... PM timer delta = 833908
APIC calibration PIT not consistent with PM Timer: 232ms instead of 100ms
APIC delta adjusted to PM-Timer: 1041737 (2426884)
..... delta 1041737
..... mult: 44749065
..... calibration result: 166677
..... CPU clock speed is 4659.0624 MHz.
..... host bus clock speed is 166.0677 MHz.

This box is off by factor 2.3 and using the PM-Timer instead of the
PIT/jiffies values gives me a correct result.

Another one:
APIC calibration not consistent with PM Timer: 2020ms instead of 100ms
APIC delta adjusted to PM-Timer: 1254436 (25341111)

Off by factor 20 !!

The original APIC timer calibration did:

	local_irq_disable();
	wait_until_pit_underflows();
	t1 = read_apic_counter();
	for (i = 0; i < HZ/10; i++)
		wait_until_pit_underflows();
	t2 = read_apic_counter();

and calculated the APIC timer frequency from the delta of t1 and t2 vs.
the 100ms time.

This had 2 problems:
1. It gave results, which are off by factor 2-10 on a couple of boxen.
2. Some systems stop there dead as the PIT readout is broken.

I changed it to do:

	local_irq_disable();
	original_pit_handler = pit->handler;
	pit->handler = lapic_calibration_handler;
	loops = 0;
	local_irq_enable();
	wait_until_handler_has_done_HZ/10_loops();

The handler does:

	if (!loops++) {
		t1_apic = read_apic_counter();
		t1_jiffies = jiffies;
		t1_pm = read_pm_timer();
	}

	if (loops == HZ/10) {
		t2_apic = read_apic_counter();
		t2_jiffies = jiffies;
		t2_pm = read_pm_timer();
		done = 1;
	}

If the pmtimer is available, then calculate the APIC timer frequency
from the t1_pm/t2_pm delta, otherwise use jiffies.

When pm_timer is there, we can trust the calculated value, if not we do
a verify run of the periodic apic timer and the pit timer. If this fails
- and it fails often due to the SMM crap - then I use the PIT and IPIs.

In the first version I did a verification run even when pm_timer was
there, but this produced false positives as well, because the lapic
timer interrupt is in the same way delayed as the PIT interrupt. I
removed this to avoid unnecessary switching to IPIs after I verified,
that it always produced false positives when the calibration was done
against the PM-Timer.

	tglx



^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [BUG] 2.6.21-rc1,2,3 regressions on my system that I found so far
  2007-03-17  1:32   ` [BUG] 2.6.21-rc1,2,3 regressions on my system that I found so far Len Brown
  2007-03-17  9:56     ` Thomas Gleixner
@ 2007-03-17 10:32     ` Arjan van de Ven
  2007-03-17 13:26       ` Andi Kleen
  2007-03-17 22:45     ` Maxim
  2 siblings, 1 reply; 37+ messages in thread
From: Arjan van de Ven @ 2007-03-17 10:32 UTC (permalink / raw)
  To: Len Brown
  Cc: tglx, Maxim Levitsky, linux-kernel, Ingo Molnar, Andrew Morton,
	Adrian Bunk

On Fri, 2007-03-16 at 21:32 -0400, Len Brown wrote:
> On Friday 16 March 2007 19:44, Thomas Gleixner wrote:
> > Maxim,
> > 
> > On Fri, 2007-03-16 at 12:30 +0200, Maxim Levitsky wrote:
> > > 3) Sometimes I get this (once in three boots or so)
> > > 
> > > [   36.217405] ENABLING IO-APIC IRQs
> > > [   36.217587] ..TIMER: vector=0x31 apic1=0 pin1=2 apic2=-1 pin2=-1
> > > [   36.433917] APIC timer disabled due to verification failure.
> > > 
> > > And NO_HZ is disabled due to that (I get 1000/s timer's interrupts)
> > > I haven't investigated that yet.
> > > It looks like another new test that my hardware fails to perform... 
> > 
> > Yes, this is probably caused by SMM code trying to emulate a PS/2
> > keyboard from a (maybe connected or not) USB keyboard. Unfortunately we
> > have no way to disable this BIOS misfeature in the early boot process. 
> > Arjan, Len ?????
> 
> Nope.  By definition, SMM is invisible to the OS -- we don't even
> get a bit that said it occurred (though we'd like one -- it would
> be really helpful to diagnose issues like this one)


well we can do the handshake to take ownership like we do much later in
boot, but that requires PCI to be there and fully discovered, which we
don't have this early.

-- 
if you want to mail me at work (you don't), use arjan (at) linux.intel.com
Test the interaction between Linux and your BIOS via http://www.linuxfirmwarekit.org


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [BUG] 2.6.21-rc1,2,3 regressions on my system that I found so far
  2007-03-17  9:56     ` Thomas Gleixner
@ 2007-03-17 11:05       ` Thomas Gleixner
  2007-03-17 16:52       ` Thomas Gleixner
  1 sibling, 0 replies; 37+ messages in thread
From: Thomas Gleixner @ 2007-03-17 11:05 UTC (permalink / raw)
  To: Len Brown
  Cc: Maxim Levitsky, linux-kernel, Ingo Molnar, Andrew Morton,
	Adrian Bunk, Arjan van de Ven

On Sat, 2007-03-17 at 10:56 +0100, Thomas Gleixner wrote:
> calibrating APIC timer ...
> ... lapic delta = 2426884
> ... PM timer delta = 833908
> APIC calibration PIT not consistent with PM Timer: 232ms instead of 100ms
> APIC delta adjusted to PM-Timer: 1041737 (2426884)
> ..... delta 1041737
> ..... mult: 44749065
> ..... calibration result: 166677
> ..... CPU clock speed is 4659.0624 MHz.
> ..... host bus clock speed is 166.0677 MHz.
> 
> This box is off by factor 2.3 and using the PM-Timer instead of the
> PIT/jiffies values gives me a correct result.
> 
> Another one:
> APIC calibration not consistent with PM Timer: 2020ms instead of 100ms
> APIC delta adjusted to PM-Timer: 1254436 (25341111)
> 
> Off by factor 20 !!

This weird behaviour also can be seen with the BogoMIPS calibration:

Calibrating delay using timer specific routine.. 6428.32 BogoMIPS 
(lpj=12856647)
....
Initializing CPU#1
Calibrating delay using timer specific routine.. 103837.25 BogoMIPS 
(lpj=207674508)

Note, that I never observed that on CPU#0. It always affects CPU#1.

	tglx



^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH] i386: trust the PM-Timer calibration of the local APIC timer
  2007-03-17  0:04   ` [PATCH] i386: trust the PM-Timer calibration of the local APIC timer Thomas Gleixner
  2007-03-17  7:22     ` Ingo Molnar
@ 2007-03-17 13:24     ` Andi Kleen
  2007-03-18  8:12     ` Andrew Morton
  2 siblings, 0 replies; 37+ messages in thread
From: Andi Kleen @ 2007-03-17 13:24 UTC (permalink / raw)
  To: tglx
  Cc: Maxim Levitsky, linux-kernel, Ingo Molnar, Andrew Morton,
	Adrian Bunk, Arjan van de Ven, Len Brown

Thomas Gleixner <tglx@linutronix.de> writes:

> When PM-Timer is available for local APIC timer calibration we can skip
> the verification of the calibrated time value. The resulting error is
> quite small on a bunch of evaluated platforms and is less harming than
> the observed false positives.

Looks good to me.

-Andi

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [BUG] 2.6.21-rc1,2,3 regressions on my system that I found so far
  2007-03-17 10:32     ` Arjan van de Ven
@ 2007-03-17 13:26       ` Andi Kleen
  2007-03-20  4:27         ` Greg KH
  0 siblings, 1 reply; 37+ messages in thread
From: Andi Kleen @ 2007-03-17 13:26 UTC (permalink / raw)
  To: Arjan van de Ven
  Cc: Len Brown, tglx, Maxim Levitsky, linux-kernel, Ingo Molnar,
	Andrew Morton, Adrian Bunk

Arjan van de Ven <arjan@infradead.org> writes:
> 
> well we can do the handshake to take ownership like we do much later in
> boot, but that requires PCI to be there and fully discovered, which we
> don't have this early.

That's not true - we do early pci discovery. Doing USB handsoff
there would be quite possible.

-Andi


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [BUG] 2.6.21-rc1,2,3 regressions on my system that I found so far
  2007-03-17  9:56     ` Thomas Gleixner
  2007-03-17 11:05       ` Thomas Gleixner
@ 2007-03-17 16:52       ` Thomas Gleixner
  1 sibling, 0 replies; 37+ messages in thread
From: Thomas Gleixner @ 2007-03-17 16:52 UTC (permalink / raw)
  To: Len Brown
  Cc: Maxim Levitsky, linux-kernel, Ingo Molnar, Andrew Morton,
	Adrian Bunk, Arjan van de Ven

On Sat, 2007-03-17 at 10:56 +0100, Thomas Gleixner wrote:
> > Maybe I could follow the new logic in apic.c if I saw the "apic=debug"
> > output for this box.
> 
> calibrating APIC timer ...
> ... lapic delta = 2426884
> ... PM timer delta = 833908
> APIC calibration PIT not consistent with PM Timer: 232ms instead of 100ms
> APIC delta adjusted to PM-Timer: 1041737 (2426884)
> ..... delta 1041737
> ..... mult: 44749065
> ..... calibration result: 166677
> ..... CPU clock speed is 4659.0624 MHz.
> ..... host bus clock speed is 166.0677 MHz.
> 
> This box is off by factor 2.3 and using the PM-Timer instead of the
> PIT/jiffies values gives me a correct result.

I instrumented the lapic calibration on this box:

I1:     999 us total:    999 us
I2:     999 us total:   1998 us
...
I28:    999 us total:  27980 us
I29: 135097 us total: 163077 us  <--------------------
I30:    881 us total: 163958 us
...
I98:   1000 us total: 231918 us
I99:    999 us total: 232917 us

So it vanishes away for 132 ms, which is exactly the error above. This
happens in random places and sometimes I'm lucky that it does not happen
at all.

	tglx



^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [BUG] 2.6.21-rc1,2,3 regressions on my system that I found so far
  2007-03-17  1:32   ` [BUG] 2.6.21-rc1,2,3 regressions on my system that I found so far Len Brown
  2007-03-17  9:56     ` Thomas Gleixner
  2007-03-17 10:32     ` Arjan van de Ven
@ 2007-03-17 22:45     ` Maxim
  2 siblings, 0 replies; 37+ messages in thread
From: Maxim @ 2007-03-17 22:45 UTC (permalink / raw)
  To: Len Brown
  Cc: tglx, linux-kernel, Ingo Molnar, Andrew Morton, Adrian Bunk,
	Arjan van de Ven

On Saturday 17 March 2007 03:32:53 Len Brown wrote:
> On Friday 16 March 2007 19:44, Thomas Gleixner wrote:
> > Maxim,
> > 
> > On Fri, 2007-03-16 at 12:30 +0200, Maxim Levitsky wrote:
> > > 3) Sometimes I get this (once in three boots or so)
> > > 
> > > [   36.217405] ENABLING IO-APIC IRQs
> > > [   36.217587] ..TIMER: vector=0x31 apic1=0 pin1=2 apic2=-1 pin2=-1
> > > [   36.433917] APIC timer disabled due to verification failure.
> > > 
> > > And NO_HZ is disabled due to that (I get 1000/s timer's interrupts)
> > > I haven't investigated that yet.
> > > It looks like another new test that my hardware fails to perform... 
> > 
> > Yes, this is probably caused by SMM code trying to emulate a PS/2
> > keyboard from a (maybe connected or not) USB keyboard. Unfortunately we
> > have no way to disable this BIOS misfeature in the early boot process. 
> > Arjan, Len ?????
> 
> Nope.  By definition, SMM is invisible to the OS -- we don't even
> get a bit that said it occurred (though we'd like one -- it would
> be really helpful to diagnose issues like this one)
> 
> So go into BIOS SETUP and see if there is a USB Legacy Emulation
> feature that you can disable.  Sometimes there is not, but disabling
> onboard USB altogether may help at least prove the issue is in that area.
> 
> > I built in this test to rule out bogus LAPIC timer calibration values
> > which are sometimes off by factor 2-10.
> > 
> > But I also built in a calibration against the PM-Timer, which turned out
> > to be quite reliable and I think the additional verification step is
> > only necessary for sytems without PM-Timer.
> > 
> > That was a bit over cautious from my side. I send a patch to avoid this
> > when PM-Timer is available in a separate mail.
> 
> PM-Timer was invented to work-around the issue that the TSC became unreliable
> in the face of power management on laptops.  In particular, to be able
> to time duration of OS idle where TSC stopped.
> 
> While it is not fine grain, and it is not low-latency, is should
> be very reliable.  My understanding is that it is implemented as
> a simple divider right off the system 14MHz clock -- the signal
> which most motherboard clocks are PLL multiplied up from --
> including the 100MHz front-side bus which drives the LAPIC timer.
> 
> But that said, I don't understand why calibrating the LAPIC timer
> using the PM-timer is going to be more reliable -- exactly how
> and why did the previous calibration scheme fail?
> Maybe I could follow the new logic in apic.c if I saw the "apic=debug"
> output for this box.
> 
> cheers,
> -Len
> 
> 
> 

Hi,

	Yes, usb emulation is enabled, but I need it. I will test without usb emulation, but since it shows only sometimes, 
	I don't know yet whenever usb legacy affects it.

	Regards, 
		Maxim Levitsky

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [BUG] 2.6.21-rc1,2,3 regressions on my system that I found so far
  2007-03-16 23:19 ` Len Brown
@ 2007-03-17 23:00   ` Maxim
  2007-03-17 23:32     ` Thomas Gleixner
  2007-03-20 11:54     ` sysfs ugly timer interface (was Re: [BUG] 2.6.21-rc1,2,3 regressions on my system that I found so far) Pavel Machek
  0 siblings, 2 replies; 37+ messages in thread
From: Maxim @ 2007-03-17 23:00 UTC (permalink / raw)
  To: Len Brown; +Cc: linux-kernel

On Saturday 17 March 2007 01:19:44 Len Brown wrote:
> On Friday 16 March 2007 06:30, Maxim Levitsky wrote:
> > 
> > Good day, 
> > 
> > I want to report regressions I have with 2.6.21-rc3 kernel.
> > I use CONFIG_NO_HZ.
> 
> Do any of these issues go away with CONFIG_NO_HZ=n (or boot with nohz=n)
> or are they all independent of it?
> 
> thanks,
> -Len
> 
> > 1) Both suspend to disk and suspend to RAM are completely broken:
> > On vanilla 2.6.20 suspend to disk works perfectly and suspend to ram works _almost_ perfectly (I will tell about that later).
> > On 2.6.21-rc1 and later system hangs even before suspend begins (suspend to disk hangs before image write , and after suspend to ram , 
> > some devices are powered down (disk,power leds) , and some and not(fans, power) , and system hangs).
> > 
> > I did a git-bisect and I found which commit caused that:
> > 	e3c7db621bed4afb8e231cb005057f2feb5db557 - [PATCH] [PATCH] PM: Change code ordering in main.c (breaks  S3)
> > 	ed746e3b18f4df18afa3763155972c5835f284c5 - [PATCH] [PATCH] swsusp: Change code ordering in disk.c (breaks swsusp, I don't use it, but I tested it)
> >         259130526c267550bc365d3015917d90667732f1 - [PATCH] [PATCH] swsusp: Change code ordering in user.c (breaks uswsusp, that I use)
> > 
> > I reverted those commits and now system suspends correctly to disk, but suspend to ram showed some more regressions.
> > 
> > 
> > 2) ) After suspend to ram I get this 
> > 
> > Mar 14 00:22:23 MAIN kernel: [    2.072875] caller is check_tsc_sync_source+0x1d/0x100
> > Mar 14 00:22:23 MAIN kernel: [    2.072878]  [show_trace_log_lvl+26/48] show_trace_log_lvl+0x1a/0x30
> > Mar 14 00:22:23 MAIN kernel: [    2.072881]  [show_trace+18/32] show_trace+0x12/0x20
> > Mar 14 00:22:23 MAIN kernel: [    2.072884]  [dump_stack+22/32] dump_stack+0x16/0x20
> > Mar 14 00:22:23 MAIN kernel: [    2.072887]  [debug_smp_processor_id+173/176] debug_smp_processor_id+0xad/0xb0
> > Mar 14 00:22:23 MAIN kernel: [    2.072891]  [check_tsc_sync_source+29/256] check_tsc_sync_source+0x1d/0x100
> > Mar 14 00:22:23 MAIN kernel: [    2.072894]  [__cpu_up+80/384] __cpu_up+0x50/0x180
> > Mar 14 00:22:23 MAIN kernel: [    2.072897]  [_cpu_up+98/208] _cpu_up+0x62/0xd0
> > Mar 14 00:22:23 MAIN kernel: [    2.072901]  [cpu_up+46/80] cpu_up+0x2e/0x50
> > Mar 14 00:22:23 MAIN kernel: [    2.072903]  [enable_nonboot_cpus+110/160] enable_nonboot_cpus+0x6e/0xa0
> > Mar 14 00:22:23 MAIN kernel: [    2.072906]  [enter_state+326/496] enter_state+0x146/0x1f0
> > Mar 14 00:22:23 MAIN kernel: [    2.072909]  [state_store+174/192] state_store+0xae/0xc0
> > Mar 14 00:22:23 MAIN kernel: [    2.072912]  [subsys_attr_store+43/64] subsys_attr_store+0x2b/0x40
> > Mar 14 00:22:23 MAIN kernel: [    2.072917]  [sysfs_write_file+186/272] sysfs_write_file+0xba/0x110
> > Mar 14 00:22:23 MAIN kernel: [    2.072920]  [vfs_write+150/352] vfs_write+0x96/0x160
> > Mar 14 00:22:23 MAIN kernel: [    2.072923]  [sys_write+61/112] sys_write+0x3d/0x70
> > Mar 14 00:22:23 MAIN kernel: [    2.072926]  [sysenter_past_esp+93/153] sysenter_past_esp+0x5d/0x99
> > Mar 14 00:22:23 MAIN kernel: [    2.072929]  =======================
> > Mar 14 00:22:23 MAIN kernel: [    2.072931] checking TSC synchronization [CPU#0 -> CPU#1]:
> > Mar 14 00:22:23 MAIN kernel: [    2.092922] Measured 72051818872 cycles TSC warp between CPUs, turning off
> > 
> > It looks clear that preempt is enabled all the way in second cpu initialization, ( I think that at least in check_tsc_sync_source, it should be disabled,
> > shouldn't it ? )
> > 
> > Then I did add preempt_disable() / preempt_enable()  to this function , and  I still got this:
> > 
> > Mar 14 00:22:23 MAIN kernel: [    2.072931] checking TSC synchronization [CPU#0 -> CPU#1]:
> > Mar 14 00:22:23 MAIN kernel: [    2.092922] Measured 72051818872 cycles TSC warp between CPUs, turning off
> > 
> > It happens after second CPU is brought back on-line.
> > 
> > Now I understand that this is TSC sync problem and I tried to do some tests:
> > 
> >  I tried to disable/enable second CPU by hand, eg I did number of times,
> > 
> > echo "0" > /sys/devices/system/cpu/cpu1/online
> > echo "1" > /sys/devices/system/cpu/cpu1/online
> > 
> > and TSC sync was ok.
> > 
> > Then I disabled 2nd CPU, have suspended system to RAM , resumed it  , and then enabled 2nd CPU and got same error message.
> > Then I disabled cpufreq , and did above tests, and got same results.
> > I think that maybe this error is false, that there is some difference in TSC clock, but this difference is constant, and can be fixed
> > 
> > 3) Sometimes I get this (once in three boots or so)
> > 
> > [   36.217405] ENABLING IO-APIC IRQs
> > [   36.217587] ..TIMER: vector=0x31 apic1=0 pin1=2 apic2=-1 pin2=-1
> > [   36.433917] APIC timer disabled due to verification failure.
> > 
> > And NO_HZ is disabled due to that (I get 1000/s timer's interrupts)
> > I haven't investigated that yet.
> > It looks like another new test that my hardware fails to perform... 
> > 
> > 
> > And now I want to tell you about that _almost_ working suspend to ram I got in 2.6.20:
> > To put it simply sometimes system wakes from resume, and sometimes not (about 1 in 5 times)
> > When it does it works perfectly. 
> > 
> > This is quite common problem but ironically my case is very different and harder to solve.
> > 
> > The fact is that I found thanks to RTC tricks , similar to PM_TRACE that system hangs in exactly three places (in one of them of course)
> > 
> > I put between instructions, code like that to save a position in RTC alarm which is not cleared on reboot
> > Note that this code uses ax, but I checked every time I put it that ax can be used (eg, it is loaded in next instruction)
> > 
> > #define TRACE(val) 			 \
> > 	movb	$0x01, %al		; \
> > 	outb	%al, $0x70 		; \
> > 	movb	$ ## val, %al 		; \
> > 	outb	%al, $0x71
> > 
> > It hangs very early in asm code, and those are places:
> > 
> > 1) /home/maxim/software/kernel/linux-2.6.20-mod/arch/i386/kernel/acpi/wakeup.S:wakeup_start:
> > 	ljmpl	$__KERNEL_CS,$wakeup_pmode_return
> > As I see that is first time wakeup low page is addressing kernel memory by jumping to it.
> > 
> > 2)  /home/maxim/software/kernel/linux-2.6.20-mod/arch/i386/kernel/acpi/wakeup.S:do_suspend_lowlevel
> > 	call	restore_registers 
> > It hangs exacly on that instruction, I can only see that this is first time protected stack is accessed
> > 
> > 3) /home/maxim/software/kernel/linux-2.6.20-mod/arch/i386/power/cpu.c :__restore_processor_state(struct saved_context *ctxt)
> > 	mtrr_ap_init();
> > 
> > It hangs somewhere inside this function, and because is is long I haven't found where exactly, it is easier to disable MTRR all together.
> > 
> > Note that all three places have different external behavier:
> > 
> > in first case, system powers on -> off -> on -> hangs
> > I did put test in RTC to see whenever BIOS call kernel twice but I know from this test for sure that it calls it only once,
> > It actualy makes sense, because exception occures before IDT is loaded, so system has no choice but to power down
> > 
> > in second case I see blinking leds -> almost sure a oops
> > 
> > in third case system just hangs
> > 
> > That's all, I will continue to dig those problems out
> > 
> > Thanks for attention,
> > 	Maxim Levitsky
> > -
> > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> > the body of a message to majordomo@vger.kernel.org
> > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> > Please read the FAQ at  http://www.tux.org/lkml/
> > 
> 

Issues that are caused by those

        e3c7db621bed4afb8e231cb005057f2feb5db557 - [PATCH] [PATCH] PM: Change code ordering in main.c (breaks  S3)
        ed746e3b18f4df18afa3763155972c5835f284c5 - [PATCH] [PATCH] swsusp: Change code ordering in disk.c (breaks swsusp, I don't use it, but I tested it)
        259130526c267550bc365d3015917d90667732f1 - [PATCH] [PATCH] swsusp: Change code ordering in user.c (breaks uswsusp, that I use)

doesn't go away, even worse, I tested latest Linus's git tree, and same hang both is suspend to disk and suspend to ram (before suspending)



>Mar 14 00:22:23 MAIN kernel: [    2.072875] caller is check_tsc_sync_source+0x1d/0x100
>Mar 14 00:22:23 MAIN kernel: [    2.072878]  [show_trace_log_lvl+26/48] show_trace_log_lvl+0x1a/0x30
>Mar 14 00:22:23 MAIN kernel: [    2.072881]  [show_trace+18/32] show_trace+0x12/0x20
>Mar 14 00:22:23 MAIN kernel: [    2.072884]  [dump_stack+22/32] dump_stack+0x16/0x20
>Mar 14 00:22:23 MAIN kernel: [    2.072887]  [debug_smp_processor_id+173/176] debug_smp_processor_id+0xad/0x


^ This thing is fixed (sorry for bothering you)


>Mar 14 00:22:23 MAIN kernel: [    2.072931] checking TSC synchronization [CPU#0 -> CPU#1]:
>Mar 14 00:22:23 MAIN kernel: [    2.092922] Measured 72051818872 cycles TSC warp between CPUs, turning off

^ This one I don't think is related to NO_HZ, maybe it is hardware problem, but it exist without NO_HZ


>[   36.217405] ENABLING IO-APIC IRQs
>[   36.217587] ..TIMER: vector=0x31 apic1=0 pin1=2 apic2=-1 pin2=-1
>[   36.433917] APIC timer disabled due to verification failure.

This one is now discussed, I will look at it and it is not related to NO_HZ

And hangs with S3 as I said occur even in 2.6.20


And I forgot to tell about another problem with (now I know ,hi-resolution timers)
That before suspend to ram APIC timer is used and HPET is not used :

root@MAIN:/home/maxim# cat /sys/devices/system/clockevents/clockevents0/registered
lapic                F:0007 M:3(periodic) C: 1
hpet                 F:0003 M:1(shutdown) C: 0
lapic                F:0007 M:3(periodic) C: 0
root@MAIN:/home/maxim#   


But after suspend to ram HPET is 'woken'

root@MAIN:/home/maxim# cat /sys/devices/system/clockevents/clockevents0/registered
lapic                F:0007 M:3(one shoot) C: 1
hpet                 F:0003 M:3(one shoot) C: 0
lapic                F:0007 M:3(one shoot) C: 0

Note that I added those (one shoot), (periodic) descriptions, would be nice to have them in kernel, can I send a patch ?  ;-)

and I see average of 18 IRQs/sec on IRQ 0


Best regards,
	Maxim Levitsky


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [BUG] 2.6.21-rc1,2,3 regressions on my system that I found so far
  2007-03-16 23:39 ` [BUG] 2.6.21-rc1,2,3 regressions on my system that I found so far Thomas Gleixner
@ 2007-03-17 23:01   ` Maxim
  0 siblings, 0 replies; 37+ messages in thread
From: Maxim @ 2007-03-17 23:01 UTC (permalink / raw)
  To: tglx; +Cc: linux-kernel

On Saturday 17 March 2007 01:39:01 Thomas Gleixner wrote:
> On Fri, 2007-03-16 at 12:30 +0200, Maxim Levitsky wrote:
> > Mar 14 00:22:23 MAIN kernel: [    2.072875] caller is check_tsc_sync_source+0x1d/0x100
> > Mar 14 00:22:23 MAIN kernel: [    2.072878]  [show_trace_log_lvl+26/48] show_trace_log_lvl+0x1a/0x30
> > Mar 14 00:22:23 MAIN kernel: [    2.072931] checking TSC synchronization [CPU#0 -> CPU#1]:
> > Mar 14 00:22:23 MAIN kernel: [    2.092922] Measured 72051818872 cycles TSC warp between CPUs, turning off
> > 
> > It looks clear that preempt is enabled all the way in second cpu initialization, ( I think that at least in check_tsc_sync_source, it should be disabled,
> > shouldn't it ? )
> 
> This should be fixed by commit d04f41e35343f1d788551fd3f753f51794f4afcf
> 
> 	tglx
> 
> 
> 
> 

Hi,

Yes, it is fixed, thanks

Regards,
	Maxim Levitsky

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [BUG] 2.6.21-rc1,2,3 regressions on my system that I found so far
  2007-03-17 23:00   ` Maxim
@ 2007-03-17 23:32     ` Thomas Gleixner
  2007-12-30  7:50       ` Mike Galbraith
  2007-03-20 11:54     ` sysfs ugly timer interface (was Re: [BUG] 2.6.21-rc1,2,3 regressions on my system that I found so far) Pavel Machek
  1 sibling, 1 reply; 37+ messages in thread
From: Thomas Gleixner @ 2007-03-17 23:32 UTC (permalink / raw)
  To: Maxim; +Cc: Len Brown, linux-kernel

Maxim,

On Sun, 2007-03-18 at 01:00 +0200, Maxim wrote:
> >Mar 14 00:22:23 MAIN kernel: [    2.072931] checking TSC synchronization [CPU#0 -> CPU#1]:
> >Mar 14 00:22:23 MAIN kernel: [    2.092922] Measured 72051818872 cycles TSC warp between CPUs, turning off
> 
> ^ This one I don't think is related to NO_HZ, maybe it is hardware
> problem, but it exist without NO_HZ

The TSC is checked for synchronization between the CPUs. It's nothing to
worry about. We switch off the TSC and use a different clocksource.

Is this after resume ? If yes, then something (probably BIOS) is
fiddling with the TSC of one CPU when the resume happens.

> >[   36.217405] ENABLING IO-APIC IRQs
> >[   36.217587] ..TIMER: vector=0x31 apic1=0 pin1=2 apic2=-1 pin2=-1
> >[   36.433917] APIC timer disabled due to verification failure.
> 
> This one is now discussed, I will look at it and it is not related to NO_HZ

I sent a patch for this yesterday:

http://marc.info/?l=linux-kernel&m=117408952322631&w=2

> And I forgot to tell about another problem with (now I know ,hi-resolution timers)
> That before suspend to ram APIC timer is used and HPET is not used :
> 
> root@MAIN:/home/maxim# cat /sys/devices/system/clockevents/clockevents0/registered
> lapic                F:0007 M:3(periodic) C: 1
> hpet                 F:0003 M:1(shutdown) C: 0
> lapic                F:0007 M:3(periodic) C: 0
> root@MAIN:/home/maxim#   
> 
> But after suspend to ram HPET is 'woken'
> 
> root@MAIN:/home/maxim# cat /sys/devices/system/clockevents/clockevents0/registered
> lapic                F:0007 M:3(one shoot) C: 1
> hpet                 F:0003 M:3(one shoot) C: 0
> lapic                F:0007 M:3(one shoot) C: 0

This is unrelated to suspend / resume. The local apic timers stop
(hardware madness), when the CPU enters C3 power state. In this case we
switch to HPET (or PIT when HPET is not available) and broadcast the
events via Inter Processor Interrupts. This is nothing to worry about. 

I'm a bit surprised though, that your system was in periodic mode before
suspend and switched to one shot mode on resume.

Is this reproducible ? If yes, can you please provide the dmesg output
from boot to resume ?

> Note that I added those (one shoot), (periodic) descriptions, would be
> nice to have them in kernel, can I send a patch ?  ;-)

Sure, just s/shoot/shot/ :)

> and I see average of 18 IRQs/sec on IRQ 0

So dynticks are working as expected.

	tglx



^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH] i386: trust the PM-Timer calibration of the local APIC timer
  2007-03-17  0:04   ` [PATCH] i386: trust the PM-Timer calibration of the local APIC timer Thomas Gleixner
  2007-03-17  7:22     ` Ingo Molnar
  2007-03-17 13:24     ` Andi Kleen
@ 2007-03-18  8:12     ` Andrew Morton
  2007-03-18  8:45       ` Thomas Gleixner
  2 siblings, 1 reply; 37+ messages in thread
From: Andrew Morton @ 2007-03-18  8:12 UTC (permalink / raw)
  To: tglx
  Cc: Maxim Levitsky, linux-kernel, Ingo Molnar, Adrian Bunk,
	Arjan van de Ven, Len Brown

On Sat, 17 Mar 2007 01:04:56 +0100 Thomas Gleixner <tglx@linutronix.de> wrote:

> When PM-Timer is available for local APIC timer calibration we can skip
> the verification of the calibrated time value. The resulting error is
> quite small on a bunch of evaluated platforms and is less harming than
> the observed false positives.
> 
> We need to keep the verification on systems, which have no PM-Timer to
> avoid bogus local APIC timer calibrations in the range of factor 2-10,
> which can be observed when swicthing off the PM-timer support in the
> kernel configuration.
> 
> The wrong calibration values are probably caused by SMM code trying to
> emulate a PS/2 keyboard from a (maybe connected or not) USB keyboard.
> This prohibits the accurate delivery of PIT interrupts, which are used
> to calibrate the local APIC timer. Unfortunately we have no way to
> disable this BIOS misfeature in the early boot process.
> 
> Add also the dropped cpu_relax() back to the wait loops.

Is this a for-2.6.21 thing?

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH] i386: trust the PM-Timer calibration of the local APIC timer
  2007-03-18  8:12     ` Andrew Morton
@ 2007-03-18  8:45       ` Thomas Gleixner
  0 siblings, 0 replies; 37+ messages in thread
From: Thomas Gleixner @ 2007-03-18  8:45 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Maxim Levitsky, linux-kernel, Ingo Molnar, Adrian Bunk,
	Arjan van de Ven, Len Brown

On Sun, 2007-03-18 at 00:12 -0800, Andrew Morton wrote:
> On Sat, 17 Mar 2007 01:04:56 +0100 Thomas Gleixner <tglx@linutronix.de> wrote:
> 
> > When PM-Timer is available for local APIC timer calibration we can skip
> > the verification of the calibrated time value. The resulting error is
> > quite small on a bunch of evaluated platforms and is less harming than
> > the observed false positives.
> > 
> > We need to keep the verification on systems, which have no PM-Timer to
> > avoid bogus local APIC timer calibrations in the range of factor 2-10,
> > which can be observed when swicthing off the PM-timer support in the
> > kernel configuration.
> > 
> > The wrong calibration values are probably caused by SMM code trying to
> > emulate a PS/2 keyboard from a (maybe connected or not) USB keyboard.
> > This prohibits the accurate delivery of PIT interrupts, which are used
> > to calibrate the local APIC timer. Unfortunately we have no way to
> > disable this BIOS misfeature in the early boot process.
> > 
> > Add also the dropped cpu_relax() back to the wait loops.
> 
> Is this a for-2.6.21 thing?

Yes please. The false positives of the original calibration are
annoying.

	tglx



^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [BUG] 2.6.21-rc1,2,3 regressions on my system that I found so far
  2007-03-17 13:26       ` Andi Kleen
@ 2007-03-20  4:27         ` Greg KH
  2007-03-20  6:30           ` Thomas Gleixner
                             ` (2 more replies)
  0 siblings, 3 replies; 37+ messages in thread
From: Greg KH @ 2007-03-20  4:27 UTC (permalink / raw)
  To: Andi Kleen
  Cc: Arjan van de Ven, Len Brown, tglx, Maxim Levitsky, linux-kernel,
	Ingo Molnar, Andrew Morton, Adrian Bunk

On Sat, Mar 17, 2007 at 02:26:57PM +0100, Andi Kleen wrote:
> Arjan van de Ven <arjan@infradead.org> writes:
> > 
> > well we can do the handshake to take ownership like we do much later in
> > boot, but that requires PCI to be there and fully discovered, which we
> > don't have this early.
> 
> That's not true - we do early pci discovery. Doing USB handsoff
> there would be quite possible.

What, we don't do USB "handoff" early enough in the boot process?  It's
happening at PCI quirk time now, which I think should be early enough
for everyone (and too early for some who rely on USB keyboards and
initramfs shells...)

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [BUG] 2.6.21-rc1,2,3 regressions on my system that I found so far
  2007-03-16 23:44 ` Thomas Gleixner
  2007-03-17  0:04   ` [PATCH] i386: trust the PM-Timer calibration of the local APIC timer Thomas Gleixner
  2007-03-17  1:32   ` [BUG] 2.6.21-rc1,2,3 regressions on my system that I found so far Len Brown
@ 2007-03-20  5:04   ` Lee Revell
  2007-03-20  5:36     ` Eric St-Laurent
  2 siblings, 1 reply; 37+ messages in thread
From: Lee Revell @ 2007-03-20  5:04 UTC (permalink / raw)
  To: tglx
  Cc: Maxim Levitsky, linux-kernel, Ingo Molnar, Andrew Morton,
	Adrian Bunk, Arjan van de Ven, Len Brown

On 3/16/07, Thomas Gleixner <tglx@linutronix.de> wrote:
> Yes, this is probably caused by SMM code trying to emulate a PS/2
> keyboard from a (maybe connected or not) USB keyboard. Unfortunately we
> have no way to disable this BIOS misfeature in the early boot process.

https://mail.rtai.org/pipermail/rtai/2003-March/002949.html

http://www.embeddedrelated.com/usenet/embedded/show/50333-1.php

I think CONFIG_TRY_TO_DISABLE_SMI would be excellent for debugging,
not to mention people trying to spec out hardware for RT
applications...

Lee

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [BUG] 2.6.21-rc1,2,3 regressions on my system that I found so far
  2007-03-20  5:04   ` Lee Revell
@ 2007-03-20  5:36     ` Eric St-Laurent
  2007-03-20  9:15       ` Arjan van de Ven
  0 siblings, 1 reply; 37+ messages in thread
From: Eric St-Laurent @ 2007-03-20  5:36 UTC (permalink / raw)
  To: Lee Revell
  Cc: tglx, Maxim Levitsky, linux-kernel, Ingo Molnar, Andrew Morton,
	Adrian Bunk, Arjan van de Ven, Len Brown

On Tue, 2007-20-03 at 01:04 -0400, Lee Revell wrote:

> I think CONFIG_TRY_TO_DISABLE_SMI would be excellent for debugging,
> not to mention people trying to spec out hardware for RT
> applications...

There is a SMI disabling module in RTAI, check the smi-module.c in this:

https://www.rtai.org/RTAI/rtai-3.5.tar.bz2

More infos:

http://www.captain.at/rtai-smi-high-latency.php
http://www.captain.at/xenomai-smi-high-latency.php

It might make sense to merge this code, at least in the -rt tree.


- Eric



^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [BUG] 2.6.21-rc1,2,3 regressions on my system that I found so far
  2007-03-20  4:27         ` Greg KH
@ 2007-03-20  6:30           ` Thomas Gleixner
  2007-03-20  9:14           ` Arjan van de Ven
  2007-03-20 11:36           ` Andi Kleen
  2 siblings, 0 replies; 37+ messages in thread
From: Thomas Gleixner @ 2007-03-20  6:30 UTC (permalink / raw)
  To: Greg KH
  Cc: Andi Kleen, Arjan van de Ven, Len Brown, Maxim Levitsky,
	linux-kernel, Ingo Molnar, Andrew Morton, Adrian Bunk

On Mon, 2007-03-19 at 21:27 -0700, Greg KH wrote:
> On Sat, Mar 17, 2007 at 02:26:57PM +0100, Andi Kleen wrote:
> > Arjan van de Ven <arjan@infradead.org> writes:
> > > 
> > > well we can do the handshake to take ownership like we do much later in
> > > boot, but that requires PCI to be there and fully discovered, which we
> > > don't have this early.
> > 
> > That's not true - we do early pci discovery. Doing USB handsoff
> > there would be quite possible.
> 
> What, we don't do USB "handoff" early enough in the boot process?  It's
> happening at PCI quirk time now, which I think should be early enough
> for everyone (and too early for some who rely on USB keyboards and
> initramfs shells...)

It happens way after the CPUs are brought up. At this point both the
delay loop calibration and the local APIC calibration are already done.

	tglx



^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [BUG] 2.6.21-rc1,2,3 regressions on my system that I found so far
  2007-03-20  4:27         ` Greg KH
  2007-03-20  6:30           ` Thomas Gleixner
@ 2007-03-20  9:14           ` Arjan van de Ven
  2007-03-20 11:36           ` Andi Kleen
  2 siblings, 0 replies; 37+ messages in thread
From: Arjan van de Ven @ 2007-03-20  9:14 UTC (permalink / raw)
  To: Greg KH
  Cc: Andi Kleen, Len Brown, tglx, Maxim Levitsky, linux-kernel,
	Ingo Molnar, Andrew Morton, Adrian Bunk

On Mon, 2007-03-19 at 21:27 -0700, Greg KH wrote:
> On Sat, Mar 17, 2007 at 02:26:57PM +0100, Andi Kleen wrote:
> > Arjan van de Ven <arjan@infradead.org> writes:
> > > 
> > > well we can do the handshake to take ownership like we do much later in
> > > boot, but that requires PCI to be there and fully discovered, which we
> > > don't have this early.
> > 
> > That's not true - we do early pci discovery. Doing USB handsoff
> > there would be quite possible.
> 
> What, we don't do USB "handoff" early enough in the boot process?  It's
> happening at PCI quirk time now, which I think should be early enough
> for everyone (and too early for some who rely on USB keyboards and
> initramfs shells...)

it's not early enough for this bug, where the SMM code is ruining the
cpu calibrations :)

-- 
if you want to mail me at work (you don't), use arjan (at) linux.intel.com
Test the interaction between Linux and your BIOS via http://www.linuxfirmwarekit.org


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [BUG] 2.6.21-rc1,2,3 regressions on my system that I found so far
  2007-03-20  5:36     ` Eric St-Laurent
@ 2007-03-20  9:15       ` Arjan van de Ven
  2007-03-20 18:04         ` Andy Lutomirski
  2007-03-20 22:58         ` Eric St-Laurent
  0 siblings, 2 replies; 37+ messages in thread
From: Arjan van de Ven @ 2007-03-20  9:15 UTC (permalink / raw)
  To: Eric St-Laurent
  Cc: Lee Revell, tglx, Maxim Levitsky, linux-kernel, Ingo Molnar,
	Andrew Morton, Adrian Bunk, Len Brown

On Tue, 2007-03-20 at 01:36 -0400, Eric St-Laurent wrote:
> On Tue, 2007-20-03 at 01:04 -0400, Lee Revell wrote:
> 
> > I think CONFIG_TRY_TO_DISABLE_SMI would be excellent for debugging,
> > not to mention people trying to spec out hardware for RT
> > applications...
> 
> There is a SMI disabling module in RTAI, check the smi-module.c in this:
> 
> https://www.rtai.org/RTAI/rtai-3.5.tar.bz2
> 
> More infos:
> 
> http://www.captain.at/rtai-smi-high-latency.php
> http://www.captain.at/xenomai-smi-high-latency.php
> 
> It might make sense to merge this code, at least in the -rt tree.

it NEVER makes sense to disable SMM.

SMM is there to ensure that your hardware doesn't get physically
damaged.

disabling that is a BAD idea. I'm no fan of SMM myself, but it's there,
and we have to live with it. Disabling it without knowing what it does
on your system is madness.

-- 
if you want to mail me at work (you don't), use arjan (at) linux.intel.com
Test the interaction between Linux and your BIOS via http://www.linuxfirmwarekit.org


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [BUG] 2.6.21-rc1,2,3 regressions on my system that I found so far
  2007-03-20  4:27         ` Greg KH
  2007-03-20  6:30           ` Thomas Gleixner
  2007-03-20  9:14           ` Arjan van de Ven
@ 2007-03-20 11:36           ` Andi Kleen
  2007-03-20 11:41             ` Oliver Neukum
  2 siblings, 1 reply; 37+ messages in thread
From: Andi Kleen @ 2007-03-20 11:36 UTC (permalink / raw)
  To: Greg KH
  Cc: Andi Kleen, Arjan van de Ven, Len Brown, tglx, Maxim Levitsky,
	linux-kernel, Ingo Molnar, Andrew Morton, Adrian Bunk

On Mon, Mar 19, 2007 at 09:27:34PM -0700, Greg KH wrote:
> On Sat, Mar 17, 2007 at 02:26:57PM +0100, Andi Kleen wrote:
> > Arjan van de Ven <arjan@infradead.org> writes:
> > > 
> > > well we can do the handshake to take ownership like we do much later in
> > > boot, but that requires PCI to be there and fully discovered, which we
> > > don't have this early.
> > 
> > That's not true - we do early pci discovery. Doing USB handsoff
> > there would be quite possible.
> 
> What, we don't do USB "handoff" early enough in the boot process?  It's
> happening at PCI quirk time now, which I think should be early enough
> for everyone (and too early for some who rely on USB keyboards and

Early for drivers, but quite late for architecture initialization.

> initramfs shells...)

It's long after timer calibration, which is what it interfered with here.

To handle that it would need to be moved to the x86 early quirks and
use boot_ioremap etc. It would be probably somewhat messy, but doable.

-Andi


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [BUG] 2.6.21-rc1,2,3 regressions on my system that I found so far
  2007-03-20 11:36           ` Andi Kleen
@ 2007-03-20 11:41             ` Oliver Neukum
  0 siblings, 0 replies; 37+ messages in thread
From: Oliver Neukum @ 2007-03-20 11:41 UTC (permalink / raw)
  To: Andi Kleen
  Cc: Greg KH, Arjan van de Ven, Len Brown, tglx, Maxim Levitsky,
	linux-kernel, Ingo Molnar, Andrew Morton, Adrian Bunk

Am Dienstag, 20. März 2007 12:36 schrieb Andi Kleen:
> It's long after timer calibration, which is what it interfered with here.
> 
> To handle that it would need to be moved to the x86 early quirks and
> use boot_ioremap etc. It would be probably somewhat messy, but doable.

USB is not specific to x86. And not necessarily the only user of SMM.
Is this really necessary?

	Regards
		Oliver

^ permalink raw reply	[flat|nested] 37+ messages in thread

* sysfs ugly timer interface (was Re: [BUG] 2.6.21-rc1,2,3 regressions on my system that I found so far)
  2007-03-17 23:00   ` Maxim
  2007-03-17 23:32     ` Thomas Gleixner
@ 2007-03-20 11:54     ` Pavel Machek
  2007-03-22 15:28       ` Greg KH
  1 sibling, 1 reply; 37+ messages in thread
From: Pavel Machek @ 2007-03-20 11:54 UTC (permalink / raw)
  To: Maxim, Greg KH, Andrew Morton; +Cc: Len Brown, linux-kernel

Hi!

> root@MAIN:/home/maxim# cat /sys/devices/system/clockevents/clockevents0/registered
> lapic                F:0007 M:3(periodic) C: 1
> hpet                 F:0003 M:1(shutdown) C: 0
> lapic                F:0007 M:3(periodic) C: 0
> root@MAIN:/home/maxim#   

Now... this file needs to die, before 2.6.21 is released. It tries to
bring /proc-like parsing nightmare to sysfs. Kill it before it becomes
part of stable ABI!
							Pavel

-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [BUG] 2.6.21-rc1,2,3 regressions on my system that I found so far
  2007-03-20  9:15       ` Arjan van de Ven
@ 2007-03-20 18:04         ` Andy Lutomirski
  2007-03-20 22:58         ` Eric St-Laurent
  1 sibling, 0 replies; 37+ messages in thread
From: Andy Lutomirski @ 2007-03-20 18:04 UTC (permalink / raw)
  To: Arjan van de Ven, linux-kernel
  Cc: Eric St-Laurent, Lee Revell, tglx, Maxim Levitsky, linux-kernel,
	Ingo Molnar, Andrew Morton, Adrian Bunk, Len Brown

Arjan van de Ven wrote:
> On Tue, 2007-03-20 at 01:36 -0400, Eric St-Laurent wrote:
>> On Tue, 2007-20-03 at 01:04 -0400, Lee Revell wrote:
>>
>>> I think CONFIG_TRY_TO_DISABLE_SMI would be excellent for debugging,
>>> not to mention people trying to spec out hardware for RT
>>> applications...
>> There is a SMI disabling module in RTAI, check the smi-module.c in this:
>>
>> https://www.rtai.org/RTAI/rtai-3.5.tar.bz2
>>
>> More infos:
>>
>> http://www.captain.at/rtai-smi-high-latency.php
>> http://www.captain.at/xenomai-smi-high-latency.php
>>
>> It might make sense to merge this code, at least in the -rt tree.
> 
> it NEVER makes sense to disable SMM.
> 
> SMM is there to ensure that your hardware doesn't get physically
> damaged.
> 
> disabling that is a BAD idea. I'm no fan of SMM myself, but it's there,
> and we have to live with it. Disabling it without knowing what it does
> on your system is madness.
> 

How about disabling it long enough to calibrate the timers and then 
turning it back on?

--Andy

(apologies if anyone gets duplicates of this.  i'm encountering 
nightly-thunderbird-build bugs.)

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [BUG] 2.6.21-rc1,2,3 regressions on my system that I found so far
  2007-03-20  9:15       ` Arjan van de Ven
  2007-03-20 18:04         ` Andy Lutomirski
@ 2007-03-20 22:58         ` Eric St-Laurent
  1 sibling, 0 replies; 37+ messages in thread
From: Eric St-Laurent @ 2007-03-20 22:58 UTC (permalink / raw)
  To: Arjan van de Ven
  Cc: Lee Revell, tglx, Maxim Levitsky, linux-kernel, Ingo Molnar,
	Andrew Morton, Adrian Bunk, Len Brown

On Tue, 2007-20-03 at 10:15 +0100, Arjan van de Ven wrote:

> disabling that is a BAD idea. I'm no fan of SMM myself, but it's there,
> and we have to live with it. Disabling it without knowing what it does
> on your system is madness.
> 

Like Lee said, for "debugging", mainly trying to resolve unexplained
long latencies.

I've had a laptop that caused latency spikes with the cpu fan was turn
on. I tried disabling SMI to diagnose the problem with no success.

My current system has a BIOS feature to control fans speed according to
temperature. I presume this must a SMI to work right?  In this case it
should be possible to find and disable the related SMI and replace the
fan control with a user space software.

Of course it's not wise to blindly disable SMIs as we don't precisely
know what they do. 


- Eric



^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: sysfs ugly timer interface (was Re: [BUG] 2.6.21-rc1,2,3 regressions on my system that I found so far)
  2007-03-20 11:54     ` sysfs ugly timer interface (was Re: [BUG] 2.6.21-rc1,2,3 regressions on my system that I found so far) Pavel Machek
@ 2007-03-22 15:28       ` Greg KH
  2007-03-22 15:41         ` Thomas Gleixner
  2007-03-23  1:24         ` sysfs q [was: sysfs ugly timer interface] Jan Engelhardt
  0 siblings, 2 replies; 37+ messages in thread
From: Greg KH @ 2007-03-22 15:28 UTC (permalink / raw)
  To: Pavel Machek; +Cc: Maxim, Andrew Morton, Len Brown, linux-kernel

On Tue, Mar 20, 2007 at 11:54:03AM +0000, Pavel Machek wrote:
> Hi!
> 
> > root@MAIN:/home/maxim# cat /sys/devices/system/clockevents/clockevents0/registered
> > lapic                F:0007 M:3(periodic) C: 1
> > hpet                 F:0003 M:1(shutdown) C: 0
> > lapic                F:0007 M:3(periodic) C: 0
> > root@MAIN:/home/maxim#   
> 
> Now... this file needs to die, before 2.6.21 is released. It tries to
> bring /proc-like parsing nightmare to sysfs. Kill it before it becomes
> part of stable ABI!

Eeek!

I agree, that needs to be fixed now.

Remember, 1 value per file in sysfs!  Shall I just submit a patch
ripping it out for now?

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: sysfs ugly timer interface (was Re: [BUG] 2.6.21-rc1,2,3 regressions on my system that I found so far)
  2007-03-22 15:28       ` Greg KH
@ 2007-03-22 15:41         ` Thomas Gleixner
  2007-03-23  1:24         ` sysfs q [was: sysfs ugly timer interface] Jan Engelhardt
  1 sibling, 0 replies; 37+ messages in thread
From: Thomas Gleixner @ 2007-03-22 15:41 UTC (permalink / raw)
  To: Greg KH; +Cc: Pavel Machek, Maxim, Andrew Morton, Len Brown, linux-kernel

On Thu, 2007-03-22 at 08:28 -0700, Greg KH wrote:
> On Tue, Mar 20, 2007 at 11:54:03AM +0000, Pavel Machek wrote:
> > Hi!
> > 
> > > root@MAIN:/home/maxim# cat /sys/devices/system/clockevents/clockevents0/registered
> > > lapic                F:0007 M:3(periodic) C: 1
> > > hpet                 F:0003 M:1(shutdown) C: 0
> > > lapic                F:0007 M:3(periodic) C: 0
> > > root@MAIN:/home/maxim#   
> > 
> > Now... this file needs to die, before 2.6.21 is released. It tries to
> > bring /proc-like parsing nightmare to sysfs. Kill it before it becomes
> > part of stable ABI!
> 
> Eeek!
> 
> I agree, that needs to be fixed now.
> 
> Remember, 1 value per file in sysfs!  Shall I just submit a patch
> ripping it out for now?

I fix it.

	tglx



^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: sysfs q [was: sysfs ugly timer interface]
  2007-03-22 15:28       ` Greg KH
  2007-03-22 15:41         ` Thomas Gleixner
@ 2007-03-23  1:24         ` Jan Engelhardt
  2007-03-23  4:48           ` Greg KH
  1 sibling, 1 reply; 37+ messages in thread
From: Jan Engelhardt @ 2007-03-23  1:24 UTC (permalink / raw)
  To: Greg KH; +Cc: Linux Kernel Mailing List


On Mar 22 2007 08:28, Greg KH wrote:
>> 
>> > root@MAIN:/home/maxim# cat /sys/devices/system/clockevents/clockevents0/registered
>> > lapic                F:0007 M:3(periodic) C: 1
>> > hpet                 F:0003 M:1(shutdown) C: 0
>> > lapic                F:0007 M:3(periodic) C: 0
>> > root@MAIN:/home/maxim#   
>> 
>> Now... this file needs to die, before 2.6.21 is released. It tries to
>> bring /proc-like parsing nightmare to sysfs. Kill it before it becomes
>> part of stable ABI!
>
>Eeek!

Question regarding sysfs files: How would you do something like
/proc/net/nf_conntrack with sysfs? Have directories named like 0000, 
0001, 0002, ..?


Jan
-- 

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: sysfs q [was: sysfs ugly timer interface]
  2007-03-23  1:24         ` sysfs q [was: sysfs ugly timer interface] Jan Engelhardt
@ 2007-03-23  4:48           ` Greg KH
  2007-03-23  6:05             ` Jan Engelhardt
  0 siblings, 1 reply; 37+ messages in thread
From: Greg KH @ 2007-03-23  4:48 UTC (permalink / raw)
  To: Jan Engelhardt; +Cc: Linux Kernel Mailing List

On Fri, Mar 23, 2007 at 02:24:46AM +0100, Jan Engelhardt wrote:
> 
> On Mar 22 2007 08:28, Greg KH wrote:
> >> 
> >> > root@MAIN:/home/maxim# cat /sys/devices/system/clockevents/clockevents0/registered
> >> > lapic                F:0007 M:3(periodic) C: 1
> >> > hpet                 F:0003 M:1(shutdown) C: 0
> >> > lapic                F:0007 M:3(periodic) C: 0
> >> > root@MAIN:/home/maxim#   
> >> 
> >> Now... this file needs to die, before 2.6.21 is released. It tries to
> >> bring /proc-like parsing nightmare to sysfs. Kill it before it becomes
> >> part of stable ABI!
> >
> >Eeek!
> 
> Question regarding sysfs files: How would you do something like
> /proc/net/nf_conntrack with sysfs? Have directories named like 0000, 
> 0001, 0002, ..?

I don't know, I've never said that _all_ proc files can move to sysfs.
For some things, like possibly the netfilter stuff, proc files make more
sense.

Were you thinking of moving this file to sysfs?  What does the
information in it represent?

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: sysfs q [was: sysfs ugly timer interface]
  2007-03-23  4:48           ` Greg KH
@ 2007-03-23  6:05             ` Jan Engelhardt
  0 siblings, 0 replies; 37+ messages in thread
From: Jan Engelhardt @ 2007-03-23  6:05 UTC (permalink / raw)
  To: Greg KH; +Cc: Linux Kernel Mailing List


On Mar 22 2007 21:48, Greg KH wrote:
>On Fri, Mar 23, 2007 at 02:24:46AM +0100, Jan Engelhardt wrote:
>> On Mar 22 2007 08:28, Greg KH wrote:
>> 
>> Question regarding sysfs files: How would you do something like 
>> /proc/net/nf_conntrack with sysfs? Have directories named like 0000, 
>> 0001, 0002, ..?
>
>I don't know, I've never said that _all_ proc files can move to sysfs. 
>For some things, like possibly the netfilter stuff, proc files make 
>more sense.

But proc is for procs. (At least its name indicates.)

>Were you thinking of moving this file to sysfs?

No, not that one. But new modules. Everyone says "please no new /proc 
files"[some examples, 1,2]. On the other hand,

[1] http://lkml.org/lkml/2007/1/21/34
[2] http://lkml.org/lkml/2005/2/3/285

>>>> root@MAIN:/home/maxim# cat /sys/devices/system/clockevents/clockevents0/registered                         
>>>> lapic                F:0007 M:3(periodic) C: 1
>>>> hpet                 F:0003 M:1(shutdown) C: 0
>>>> lapic                F:0007 M:3(periodic) C: 0
>>>> root@MAIN:/home/maxim#
>>>                                                                            
>>> Now... this file needs to die, before 2.6.21 is released. It tries to
>>> bring /proc-like parsing nightmare to sysfs. Kill it before it becomes
>>> part of stable ABI!

when there's a proc-style multi-line file like that clockevents thing in 
sysfs, people raise objections too (see above), which leads me to the 
question: if neither procfs nor sysfs are appropriate for such files, 
what is?


>What does the information in it represent?

A list of the currently tracked connections.



Jan
-- 

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [BUG] 2.6.21-rc1,2,3 regressions on my system that I found so far
  2007-03-17 23:32     ` Thomas Gleixner
@ 2007-12-30  7:50       ` Mike Galbraith
  2007-12-30 14:53         ` Ingo Molnar
  0 siblings, 1 reply; 37+ messages in thread
From: Mike Galbraith @ 2007-12-30  7:50 UTC (permalink / raw)
  To: tglx; +Cc: Maxim, Len Brown, linux-kernel

(hm, google says i'm not the only one seeing this, so...)

On Sun, 2007-03-18 at 00:32 +0100, Thomas Gleixner wrote:
> Maxim,
> 
> On Sun, 2007-03-18 at 01:00 +0200, Maxim wrote:
> > >Mar 14 00:22:23 MAIN kernel: [    2.072931] checking TSC synchronization [CPU#0 -> CPU#1]:
> > >Mar 14 00:22:23 MAIN kernel: [    2.092922] Measured 72051818872 cycles TSC warp between CPUs, turning off
> > 
> > ^ This one I don't think is related to NO_HZ, maybe it is hardware
> > problem, but it exist without NO_HZ
> 
> The TSC is checked for synchronization between the CPUs. It's nothing to
> worry about. We switch off the TSC and use a different clocksource.
> 
> Is this after resume ? If yes, then something (probably BIOS) is
> fiddling with the TSC of one CPU when the resume happens.

My P4 box has the same "problem", which is remedied by..

diff --git a/arch/x86/kernel/tsc_sync.c b/arch/x86/kernel/tsc_sync.c
index 9125efe..7b74969 100644
--- a/arch/x86/kernel/tsc_sync.c
+++ b/arch/x86/kernel/tsc_sync.c
@@ -46,7 +46,7 @@ static __cpuinit void check_tsc_warp(void)
 	cycles_t start, now, prev, end;
 	int i;
 
-	start = get_cycles_sync();
+	start = last_tsc = get_cycles_sync();
 	/*
 	 * The measurement runs for 20 msecs:
 	 */

..whacking the ancient last_tsc before entering test loop.  Question is,
is there a good reason to disable the TSC once it's been stepped upon by
BIOS?  Are there any ill effects to be awaited by ignoring this BIOS
artifact?  All seems just fine here.

	-Mike


^ permalink raw reply related	[flat|nested] 37+ messages in thread

* Re: [BUG] 2.6.21-rc1,2,3 regressions on my system that I found so far
  2007-12-30  7:50       ` Mike Galbraith
@ 2007-12-30 14:53         ` Ingo Molnar
  0 siblings, 0 replies; 37+ messages in thread
From: Ingo Molnar @ 2007-12-30 14:53 UTC (permalink / raw)
  To: Mike Galbraith; +Cc: tglx, Maxim, Len Brown, linux-kernel, Andrew Morton


* Mike Galbraith <efault@gmx.de> wrote:

> > Is this after resume ? If yes, then something (probably BIOS) is 
> > fiddling with the TSC of one CPU when the resume happens.
> 
> My P4 box has the same "problem", which is remedied by..

> -	start = get_cycles_sync();
> +	start = last_tsc = get_cycles_sync();

this is slightly racy - your second patch that initializes things 
properly is the right solution IMO. I'm wondering, if others are seeing 
this too, should we make this a v2.6.24 item? It's a bit late for that i 
think - although it shouldnt hurt.

	Ingo

^ permalink raw reply	[flat|nested] 37+ messages in thread

end of thread, other threads:[~2007-12-30 14:57 UTC | newest]

Thread overview: 37+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2007-03-16 10:30 [BUG] 2.6.21-rc1,2,3 regressions on my system that I found so far Maxim Levitsky
2007-03-16 23:19 ` Len Brown
2007-03-17 23:00   ` Maxim
2007-03-17 23:32     ` Thomas Gleixner
2007-12-30  7:50       ` Mike Galbraith
2007-12-30 14:53         ` Ingo Molnar
2007-03-20 11:54     ` sysfs ugly timer interface (was Re: [BUG] 2.6.21-rc1,2,3 regressions on my system that I found so far) Pavel Machek
2007-03-22 15:28       ` Greg KH
2007-03-22 15:41         ` Thomas Gleixner
2007-03-23  1:24         ` sysfs q [was: sysfs ugly timer interface] Jan Engelhardt
2007-03-23  4:48           ` Greg KH
2007-03-23  6:05             ` Jan Engelhardt
2007-03-16 23:39 ` [BUG] 2.6.21-rc1,2,3 regressions on my system that I found so far Thomas Gleixner
2007-03-17 23:01   ` Maxim
2007-03-16 23:44 ` Thomas Gleixner
2007-03-17  0:04   ` [PATCH] i386: trust the PM-Timer calibration of the local APIC timer Thomas Gleixner
2007-03-17  7:22     ` Ingo Molnar
2007-03-17 13:24     ` Andi Kleen
2007-03-18  8:12     ` Andrew Morton
2007-03-18  8:45       ` Thomas Gleixner
2007-03-17  1:32   ` [BUG] 2.6.21-rc1,2,3 regressions on my system that I found so far Len Brown
2007-03-17  9:56     ` Thomas Gleixner
2007-03-17 11:05       ` Thomas Gleixner
2007-03-17 16:52       ` Thomas Gleixner
2007-03-17 10:32     ` Arjan van de Ven
2007-03-17 13:26       ` Andi Kleen
2007-03-20  4:27         ` Greg KH
2007-03-20  6:30           ` Thomas Gleixner
2007-03-20  9:14           ` Arjan van de Ven
2007-03-20 11:36           ` Andi Kleen
2007-03-20 11:41             ` Oliver Neukum
2007-03-17 22:45     ` Maxim
2007-03-20  5:04   ` Lee Revell
2007-03-20  5:36     ` Eric St-Laurent
2007-03-20  9:15       ` Arjan van de Ven
2007-03-20 18:04         ` Andy Lutomirski
2007-03-20 22:58         ` Eric St-Laurent

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.