All of lore.kernel.org
 help / color / mirror / Atom feed
* 2.6.20-rc6-rt4 register_cpu_notification undefined
@ 2007-01-30 11:26 Rui Nuno Capela
  2007-02-09 18:56 ` 2.6.20-rt5 Oops on boot Rui Nuno Capela
  2007-04-01 17:12 ` 2.6.21-rc5-rt6 make errors Rui Nuno Capela
  0 siblings, 2 replies; 30+ messages in thread
From: Rui Nuno Capela @ 2007-01-30 11:26 UTC (permalink / raw)
  To: Ingo Molnar; +Cc: linux-rt-users, linux-kernel

[-- Attachment #1: Type: text/plain, Size: 229 bytes --]

Hi,

Just to let you know that this my simple patch solves the
'register_cpu_notifier' being undefined when SMP is set but not
HOTPLUG_CPU.

Dunno if its the right thing, tho.

Cheers
-- 
rncbc aka Rui Nuno Capela
rncbc@rncbc.org

[-- Attachment #2: linux-2.6.20-rc6-rt4-cpu.patch --]
[-- Type: application/octet-stream, Size: 357 bytes --]

--- linux/kernel/cpu.c.orig	2007-01-25 02:19:28.000000000 +0000
+++ linux/kernel/cpu.c	2007-01-30 09:43:22.000000000 +0000
@@ -75,10 +75,10 @@
 	return ret;
 }
 
-#ifdef CONFIG_HOTPLUG_CPU
-
 EXPORT_SYMBOL(register_cpu_notifier);
 
+#ifdef CONFIG_HOTPLUG_CPU
+
 void unregister_cpu_notifier(struct notifier_block *nb)
 {
 	mutex_lock(&cpu_add_remove_lock);

^ permalink raw reply	[flat|nested] 30+ messages in thread

* 2.6.20-rt5 Oops on boot
  2007-01-30 11:26 2.6.20-rc6-rt4 register_cpu_notification undefined Rui Nuno Capela
@ 2007-02-09 18:56 ` Rui Nuno Capela
  2007-02-16  0:46   ` 2.6.20-rt5 Oops on boot [-rt8 OK] Rui Nuno Capela
  2007-04-01 17:12 ` 2.6.21-rc5-rt6 make errors Rui Nuno Capela
  1 sibling, 1 reply; 30+ messages in thread
From: Rui Nuno Capela @ 2007-02-09 18:56 UTC (permalink / raw)
  To: Ingo Molnar; +Cc: Rui Nuno Capela, linux-rt-users, linux-kernel

Hi,

I have terrible news: 2.6.20-rt5 does not boot at all on a couple
machines I was brave enough to try -- a P4@3.3Ghz SMP/HT desktop, and a
Core2 Duo T7200@2.0Ghz laptop. For the first case I could capture the
following dump via serial console:

--BOF--
Linux version 2.6.20-rt5.1 (root@gamma-suse1) (gcc version 4.1.2
20061115 (prerelease) (SUSE Linux)) #1 SMP PREEMPT Fri Feb 9 18:30:22
WET 2007
BIOS-provided physical RAM map:
sanitize start
sanitize end
copy_e820_map() start: 0000000000000000 size: 000000000009fc00 end:
000000000009fc00 type: 1
copy_e820_map() type is E820_RAM
copy_e820_map() start: 000000000009fc00 size: 0000000000000400 end:
00000000000a0000 type: 2
copy_e820_map() start: 00000000000e8000 size: 0000000000018000 end:
0000000000100000 type: 2
copy_e820_map() start: 0000000000100000 size: 000000003fe30000 end:
000000003ff30000 type: 1
copy_e820_map() type is E820_RAM
copy_e820_map() start: 000000003ff30000 size: 0000000000010000 end:
000000003ff40000 type: 3
copy_e820_map() start: 000000003ff40000 size: 00000000000b0000 end:
000000003fff0000 type: 4
copy_e820_map() start: 000000003fff0000 size: 0000000000010000 end:
0000000040000000 type: 2
copy_e820_map() start: 00000000ffb80000 size: 0000000000480000 end:
0000000100000000 type: 2
 BIOS-e820: 0000000000000000 - 000000000009fc00 (usable)
 BIOS-e820: 000000000009fc00 - 00000000000a0000 (reserved)
 BIOS-e820: 00000000000e8000 - 0000000000100000 (reserved)
 BIOS-e820: 0000000000100000 - 000000003ff30000 (usable)
 BIOS-e820: 000000003ff30000 - 000000003ff40000 (ACPI data)
 BIOS-e820: 000000003ff40000 - 000000003fff0000 (ACPI NVS)
 BIOS-e820: 000000003fff0000 - 0000000040000000 (reserved)
 BIOS-e820: 00000000ffb80000 - 0000000100000000 (reserved)
127MB HIGHMEM available.
896MB LOWMEM available.
found SMP MP-table at 000ff780
Entering add_active_range(0, 0, 261936) 0 entries of 256 used
Zone PFN ranges:
  DMA             0 ->     4096
  Normal       4096 ->   229376
  HighMem    229376 ->   261936
early_node_map[1] active PFN ranges
    0:        0 ->   261936
On node 0 totalpages: 261936
  DMA zone: 32 pages used for memmap
  DMA zone: 0 pages reserved
  DMA zone: 4064 pages, LIFO batch:0
  Normal zone: 1760 pages used for memmap
  Normal zone: 223520 pages, LIFO batch:31
  HighMem zone: 254 pages used for memmap
  HighMem zone: 32306 pages, LIFO batch:7
DMI 2.3 present.
Using APIC driver default
ACPI: RSDP (v002 ACPIAM                                ) @ 0x000f9e60
ACPI: XSDT (v001 A M I  OEMXSDT  0x08000320 MSFT 0x00000097) @ 0x3ff30100
ACPI: FADT (v003 A M I  OEMFACP  0x08000320 MSFT 0x00000097) @ 0x3ff30290
ACPI: MADT (v001 A M I  OEMAPIC  0x08000320 MSFT 0x00000097) @ 0x3ff30390
ACPI: OEMB (v001 A M I  OEMBIOS  0x08000320 MSFT 0x00000097) @ 0x3ff40040
ACPI: DSDT (v001  P4P81 P4P81086 0x00000086 INTL 0x02002026) @ 0x00000000
ACPI: PM-Timer IO Port: 0x808
ACPI: Local APIC address 0xfee00000
ACPI: LAPIC (acpi_id[0x01] lapic_id[0x00] enabled)
Processor #0 15:2 APIC version 20
ACPI: LAPIC (acpi_id[0x02] lapic_id[0x01] enabled)
Processor #1 15:2 APIC version 20
ACPI: IOAPIC (id[0x02] address[0xfec00000] gsi_base[0])
IOAPIC[0]: apic_id 2, version 32, address 0xfec00000, GSI 0-23
ACPI: INT_SRC_OVR (bus 0 bus_irq 0 global_irq 2 dfl dfl)
ACPI: INT_SRC_OVR (bus 0 bus_irq 9 global_irq 9 high level)
ACPI: IRQ0 used by override.
ACPI: IRQ2 used by override.
ACPI: IRQ9 used by override.
Enabling APIC mode:  Flat.  Using 1 I/O APICs
Using ACPI (MADT) for SMP configuration information
Allocating PCI resources starting at 50000000 (gap: 40000000:bfb80000)
Detected 3361.210 MHz processor.
Real-Time Preemption Support (C) 2004-2007 Ingo Molnar
Built 1 zonelists.  Total pages: 259890
Kernel command line: root=/dev/hda1 vga=0x31a resume=/dev/hda3
splash=silent console=tty0 console=ttyS0,115200n8 debug
mapped APIC to ffffd000 (fee00000)
mapped IOAPIC to ffffc000 (fec00000)
Enabling fast FPU save and restore... done.
Enabling unmasked SIMD FPU exception support... done.
Initializing CPU#0
WARNING: experimental RCU implementation.
CPU 0 irqstacks, hard=c03ed000 soft=c03eb000
PID hash table entries: 4096 (order: 12, 16384 bytes)
Console: colour dummy device 80x25
Dentry cache hash table entries: 131072 (order: 7, 524288 bytes)
Inode-cache hash table entries: 65536 (order: 6, 262144 bytes)
Memory: 1030292k/1047744k available (1725k kernel code, 16684k reserved,
1005k data, 220k init, 130240k highmem)
virtual kernel memory layout:
    fixmap  : 0xfff9c000 - 0xfffff000   ( 396 kB)
    pkmap   : 0xff800000 - 0xffc00000   (4096 kB)
    vmalloc : 0xf8800000 - 0xff7fe000   ( 111 MB)
    lowmem  : 0xc0000000 - 0xf8000000   ( 896 MB)
      .init : 0xc03af000 - 0xc03e6000   ( 220 kB)
      .data : 0xc02af5dd - 0xc03aacf4   (1005 kB)
      .text : 0xc0100000 - 0xc02af5dd   (1725 kB)
Checking if this processor honours the WP bit even in supervisor mode... Ok.
Calibrating delay using timer specific routine.. 6723.97 BogoMIPS
(lpj=3361989)
Security Framework v1.0.0 initialized
Mount-cache hash table entries: 512
CPU: After generic identify, caps: bfebfbff 00000000 00000000 00000000
00004400 00000000 00000000
CPU: Trace cache: 12K uops, L1 D cache: 8K
CPU: L2 cache: 512K
CPU: Physical Processor ID: 0
CPU: After all inits, caps: bfebfbff 00000000 00000000 00003080 00004400
00000000 00000000
Intel machine check architecture supported.
Intel machine check reporting enabled on CPU#0.
CPU0: Intel P4/Xeon Extended MCE MSRs (12) available
CPU0: Thermal monitoring enabled
Compat vDSO mapped to ffffe000.
Checking 'hlt' instruction... OK.
Freeing SMP alternatives: 9k freed
ACPI: Core revision 20060707
CPU0: Intel(R) Pentium(R) 4 CPU 2.80GHz stepping 09
Booting processor 1/1 eip 2000
CPU 1 irqstacks, hard=c03ee000 soft=c03ec000
Initializing CPU#1
Calibrating delay using timer specific routine.. 6720.94 BogoMIPS
(lpj=3360473)
CPU: After generic identify, caps: bfebfbff 00000000 00000000 00000000
00004400 00000000 00000000
CPU: Trace cache: 12K uops, L1 D cache: 8K
CPU: L2 cache: 512K
CPU: Physical Processor ID: 0
CPU: After all inits, caps: bfebfbff 00000000 00000000 00003080 00004400
00000000 00000000
Intel machine check architecture supported.
Intel machine check reporting enabled on CPU#1.
CPU1: Intel P4/Xeon Extended MCE MSRs (12) available
CPU1: Thermal monitoring enabled
CPU1: Intel(R) Pentium(R) 4 CPU 2.80GHz stepping 09
Total of 2 processors activated (13444.92 BogoMIPS).
ENABLING IO-APIC IRQs
..TIMER: vector=0x31 apic1=0 pin1=2 apic2=-1 pin2=-1
checking TSC synchronization [CPU#0 -> CPU#1]: passed.
Brought up 2 CPUs
BUG: unable to handle kernel NULL pointer dereference at virtual address
0000001c
 printing eip:
c011783d
*pde = 00000000
stopped custom tracer.
Oops: 0000 [#1]
PREEMPT SMP
Modules linked in:
CPU:    1
EIP:    0060:[<c011783d>]    Not tainted VLI
EFLAGS: 00010286   (2.6.20-rt5.1 #1)
EIP is at try_to_wake_up+0x11/0x395
eax: 00000000   ebx: 00000000   ecx: 00000000   edx: dfc87ccc
esi: 00000000   edi: c18a4700   ebp: dfc87cdc   esp: dfc87c90
ds: 007b   es: 007b   ss: 0068   preempt: 00000001
Process swapper (pid: 1, ti=dfc87000 task=dfca1670 task.ti=dfc87000)
Stack: 00000046 00000000 0000001f 00000008 c18a3d80 dfca1670 dfc87cc0
dfc87cb8
       00000046 dfca1670 00000001 dfc87cc4 c0136f34 dfc87cd0 c012dbc8
dfc87cf4
       00000000 c18a3d80 c18a4700 dfc87ce8 c0117c64 00000000 dfc87d3c
c01180e6
Call Trace:
 [<c0117c64>] wake_up_process+0x19/0x1b
 [<c01180e6>] set_cpus_allowed+0x6c/0x92
 [<c0118140>] measure_one+0x34/0x165
 [<c0118953>] build_sched_domains+0x6e2/0xce4
 [<c0118f70>] arch_init_sched_domains+0x1b/0x1d
 [<c03c15dd>] sched_init_smp+0x10/0x47
 [<c010047d>] init+0xd0/0x335
 [<c0104b87>] kernel_thread_helper+0x7/0x10
 =======================
Code: 5d f0 89 4f 44 89 5f 48 8b 55 e8 89 f8 e8 99 e1 ff ff 83 c4 0c 5b
5e 5f 5d c3 55 89 e5 57 56 89 c6 53 83 ec 40 89 55 bc 8d 55 f0 <83> 78
1c 63 b8 00 00 00 00 0f 4f c1 89 45 b8 89 f0 e8 ca e1 ff
EIP: [<c011783d>] try_to_wake_up+0x11/0x395 SS:ESP 0068:dfc87c90
 <0>Kernel panic - not syncing: Attempted to kill init!
 [<c0104f7e>] dump_trace+0x63/0x1e5
 [<c010511a>] show_trace_log_lvl+0x1a/0x2f
 [<c010572a>] show_trace+0x12/0x14
 [<c01057bd>] dump_stack+0x16/0x18
 [<c011c791>] panic+0x50/0xf3
 [<c011f25b>] do_exit+0x9b/0x771
 [<c01056c3>] die+0x211/0x237
 [<c02add07>] do_page_fault+0x3f3/0x4bf
 [<c02ac39c>] error_code+0x7c/0x84
 [<c011783d>] try_to_wake_up+0x11/0x395
 [<c0117c64>] wake_up_process+0x19/0x1b
 [<c01180e6>] set_cpus_allowed+0x6c/0x92
 [<c0118140>] measure_one+0x34/0x165
 [<c0118953>] build_sched_domains+0x6e2/0xce4
 [<c0118f70>] arch_init_sched_domains+0x1b/0x1d
 [<c03c15dd>] sched_init_smp+0x10/0x47
 [<c010047d>] init+0xd0/0x335
 [<c0104b87>] kernel_thread_helper+0x7/0x10
 =======================
--EOF--

Hope it helps.
-- 
rncbc aka Rui Nuno Capela
rncbc@rncbc.org

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: 2.6.20-rt5 Oops on boot [-rt8 OK]
  2007-02-09 18:56 ` 2.6.20-rt5 Oops on boot Rui Nuno Capela
@ 2007-02-16  0:46   ` Rui Nuno Capela
  2007-02-16  8:25     ` Ingo Molnar
  0 siblings, 1 reply; 30+ messages in thread
From: Rui Nuno Capela @ 2007-02-16  0:46 UTC (permalink / raw)
  To: Ingo Molnar; +Cc: linux-rt-users, linux-kernel

Rui Nuno Capela (me) wrote:
> 
> I have terrible news: 2.6.20-rt5 does not boot at all on a couple
> machines I was brave enough to try -- a P4@3.3Ghz SMP/HT desktop, and a
> Core2 Duo T7200@2.0Ghz laptop. For the first case I could capture the
> following dump via serial console:
> ...

News are that 2.6.20-rt8 got it all back to business :)

Cheers.
-- 
rncbc aka Rui Nuno Capela
rncbc@rncbc.org

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: 2.6.20-rt5 Oops on boot [-rt8 OK]
  2007-02-16  0:46   ` 2.6.20-rt5 Oops on boot [-rt8 OK] Rui Nuno Capela
@ 2007-02-16  8:25     ` Ingo Molnar
  2007-02-19 12:38       ` Sergio Monteiro Basto
  0 siblings, 1 reply; 30+ messages in thread
From: Ingo Molnar @ 2007-02-16  8:25 UTC (permalink / raw)
  To: Rui Nuno Capela; +Cc: linux-rt-users, linux-kernel


* Rui Nuno Capela <rncbc@rncbc.org> wrote:

> Rui Nuno Capela (me) wrote:
> > 
> > I have terrible news: 2.6.20-rt5 does not boot at all on a couple
> > machines I was brave enough to try -- a P4@3.3Ghz SMP/HT desktop, and a
> > Core2 Duo T7200@2.0Ghz laptop. For the first case I could capture the
> > following dump via serial console:
> > ...
> 
> News are that 2.6.20-rt8 got it all back to business :)

great! The fix is from Michal/Clark/Steve.

	Ingo

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: 2.6.20-rt5 Oops on boot [-rt8 OK]
  2007-02-16  8:25     ` Ingo Molnar
@ 2007-02-19 12:38       ` Sergio Monteiro Basto
  0 siblings, 0 replies; 30+ messages in thread
From: Sergio Monteiro Basto @ 2007-02-19 12:38 UTC (permalink / raw)
  To: Ingo Molnar; +Cc: Rui Nuno Capela, linux-rt-users, linux-kernel

On Fri, 2007-02-16 at 09:25 +0100, Ingo Molnar wrote:
> > 
> > News are that 2.6.20-rt8 got it all back to business :) 

yes!, my Pentium D back in business too  .

--
Sérgio M .B. 


^ permalink raw reply	[flat|nested] 30+ messages in thread

* 2.6.21-rc5-rt6 make errors
  2007-01-30 11:26 2.6.20-rc6-rt4 register_cpu_notification undefined Rui Nuno Capela
  2007-02-09 18:56 ` 2.6.20-rt5 Oops on boot Rui Nuno Capela
@ 2007-04-01 17:12 ` Rui Nuno Capela
  2007-04-01 18:39   ` Ingo Molnar
  1 sibling, 1 reply; 30+ messages in thread
From: Rui Nuno Capela @ 2007-04-01 17:12 UTC (permalink / raw)
  To: linux-kernel; +Cc: linux-rt-users

Hi,

Just tried to build 2.6.21-rc5-rt6 and it is failing on build time.

...
kernel/sched.c: In function ‘__schedule’:
kernel/sched.c:3830: error: ‘print_functions’ undeclared (first use in
this function)
kernel/sched.c:3830: error: (Each undeclared identifier is reported only
once
kernel/sched.c:3830: error: for each function it appears in.)
...
arch/i386/kernel/apic.c: In function ‘smp_apic_timer_interrupt’:
arch/i386/kernel/apic.c:589: error: invalid lvalue in assignment
arch/i386/kernel/apic.c:608: error: invalid lvalue in assignment
...

Just to let you know that -rt5 was doing fine with very same .config .

Cheers.
-- 
rncbc aka Rui Nuno Capela
rncbc@rncbc.org



^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: 2.6.21-rc5-rt6 make errors
  2007-04-01 17:12 ` 2.6.21-rc5-rt6 make errors Rui Nuno Capela
@ 2007-04-01 18:39   ` Ingo Molnar
  2007-04-03 23:49     ` 2.6.21-rc5-rt10 troubles Rui Nuno Capela
  2007-05-25 20:58     ` 2.6.21-rt2..8 troubles Rui Nuno Capela
  0 siblings, 2 replies; 30+ messages in thread
From: Ingo Molnar @ 2007-04-01 18:39 UTC (permalink / raw)
  To: Rui Nuno Capela; +Cc: linux-kernel, linux-rt-users


* Rui Nuno Capela <rncbc@rncbc.org> wrote:

> Hi,
> 
> Just tried to build 2.6.21-rc5-rt6 and it is failing on build time.

> Just to let you know that -rt5 was doing fine with very same .config .

oops - indeed! I've uploaded -rc5-rt7 with the fix. (it includes a few 
other fixes as well)

note that for Fedora-ish distros there's an easy yum test-kernel 
available from the rt-testing repo:

cat > /etc/yum.repos.d/rt-testing.repo
[rt-testing]
name=Ingo's Real-Time (-rt) test-kernel for FC6
baseurl=http://people.redhat.com/mingo/realtime-preempt/yum-testing/yum/
enabled=1
gpgcheck=0
<Ctrl-D>

this includes Linus-git-bleeding-edge as well as the base kernel. (rt7 
is based on upstream HEAD 755948cfca16c7)

[the rt-testing repo currently includes the rt6 rpm, i've just started 
the rt7 build, it should be available in half an hour or so]

	Ingo

^ permalink raw reply	[flat|nested] 30+ messages in thread

* 2.6.21-rc5-rt10 troubles
  2007-04-01 18:39   ` Ingo Molnar
@ 2007-04-03 23:49     ` Rui Nuno Capela
  2007-04-04  8:49       ` Ingo Molnar
  2007-05-25 20:58     ` 2.6.21-rt2..8 troubles Rui Nuno Capela
  1 sibling, 1 reply; 30+ messages in thread
From: Rui Nuno Capela @ 2007-04-03 23:49 UTC (permalink / raw)
  To: Ingo Molnar; +Cc: linux-kernel, linux-rt-users

Ingo et al.

I'm afraid having no good news (once again). After building
2.6.21-rc5-rt8 and recently on -rt10 I've found some trouble running on
a Core2 T7200 laptop (SMP). Somehow, specially after starting jackd, the
whole system starts crawling to death. It just slows down to some kind
of Big Freeze, with no evidence over the console whatsoever, so that I'm
ultimately left with a brick on my hands.

This behavior is consistent and occurs every time after jackd is
started. It does not seem to occur on -rt5 and earlier.

I wish I could give you more details, but fact is I don't know where to
look. The machine just freezes silently.

Bye now.
-- 
rncbc aka Rui Nuno Capela
rncbc@rncbc.org


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: 2.6.21-rc5-rt10 troubles
  2007-04-03 23:49     ` 2.6.21-rc5-rt10 troubles Rui Nuno Capela
@ 2007-04-04  8:49       ` Ingo Molnar
  2007-04-04  9:42         ` Ingo Molnar
  0 siblings, 1 reply; 30+ messages in thread
From: Ingo Molnar @ 2007-04-04  8:49 UTC (permalink / raw)
  To: Rui Nuno Capela; +Cc: linux-kernel, linux-rt-users


* Rui Nuno Capela <rncbc@rncbc.org> wrote:

> Ingo et al.
> 
> I'm afraid having no good news (once again). After building 
> 2.6.21-rc5-rt8 and recently on -rt10 I've found some trouble running 
> on a Core2 T7200 laptop (SMP). Somehow, specially after starting 
> jackd, the whole system starts crawling to death. It just slows down 
> to some kind of Big Freeze, with no evidence over the console 
> whatsoever, so that I'm ultimately left with a brick on my hands.
> 
> This behavior is consistent and occurs every time after jackd is 
> started. It does not seem to occur on -rt5 and earlier.
> 
> I wish I could give you more details, but fact is I don't know where 
> to look. The machine just freezes silently.

could you try rt11 (which fixes two bad bugs in rt10)? If rt11 freezes 
too then could you try to unapply the attached patch? This patch is the 
main delta between rt5 and rt11. (plus upstream changes but those 
shouldnt matter for this problem)

	Ingo

----------------------->
Subject: [patch] softirq preemption: optimization
From: Ingo Molnar <mingo@elte.hu>

optimize softirq preemption by allowing a hardirq context to pick up
softirq processing.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
---
 include/linux/interrupt.h |    1 
 kernel/irq/manage.c       |   24 +++----
 kernel/softirq.c          |  139 +++++++++++++++++++++++++++++++++++++---------
 3 files changed, 123 insertions(+), 41 deletions(-)

Index: linux/include/linux/interrupt.h
===================================================================
--- linux.orig/include/linux/interrupt.h
+++ linux/include/linux/interrupt.h
@@ -266,6 +266,7 @@ struct softirq_action
 asmlinkage void do_softirq(void);
 extern void open_softirq(int nr, void (*action)(struct softirq_action*), void *data);
 extern void softirq_init(void);
+extern void do_softirq_from_hardirq(void);
 
 #ifdef CONFIG_PREEMPT_HARDIRQS
 # define __raise_softirq_irqoff(nr) raise_softirq_irqoff(nr)
Index: linux/kernel/irq/manage.c
===================================================================
--- linux.orig/kernel/irq/manage.c
+++ linux/kernel/irq/manage.c
@@ -628,14 +628,17 @@ static void thread_simple_irq(irq_desc_t
 	unsigned int irq = desc - irq_desc;
 	irqreturn_t action_ret;
 
+repeat:
 	if (action && !desc->depth) {
 		spin_unlock(&desc->lock);
 		action_ret = handle_IRQ_event(irq, action);
-		cond_resched_hardirq_context();
 		spin_lock_irq(&desc->lock);
 		if (!noirqdebug)
 			note_interrupt(irq, desc, action_ret);
 	}
+	if (desc->status & IRQ_PENDING)
+		goto repeat;
+
 	desc->status &= ~IRQ_INPROGRESS;
 }
 
@@ -692,7 +695,6 @@ static void thread_edge_irq(irq_desc_t *
 		desc->status &= ~IRQ_PENDING;
 		spin_unlock(&desc->lock);
 		action_ret = handle_IRQ_event(irq, action);
-		cond_resched_hardirq_context();
 		spin_lock_irq(&desc->lock);
 		if (!noirqdebug)
 			note_interrupt(irq, desc, action_ret);
@@ -721,7 +723,6 @@ static void thread_do_irq(irq_desc_t *de
 		desc->status &= ~IRQ_PENDING;
 		spin_unlock(&desc->lock);
 		action_ret = handle_IRQ_event(irq, action);
-		cond_resched_hardirq_context();
 		spin_lock_irq(&desc->lock);
 		if (!noirqdebug)
 			note_interrupt(irq, desc, action_ret);
@@ -757,8 +758,6 @@ static void do_hardirq(struct irq_desc *
 		wake_up(&desc->wait_for_handler);
 }
 
-extern asmlinkage void __do_softirq(void);
-
 static int do_irqd(void * __desc)
 {
 	struct sched_param param = { 0, };
@@ -781,16 +780,13 @@ static int do_irqd(void * __desc)
 
 	while (!kthread_should_stop()) {
 		local_irq_disable_nort();
-		set_current_state(TASK_INTERRUPTIBLE);
-#ifndef CONFIG_PREEMPT_RT
-		irq_enter();
-#endif
-		do_hardirq(desc);
-#ifndef CONFIG_PREEMPT_RT
-		irq_exit();
-#endif
+		do {
+			set_current_state(TASK_INTERRUPTIBLE);
+			do_hardirq(desc);
+			do_softirq_from_hardirq();
+		} while (current->state == TASK_RUNNING);
+
 		local_irq_enable_nort();
-		cond_resched();
 #ifdef CONFIG_SMP
 		/*
 		 * Did IRQ affinities change?
Index: linux/kernel/softirq.c
===================================================================
--- linux.orig/kernel/softirq.c
+++ linux/kernel/softirq.c
@@ -100,8 +100,26 @@ static void wakeup_softirqd(int softirq)
 	/* Interrupts are disabled: no need to stop preemption */
 	struct task_struct *tsk = __get_cpu_var(ksoftirqd)[softirq].tsk;
 
-	if (tsk && tsk->state != TASK_RUNNING)
-		wake_up_process(tsk);
+	if (unlikely(!tsk))
+		return;
+#if defined(CONFIG_PREEMPT_SOFTIRQS) && defined(CONFIG_PREEMPT_HARDIRQS)
+	/*
+	 * Optimization: if we are in a hardirq thread context, and
+	 * if the priority of the softirq thread is the same as the
+	 * priority of the hardirq thread, then 'merge' softirq
+	 * processing into the hardirq context. (it will later on
+	 * execute softirqs via do_softirq_from_hardirq()).
+	 * So here we can skip the wakeup and can rely on the hardirq
+	 * context processing it later on.
+	 */
+	if ((current->flags & PF_HARDIRQ) && !hardirq_count() &&
+			(tsk->normal_prio == current->normal_prio))
+		return;
+#endif
+	/*
+	 * Wake up the softirq task:
+	 */
+	wake_up_process(tsk);
 }
 
 /*
@@ -250,50 +268,100 @@ EXPORT_SYMBOL(local_bh_enable_ip);
  * we want to handle softirqs as soon as possible, but they
  * should not be able to lock up the box.
  */
-#define MAX_SOFTIRQ_RESTART 10
+#define MAX_SOFTIRQ_RESTART 20
+
+static DEFINE_PER_CPU(u32, softirq_running);
 
-asmlinkage void ___do_softirq(void)
+static void ___do_softirq(const int same_prio_only)
 {
+	int max_restart = MAX_SOFTIRQ_RESTART, max_loops = MAX_SOFTIRQ_RESTART;
+	__u32 pending, available_mask, same_prio_skipped;
 	struct softirq_action *h;
-	__u32 pending;
-	int max_restart = MAX_SOFTIRQ_RESTART;
-	int cpu;
+	struct task_struct *tsk;
+	int cpu, softirq;
 
 	pending = local_softirq_pending();
 	account_system_vtime(current);
 
 	cpu = smp_processor_id();
 restart:
+	available_mask = -1;
+	softirq = 0;
+	same_prio_skipped = 0;
 	/* Reset the pending bitmask before enabling irqs */
 	set_softirq_pending(0);
 
-	local_irq_enable();
-
 	h = softirq_vec;
 
 	do {
+		u32 softirq_mask = 1 << softirq;
+
 		if (pending & 1) {
-			{
-				u32 preempt_count = preempt_count();
-				h->action(h);
-				if (preempt_count != preempt_count()) {
-					print_symbol("BUG: softirq exited %s with wrong preemption count!\n", (unsigned long) h->action);
-					printk("entered with %08x, exited with %08x.\n", preempt_count, preempt_count());
-					preempt_count() = preempt_count;
+			u32 preempt_count = preempt_count();
+
+#if defined(CONFIG_PREEMPT_SOFTIRQS) && defined(CONFIG_PREEMPT_HARDIRQS)
+			/*
+			 * If executed by a same-prio hardirq thread
+			 * then skip pending softirqs that belong
+			 * to softirq threads with different priority:
+			 */
+			if (same_prio_only) {
+				tsk = __get_cpu_var(ksoftirqd)[softirq].tsk;
+				if (tsk && tsk->normal_prio !=
+						current->normal_prio) {
+					same_prio_skipped |= softirq_mask;
+					available_mask &= ~softirq_mask;
+					goto next;
 				}
 			}
+#endif
+			/*
+			 * Is this softirq already being processed?
+			 */
+			if (per_cpu(softirq_running, cpu) & softirq_mask) {
+				available_mask &= ~softirq_mask;
+				goto next;
+			}
+			per_cpu(softirq_running, cpu) |= softirq_mask;
+			local_irq_enable();
+
+			h->action(h);
+			if (preempt_count != preempt_count()) {
+				print_symbol("BUG: softirq exited %s with wrong preemption count!\n", (unsigned long) h->action);
+				printk("entered with %08x, exited with %08x.\n", preempt_count, preempt_count());
+				preempt_count() = preempt_count;
+			}
 			rcu_bh_qsctr_inc(cpu);
 			cond_resched_softirq_context();
+			local_irq_disable();
+			per_cpu(softirq_running, cpu) &= ~softirq_mask;
 		}
+next:
 		h++;
+		softirq++;
 		pending >>= 1;
 	} while (pending);
 
-	local_irq_disable();
-
+	or_softirq_pending(same_prio_skipped);
 	pending = local_softirq_pending();
-	if (pending && --max_restart)
-		goto restart;
+	if (pending & available_mask) {
+		if (--max_restart)
+			goto restart;
+		/*
+		 * With softirq threading there's no reason not to
+		 * finish the workload we have:
+		 */
+#ifdef CONFIG_PREEMPT_SOFTIRQS
+		if (--max_loops) {
+			if (printk_ratelimit())
+				printk("INFO: softirq overload: %08x\n", pending);
+			max_restart = MAX_SOFTIRQ_RESTART;
+			goto restart;
+		}
+		if (printk_ratelimit())
+			printk("BUG: softirq loop! %08x\n", pending);
+#endif
+	}
 
 	if (pending)
 		trigger_softirqs();
@@ -321,7 +389,7 @@ asmlinkage void __do_softirq(void)
 	p_flags = current->flags & PF_HARDIRQ;
 	current->flags &= ~PF_HARDIRQ;
 
-	___do_softirq();
+	___do_softirq(0);
 
 	trace_softirq_exit();
 
@@ -350,8 +418,9 @@ void do_softirq_from_hardirq(void)
 	__local_bh_disable((unsigned long)__builtin_return_address(0));
 	p_flags = current->flags & PF_HARDIRQ;
 	current->flags &= ~PF_HARDIRQ;
+	current->flags |= PF_SOFTIRQ;
 
-	___do_softirq();
+	___do_softirq(1);
 
 	trace_softirq_exit();
 
@@ -359,6 +428,9 @@ void do_softirq_from_hardirq(void)
 	_local_bh_enable();
 
 	current->flags |= p_flags;
+	current->flags &= ~PF_SOFTIRQ;
+
+	local_irq_enable();
 }
 
 #ifndef __ARCH_HAS_DO_SOFTIRQ
@@ -669,8 +741,9 @@ static int ksoftirqd(void * __data)
 {
 	struct sched_param param = { .sched_priority = MAX_USER_RT_PRIO/2 };
 	struct softirqdata *data = __data;
-	u32 mask = (1 << data->nr);
+	u32 softirq_mask = (1 << data->nr);
 	struct softirq_action *h;
+	int cpu = data->cpu;
 
 	sys_sched_setscheduler(current->pid, SCHED_FIFO, &param);
 //	set_user_nice(current, -10);
@@ -684,7 +757,8 @@ static int ksoftirqd(void * __data)
 
 	while (!kthread_should_stop()) {
 		preempt_disable();
-		if (!(local_softirq_pending() & mask)) {
+		if (!(local_softirq_pending() & softirq_mask)) {
+sleep_more:
 			__preempt_enable_no_resched();
 			schedule();
 			preempt_disable();
@@ -694,16 +768,26 @@ static int ksoftirqd(void * __data)
 #endif
 		__set_current_state(TASK_RUNNING);
 
-		while (local_softirq_pending() & mask) {
+		while (local_softirq_pending() & softirq_mask) {
 			/* Preempt disable stops cpu going offline.
 			   If already offline, we'll be on wrong CPU:
 			   don't process */
-			if (cpu_is_offline(data->cpu))
+			if (cpu_is_offline(cpu))
 				goto wait_to_die;
 
 			local_irq_disable();
+			/*
+			 * Is the softirq already being executed by
+			 * a hardirq context?
+			 */
+			if (per_cpu(softirq_running, cpu) & softirq_mask) {
+				local_irq_enable();
+				set_current_state(TASK_INTERRUPTIBLE);
+				goto sleep_more;
+			}
+			per_cpu(softirq_running, cpu) |= softirq_mask;
 			__preempt_enable_no_resched();
-			set_softirq_pending(local_softirq_pending() & ~mask);
+			set_softirq_pending(local_softirq_pending() & ~softirq_mask);
 			local_bh_disable();
 			local_irq_enable();
 
@@ -713,6 +797,7 @@ static int ksoftirqd(void * __data)
 			rcu_bh_qsctr_inc(data->cpu);
 
 			local_irq_disable();
+			per_cpu(softirq_running, cpu) &= ~softirq_mask;
 			_local_bh_enable();
 			local_irq_enable();
 

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: 2.6.21-rc5-rt10 troubles
  2007-04-04  8:49       ` Ingo Molnar
@ 2007-04-04  9:42         ` Ingo Molnar
  0 siblings, 0 replies; 30+ messages in thread
From: Ingo Molnar @ 2007-04-04  9:42 UTC (permalink / raw)
  To: Rui Nuno Capela; +Cc: linux-kernel, linux-rt-users


* Ingo Molnar <mingo@elte.hu> wrote:

> could you try rt11 (which fixes two bad bugs in rt10)? If rt11 freezes 
> too then could you try to unapply the attached patch? This patch is 
> the main delta between rt5 and rt11. (plus upstream changes but those 
> shouldnt matter for this problem)

FYI, i've released -rt12 meanwhile - and the patch to unapply from -rt12 
is below.

	Ingo

------------------->
Subject: [patch] softirq preemption: optimization
From: Ingo Molnar <mingo@elte.hu>

optimize softirq preemption by allowing a hardirq context to pick up
softirq processing.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
---
 include/linux/interrupt.h |    1 
 kernel/irq/manage.c       |   24 +++----
 kernel/softirq.c          |  150 ++++++++++++++++++++++++++++++++++++----------
 3 files changed, 131 insertions(+), 44 deletions(-)

Index: linux/include/linux/interrupt.h
===================================================================
--- linux.orig/include/linux/interrupt.h
+++ linux/include/linux/interrupt.h
@@ -266,6 +266,7 @@ struct softirq_action
 asmlinkage void do_softirq(void);
 extern void open_softirq(int nr, void (*action)(struct softirq_action*), void *data);
 extern void softirq_init(void);
+extern void do_softirq_from_hardirq(void);
 
 #ifdef CONFIG_PREEMPT_HARDIRQS
 # define __raise_softirq_irqoff(nr) raise_softirq_irqoff(nr)
Index: linux/kernel/irq/manage.c
===================================================================
--- linux.orig/kernel/irq/manage.c
+++ linux/kernel/irq/manage.c
@@ -628,14 +628,17 @@ static void thread_simple_irq(irq_desc_t
 	unsigned int irq = desc - irq_desc;
 	irqreturn_t action_ret;
 
+repeat:
 	if (action && !desc->depth) {
 		spin_unlock(&desc->lock);
 		action_ret = handle_IRQ_event(irq, action);
-		cond_resched_hardirq_context();
 		spin_lock_irq(&desc->lock);
 		if (!noirqdebug)
 			note_interrupt(irq, desc, action_ret);
 	}
+	if (desc->status & IRQ_PENDING)
+		goto repeat;
+
 	desc->status &= ~IRQ_INPROGRESS;
 }
 
@@ -692,7 +695,6 @@ static void thread_edge_irq(irq_desc_t *
 		desc->status &= ~IRQ_PENDING;
 		spin_unlock(&desc->lock);
 		action_ret = handle_IRQ_event(irq, action);
-		cond_resched_hardirq_context();
 		spin_lock_irq(&desc->lock);
 		if (!noirqdebug)
 			note_interrupt(irq, desc, action_ret);
@@ -721,7 +723,6 @@ static void thread_do_irq(irq_desc_t *de
 		desc->status &= ~IRQ_PENDING;
 		spin_unlock(&desc->lock);
 		action_ret = handle_IRQ_event(irq, action);
-		cond_resched_hardirq_context();
 		spin_lock_irq(&desc->lock);
 		if (!noirqdebug)
 			note_interrupt(irq, desc, action_ret);
@@ -757,8 +758,6 @@ static void do_hardirq(struct irq_desc *
 		wake_up(&desc->wait_for_handler);
 }
 
-extern asmlinkage void __do_softirq(void);
-
 static int do_irqd(void * __desc)
 {
 	struct sched_param param = { 0, };
@@ -781,16 +780,13 @@ static int do_irqd(void * __desc)
 
 	while (!kthread_should_stop()) {
 		local_irq_disable_nort();
-		set_current_state(TASK_INTERRUPTIBLE);
-#ifndef CONFIG_PREEMPT_RT
-		irq_enter();
-#endif
-		do_hardirq(desc);
-#ifndef CONFIG_PREEMPT_RT
-		irq_exit();
-#endif
+		do {
+			set_current_state(TASK_INTERRUPTIBLE);
+			do_hardirq(desc);
+			do_softirq_from_hardirq();
+		} while (current->state == TASK_RUNNING);
+
 		local_irq_enable_nort();
-		cond_resched();
 #ifdef CONFIG_SMP
 		/*
 		 * Did IRQ affinities change?
Index: linux/kernel/softirq.c
===================================================================
--- linux.orig/kernel/softirq.c
+++ linux/kernel/softirq.c
@@ -100,8 +100,26 @@ static void wakeup_softirqd(int softirq)
 	/* Interrupts are disabled: no need to stop preemption */
 	struct task_struct *tsk = __get_cpu_var(ksoftirqd)[softirq].tsk;
 
-	if (tsk && tsk->state != TASK_RUNNING)
-		wake_up_process(tsk);
+	if (unlikely(!tsk))
+		return;
+#if defined(CONFIG_PREEMPT_SOFTIRQS) && defined(CONFIG_PREEMPT_HARDIRQS)
+	/*
+	 * Optimization: if we are in a hardirq thread context, and
+	 * if the priority of the softirq thread is the same as the
+	 * priority of the hardirq thread, then 'merge' softirq
+	 * processing into the hardirq context. (it will later on
+	 * execute softirqs via do_softirq_from_hardirq()).
+	 * So here we can skip the wakeup and can rely on the hardirq
+	 * context processing it later on.
+	 */
+	if ((current->flags & PF_HARDIRQ) && !hardirq_count() &&
+			(tsk->normal_prio == current->normal_prio))
+		return;
+#endif
+	/*
+	 * Wake up the softirq task:
+	 */
+	wake_up_process(tsk);
 }
 
 /*
@@ -250,50 +268,100 @@ EXPORT_SYMBOL(local_bh_enable_ip);
  * we want to handle softirqs as soon as possible, but they
  * should not be able to lock up the box.
  */
-#define MAX_SOFTIRQ_RESTART 10
+#define MAX_SOFTIRQ_RESTART 20
+
+static DEFINE_PER_CPU(u32, softirq_running);
 
-asmlinkage void ___do_softirq(void)
+static void ___do_softirq(const int same_prio_only)
 {
+	int max_restart = MAX_SOFTIRQ_RESTART, max_loops = MAX_SOFTIRQ_RESTART;
+	__u32 pending, available_mask, same_prio_skipped;
 	struct softirq_action *h;
-	__u32 pending;
-	int max_restart = MAX_SOFTIRQ_RESTART;
-	int cpu;
+	struct task_struct *tsk;
+	int cpu, softirq;
 
 	pending = local_softirq_pending();
 	account_system_vtime(current);
 
 	cpu = smp_processor_id();
 restart:
+	available_mask = -1;
+	softirq = 0;
+	same_prio_skipped = 0;
 	/* Reset the pending bitmask before enabling irqs */
 	set_softirq_pending(0);
 
-	local_irq_enable();
-
 	h = softirq_vec;
 
 	do {
+		u32 softirq_mask = 1 << softirq;
+
 		if (pending & 1) {
-			{
-				u32 preempt_count = preempt_count();
-				h->action(h);
-				if (preempt_count != preempt_count()) {
-					print_symbol("BUG: softirq exited %s with wrong preemption count!\n", (unsigned long) h->action);
-					printk("entered with %08x, exited with %08x.\n", preempt_count, preempt_count());
-					preempt_count() = preempt_count;
+			u32 preempt_count = preempt_count();
+
+#if defined(CONFIG_PREEMPT_SOFTIRQS) && defined(CONFIG_PREEMPT_HARDIRQS)
+			/*
+			 * If executed by a same-prio hardirq thread
+			 * then skip pending softirqs that belong
+			 * to softirq threads with different priority:
+			 */
+			if (same_prio_only) {
+				tsk = __get_cpu_var(ksoftirqd)[softirq].tsk;
+				if (tsk && tsk->normal_prio !=
+						current->normal_prio) {
+					same_prio_skipped |= softirq_mask;
+					available_mask &= ~softirq_mask;
+					goto next;
 				}
 			}
+#endif
+			/*
+			 * Is this softirq already being processed?
+			 */
+			if (per_cpu(softirq_running, cpu) & softirq_mask) {
+				available_mask &= ~softirq_mask;
+				goto next;
+			}
+			per_cpu(softirq_running, cpu) |= softirq_mask;
+			local_irq_enable();
+
+			h->action(h);
+			if (preempt_count != preempt_count()) {
+				print_symbol("BUG: softirq exited %s with wrong preemption count!\n", (unsigned long) h->action);
+				printk("entered with %08x, exited with %08x.\n", preempt_count, preempt_count());
+				preempt_count() = preempt_count;
+			}
 			rcu_bh_qsctr_inc(cpu);
 			cond_resched_softirq_context();
+			local_irq_disable();
+			per_cpu(softirq_running, cpu) &= ~softirq_mask;
 		}
+next:
 		h++;
+		softirq++;
 		pending >>= 1;
 	} while (pending);
 
-	local_irq_disable();
-
+	or_softirq_pending(same_prio_skipped);
 	pending = local_softirq_pending();
-	if (pending && --max_restart)
-		goto restart;
+	if (pending & available_mask) {
+		if (--max_restart)
+			goto restart;
+		/*
+		 * With softirq threading there's no reason not to
+		 * finish the workload we have:
+		 */
+#ifdef CONFIG_PREEMPT_SOFTIRQS
+		if (--max_loops) {
+			if (printk_ratelimit())
+				printk("INFO: softirq overload: %08x\n", pending);
+			max_restart = MAX_SOFTIRQ_RESTART;
+			goto restart;
+		}
+		if (printk_ratelimit())
+			printk("BUG: softirq loop! %08x\n", pending);
+#endif
+	}
 
 	if (pending)
 		trigger_softirqs();
@@ -321,7 +389,7 @@ asmlinkage void __do_softirq(void)
 	p_flags = current->flags & PF_HARDIRQ;
 	current->flags &= ~PF_HARDIRQ;
 
-	___do_softirq();
+	___do_softirq(0);
 
 	trace_softirq_exit();
 
@@ -345,20 +413,29 @@ void do_softirq_from_hardirq(void)
 	if (!local_softirq_pending())
 		return;
 	/*
-	 * 'immediate' softirq execution:
+	 * 'immediate' softirq execution, from hardirq context:
 	 */
+	local_irq_disable();
 	__local_bh_disable((unsigned long)__builtin_return_address(0));
+#ifndef CONFIG_PREEMPT_SOFTIRQS
+	trace_softirq_enter();
+#endif
 	p_flags = current->flags & PF_HARDIRQ;
 	current->flags &= ~PF_HARDIRQ;
+	current->flags |= PF_SOFTIRQ;
 
-	___do_softirq();
+	___do_softirq(1);
 
+#ifndef CONFIG_PREEMPT_SOFTIRQS
 	trace_softirq_exit();
-
+#endif
 	account_system_vtime(current);
-	_local_bh_enable();
 
 	current->flags |= p_flags;
+	current->flags &= ~PF_SOFTIRQ;
+
+	_local_bh_enable();
+	local_irq_enable();
 }
 
 #ifndef __ARCH_HAS_DO_SOFTIRQ
@@ -669,8 +746,9 @@ static int ksoftirqd(void * __data)
 {
 	struct sched_param param = { .sched_priority = MAX_USER_RT_PRIO/2 };
 	struct softirqdata *data = __data;
-	u32 mask = (1 << data->nr);
+	u32 softirq_mask = (1 << data->nr);
 	struct softirq_action *h;
+	int cpu = data->cpu;
 
 	sys_sched_setscheduler(current->pid, SCHED_FIFO, &param);
 //	set_user_nice(current, -10);
@@ -684,7 +762,8 @@ static int ksoftirqd(void * __data)
 
 	while (!kthread_should_stop()) {
 		preempt_disable();
-		if (!(local_softirq_pending() & mask)) {
+		if (!(local_softirq_pending() & softirq_mask)) {
+sleep_more:
 			__preempt_enable_no_resched();
 			schedule();
 			preempt_disable();
@@ -694,16 +773,26 @@ static int ksoftirqd(void * __data)
 #endif
 		__set_current_state(TASK_RUNNING);
 
-		while (local_softirq_pending() & mask) {
+		while (local_softirq_pending() & softirq_mask) {
 			/* Preempt disable stops cpu going offline.
 			   If already offline, we'll be on wrong CPU:
 			   don't process */
-			if (cpu_is_offline(data->cpu))
+			if (cpu_is_offline(cpu))
 				goto wait_to_die;
 
 			local_irq_disable();
+			/*
+			 * Is the softirq already being executed by
+			 * a hardirq context?
+			 */
+			if (per_cpu(softirq_running, cpu) & softirq_mask) {
+				local_irq_enable();
+				set_current_state(TASK_INTERRUPTIBLE);
+				goto sleep_more;
+			}
+			per_cpu(softirq_running, cpu) |= softirq_mask;
 			__preempt_enable_no_resched();
-			set_softirq_pending(local_softirq_pending() & ~mask);
+			set_softirq_pending(local_softirq_pending() & ~softirq_mask);
 			local_bh_disable();
 			local_irq_enable();
 
@@ -713,6 +802,7 @@ static int ksoftirqd(void * __data)
 			rcu_bh_qsctr_inc(data->cpu);
 
 			local_irq_disable();
+			per_cpu(softirq_running, cpu) &= ~softirq_mask;
 			_local_bh_enable();
 			local_irq_enable();
 


^ permalink raw reply	[flat|nested] 30+ messages in thread

* 2.6.21-rt2..8 troubles
  2007-04-01 18:39   ` Ingo Molnar
  2007-04-03 23:49     ` 2.6.21-rc5-rt10 troubles Rui Nuno Capela
@ 2007-05-25 20:58     ` Rui Nuno Capela
  2007-05-26 16:08       ` Thomas Gleixner
  2007-05-31 15:56       ` Steven Rostedt
  1 sibling, 2 replies; 30+ messages in thread
From: Rui Nuno Capela @ 2007-05-25 20:58 UTC (permalink / raw)
  To: Ingo Molnar; +Cc: linux-kernel, linux-rt-users

Hi Ingo et al.

It's been quite a while, since last time I've complained about the -rt
kernel patch series. This time I'm afraid I have a nasty specialty I've
been trying to figure out and isolate but to no definitive results.

Fact is, since 2.6.21-rt2 and still on latest -rt8, that I'm facing
troubled behavior while running on a Core2 T7200 laptop (SMP). Somehow,
soon or later, the whole system starts crawling to death. It just slows
down to some kind of Big Freeze, with no evidence over the console
whatsoever, so that I'm ultimately left with a brick on my hands.

This behavior is consistent and occurs every time after a while. It
surely does not occur on 2.6.21-rt1 and earlier. Even stranger, it does
not occur on another but older P4@3.3Ghz desktop (HT/SMP) where a very
identical system image is deployed (openSUSE 10.2 i386, gcc 4.1.2, KDE
3.5.7)

I wish I could give you more details, but the fact is I don't know where
to look. The machine just freezes silently, again and again, with all
kernels from -rt2 to -rt8 inclusive, with no traceable evidence, at
least to my knowledge. The only symptom that I can come about is that,
from some moment on and ever since, the system cannot start any new
process anymore, or otherwise takes forever to realize and launch any
new started process thread.

A sample dmesg output:
   http://www.rncbc.org/datahub/dmesg-2.6.21-rt5.0
The corresponding .config:
   http://www.rncbc.org/datahub/config-2.6.21-rt5.0

Again, there's no logged evidence of the problem, which is as nasty as
repeatable after each boot. Unfortunately, it's not quite
deterministically reproducible, this behavior of turning into an
unresponsive brick ;) It's just a matter of time, or so I think. That's
why I have no clues.

Is there anything I can do better to help myself figuring out this
issue? As this is a  modern laptop such things like a serial console are
unavailable, but it would be nice to track things up over netconsole
perhaps?

I just need some bright and nice directions now ;) Hope someone finds
this worth of attention too. Meanwhile, I'll be happy with 2.6.21-rt1 :)

Cheers.
-- 
rncbc aka Rui Nuno Capela
rncbc@rncbc.org


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: 2.6.21-rt2..8 troubles
  2007-05-25 20:58     ` 2.6.21-rt2..8 troubles Rui Nuno Capela
@ 2007-05-26 16:08       ` Thomas Gleixner
  2007-05-26 21:21         ` Rui Nuno Capela
  2007-05-31 15:56       ` Steven Rostedt
  1 sibling, 1 reply; 30+ messages in thread
From: Thomas Gleixner @ 2007-05-26 16:08 UTC (permalink / raw)
  To: Rui Nuno Capela; +Cc: Ingo Molnar, linux-kernel, linux-rt-users

On Fri, 2007-05-25 at 21:58 +0100, Rui Nuno Capela wrote:
> Is there anything I can do better to help myself figuring out this
> issue? As this is a  modern laptop such things like a serial console are
> unavailable, but it would be nice to track things up over netconsole
> perhaps?
> 
> I just need some bright and nice directions now ;) Hope someone finds
> this worth of attention too. Meanwhile, I'll be happy with 2.6.21-rt1 :)

Can you boot with "hpet=disable" on the command line ?

If that does not help, please provide the output of /proc/timer_list.

Thanks,

	tglx



^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: 2.6.21-rt2..8 troubles
  2007-05-26 16:08       ` Thomas Gleixner
@ 2007-05-26 21:21         ` Rui Nuno Capela
  2007-06-06  0:44           ` Rui Nuno Capela
  0 siblings, 1 reply; 30+ messages in thread
From: Rui Nuno Capela @ 2007-05-26 21:21 UTC (permalink / raw)
  To: Thomas Gleixner; +Cc: Ingo Molnar, linux-kernel, linux-rt-users

Thomas Gleixner wrote:
> On Fri, 2007-05-25 at 21:58 +0100, Rui Nuno Capela wrote:
>> Is there anything I can do better to help myself figuring out this
>> issue? As this is a  modern laptop such things like a serial console are
>> unavailable, but it would be nice to track things up over netconsole
>> perhaps?
>>
>> I just need some bright and nice directions now ;) Hope someone finds
>> this worth of attention too. Meanwhile, I'll be happy with 2.6.21-rt1 :)
> 
> Can you boot with "hpet=disable" on the command line ?
> 

Nope. It doesn't seem to have significant effect. Same time-bomb
behavior: after an indeterminate period of uptime, the systems stops
responding and cannot spawn new processes (current running ones still
live on, strange).

> If that does not help, please provide the output of /proc/timer_list.
> 

This is with my latest iteration:
  http://www.rncbc.org/datahub/config-2.6.21.1-rt8.0

Normal boot on which it behaves as badly as reported:
  http://www.rncbc.org/datahub/dmesg-2.6.21.1-rt8.0

# cat /proc/timer_list
Timer List Version: v0.3
HRTIMER_MAX_CLOCK_BASES: 2
now at 131736771907 nsecs

cpu: 0
 clock 0:
  .index:      0
  .resolution: 1 nsecs
  .get_time:   ktime_get_real
  .offset:     1180213690448299114 nsecs
active timers:
 clock 1:
  .index:      1
  .resolution: 1 nsecs
  .get_time:   ktime_get
  .offset:     0 nsecs
active timers:
 #0: <ed7c4ef4>, tick_sched_timer, S:01
 # expires at 131737000000 nsecs [in 228093 nsecs]
 #1: <ed7c4ef4>, it_real_fn, S:01
 # expires at 131751277843 nsecs [in 14505936 nsecs]
 #2: <ed7c4ef4>, hrtimer_wakeup, S:01
 # expires at 131802703679 nsecs [in 65931772 nsecs]
 #3: <ed7c4ef4>, hrtimer_wakeup, S:01
 # expires at 131802705006 nsecs [in 65933099 nsecs]
 #4: <ed7c4ef4>, hrtimer_wakeup, S:01
 # expires at 132412838830 nsecs [in 676066923 nsecs]
 #5: <ed7c4ef4>, it_real_fn, S:01
 # expires at 137026607454 nsecs [in 5289835547 nsecs]
 #6: <ed7c4ef4>, hrtimer_wakeup, S:01
 # expires at 141381493725 nsecs [in 9644721818 nsecs]
 #7: <ed7c4ef4>, hrtimer_wakeup, S:01
 # expires at 170796028701 nsecs [in 39059256794 nsecs]
  .expires_next   : 131737000000 nsecs
  .hres_active    : 1
  .nr_events      : 40634
  .nohz_mode      : 2
  .idle_tick      : 131724000000 nsecs
  .tick_stopped   : 0
  .idle_jiffies   : 4294799020
  .idle_calls     : 178848
  .idle_sleeps    : 133212
  .idle_entrytime : 131736069830 nsecs
  .idle_sleeptime : 100895567465 nsecs
  .last_jiffies   : 4294799033
  .next_jiffies   : 4294799039
  .idle_expires   : 131736000000 nsecs
jiffies: 4294799033

cpu: 1
 clock 0:
  .index:      0
  .resolution: 1 nsecs
  .get_time:   ktime_get_real
  .offset:     1180213690448299114 nsecs
active timers:
 clock 1:
  .index:      1
  .resolution: 1 nsecs
  .get_time:   ktime_get
  .offset:     0 nsecs
active timers:
 #0: <ed7c4ef4>, hrtimer_wakeup, S:01
 # expires at 131737067173 nsecs [in 295266 nsecs]
 #1: <ed7c4ef4>, tick_sched_timer, S:01
 # expires at 131737250000 nsecs [in 478093 nsecs]
 #2: <ed7c4ef4>, hrtimer_wakeup, S:01
 # expires at 139151071745 nsecs [in 7414299838 nsecs]
 #3: <ed7c4ef4>, hrtimer_wakeup, S:01
 # expires at 139151133755 nsecs [in 7414361848 nsecs]
 #4: <ed7c4ef4>, hrtimer_wakeup, S:01
 # expires at 139151154005 nsecs [in 7414382098 nsecs]
  .expires_next   : 131737067173 nsecs
  .hres_active    : 1
  .nr_events      : 31510
  .nohz_mode      : 2
  .idle_tick      : 131734250000 nsecs
  .tick_stopped   : 0
  .idle_jiffies   : 4294799030
  .idle_calls     : 151213
  .idle_sleeps    : 107018
  .idle_entrytime : 131735193036 nsecs
  .idle_sleeptime : 108256832194 nsecs
  .last_jiffies   : 4294799032
  .next_jiffies   : 4294799040
  .idle_expires   : 131743000000 nsecs
jiffies: 4294799033


Tick Device: mode:     1
Clock Event Device: hpet
 max_delta_ns:   2147483647
 min_delta_ns:   3352
 mult:           61496110
 shift:          32
 mode:           3
 next_event:     131737000000 nsecs
 set_next_event: hpet_legacy_next_event
 set_mode:       hpet_legacy_set_mode
 event_handler:  tick_handle_oneshot_broadcast
tick_broadcast_mask: 00000003
tick_broadcast_oneshot_mask: 00000001


Tick Device: mode:     1
Clock Event Device: lapic
 max_delta_ns:   806914928
 min_delta_ns:   1442
 mult:           44650051
 shift:          32
 mode:           1
 next_event:     131737000000 nsecs
 set_next_event: lapic_next_event
 set_mode:       lapic_timer_setup
 event_handler:  hrtimer_interrupt

Tick Device: mode:     1
Clock Event Device: lapic
 max_delta_ns:   806914928
 min_delta_ns:   1442
 mult:           44650051
 shift:          32
 mode:           3
 next_event:     131737067173 nsecs
 set_next_event: lapic_next_event
 set_mode:       lapic_timer_setup
 event_handler:  hrtimer_interrupt
--


Alternate boot with hpet=disabled as suggested, but no better results:
  http://www.rncbc.org/datahub/dmesg-2.6.21.1-rt8.0-hpet_disabled

# cat /proc/timer_list
Timer List Version: v0.3
HRTIMER_MAX_CLOCK_BASES: 2
now at 269529706096 nsecs

cpu: 0
 clock 0:
  .index:      0
  .resolution: 1 nsecs
  .get_time:   ktime_get_real
  .offset:     1180214106093436428 nsecs
active timers:
 clock 1:
  .index:      1
  .resolution: 1 nsecs
  .get_time:   ktime_get
  .offset:     0 nsecs
active timers:
 #0: <ed2a2ef4>, tick_sched_timer, S:01
 # expires at 269530000000 nsecs [in 293904 nsecs]
 #1: <ed2a2ef4>, hrtimer_wakeup, S:01
 # expires at 269554568320 nsecs [in 24862224 nsecs]
 #2: <ed2a2ef4>, hrtimer_wakeup, S:01
 # expires at 269585566924 nsecs [in 55860828 nsecs]
 #3: <ed2a2ef4>, hrtimer_wakeup, S:01
 # expires at 269822782823 nsecs [in 293076727 nsecs]
 #4: <ed2a2ef4>, hrtimer_wakeup, S:01
 # expires at 272726158017 nsecs [in 3196451921 nsecs]
 #5: <ed2a2ef4>, it_real_fn, S:01
 # expires at 278007767018 nsecs [in 8478060922 nsecs]
 #6: <ed2a2ef4>, hrtimer_wakeup, S:01
 # expires at 283716431029 nsecs [in 14186724933 nsecs]
 #7: <ed2a2ef4>, hrtimer_wakeup, S:01
 # expires at 283716456168 nsecs [in 14186750072 nsecs]
 #8: <ed2a2ef4>, hrtimer_wakeup, S:01
 # expires at 295789281627 nsecs [in 26259575531 nsecs]
  .expires_next   : 269530000000 nsecs
  .hres_active    : 1
  .nr_events      : 63228
  .nohz_mode      : 2
  .idle_tick      : 269527000000 nsecs
  .tick_stopped   : 0
  .idle_jiffies   : 4294936823
  .idle_calls     : 217590
  .idle_sleeps    : 168323
  .idle_entrytime : 269528785728 nsecs
  .idle_sleeptime : 230915526366 nsecs
  .last_jiffies   : 4294936825
  .next_jiffies   : 4294936840
  .idle_expires   : 269543000000 nsecs
jiffies: 4294936826

cpu: 1
 clock 0:
  .index:      0
  .resolution: 1 nsecs
  .get_time:   ktime_get_real
  .offset:     1180214106093436428 nsecs
active timers:
 clock 1:
  .index:      1
  .resolution: 1 nsecs
  .get_time:   ktime_get
  .offset:     0 nsecs
active timers:
 #0: <ed2a2ef4>, tick_sched_timer, S:01
 # expires at 269530250000 nsecs [in 543904 nsecs]
 #1: <ed2a2ef4>, it_real_fn, S:01
 # expires at 269546379364 nsecs [in 16673268 nsecs]
 #2: <ed2a2ef4>, hrtimer_wakeup, S:01
 # expires at 283723356553 nsecs [in 14193650457 nsecs]
  .expires_next   : 269530250000 nsecs
  .hres_active    : 1
  .nr_events      : 64947
  .nohz_mode      : 2
  .idle_tick      : 269527250000 nsecs
  .tick_stopped   : 0
  .idle_jiffies   : 4294936824
  .idle_calls     : 172684
  .idle_sleeps    : 111081
  .idle_entrytime : 269529298565 nsecs
  .idle_sleeptime : 234502295072 nsecs
  .last_jiffies   : 4294936826
  .next_jiffies   : 4294936833
  .idle_expires   : 269536000000 nsecs
jiffies: 4294936826


Tick Device: mode:     1
Clock Event Device: pit
 max_delta_ns:   27461866
 min_delta_ns:   12571
 mult:           5124677
 shift:          32
 mode:           3
 next_event:     269530250000 nsecs
 set_next_event: pit_next_event
 set_mode:       init_pit_timer
 event_handler:  tick_handle_oneshot_broadcast
tick_broadcast_mask: 00000003
tick_broadcast_oneshot_mask: 00000002


Tick Device: mode:     1
Clock Event Device: lapic
 max_delta_ns:   807031401
 min_delta_ns:   1443
 mult:           44643607
 shift:          32
 mode:           3
 next_event:     269530000000 nsecs
 set_next_event: lapic_next_event
 set_mode:       lapic_timer_setup
 event_handler:  hrtimer_interrupt

Tick Device: mode:     1
Clock Event Device: lapic
 max_delta_ns:   807031401
 min_delta_ns:   1443
 mult:           44643607
 shift:          32
 mode:           1
 next_event:     269530250000 nsecs
 set_next_event: lapic_next_event
 set_mode:       lapic_timer_setup
 event_handler:  hrtimer_interrupt
--

Thanks for the hints.

Cheers.
-- 
rncbc aka Rui Nuno Capela
rncbc@rncbc.org

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: 2.6.21-rt2..8 troubles
  2007-05-25 20:58     ` 2.6.21-rt2..8 troubles Rui Nuno Capela
  2007-05-26 16:08       ` Thomas Gleixner
@ 2007-05-31 15:56       ` Steven Rostedt
  2007-05-31 16:26         ` Luis Claudio R. Goncalves
  1 sibling, 1 reply; 30+ messages in thread
From: Steven Rostedt @ 2007-05-31 15:56 UTC (permalink / raw)
  To: Rui Nuno Capela; +Cc: Ingo Molnar, linux-kernel, linux-rt-users


On Fri, May 25, 2007 at 09:58:12PM +0100, Rui Nuno Capela wrote:
>
> I wish I could give you more details, but the fact is I don't know
> where
> to look. The machine just freezes silently, again and again, with all
> kernels from -rt2 to -rt8 inclusive, with no traceable evidence, at
> least to my knowledge. The only symptom that I can come about is that,
> from some moment on and ever since, the system cannot start any new
> process anymore, or otherwise takes forever to realize and launch any
> new started process thread.
>

I have a box that looks like it's doing the same thing. Unfortunately
for now it's being used to test other things.

But I did do a show-task and see a bunch of D processes. I'll post that
output when I get that box free again.

-- Steve


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: 2.6.21-rt2..8 troubles
  2007-05-31 15:56       ` Steven Rostedt
@ 2007-05-31 16:26         ` Luis Claudio R. Goncalves
  0 siblings, 0 replies; 30+ messages in thread
From: Luis Claudio R. Goncalves @ 2007-05-31 16:26 UTC (permalink / raw)
  To: Steven Rostedt; +Cc: linux-rt-users

On Thu, May 31, 2007 at 11:56:11AM -0400, Steven Rostedt wrote:
| 
| On Fri, May 25, 2007 at 09:58:12PM +0100, Rui Nuno Capela wrote:
| >
| > I wish I could give you more details, but the fact is I don't know
| > where
| > to look. The machine just freezes silently, again and again, with all
| > kernels from -rt2 to -rt8 inclusive, with no traceable evidence, at
| > least to my knowledge. The only symptom that I can come about is that,
| > from some moment on and ever since, the system cannot start any new
| > process anymore, or otherwise takes forever to realize and launch any
| > new started process thread.

I have also experienced some of these freezes and system hangs, some of them
reproducible, as when I start java+azureus or when xscreensaver pops in. But 
after I unset CONFIG_DEBUG_RT_MUTEXES they are all gone. Not the
DEBUG_RT_MUTEXES has the culpright but judging by the amount of stack
traces it spits every few seconds, this workload may potentialize some
inner problem in the system.

I am currently running rt8 in a FC6 box. I plan to run some further
investigation on this matter.

Luis
 
| I have a box that looks like it's doing the same thing. Unfortunately
| for now it's being used to test other things.
| 
| But I did do a show-task and see a bunch of D processes. I'll post that
| output when I get that box free again.
| 
| -- Steve
| 
| -
| To unsubscribe from this list: send the line "unsubscribe linux-rt-users" in
| the body of a message to majordomo@vger.kernel.org
| More majordomo info at  http://vger.kernel.org/majordomo-info.html
---end quoted text---

-- 
[ Luis Claudio R. Goncalves                   lclaudio at uudg dot org ]
[ Fingerprint:   4FDD B8C4 3C59 34BD 8BE9  2696 7203 D980 A448 C8F8    ]
[ Linux-HA Developer - LateNite Programmer - Gospel User - Bass Player ]
[ Fault Tolerance - Real-Time - Distributed Systems - IECLB - Is 40:31 ]

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: 2.6.21-rt2..8 troubles
  2007-05-26 21:21         ` Rui Nuno Capela
@ 2007-06-06  0:44           ` Rui Nuno Capela
  2007-06-08 15:47             ` Thomas Gleixner
  0 siblings, 1 reply; 30+ messages in thread
From: Rui Nuno Capela @ 2007-06-06  0:44 UTC (permalink / raw)
  To: Rui Nuno Capela
  Cc: Thomas Gleixner, Ingo Molnar, linux-kernel, linux-rt-users

Rui Nuno Capela wrote:
> Thomas Gleixner wrote:
>> On Fri, 2007-05-25 at 21:58 +0100, Rui Nuno Capela wrote:
>>> Is there anything I can do better to help myself figuring out this
>>> issue? As this is a  modern laptop such things like a serial console are
>>> unavailable, but it would be nice to track things up over netconsole
>>> perhaps?
>>>
>>> I just need some bright and nice directions now ;) Hope someone finds
>>> this worth of attention too. Meanwhile, I'll be happy with 2.6.21-rt1 :)
>> Can you boot with "hpet=disable" on the command line ?
>>
> 
> Nope. It doesn't seem to have significant effect. Same time-bomb
> behavior: after an indeterminate period of uptime, the systems stops
> responding and cannot spawn new processes (current running ones still
> live on, strange).
> 
>> If that does not help, please provide the output of /proc/timer_list.
>>
> 
> This is with my latest iteration:
>   http://www.rncbc.org/datahub/config-2.6.21.1-rt8.0
> 
> Normal boot on which it behaves as badly as reported:
>   http://www.rncbc.org/datahub/dmesg-2.6.21.1-rt8.0
> 
> # cat /proc/timer_list
> Timer List Version: v0.3
> HRTIMER_MAX_CLOCK_BASES: 2
> now at 131736771907 nsecs
> 
> cpu: 0
>  clock 0:
>   .index:      0
>   .resolution: 1 nsecs
>   .get_time:   ktime_get_real
>   .offset:     1180213690448299114 nsecs
> active timers:
>  clock 1:
>   .index:      1
>   .resolution: 1 nsecs
>   .get_time:   ktime_get
>   .offset:     0 nsecs
> active timers:
>  #0: <ed7c4ef4>, tick_sched_timer, S:01
>  # expires at 131737000000 nsecs [in 228093 nsecs]
>  #1: <ed7c4ef4>, it_real_fn, S:01
>  # expires at 131751277843 nsecs [in 14505936 nsecs]
>  #2: <ed7c4ef4>, hrtimer_wakeup, S:01
>  # expires at 131802703679 nsecs [in 65931772 nsecs]
>  #3: <ed7c4ef4>, hrtimer_wakeup, S:01
>  # expires at 131802705006 nsecs [in 65933099 nsecs]
>  #4: <ed7c4ef4>, hrtimer_wakeup, S:01
>  # expires at 132412838830 nsecs [in 676066923 nsecs]
>  #5: <ed7c4ef4>, it_real_fn, S:01
>  # expires at 137026607454 nsecs [in 5289835547 nsecs]
>  #6: <ed7c4ef4>, hrtimer_wakeup, S:01
>  # expires at 141381493725 nsecs [in 9644721818 nsecs]
>  #7: <ed7c4ef4>, hrtimer_wakeup, S:01
>  # expires at 170796028701 nsecs [in 39059256794 nsecs]
>   .expires_next   : 131737000000 nsecs
>   .hres_active    : 1
>   .nr_events      : 40634
>   .nohz_mode      : 2
>   .idle_tick      : 131724000000 nsecs
>   .tick_stopped   : 0
>   .idle_jiffies   : 4294799020
>   .idle_calls     : 178848
>   .idle_sleeps    : 133212
>   .idle_entrytime : 131736069830 nsecs
>   .idle_sleeptime : 100895567465 nsecs
>   .last_jiffies   : 4294799033
>   .next_jiffies   : 4294799039
>   .idle_expires   : 131736000000 nsecs
> jiffies: 4294799033
> 
> cpu: 1
>  clock 0:
>   .index:      0
>   .resolution: 1 nsecs
>   .get_time:   ktime_get_real
>   .offset:     1180213690448299114 nsecs
> active timers:
>  clock 1:
>   .index:      1
>   .resolution: 1 nsecs
>   .get_time:   ktime_get
>   .offset:     0 nsecs
> active timers:
>  #0: <ed7c4ef4>, hrtimer_wakeup, S:01
>  # expires at 131737067173 nsecs [in 295266 nsecs]
>  #1: <ed7c4ef4>, tick_sched_timer, S:01
>  # expires at 131737250000 nsecs [in 478093 nsecs]
>  #2: <ed7c4ef4>, hrtimer_wakeup, S:01
>  # expires at 139151071745 nsecs [in 7414299838 nsecs]
>  #3: <ed7c4ef4>, hrtimer_wakeup, S:01
>  # expires at 139151133755 nsecs [in 7414361848 nsecs]
>  #4: <ed7c4ef4>, hrtimer_wakeup, S:01
>  # expires at 139151154005 nsecs [in 7414382098 nsecs]
>   .expires_next   : 131737067173 nsecs
>   .hres_active    : 1
>   .nr_events      : 31510
>   .nohz_mode      : 2
>   .idle_tick      : 131734250000 nsecs
>   .tick_stopped   : 0
>   .idle_jiffies   : 4294799030
>   .idle_calls     : 151213
>   .idle_sleeps    : 107018
>   .idle_entrytime : 131735193036 nsecs
>   .idle_sleeptime : 108256832194 nsecs
>   .last_jiffies   : 4294799032
>   .next_jiffies   : 4294799040
>   .idle_expires   : 131743000000 nsecs
> jiffies: 4294799033
> 
> 
> Tick Device: mode:     1
> Clock Event Device: hpet
>  max_delta_ns:   2147483647
>  min_delta_ns:   3352
>  mult:           61496110
>  shift:          32
>  mode:           3
>  next_event:     131737000000 nsecs
>  set_next_event: hpet_legacy_next_event
>  set_mode:       hpet_legacy_set_mode
>  event_handler:  tick_handle_oneshot_broadcast
> tick_broadcast_mask: 00000003
> tick_broadcast_oneshot_mask: 00000001
> 
> 
> Tick Device: mode:     1
> Clock Event Device: lapic
>  max_delta_ns:   806914928
>  min_delta_ns:   1442
>  mult:           44650051
>  shift:          32
>  mode:           1
>  next_event:     131737000000 nsecs
>  set_next_event: lapic_next_event
>  set_mode:       lapic_timer_setup
>  event_handler:  hrtimer_interrupt
> 
> Tick Device: mode:     1
> Clock Event Device: lapic
>  max_delta_ns:   806914928
>  min_delta_ns:   1442
>  mult:           44650051
>  shift:          32
>  mode:           3
>  next_event:     131737067173 nsecs
>  set_next_event: lapic_next_event
>  set_mode:       lapic_timer_setup
>  event_handler:  hrtimer_interrupt
> --
> 
> 
> Alternate boot with hpet=disabled as suggested, but no better results:
>   http://www.rncbc.org/datahub/dmesg-2.6.21.1-rt8.0-hpet_disabled
> 
> # cat /proc/timer_list
> Timer List Version: v0.3
> HRTIMER_MAX_CLOCK_BASES: 2
> now at 269529706096 nsecs
> 
> cpu: 0
>  clock 0:
>   .index:      0
>   .resolution: 1 nsecs
>   .get_time:   ktime_get_real
>   .offset:     1180214106093436428 nsecs
> active timers:
>  clock 1:
>   .index:      1
>   .resolution: 1 nsecs
>   .get_time:   ktime_get
>   .offset:     0 nsecs
> active timers:
>  #0: <ed2a2ef4>, tick_sched_timer, S:01
>  # expires at 269530000000 nsecs [in 293904 nsecs]
>  #1: <ed2a2ef4>, hrtimer_wakeup, S:01
>  # expires at 269554568320 nsecs [in 24862224 nsecs]
>  #2: <ed2a2ef4>, hrtimer_wakeup, S:01
>  # expires at 269585566924 nsecs [in 55860828 nsecs]
>  #3: <ed2a2ef4>, hrtimer_wakeup, S:01
>  # expires at 269822782823 nsecs [in 293076727 nsecs]
>  #4: <ed2a2ef4>, hrtimer_wakeup, S:01
>  # expires at 272726158017 nsecs [in 3196451921 nsecs]
>  #5: <ed2a2ef4>, it_real_fn, S:01
>  # expires at 278007767018 nsecs [in 8478060922 nsecs]
>  #6: <ed2a2ef4>, hrtimer_wakeup, S:01
>  # expires at 283716431029 nsecs [in 14186724933 nsecs]
>  #7: <ed2a2ef4>, hrtimer_wakeup, S:01
>  # expires at 283716456168 nsecs [in 14186750072 nsecs]
>  #8: <ed2a2ef4>, hrtimer_wakeup, S:01
>  # expires at 295789281627 nsecs [in 26259575531 nsecs]
>   .expires_next   : 269530000000 nsecs
>   .hres_active    : 1
>   .nr_events      : 63228
>   .nohz_mode      : 2
>   .idle_tick      : 269527000000 nsecs
>   .tick_stopped   : 0
>   .idle_jiffies   : 4294936823
>   .idle_calls     : 217590
>   .idle_sleeps    : 168323
>   .idle_entrytime : 269528785728 nsecs
>   .idle_sleeptime : 230915526366 nsecs
>   .last_jiffies   : 4294936825
>   .next_jiffies   : 4294936840
>   .idle_expires   : 269543000000 nsecs
> jiffies: 4294936826
> 
> cpu: 1
>  clock 0:
>   .index:      0
>   .resolution: 1 nsecs
>   .get_time:   ktime_get_real
>   .offset:     1180214106093436428 nsecs
> active timers:
>  clock 1:
>   .index:      1
>   .resolution: 1 nsecs
>   .get_time:   ktime_get
>   .offset:     0 nsecs
> active timers:
>  #0: <ed2a2ef4>, tick_sched_timer, S:01
>  # expires at 269530250000 nsecs [in 543904 nsecs]
>  #1: <ed2a2ef4>, it_real_fn, S:01
>  # expires at 269546379364 nsecs [in 16673268 nsecs]
>  #2: <ed2a2ef4>, hrtimer_wakeup, S:01
>  # expires at 283723356553 nsecs [in 14193650457 nsecs]
>   .expires_next   : 269530250000 nsecs
>   .hres_active    : 1
>   .nr_events      : 64947
>   .nohz_mode      : 2
>   .idle_tick      : 269527250000 nsecs
>   .tick_stopped   : 0
>   .idle_jiffies   : 4294936824
>   .idle_calls     : 172684
>   .idle_sleeps    : 111081
>   .idle_entrytime : 269529298565 nsecs
>   .idle_sleeptime : 234502295072 nsecs
>   .last_jiffies   : 4294936826
>   .next_jiffies   : 4294936833
>   .idle_expires   : 269536000000 nsecs
> jiffies: 4294936826
> 
> 
> Tick Device: mode:     1
> Clock Event Device: pit
>  max_delta_ns:   27461866
>  min_delta_ns:   12571
>  mult:           5124677
>  shift:          32
>  mode:           3
>  next_event:     269530250000 nsecs
>  set_next_event: pit_next_event
>  set_mode:       init_pit_timer
>  event_handler:  tick_handle_oneshot_broadcast
> tick_broadcast_mask: 00000003
> tick_broadcast_oneshot_mask: 00000002
> 
> 
> Tick Device: mode:     1
> Clock Event Device: lapic
>  max_delta_ns:   807031401
>  min_delta_ns:   1443
>  mult:           44643607
>  shift:          32
>  mode:           3
>  next_event:     269530000000 nsecs
>  set_next_event: lapic_next_event
>  set_mode:       lapic_timer_setup
>  event_handler:  hrtimer_interrupt
> 
> Tick Device: mode:     1
> Clock Event Device: lapic
>  max_delta_ns:   807031401
>  min_delta_ns:   1443
>  mult:           44643607
>  shift:          32
>  mode:           1
>  next_event:     269530250000 nsecs
>  set_next_event: lapic_next_event
>  set_mode:       lapic_timer_setup
>  event_handler:  hrtimer_interrupt
> --
> 

Just for the heads-up, I'm still suffering from this same illness, and
it seems even worse (big freeze happens earlier) on 2.6.21.3-rt9.

There's no way around. On one box it works flawlessly (desktop,
P4@3.3Ghz) while on the patient one (laptop, core2 T7200) it bricks
silently.

Shrugs:)
-- 
rncbc aka Rui Nuno Capela
rncbc@rncbc.org

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: 2.6.21-rt2..8 troubles
  2007-06-06  0:44           ` Rui Nuno Capela
@ 2007-06-08 15:47             ` Thomas Gleixner
  2007-06-08 18:21               ` Rui Nuno Capela
  0 siblings, 1 reply; 30+ messages in thread
From: Thomas Gleixner @ 2007-06-08 15:47 UTC (permalink / raw)
  To: Rui Nuno Capela; +Cc: Ingo Molnar, linux-kernel, linux-rt-users

On Wed, 2007-06-06 at 01:44 +0100, Rui Nuno Capela wrote:
> Just for the heads-up, I'm still suffering from this same illness, and
> it seems even worse (big freeze happens earlier) on 2.6.21.3-rt9.
> 
> There's no way around. On one box it works flawlessly (desktop,
> P4@3.3Ghz) while on the patient one (laptop, core2 T7200) it bricks
> silently.

Sorry for responding late. To have some idea where the breakage comes
from, can you please try

http://www.tglx.de/projects/hrtimers/2.6.22-rc4/patch-2.6.22-rc4-hrt5.patch

whether it has the same behaviour.

Thanks,

	tglx



^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: 2.6.21-rt2..8 troubles
  2007-06-08 15:47             ` Thomas Gleixner
@ 2007-06-08 18:21               ` Rui Nuno Capela
  2007-06-08 18:50                 ` Thomas Gleixner
  0 siblings, 1 reply; 30+ messages in thread
From: Rui Nuno Capela @ 2007-06-08 18:21 UTC (permalink / raw)
  To: Thomas Gleixner; +Cc: Ingo Molnar, linux-kernel, linux-rt-users

Hi Thomas,

On Fri, June 8, 2007 16:47, Thomas Gleixner wrote:
> On Wed, 2007-06-06 at 01:44 +0100, Rui Nuno Capela wrote:
>
>> Just for the heads-up, I'm still suffering from this same illness, and
>> it seems even worse (big freeze happens earlier) on 2.6.21.3-rt9.
>>
>> There's no way around. On one box it works flawlessly (desktop,
>> P4@3.3Ghz) while on the patient one (laptop, core2 T7200) it bricks
>> silently.
>
> Sorry for responding late. To have some idea where the breakage comes
> from, can you please try
>
> http://www.tglx.de/projects/hrtimers/2.6.22-rc4/patch-2.6.22-rc4-hrt5.pat
> ch
>
> whether it has the same behaviour.
>

Just built from linux-2.6.22-rc4.tar.bz2, with patch-2.6.22-rc4-hrt5.
All's working apparentely nice on this offending machine (laptop, intel
core2 T7200). In fact, I'm writing this very reply under it and through
ipw3945 wifi module--which never was so pragmatic on -rt2..9 ;)

Nevertheless, this is not preempt-realtime (-rt) is it? And I it never
complained about vanilla.

Is this good news though?
-- 
rncbc aka Rui Nuno Capela
rncbc@rncbc.org

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: 2.6.21-rt2..8 troubles
  2007-06-08 18:21               ` Rui Nuno Capela
@ 2007-06-08 18:50                 ` Thomas Gleixner
  2007-06-11 19:36                   ` Rui Nuno Capela
  0 siblings, 1 reply; 30+ messages in thread
From: Thomas Gleixner @ 2007-06-08 18:50 UTC (permalink / raw)
  To: Rui Nuno Capela; +Cc: Ingo Molnar, linux-kernel, linux-rt-users

On Fri, 2007-06-08 at 19:21 +0100, Rui Nuno Capela wrote:
> >> There's no way around. On one box it works flawlessly (desktop,
> >> P4@3.3Ghz) while on the patient one (laptop, core2 T7200) it bricks
> >> silently.
> >
> > Sorry for responding late. To have some idea where the breakage comes
> > from, can you please try
> >
> > http://www.tglx.de/projects/hrtimers/2.6.22-rc4/patch-2.6.22-rc4-hrt5.pat
> > ch
> >
> > whether it has the same behaviour.
> >
> 
> Just built from linux-2.6.22-rc4.tar.bz2, with patch-2.6.22-rc4-hrt5.
> All's working apparentely nice on this offending machine (laptop, intel
> core2 T7200). In fact, I'm writing this very reply under it and through
> ipw3945 wifi module--which never was so pragmatic on -rt2..9 ;)
> 
> Nevertheless, this is not preempt-realtime (-rt) is it? And I it never
> complained about vanilla.
> 
> Is this good news though?

Well, the patch carries the same high resolution timer fixes as -rt, so
I just wanted to exclude those. Thanks for testing.

I'm spinning -rt10 with a couple of fixes. Should be out sometimes
tomorrow. If the problem persists, we need to dig deeper.

Thanks,

	tglx



^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: 2.6.21-rt2..8 troubles
  2007-06-08 18:50                 ` Thomas Gleixner
@ 2007-06-11 19:36                   ` Rui Nuno Capela
  2007-06-11 19:45                     ` Thomas Gleixner
  0 siblings, 1 reply; 30+ messages in thread
From: Rui Nuno Capela @ 2007-06-11 19:36 UTC (permalink / raw)
  To: Thomas Gleixner; +Cc: Ingo Molnar, linux-kernel, linux-rt-users

Thomas Gleixner wrote:
> On Fri, 2007-06-08 at 19:21 +0100, Rui Nuno Capela wrote:
>> Just built from linux-2.6.22-rc4.tar.bz2, with patch-2.6.22-rc4-hrt5.
>> All's working apparentely nice on this offending machine (laptop, intel
>> core2 T7200). In fact, I'm writing this very reply under it and through
>> ipw3945 wifi module--which never was so pragmatic on -rt2..9 ;)
>>
>> Nevertheless, this is not preempt-realtime (-rt) is it? And I it never
>> complained about vanilla.
>>
>> Is this good news though?
> 
> Well, the patch carries the same high resolution timer fixes as -rt, so
> I just wanted to exclude those. Thanks for testing.
> 
> I'm spinning -rt10 with a couple of fixes. Should be out sometimes
> tomorrow. If the problem persists, we need to dig deeper.
> 

Uhoh. I'm sorry to tell, but the problem is still creeping on
2.6.21.4-rt11 and -rt12 :(

So sorry.
-- 
rncbc aka Rui Nuno Capela
rncbc@rncbc.org

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: 2.6.21-rt2..8 troubles
  2007-06-11 19:36                   ` Rui Nuno Capela
@ 2007-06-11 19:45                     ` Thomas Gleixner
  2007-06-11 19:55                       ` Daniel Walker
  0 siblings, 1 reply; 30+ messages in thread
From: Thomas Gleixner @ 2007-06-11 19:45 UTC (permalink / raw)
  To: Rui Nuno Capela; +Cc: Ingo Molnar, linux-kernel, linux-rt-users

On Mon, 2007-06-11 at 20:36 +0100, Rui Nuno Capela wrote:
> > I'm spinning -rt10 with a couple of fixes. Should be out sometimes
> > tomorrow. If the problem persists, we need to dig deeper.
> > 
> 
> Uhoh. I'm sorry to tell, but the problem is still creeping on
> 2.6.21.4-rt11 and -rt12 :(
> 
> So sorry.

Hmm. Does it happen, when you boot with maxcpus=1 on the kernel
commandline ?

	tglx



^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: 2.6.21-rt2..8 troubles
  2007-06-11 19:45                     ` Thomas Gleixner
@ 2007-06-11 19:55                       ` Daniel Walker
  2007-06-11 20:50                         ` Rui Nuno Capela
  0 siblings, 1 reply; 30+ messages in thread
From: Daniel Walker @ 2007-06-11 19:55 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Rui Nuno Capela, Ingo Molnar, linux-kernel, linux-rt-users

On Mon, 2007-06-11 at 21:45 +0200, Thomas Gleixner wrote:
> On Mon, 2007-06-11 at 20:36 +0100, Rui Nuno Capela wrote:
> > > I'm spinning -rt10 with a couple of fixes. Should be out sometimes
> > > tomorrow. If the problem persists, we need to dig deeper.
> > > 
> > 
> > Uhoh. I'm sorry to tell, but the problem is still creeping on
> > 2.6.21.4-rt11 and -rt12 :(
> > 
> > So sorry.
> 
> Hmm. Does it happen, when you boot with maxcpus=1 on the kernel
> commandline ?

I think 2.6.21-rt2 had some apic updates also, (along with hpet updates)
so testing with "noapic" on the command line might be helpful too .. 

Daniel


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: 2.6.21-rt2..8 troubles
  2007-06-11 19:55                       ` Daniel Walker
@ 2007-06-11 20:50                         ` Rui Nuno Capela
  2007-06-11 21:14                           ` Thomas Gleixner
  0 siblings, 1 reply; 30+ messages in thread
From: Rui Nuno Capela @ 2007-06-11 20:50 UTC (permalink / raw)
  To: Daniel Walker; +Cc: Thomas Gleixner, Ingo Molnar, linux-kernel, linux-rt-users

Daniel Walker wrote:
> On Mon, 2007-06-11 at 21:45 +0200, Thomas Gleixner wrote:
>> On Mon, 2007-06-11 at 20:36 +0100, Rui Nuno Capela wrote:
>>>> I'm spinning -rt10 with a couple of fixes. Should be out sometimes
>>>> tomorrow. If the problem persists, we need to dig deeper.
>>>>
>>> Uhoh. I'm sorry to tell, but the problem is still creeping on
>>> 2.6.21.4-rt11 and -rt12 :(
>>>
>>> So sorry.
>> Hmm. Does it happen, when you boot with maxcpus=1 on the kernel
>> commandline ?
> 
> I think 2.6.21-rt2 had some apic updates also, (along with hpet updates)
> so testing with "noapic" on the command line might be helpful too .. 
>

Thomas,

Yes, "maxcpus=1" seems to keep it running, but then I render my Core2
just half-baked ;)

Daniel,

No, "noapic" does not seem to help any better.

HTH
-- 
rncbc aka Rui Nuno Capela
rncbc@rncbc.org

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: 2.6.21-rt2..8 troubles
  2007-06-11 20:50                         ` Rui Nuno Capela
@ 2007-06-11 21:14                           ` Thomas Gleixner
  2007-06-11 21:25                             ` Rui Nuno Capela
  0 siblings, 1 reply; 30+ messages in thread
From: Thomas Gleixner @ 2007-06-11 21:14 UTC (permalink / raw)
  To: Rui Nuno Capela; +Cc: Daniel Walker, Ingo Molnar, linux-kernel, linux-rt-users

On Mon, 2007-06-11 at 21:50 +0100, Rui Nuno Capela wrote:
> Thomas,
> 
> Yes, "maxcpus=1" seems to keep it running, but then I render my Core2
> just half-baked ;)

Yes, I know :(

/me goes into desperate mode

Is this a DELL laptop ?

	tglx



^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: 2.6.21-rt2..8 troubles
  2007-06-11 21:14                           ` Thomas Gleixner
@ 2007-06-11 21:25                             ` Rui Nuno Capela
  2007-06-11 21:42                               ` Thomas Gleixner
  0 siblings, 1 reply; 30+ messages in thread
From: Rui Nuno Capela @ 2007-06-11 21:25 UTC (permalink / raw)
  To: Thomas Gleixner; +Cc: Daniel Walker, Ingo Molnar, linux-kernel, linux-rt-users

Thomas Gleixner wrote:
> On Mon, 2007-06-11 at 21:50 +0100, Rui Nuno Capela wrote:
>> Thomas,
>>
>> Yes, "maxcpus=1" seems to keep it running, but then I render my Core2
>> just half-baked ;)
> 
> Yes, I know :(
> 
> /me goes into desperate mode
> 
> Is this a DELL laptop ?
> 

Nope. It's a Fujitsu-Siemens Amilo Si 1520 -- Intel Core2 Duo T7200@2.0Ghz.

Works great with 2.6.21-rt1, and 2.6.22-rc4-hrt5, but that you already
know :)

Cheers.
-- 
rncbc aka Rui Nuno Capela
rncbc@rncbc.org

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: 2.6.21-rt2..8 troubles
  2007-06-11 21:25                             ` Rui Nuno Capela
@ 2007-06-11 21:42                               ` Thomas Gleixner
  2007-06-11 22:34                                 ` Daniel Walker
  0 siblings, 1 reply; 30+ messages in thread
From: Thomas Gleixner @ 2007-06-11 21:42 UTC (permalink / raw)
  To: Rui Nuno Capela; +Cc: Daniel Walker, Ingo Molnar, linux-kernel, linux-rt-users

On Mon, 2007-06-11 at 22:25 +0100, Rui Nuno Capela wrote:
> Nope. It's a Fujitsu-Siemens Amilo Si 1520 -- Intel Core2 Duo T7200@2.0Ghz.

Yeah, there are Dell ones which have similar or worse symptoms.

> Works great with 2.6.21-rt1, and 2.6.22-rc4-hrt5, but that you already
> know :)

Ok. I go back and figure out which differences we have between
2.6.21-rt>8 and the -hrt queue.

	tglx



^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: 2.6.21-rt2..8 troubles
  2007-06-11 21:42                               ` Thomas Gleixner
@ 2007-06-11 22:34                                 ` Daniel Walker
  2007-06-11 23:08                                   ` Thomas Gleixner
  0 siblings, 1 reply; 30+ messages in thread
From: Daniel Walker @ 2007-06-11 22:34 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Rui Nuno Capela, Ingo Molnar, linux-kernel, linux-rt-users

On Mon, 2007-06-11 at 23:42 +0200, Thomas Gleixner wrote:
> On Mon, 2007-06-11 at 22:25 +0100, Rui Nuno Capela wrote:
> > Nope. It's a Fujitsu-Siemens Amilo Si 1520 -- Intel Core2 Duo T7200@2.0Ghz.
> 
> Yeah, there are Dell ones which have similar or worse symptoms.
> 
> > Works great with 2.6.21-rt1, and 2.6.22-rc4-hrt5, but that you already
> > know :)
> 
> Ok. I go back and figure out which differences we have between
> 2.6.21-rt>8 and the -hrt queue.

Are you sure it's strictly and HRT issue? I didn't see a
!CONFIG_HIGH_RES_TIMERS test ..

Daniel


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: 2.6.21-rt2..8 troubles
  2007-06-11 22:34                                 ` Daniel Walker
@ 2007-06-11 23:08                                   ` Thomas Gleixner
  2007-06-12 10:10                                     ` Rui Nuno Capela
  0 siblings, 1 reply; 30+ messages in thread
From: Thomas Gleixner @ 2007-06-11 23:08 UTC (permalink / raw)
  To: Daniel Walker; +Cc: Rui Nuno Capela, Ingo Molnar, linux-kernel, linux-rt-users

On Mon, 2007-06-11 at 15:34 -0700, Daniel Walker wrote:
> On Mon, 2007-06-11 at 23:42 +0200, Thomas Gleixner wrote:
> > On Mon, 2007-06-11 at 22:25 +0100, Rui Nuno Capela wrote:
> > > Nope. It's a Fujitsu-Siemens Amilo Si 1520 -- Intel Core2 Duo T7200@2.0Ghz.
> > 
> > Yeah, there are Dell ones which have similar or worse symptoms.
> > 
> > > Works great with 2.6.21-rt1, and 2.6.22-rc4-hrt5, but that you already
> > > know :)
> > 
> > Ok. I go back and figure out which differences we have between
> > 2.6.21-rt>8 and the -hrt queue.
> 
> Are you sure it's strictly and HRT issue? I didn't see a
> !CONFIG_HIGH_RES_TIMERS test ..

The main difference between -rt1 and -rt2 was the update of -hrt, which
not only affects CONFIG_HIGH_RES_TIMERS. There are enough
CONFIG_HIGH_RES_TIMERS=n related changes to clock events and friends as
well.

	tglx



^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: 2.6.21-rt2..8 troubles
  2007-06-11 23:08                                   ` Thomas Gleixner
@ 2007-06-12 10:10                                     ` Rui Nuno Capela
  2007-07-06 14:16                                       ` Rui Nuno Capela
  0 siblings, 1 reply; 30+ messages in thread
From: Rui Nuno Capela @ 2007-06-12 10:10 UTC (permalink / raw)
  To: Thomas Gleixner; +Cc: Daniel Walker, Ingo Molnar, linux-kernel, linux-rt-users


On Tue, June 12, 2007 00:08, Thomas Gleixner wrote:
> On Mon, 2007-06-11 at 15:34 -0700, Daniel Walker wrote:
>
>> On Mon, 2007-06-11 at 23:42 +0200, Thomas Gleixner wrote:
>>
>>> On Mon, 2007-06-11 at 22:25 +0100, Rui Nuno Capela wrote:
>>>
>>>> Nope. It's a Fujitsu-Siemens Amilo Si 1520 -- Intel Core2 Duo
>>>> T7200@2.0Ghz.
>>>>
>>>
>>> Yeah, there are Dell ones which have similar or worse symptoms.
>>>
>>>
>>>> Works great with 2.6.21-rt1, and 2.6.22-rc4-hrt5, but that you
>>>> already know :)
>>>
>>> Ok. I go back and figure out which differences we have between
>>> 2.6.21-rt>8 and the -hrt queue.
>>>
>>
>> Are you sure it's strictly and HRT issue? I didn't see a
>> !CONFIG_HIGH_RES_TIMERS test ..
>>
>
> The main difference between -rt1 and -rt2 was the update of -hrt, which
> not only affects CONFIG_HIGH_RES_TIMERS. There are enough
> CONFIG_HIGH_RES_TIMERS=n related changes to clock events and friends as
> well.
>

In deed, FWIW and IIRC, I can confirm that the show-stopper problem was
still present when tried with CONFIG_HIGH_RES_TIMERS not set (=N).

Bye now.
-- 
rncbc aka Rui Nuno Capela
rncbc@rncbc.org

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: 2.6.21-rt2..8 troubles
  2007-06-12 10:10                                     ` Rui Nuno Capela
@ 2007-07-06 14:16                                       ` Rui Nuno Capela
  0 siblings, 0 replies; 30+ messages in thread
From: Rui Nuno Capela @ 2007-07-06 14:16 UTC (permalink / raw)
  To: Thomas Gleixner; +Cc: Daniel Walker, Ingo Molnar, linux-kernel, linux-rt-users

Hi,

I'm back but with good news this time :)...

On Tue, June 12, 2007 11:10, Rui Nuno Capela wrote:
> On Tue, June 12, 2007 00:08, Thomas Gleixner wrote:
>> On Mon, 2007-06-11 at 15:34 -0700, Daniel Walker wrote:
>>> On Mon, 2007-06-11 at 23:42 +0200, Thomas Gleixner wrote:
>>>> On Mon, 2007-06-11 at 22:25 +0100, Rui Nuno Capela wrote:
>>>>
>>>>> Nope. It's a Fujitsu-Siemens Amilo Si 1520 -- Intel Core2 Duo
>>>>> T7200@2.0Ghz.
>>>>>
>>>> Yeah, there are Dell ones which have similar or worse symptoms.
>>>>
>>>>> Works great with 2.6.21-rt1, and 2.6.22-rc4-hrt5, but that you
>>>>> already know :)
>>>>
>>>> Ok. I go back and figure out which differences we have between
>>>> 2.6.21-rt>8 and the -hrt queue.
>>>
>>> Are you sure it's strictly and HRT issue? I didn't see a
>>> !CONFIG_HIGH_RES_TIMERS test ..
>>
>> The main difference between -rt1 and -rt2 was the update of -hrt, which
>>  not only affects CONFIG_HIGH_RES_TIMERS. There are enough
>> CONFIG_HIGH_RES_TIMERS=n related changes to clock events and friends as
>>  well.
>
> In deed, FWIW and IIRC, I can confirm that the show-stopper problem was
> still present when tried with CONFIG_HIGH_RES_TIMERS not set (=N).
>

Although I'm still with my fingers crossed, I can tell that 2.6.21.5-rt19
(and -rt20) does behave far better now on the very same box.

I've more than 8 hours up and running now, without a single glimpse of the
bad symptoms, which used to show in a matter of minutes if not earlier
during init time.

Congratulations, -rt is usable again here and that just makes me happier :)

Cheers.
-- 
rncbc aka Rui Nuno Capela
rncbc@rncbc.org



^ permalink raw reply	[flat|nested] 30+ messages in thread

end of thread, other threads:[~2007-07-06 14:17 UTC | newest]

Thread overview: 30+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2007-01-30 11:26 2.6.20-rc6-rt4 register_cpu_notification undefined Rui Nuno Capela
2007-02-09 18:56 ` 2.6.20-rt5 Oops on boot Rui Nuno Capela
2007-02-16  0:46   ` 2.6.20-rt5 Oops on boot [-rt8 OK] Rui Nuno Capela
2007-02-16  8:25     ` Ingo Molnar
2007-02-19 12:38       ` Sergio Monteiro Basto
2007-04-01 17:12 ` 2.6.21-rc5-rt6 make errors Rui Nuno Capela
2007-04-01 18:39   ` Ingo Molnar
2007-04-03 23:49     ` 2.6.21-rc5-rt10 troubles Rui Nuno Capela
2007-04-04  8:49       ` Ingo Molnar
2007-04-04  9:42         ` Ingo Molnar
2007-05-25 20:58     ` 2.6.21-rt2..8 troubles Rui Nuno Capela
2007-05-26 16:08       ` Thomas Gleixner
2007-05-26 21:21         ` Rui Nuno Capela
2007-06-06  0:44           ` Rui Nuno Capela
2007-06-08 15:47             ` Thomas Gleixner
2007-06-08 18:21               ` Rui Nuno Capela
2007-06-08 18:50                 ` Thomas Gleixner
2007-06-11 19:36                   ` Rui Nuno Capela
2007-06-11 19:45                     ` Thomas Gleixner
2007-06-11 19:55                       ` Daniel Walker
2007-06-11 20:50                         ` Rui Nuno Capela
2007-06-11 21:14                           ` Thomas Gleixner
2007-06-11 21:25                             ` Rui Nuno Capela
2007-06-11 21:42                               ` Thomas Gleixner
2007-06-11 22:34                                 ` Daniel Walker
2007-06-11 23:08                                   ` Thomas Gleixner
2007-06-12 10:10                                     ` Rui Nuno Capela
2007-07-06 14:16                                       ` Rui Nuno Capela
2007-05-31 15:56       ` Steven Rostedt
2007-05-31 16:26         ` Luis Claudio R. Goncalves

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.