* 2.6.20-rc6-rt4 register_cpu_notification undefined @ 2007-01-30 11:26 Rui Nuno Capela 2007-02-09 18:56 ` 2.6.20-rt5 Oops on boot Rui Nuno Capela 2007-04-01 17:12 ` 2.6.21-rc5-rt6 make errors Rui Nuno Capela 0 siblings, 2 replies; 30+ messages in thread From: Rui Nuno Capela @ 2007-01-30 11:26 UTC (permalink / raw) To: Ingo Molnar; +Cc: linux-rt-users, linux-kernel [-- Attachment #1: Type: text/plain, Size: 229 bytes --] Hi, Just to let you know that this my simple patch solves the 'register_cpu_notifier' being undefined when SMP is set but not HOTPLUG_CPU. Dunno if its the right thing, tho. Cheers -- rncbc aka Rui Nuno Capela rncbc@rncbc.org [-- Attachment #2: linux-2.6.20-rc6-rt4-cpu.patch --] [-- Type: application/octet-stream, Size: 357 bytes --] --- linux/kernel/cpu.c.orig 2007-01-25 02:19:28.000000000 +0000 +++ linux/kernel/cpu.c 2007-01-30 09:43:22.000000000 +0000 @@ -75,10 +75,10 @@ return ret; } -#ifdef CONFIG_HOTPLUG_CPU - EXPORT_SYMBOL(register_cpu_notifier); +#ifdef CONFIG_HOTPLUG_CPU + void unregister_cpu_notifier(struct notifier_block *nb) { mutex_lock(&cpu_add_remove_lock); ^ permalink raw reply [flat|nested] 30+ messages in thread
* 2.6.20-rt5 Oops on boot 2007-01-30 11:26 2.6.20-rc6-rt4 register_cpu_notification undefined Rui Nuno Capela @ 2007-02-09 18:56 ` Rui Nuno Capela 2007-02-16 0:46 ` 2.6.20-rt5 Oops on boot [-rt8 OK] Rui Nuno Capela 2007-04-01 17:12 ` 2.6.21-rc5-rt6 make errors Rui Nuno Capela 1 sibling, 1 reply; 30+ messages in thread From: Rui Nuno Capela @ 2007-02-09 18:56 UTC (permalink / raw) To: Ingo Molnar; +Cc: Rui Nuno Capela, linux-rt-users, linux-kernel Hi, I have terrible news: 2.6.20-rt5 does not boot at all on a couple machines I was brave enough to try -- a P4@3.3Ghz SMP/HT desktop, and a Core2 Duo T7200@2.0Ghz laptop. For the first case I could capture the following dump via serial console: --BOF-- Linux version 2.6.20-rt5.1 (root@gamma-suse1) (gcc version 4.1.2 20061115 (prerelease) (SUSE Linux)) #1 SMP PREEMPT Fri Feb 9 18:30:22 WET 2007 BIOS-provided physical RAM map: sanitize start sanitize end copy_e820_map() start: 0000000000000000 size: 000000000009fc00 end: 000000000009fc00 type: 1 copy_e820_map() type is E820_RAM copy_e820_map() start: 000000000009fc00 size: 0000000000000400 end: 00000000000a0000 type: 2 copy_e820_map() start: 00000000000e8000 size: 0000000000018000 end: 0000000000100000 type: 2 copy_e820_map() start: 0000000000100000 size: 000000003fe30000 end: 000000003ff30000 type: 1 copy_e820_map() type is E820_RAM copy_e820_map() start: 000000003ff30000 size: 0000000000010000 end: 000000003ff40000 type: 3 copy_e820_map() start: 000000003ff40000 size: 00000000000b0000 end: 000000003fff0000 type: 4 copy_e820_map() start: 000000003fff0000 size: 0000000000010000 end: 0000000040000000 type: 2 copy_e820_map() start: 00000000ffb80000 size: 0000000000480000 end: 0000000100000000 type: 2 BIOS-e820: 0000000000000000 - 000000000009fc00 (usable) BIOS-e820: 000000000009fc00 - 00000000000a0000 (reserved) BIOS-e820: 00000000000e8000 - 0000000000100000 (reserved) BIOS-e820: 0000000000100000 - 000000003ff30000 (usable) BIOS-e820: 000000003ff30000 - 000000003ff40000 (ACPI data) BIOS-e820: 000000003ff40000 - 000000003fff0000 (ACPI NVS) BIOS-e820: 000000003fff0000 - 0000000040000000 (reserved) BIOS-e820: 00000000ffb80000 - 0000000100000000 (reserved) 127MB HIGHMEM available. 896MB LOWMEM available. found SMP MP-table at 000ff780 Entering add_active_range(0, 0, 261936) 0 entries of 256 used Zone PFN ranges: DMA 0 -> 4096 Normal 4096 -> 229376 HighMem 229376 -> 261936 early_node_map[1] active PFN ranges 0: 0 -> 261936 On node 0 totalpages: 261936 DMA zone: 32 pages used for memmap DMA zone: 0 pages reserved DMA zone: 4064 pages, LIFO batch:0 Normal zone: 1760 pages used for memmap Normal zone: 223520 pages, LIFO batch:31 HighMem zone: 254 pages used for memmap HighMem zone: 32306 pages, LIFO batch:7 DMI 2.3 present. Using APIC driver default ACPI: RSDP (v002 ACPIAM ) @ 0x000f9e60 ACPI: XSDT (v001 A M I OEMXSDT 0x08000320 MSFT 0x00000097) @ 0x3ff30100 ACPI: FADT (v003 A M I OEMFACP 0x08000320 MSFT 0x00000097) @ 0x3ff30290 ACPI: MADT (v001 A M I OEMAPIC 0x08000320 MSFT 0x00000097) @ 0x3ff30390 ACPI: OEMB (v001 A M I OEMBIOS 0x08000320 MSFT 0x00000097) @ 0x3ff40040 ACPI: DSDT (v001 P4P81 P4P81086 0x00000086 INTL 0x02002026) @ 0x00000000 ACPI: PM-Timer IO Port: 0x808 ACPI: Local APIC address 0xfee00000 ACPI: LAPIC (acpi_id[0x01] lapic_id[0x00] enabled) Processor #0 15:2 APIC version 20 ACPI: LAPIC (acpi_id[0x02] lapic_id[0x01] enabled) Processor #1 15:2 APIC version 20 ACPI: IOAPIC (id[0x02] address[0xfec00000] gsi_base[0]) IOAPIC[0]: apic_id 2, version 32, address 0xfec00000, GSI 0-23 ACPI: INT_SRC_OVR (bus 0 bus_irq 0 global_irq 2 dfl dfl) ACPI: INT_SRC_OVR (bus 0 bus_irq 9 global_irq 9 high level) ACPI: IRQ0 used by override. ACPI: IRQ2 used by override. ACPI: IRQ9 used by override. Enabling APIC mode: Flat. Using 1 I/O APICs Using ACPI (MADT) for SMP configuration information Allocating PCI resources starting at 50000000 (gap: 40000000:bfb80000) Detected 3361.210 MHz processor. Real-Time Preemption Support (C) 2004-2007 Ingo Molnar Built 1 zonelists. Total pages: 259890 Kernel command line: root=/dev/hda1 vga=0x31a resume=/dev/hda3 splash=silent console=tty0 console=ttyS0,115200n8 debug mapped APIC to ffffd000 (fee00000) mapped IOAPIC to ffffc000 (fec00000) Enabling fast FPU save and restore... done. Enabling unmasked SIMD FPU exception support... done. Initializing CPU#0 WARNING: experimental RCU implementation. CPU 0 irqstacks, hard=c03ed000 soft=c03eb000 PID hash table entries: 4096 (order: 12, 16384 bytes) Console: colour dummy device 80x25 Dentry cache hash table entries: 131072 (order: 7, 524288 bytes) Inode-cache hash table entries: 65536 (order: 6, 262144 bytes) Memory: 1030292k/1047744k available (1725k kernel code, 16684k reserved, 1005k data, 220k init, 130240k highmem) virtual kernel memory layout: fixmap : 0xfff9c000 - 0xfffff000 ( 396 kB) pkmap : 0xff800000 - 0xffc00000 (4096 kB) vmalloc : 0xf8800000 - 0xff7fe000 ( 111 MB) lowmem : 0xc0000000 - 0xf8000000 ( 896 MB) .init : 0xc03af000 - 0xc03e6000 ( 220 kB) .data : 0xc02af5dd - 0xc03aacf4 (1005 kB) .text : 0xc0100000 - 0xc02af5dd (1725 kB) Checking if this processor honours the WP bit even in supervisor mode... Ok. Calibrating delay using timer specific routine.. 6723.97 BogoMIPS (lpj=3361989) Security Framework v1.0.0 initialized Mount-cache hash table entries: 512 CPU: After generic identify, caps: bfebfbff 00000000 00000000 00000000 00004400 00000000 00000000 CPU: Trace cache: 12K uops, L1 D cache: 8K CPU: L2 cache: 512K CPU: Physical Processor ID: 0 CPU: After all inits, caps: bfebfbff 00000000 00000000 00003080 00004400 00000000 00000000 Intel machine check architecture supported. Intel machine check reporting enabled on CPU#0. CPU0: Intel P4/Xeon Extended MCE MSRs (12) available CPU0: Thermal monitoring enabled Compat vDSO mapped to ffffe000. Checking 'hlt' instruction... OK. Freeing SMP alternatives: 9k freed ACPI: Core revision 20060707 CPU0: Intel(R) Pentium(R) 4 CPU 2.80GHz stepping 09 Booting processor 1/1 eip 2000 CPU 1 irqstacks, hard=c03ee000 soft=c03ec000 Initializing CPU#1 Calibrating delay using timer specific routine.. 6720.94 BogoMIPS (lpj=3360473) CPU: After generic identify, caps: bfebfbff 00000000 00000000 00000000 00004400 00000000 00000000 CPU: Trace cache: 12K uops, L1 D cache: 8K CPU: L2 cache: 512K CPU: Physical Processor ID: 0 CPU: After all inits, caps: bfebfbff 00000000 00000000 00003080 00004400 00000000 00000000 Intel machine check architecture supported. Intel machine check reporting enabled on CPU#1. CPU1: Intel P4/Xeon Extended MCE MSRs (12) available CPU1: Thermal monitoring enabled CPU1: Intel(R) Pentium(R) 4 CPU 2.80GHz stepping 09 Total of 2 processors activated (13444.92 BogoMIPS). ENABLING IO-APIC IRQs ..TIMER: vector=0x31 apic1=0 pin1=2 apic2=-1 pin2=-1 checking TSC synchronization [CPU#0 -> CPU#1]: passed. Brought up 2 CPUs BUG: unable to handle kernel NULL pointer dereference at virtual address 0000001c printing eip: c011783d *pde = 00000000 stopped custom tracer. Oops: 0000 [#1] PREEMPT SMP Modules linked in: CPU: 1 EIP: 0060:[<c011783d>] Not tainted VLI EFLAGS: 00010286 (2.6.20-rt5.1 #1) EIP is at try_to_wake_up+0x11/0x395 eax: 00000000 ebx: 00000000 ecx: 00000000 edx: dfc87ccc esi: 00000000 edi: c18a4700 ebp: dfc87cdc esp: dfc87c90 ds: 007b es: 007b ss: 0068 preempt: 00000001 Process swapper (pid: 1, ti=dfc87000 task=dfca1670 task.ti=dfc87000) Stack: 00000046 00000000 0000001f 00000008 c18a3d80 dfca1670 dfc87cc0 dfc87cb8 00000046 dfca1670 00000001 dfc87cc4 c0136f34 dfc87cd0 c012dbc8 dfc87cf4 00000000 c18a3d80 c18a4700 dfc87ce8 c0117c64 00000000 dfc87d3c c01180e6 Call Trace: [<c0117c64>] wake_up_process+0x19/0x1b [<c01180e6>] set_cpus_allowed+0x6c/0x92 [<c0118140>] measure_one+0x34/0x165 [<c0118953>] build_sched_domains+0x6e2/0xce4 [<c0118f70>] arch_init_sched_domains+0x1b/0x1d [<c03c15dd>] sched_init_smp+0x10/0x47 [<c010047d>] init+0xd0/0x335 [<c0104b87>] kernel_thread_helper+0x7/0x10 ======================= Code: 5d f0 89 4f 44 89 5f 48 8b 55 e8 89 f8 e8 99 e1 ff ff 83 c4 0c 5b 5e 5f 5d c3 55 89 e5 57 56 89 c6 53 83 ec 40 89 55 bc 8d 55 f0 <83> 78 1c 63 b8 00 00 00 00 0f 4f c1 89 45 b8 89 f0 e8 ca e1 ff EIP: [<c011783d>] try_to_wake_up+0x11/0x395 SS:ESP 0068:dfc87c90 <0>Kernel panic - not syncing: Attempted to kill init! [<c0104f7e>] dump_trace+0x63/0x1e5 [<c010511a>] show_trace_log_lvl+0x1a/0x2f [<c010572a>] show_trace+0x12/0x14 [<c01057bd>] dump_stack+0x16/0x18 [<c011c791>] panic+0x50/0xf3 [<c011f25b>] do_exit+0x9b/0x771 [<c01056c3>] die+0x211/0x237 [<c02add07>] do_page_fault+0x3f3/0x4bf [<c02ac39c>] error_code+0x7c/0x84 [<c011783d>] try_to_wake_up+0x11/0x395 [<c0117c64>] wake_up_process+0x19/0x1b [<c01180e6>] set_cpus_allowed+0x6c/0x92 [<c0118140>] measure_one+0x34/0x165 [<c0118953>] build_sched_domains+0x6e2/0xce4 [<c0118f70>] arch_init_sched_domains+0x1b/0x1d [<c03c15dd>] sched_init_smp+0x10/0x47 [<c010047d>] init+0xd0/0x335 [<c0104b87>] kernel_thread_helper+0x7/0x10 ======================= --EOF-- Hope it helps. -- rncbc aka Rui Nuno Capela rncbc@rncbc.org ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: 2.6.20-rt5 Oops on boot [-rt8 OK] 2007-02-09 18:56 ` 2.6.20-rt5 Oops on boot Rui Nuno Capela @ 2007-02-16 0:46 ` Rui Nuno Capela 2007-02-16 8:25 ` Ingo Molnar 0 siblings, 1 reply; 30+ messages in thread From: Rui Nuno Capela @ 2007-02-16 0:46 UTC (permalink / raw) To: Ingo Molnar; +Cc: linux-rt-users, linux-kernel Rui Nuno Capela (me) wrote: > > I have terrible news: 2.6.20-rt5 does not boot at all on a couple > machines I was brave enough to try -- a P4@3.3Ghz SMP/HT desktop, and a > Core2 Duo T7200@2.0Ghz laptop. For the first case I could capture the > following dump via serial console: > ... News are that 2.6.20-rt8 got it all back to business :) Cheers. -- rncbc aka Rui Nuno Capela rncbc@rncbc.org ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: 2.6.20-rt5 Oops on boot [-rt8 OK] 2007-02-16 0:46 ` 2.6.20-rt5 Oops on boot [-rt8 OK] Rui Nuno Capela @ 2007-02-16 8:25 ` Ingo Molnar 2007-02-19 12:38 ` Sergio Monteiro Basto 0 siblings, 1 reply; 30+ messages in thread From: Ingo Molnar @ 2007-02-16 8:25 UTC (permalink / raw) To: Rui Nuno Capela; +Cc: linux-rt-users, linux-kernel * Rui Nuno Capela <rncbc@rncbc.org> wrote: > Rui Nuno Capela (me) wrote: > > > > I have terrible news: 2.6.20-rt5 does not boot at all on a couple > > machines I was brave enough to try -- a P4@3.3Ghz SMP/HT desktop, and a > > Core2 Duo T7200@2.0Ghz laptop. For the first case I could capture the > > following dump via serial console: > > ... > > News are that 2.6.20-rt8 got it all back to business :) great! The fix is from Michal/Clark/Steve. Ingo ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: 2.6.20-rt5 Oops on boot [-rt8 OK] 2007-02-16 8:25 ` Ingo Molnar @ 2007-02-19 12:38 ` Sergio Monteiro Basto 0 siblings, 0 replies; 30+ messages in thread From: Sergio Monteiro Basto @ 2007-02-19 12:38 UTC (permalink / raw) To: Ingo Molnar; +Cc: Rui Nuno Capela, linux-rt-users, linux-kernel On Fri, 2007-02-16 at 09:25 +0100, Ingo Molnar wrote: > > > > News are that 2.6.20-rt8 got it all back to business :) yes!, my Pentium D back in business too . -- Sérgio M .B. ^ permalink raw reply [flat|nested] 30+ messages in thread
* 2.6.21-rc5-rt6 make errors 2007-01-30 11:26 2.6.20-rc6-rt4 register_cpu_notification undefined Rui Nuno Capela 2007-02-09 18:56 ` 2.6.20-rt5 Oops on boot Rui Nuno Capela @ 2007-04-01 17:12 ` Rui Nuno Capela 2007-04-01 18:39 ` Ingo Molnar 1 sibling, 1 reply; 30+ messages in thread From: Rui Nuno Capela @ 2007-04-01 17:12 UTC (permalink / raw) To: linux-kernel; +Cc: linux-rt-users Hi, Just tried to build 2.6.21-rc5-rt6 and it is failing on build time. ... kernel/sched.c: In function ‘__schedule’: kernel/sched.c:3830: error: ‘print_functions’ undeclared (first use in this function) kernel/sched.c:3830: error: (Each undeclared identifier is reported only once kernel/sched.c:3830: error: for each function it appears in.) ... arch/i386/kernel/apic.c: In function ‘smp_apic_timer_interrupt’: arch/i386/kernel/apic.c:589: error: invalid lvalue in assignment arch/i386/kernel/apic.c:608: error: invalid lvalue in assignment ... Just to let you know that -rt5 was doing fine with very same .config . Cheers. -- rncbc aka Rui Nuno Capela rncbc@rncbc.org ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: 2.6.21-rc5-rt6 make errors 2007-04-01 17:12 ` 2.6.21-rc5-rt6 make errors Rui Nuno Capela @ 2007-04-01 18:39 ` Ingo Molnar 2007-04-03 23:49 ` 2.6.21-rc5-rt10 troubles Rui Nuno Capela 2007-05-25 20:58 ` 2.6.21-rt2..8 troubles Rui Nuno Capela 0 siblings, 2 replies; 30+ messages in thread From: Ingo Molnar @ 2007-04-01 18:39 UTC (permalink / raw) To: Rui Nuno Capela; +Cc: linux-kernel, linux-rt-users * Rui Nuno Capela <rncbc@rncbc.org> wrote: > Hi, > > Just tried to build 2.6.21-rc5-rt6 and it is failing on build time. > Just to let you know that -rt5 was doing fine with very same .config . oops - indeed! I've uploaded -rc5-rt7 with the fix. (it includes a few other fixes as well) note that for Fedora-ish distros there's an easy yum test-kernel available from the rt-testing repo: cat > /etc/yum.repos.d/rt-testing.repo [rt-testing] name=Ingo's Real-Time (-rt) test-kernel for FC6 baseurl=http://people.redhat.com/mingo/realtime-preempt/yum-testing/yum/ enabled=1 gpgcheck=0 <Ctrl-D> this includes Linus-git-bleeding-edge as well as the base kernel. (rt7 is based on upstream HEAD 755948cfca16c7) [the rt-testing repo currently includes the rt6 rpm, i've just started the rt7 build, it should be available in half an hour or so] Ingo ^ permalink raw reply [flat|nested] 30+ messages in thread
* 2.6.21-rc5-rt10 troubles 2007-04-01 18:39 ` Ingo Molnar @ 2007-04-03 23:49 ` Rui Nuno Capela 2007-04-04 8:49 ` Ingo Molnar 2007-05-25 20:58 ` 2.6.21-rt2..8 troubles Rui Nuno Capela 1 sibling, 1 reply; 30+ messages in thread From: Rui Nuno Capela @ 2007-04-03 23:49 UTC (permalink / raw) To: Ingo Molnar; +Cc: linux-kernel, linux-rt-users Ingo et al. I'm afraid having no good news (once again). After building 2.6.21-rc5-rt8 and recently on -rt10 I've found some trouble running on a Core2 T7200 laptop (SMP). Somehow, specially after starting jackd, the whole system starts crawling to death. It just slows down to some kind of Big Freeze, with no evidence over the console whatsoever, so that I'm ultimately left with a brick on my hands. This behavior is consistent and occurs every time after jackd is started. It does not seem to occur on -rt5 and earlier. I wish I could give you more details, but fact is I don't know where to look. The machine just freezes silently. Bye now. -- rncbc aka Rui Nuno Capela rncbc@rncbc.org ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: 2.6.21-rc5-rt10 troubles 2007-04-03 23:49 ` 2.6.21-rc5-rt10 troubles Rui Nuno Capela @ 2007-04-04 8:49 ` Ingo Molnar 2007-04-04 9:42 ` Ingo Molnar 0 siblings, 1 reply; 30+ messages in thread From: Ingo Molnar @ 2007-04-04 8:49 UTC (permalink / raw) To: Rui Nuno Capela; +Cc: linux-kernel, linux-rt-users * Rui Nuno Capela <rncbc@rncbc.org> wrote: > Ingo et al. > > I'm afraid having no good news (once again). After building > 2.6.21-rc5-rt8 and recently on -rt10 I've found some trouble running > on a Core2 T7200 laptop (SMP). Somehow, specially after starting > jackd, the whole system starts crawling to death. It just slows down > to some kind of Big Freeze, with no evidence over the console > whatsoever, so that I'm ultimately left with a brick on my hands. > > This behavior is consistent and occurs every time after jackd is > started. It does not seem to occur on -rt5 and earlier. > > I wish I could give you more details, but fact is I don't know where > to look. The machine just freezes silently. could you try rt11 (which fixes two bad bugs in rt10)? If rt11 freezes too then could you try to unapply the attached patch? This patch is the main delta between rt5 and rt11. (plus upstream changes but those shouldnt matter for this problem) Ingo -----------------------> Subject: [patch] softirq preemption: optimization From: Ingo Molnar <mingo@elte.hu> optimize softirq preemption by allowing a hardirq context to pick up softirq processing. Signed-off-by: Ingo Molnar <mingo@elte.hu> --- include/linux/interrupt.h | 1 kernel/irq/manage.c | 24 +++---- kernel/softirq.c | 139 +++++++++++++++++++++++++++++++++++++--------- 3 files changed, 123 insertions(+), 41 deletions(-) Index: linux/include/linux/interrupt.h =================================================================== --- linux.orig/include/linux/interrupt.h +++ linux/include/linux/interrupt.h @@ -266,6 +266,7 @@ struct softirq_action asmlinkage void do_softirq(void); extern void open_softirq(int nr, void (*action)(struct softirq_action*), void *data); extern void softirq_init(void); +extern void do_softirq_from_hardirq(void); #ifdef CONFIG_PREEMPT_HARDIRQS # define __raise_softirq_irqoff(nr) raise_softirq_irqoff(nr) Index: linux/kernel/irq/manage.c =================================================================== --- linux.orig/kernel/irq/manage.c +++ linux/kernel/irq/manage.c @@ -628,14 +628,17 @@ static void thread_simple_irq(irq_desc_t unsigned int irq = desc - irq_desc; irqreturn_t action_ret; +repeat: if (action && !desc->depth) { spin_unlock(&desc->lock); action_ret = handle_IRQ_event(irq, action); - cond_resched_hardirq_context(); spin_lock_irq(&desc->lock); if (!noirqdebug) note_interrupt(irq, desc, action_ret); } + if (desc->status & IRQ_PENDING) + goto repeat; + desc->status &= ~IRQ_INPROGRESS; } @@ -692,7 +695,6 @@ static void thread_edge_irq(irq_desc_t * desc->status &= ~IRQ_PENDING; spin_unlock(&desc->lock); action_ret = handle_IRQ_event(irq, action); - cond_resched_hardirq_context(); spin_lock_irq(&desc->lock); if (!noirqdebug) note_interrupt(irq, desc, action_ret); @@ -721,7 +723,6 @@ static void thread_do_irq(irq_desc_t *de desc->status &= ~IRQ_PENDING; spin_unlock(&desc->lock); action_ret = handle_IRQ_event(irq, action); - cond_resched_hardirq_context(); spin_lock_irq(&desc->lock); if (!noirqdebug) note_interrupt(irq, desc, action_ret); @@ -757,8 +758,6 @@ static void do_hardirq(struct irq_desc * wake_up(&desc->wait_for_handler); } -extern asmlinkage void __do_softirq(void); - static int do_irqd(void * __desc) { struct sched_param param = { 0, }; @@ -781,16 +780,13 @@ static int do_irqd(void * __desc) while (!kthread_should_stop()) { local_irq_disable_nort(); - set_current_state(TASK_INTERRUPTIBLE); -#ifndef CONFIG_PREEMPT_RT - irq_enter(); -#endif - do_hardirq(desc); -#ifndef CONFIG_PREEMPT_RT - irq_exit(); -#endif + do { + set_current_state(TASK_INTERRUPTIBLE); + do_hardirq(desc); + do_softirq_from_hardirq(); + } while (current->state == TASK_RUNNING); + local_irq_enable_nort(); - cond_resched(); #ifdef CONFIG_SMP /* * Did IRQ affinities change? Index: linux/kernel/softirq.c =================================================================== --- linux.orig/kernel/softirq.c +++ linux/kernel/softirq.c @@ -100,8 +100,26 @@ static void wakeup_softirqd(int softirq) /* Interrupts are disabled: no need to stop preemption */ struct task_struct *tsk = __get_cpu_var(ksoftirqd)[softirq].tsk; - if (tsk && tsk->state != TASK_RUNNING) - wake_up_process(tsk); + if (unlikely(!tsk)) + return; +#if defined(CONFIG_PREEMPT_SOFTIRQS) && defined(CONFIG_PREEMPT_HARDIRQS) + /* + * Optimization: if we are in a hardirq thread context, and + * if the priority of the softirq thread is the same as the + * priority of the hardirq thread, then 'merge' softirq + * processing into the hardirq context. (it will later on + * execute softirqs via do_softirq_from_hardirq()). + * So here we can skip the wakeup and can rely on the hardirq + * context processing it later on. + */ + if ((current->flags & PF_HARDIRQ) && !hardirq_count() && + (tsk->normal_prio == current->normal_prio)) + return; +#endif + /* + * Wake up the softirq task: + */ + wake_up_process(tsk); } /* @@ -250,50 +268,100 @@ EXPORT_SYMBOL(local_bh_enable_ip); * we want to handle softirqs as soon as possible, but they * should not be able to lock up the box. */ -#define MAX_SOFTIRQ_RESTART 10 +#define MAX_SOFTIRQ_RESTART 20 + +static DEFINE_PER_CPU(u32, softirq_running); -asmlinkage void ___do_softirq(void) +static void ___do_softirq(const int same_prio_only) { + int max_restart = MAX_SOFTIRQ_RESTART, max_loops = MAX_SOFTIRQ_RESTART; + __u32 pending, available_mask, same_prio_skipped; struct softirq_action *h; - __u32 pending; - int max_restart = MAX_SOFTIRQ_RESTART; - int cpu; + struct task_struct *tsk; + int cpu, softirq; pending = local_softirq_pending(); account_system_vtime(current); cpu = smp_processor_id(); restart: + available_mask = -1; + softirq = 0; + same_prio_skipped = 0; /* Reset the pending bitmask before enabling irqs */ set_softirq_pending(0); - local_irq_enable(); - h = softirq_vec; do { + u32 softirq_mask = 1 << softirq; + if (pending & 1) { - { - u32 preempt_count = preempt_count(); - h->action(h); - if (preempt_count != preempt_count()) { - print_symbol("BUG: softirq exited %s with wrong preemption count!\n", (unsigned long) h->action); - printk("entered with %08x, exited with %08x.\n", preempt_count, preempt_count()); - preempt_count() = preempt_count; + u32 preempt_count = preempt_count(); + +#if defined(CONFIG_PREEMPT_SOFTIRQS) && defined(CONFIG_PREEMPT_HARDIRQS) + /* + * If executed by a same-prio hardirq thread + * then skip pending softirqs that belong + * to softirq threads with different priority: + */ + if (same_prio_only) { + tsk = __get_cpu_var(ksoftirqd)[softirq].tsk; + if (tsk && tsk->normal_prio != + current->normal_prio) { + same_prio_skipped |= softirq_mask; + available_mask &= ~softirq_mask; + goto next; } } +#endif + /* + * Is this softirq already being processed? + */ + if (per_cpu(softirq_running, cpu) & softirq_mask) { + available_mask &= ~softirq_mask; + goto next; + } + per_cpu(softirq_running, cpu) |= softirq_mask; + local_irq_enable(); + + h->action(h); + if (preempt_count != preempt_count()) { + print_symbol("BUG: softirq exited %s with wrong preemption count!\n", (unsigned long) h->action); + printk("entered with %08x, exited with %08x.\n", preempt_count, preempt_count()); + preempt_count() = preempt_count; + } rcu_bh_qsctr_inc(cpu); cond_resched_softirq_context(); + local_irq_disable(); + per_cpu(softirq_running, cpu) &= ~softirq_mask; } +next: h++; + softirq++; pending >>= 1; } while (pending); - local_irq_disable(); - + or_softirq_pending(same_prio_skipped); pending = local_softirq_pending(); - if (pending && --max_restart) - goto restart; + if (pending & available_mask) { + if (--max_restart) + goto restart; + /* + * With softirq threading there's no reason not to + * finish the workload we have: + */ +#ifdef CONFIG_PREEMPT_SOFTIRQS + if (--max_loops) { + if (printk_ratelimit()) + printk("INFO: softirq overload: %08x\n", pending); + max_restart = MAX_SOFTIRQ_RESTART; + goto restart; + } + if (printk_ratelimit()) + printk("BUG: softirq loop! %08x\n", pending); +#endif + } if (pending) trigger_softirqs(); @@ -321,7 +389,7 @@ asmlinkage void __do_softirq(void) p_flags = current->flags & PF_HARDIRQ; current->flags &= ~PF_HARDIRQ; - ___do_softirq(); + ___do_softirq(0); trace_softirq_exit(); @@ -350,8 +418,9 @@ void do_softirq_from_hardirq(void) __local_bh_disable((unsigned long)__builtin_return_address(0)); p_flags = current->flags & PF_HARDIRQ; current->flags &= ~PF_HARDIRQ; + current->flags |= PF_SOFTIRQ; - ___do_softirq(); + ___do_softirq(1); trace_softirq_exit(); @@ -359,6 +428,9 @@ void do_softirq_from_hardirq(void) _local_bh_enable(); current->flags |= p_flags; + current->flags &= ~PF_SOFTIRQ; + + local_irq_enable(); } #ifndef __ARCH_HAS_DO_SOFTIRQ @@ -669,8 +741,9 @@ static int ksoftirqd(void * __data) { struct sched_param param = { .sched_priority = MAX_USER_RT_PRIO/2 }; struct softirqdata *data = __data; - u32 mask = (1 << data->nr); + u32 softirq_mask = (1 << data->nr); struct softirq_action *h; + int cpu = data->cpu; sys_sched_setscheduler(current->pid, SCHED_FIFO, ¶m); // set_user_nice(current, -10); @@ -684,7 +757,8 @@ static int ksoftirqd(void * __data) while (!kthread_should_stop()) { preempt_disable(); - if (!(local_softirq_pending() & mask)) { + if (!(local_softirq_pending() & softirq_mask)) { +sleep_more: __preempt_enable_no_resched(); schedule(); preempt_disable(); @@ -694,16 +768,26 @@ static int ksoftirqd(void * __data) #endif __set_current_state(TASK_RUNNING); - while (local_softirq_pending() & mask) { + while (local_softirq_pending() & softirq_mask) { /* Preempt disable stops cpu going offline. If already offline, we'll be on wrong CPU: don't process */ - if (cpu_is_offline(data->cpu)) + if (cpu_is_offline(cpu)) goto wait_to_die; local_irq_disable(); + /* + * Is the softirq already being executed by + * a hardirq context? + */ + if (per_cpu(softirq_running, cpu) & softirq_mask) { + local_irq_enable(); + set_current_state(TASK_INTERRUPTIBLE); + goto sleep_more; + } + per_cpu(softirq_running, cpu) |= softirq_mask; __preempt_enable_no_resched(); - set_softirq_pending(local_softirq_pending() & ~mask); + set_softirq_pending(local_softirq_pending() & ~softirq_mask); local_bh_disable(); local_irq_enable(); @@ -713,6 +797,7 @@ static int ksoftirqd(void * __data) rcu_bh_qsctr_inc(data->cpu); local_irq_disable(); + per_cpu(softirq_running, cpu) &= ~softirq_mask; _local_bh_enable(); local_irq_enable(); ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: 2.6.21-rc5-rt10 troubles 2007-04-04 8:49 ` Ingo Molnar @ 2007-04-04 9:42 ` Ingo Molnar 0 siblings, 0 replies; 30+ messages in thread From: Ingo Molnar @ 2007-04-04 9:42 UTC (permalink / raw) To: Rui Nuno Capela; +Cc: linux-kernel, linux-rt-users * Ingo Molnar <mingo@elte.hu> wrote: > could you try rt11 (which fixes two bad bugs in rt10)? If rt11 freezes > too then could you try to unapply the attached patch? This patch is > the main delta between rt5 and rt11. (plus upstream changes but those > shouldnt matter for this problem) FYI, i've released -rt12 meanwhile - and the patch to unapply from -rt12 is below. Ingo -------------------> Subject: [patch] softirq preemption: optimization From: Ingo Molnar <mingo@elte.hu> optimize softirq preemption by allowing a hardirq context to pick up softirq processing. Signed-off-by: Ingo Molnar <mingo@elte.hu> --- include/linux/interrupt.h | 1 kernel/irq/manage.c | 24 +++---- kernel/softirq.c | 150 ++++++++++++++++++++++++++++++++++++---------- 3 files changed, 131 insertions(+), 44 deletions(-) Index: linux/include/linux/interrupt.h =================================================================== --- linux.orig/include/linux/interrupt.h +++ linux/include/linux/interrupt.h @@ -266,6 +266,7 @@ struct softirq_action asmlinkage void do_softirq(void); extern void open_softirq(int nr, void (*action)(struct softirq_action*), void *data); extern void softirq_init(void); +extern void do_softirq_from_hardirq(void); #ifdef CONFIG_PREEMPT_HARDIRQS # define __raise_softirq_irqoff(nr) raise_softirq_irqoff(nr) Index: linux/kernel/irq/manage.c =================================================================== --- linux.orig/kernel/irq/manage.c +++ linux/kernel/irq/manage.c @@ -628,14 +628,17 @@ static void thread_simple_irq(irq_desc_t unsigned int irq = desc - irq_desc; irqreturn_t action_ret; +repeat: if (action && !desc->depth) { spin_unlock(&desc->lock); action_ret = handle_IRQ_event(irq, action); - cond_resched_hardirq_context(); spin_lock_irq(&desc->lock); if (!noirqdebug) note_interrupt(irq, desc, action_ret); } + if (desc->status & IRQ_PENDING) + goto repeat; + desc->status &= ~IRQ_INPROGRESS; } @@ -692,7 +695,6 @@ static void thread_edge_irq(irq_desc_t * desc->status &= ~IRQ_PENDING; spin_unlock(&desc->lock); action_ret = handle_IRQ_event(irq, action); - cond_resched_hardirq_context(); spin_lock_irq(&desc->lock); if (!noirqdebug) note_interrupt(irq, desc, action_ret); @@ -721,7 +723,6 @@ static void thread_do_irq(irq_desc_t *de desc->status &= ~IRQ_PENDING; spin_unlock(&desc->lock); action_ret = handle_IRQ_event(irq, action); - cond_resched_hardirq_context(); spin_lock_irq(&desc->lock); if (!noirqdebug) note_interrupt(irq, desc, action_ret); @@ -757,8 +758,6 @@ static void do_hardirq(struct irq_desc * wake_up(&desc->wait_for_handler); } -extern asmlinkage void __do_softirq(void); - static int do_irqd(void * __desc) { struct sched_param param = { 0, }; @@ -781,16 +780,13 @@ static int do_irqd(void * __desc) while (!kthread_should_stop()) { local_irq_disable_nort(); - set_current_state(TASK_INTERRUPTIBLE); -#ifndef CONFIG_PREEMPT_RT - irq_enter(); -#endif - do_hardirq(desc); -#ifndef CONFIG_PREEMPT_RT - irq_exit(); -#endif + do { + set_current_state(TASK_INTERRUPTIBLE); + do_hardirq(desc); + do_softirq_from_hardirq(); + } while (current->state == TASK_RUNNING); + local_irq_enable_nort(); - cond_resched(); #ifdef CONFIG_SMP /* * Did IRQ affinities change? Index: linux/kernel/softirq.c =================================================================== --- linux.orig/kernel/softirq.c +++ linux/kernel/softirq.c @@ -100,8 +100,26 @@ static void wakeup_softirqd(int softirq) /* Interrupts are disabled: no need to stop preemption */ struct task_struct *tsk = __get_cpu_var(ksoftirqd)[softirq].tsk; - if (tsk && tsk->state != TASK_RUNNING) - wake_up_process(tsk); + if (unlikely(!tsk)) + return; +#if defined(CONFIG_PREEMPT_SOFTIRQS) && defined(CONFIG_PREEMPT_HARDIRQS) + /* + * Optimization: if we are in a hardirq thread context, and + * if the priority of the softirq thread is the same as the + * priority of the hardirq thread, then 'merge' softirq + * processing into the hardirq context. (it will later on + * execute softirqs via do_softirq_from_hardirq()). + * So here we can skip the wakeup and can rely on the hardirq + * context processing it later on. + */ + if ((current->flags & PF_HARDIRQ) && !hardirq_count() && + (tsk->normal_prio == current->normal_prio)) + return; +#endif + /* + * Wake up the softirq task: + */ + wake_up_process(tsk); } /* @@ -250,50 +268,100 @@ EXPORT_SYMBOL(local_bh_enable_ip); * we want to handle softirqs as soon as possible, but they * should not be able to lock up the box. */ -#define MAX_SOFTIRQ_RESTART 10 +#define MAX_SOFTIRQ_RESTART 20 + +static DEFINE_PER_CPU(u32, softirq_running); -asmlinkage void ___do_softirq(void) +static void ___do_softirq(const int same_prio_only) { + int max_restart = MAX_SOFTIRQ_RESTART, max_loops = MAX_SOFTIRQ_RESTART; + __u32 pending, available_mask, same_prio_skipped; struct softirq_action *h; - __u32 pending; - int max_restart = MAX_SOFTIRQ_RESTART; - int cpu; + struct task_struct *tsk; + int cpu, softirq; pending = local_softirq_pending(); account_system_vtime(current); cpu = smp_processor_id(); restart: + available_mask = -1; + softirq = 0; + same_prio_skipped = 0; /* Reset the pending bitmask before enabling irqs */ set_softirq_pending(0); - local_irq_enable(); - h = softirq_vec; do { + u32 softirq_mask = 1 << softirq; + if (pending & 1) { - { - u32 preempt_count = preempt_count(); - h->action(h); - if (preempt_count != preempt_count()) { - print_symbol("BUG: softirq exited %s with wrong preemption count!\n", (unsigned long) h->action); - printk("entered with %08x, exited with %08x.\n", preempt_count, preempt_count()); - preempt_count() = preempt_count; + u32 preempt_count = preempt_count(); + +#if defined(CONFIG_PREEMPT_SOFTIRQS) && defined(CONFIG_PREEMPT_HARDIRQS) + /* + * If executed by a same-prio hardirq thread + * then skip pending softirqs that belong + * to softirq threads with different priority: + */ + if (same_prio_only) { + tsk = __get_cpu_var(ksoftirqd)[softirq].tsk; + if (tsk && tsk->normal_prio != + current->normal_prio) { + same_prio_skipped |= softirq_mask; + available_mask &= ~softirq_mask; + goto next; } } +#endif + /* + * Is this softirq already being processed? + */ + if (per_cpu(softirq_running, cpu) & softirq_mask) { + available_mask &= ~softirq_mask; + goto next; + } + per_cpu(softirq_running, cpu) |= softirq_mask; + local_irq_enable(); + + h->action(h); + if (preempt_count != preempt_count()) { + print_symbol("BUG: softirq exited %s with wrong preemption count!\n", (unsigned long) h->action); + printk("entered with %08x, exited with %08x.\n", preempt_count, preempt_count()); + preempt_count() = preempt_count; + } rcu_bh_qsctr_inc(cpu); cond_resched_softirq_context(); + local_irq_disable(); + per_cpu(softirq_running, cpu) &= ~softirq_mask; } +next: h++; + softirq++; pending >>= 1; } while (pending); - local_irq_disable(); - + or_softirq_pending(same_prio_skipped); pending = local_softirq_pending(); - if (pending && --max_restart) - goto restart; + if (pending & available_mask) { + if (--max_restart) + goto restart; + /* + * With softirq threading there's no reason not to + * finish the workload we have: + */ +#ifdef CONFIG_PREEMPT_SOFTIRQS + if (--max_loops) { + if (printk_ratelimit()) + printk("INFO: softirq overload: %08x\n", pending); + max_restart = MAX_SOFTIRQ_RESTART; + goto restart; + } + if (printk_ratelimit()) + printk("BUG: softirq loop! %08x\n", pending); +#endif + } if (pending) trigger_softirqs(); @@ -321,7 +389,7 @@ asmlinkage void __do_softirq(void) p_flags = current->flags & PF_HARDIRQ; current->flags &= ~PF_HARDIRQ; - ___do_softirq(); + ___do_softirq(0); trace_softirq_exit(); @@ -345,20 +413,29 @@ void do_softirq_from_hardirq(void) if (!local_softirq_pending()) return; /* - * 'immediate' softirq execution: + * 'immediate' softirq execution, from hardirq context: */ + local_irq_disable(); __local_bh_disable((unsigned long)__builtin_return_address(0)); +#ifndef CONFIG_PREEMPT_SOFTIRQS + trace_softirq_enter(); +#endif p_flags = current->flags & PF_HARDIRQ; current->flags &= ~PF_HARDIRQ; + current->flags |= PF_SOFTIRQ; - ___do_softirq(); + ___do_softirq(1); +#ifndef CONFIG_PREEMPT_SOFTIRQS trace_softirq_exit(); - +#endif account_system_vtime(current); - _local_bh_enable(); current->flags |= p_flags; + current->flags &= ~PF_SOFTIRQ; + + _local_bh_enable(); + local_irq_enable(); } #ifndef __ARCH_HAS_DO_SOFTIRQ @@ -669,8 +746,9 @@ static int ksoftirqd(void * __data) { struct sched_param param = { .sched_priority = MAX_USER_RT_PRIO/2 }; struct softirqdata *data = __data; - u32 mask = (1 << data->nr); + u32 softirq_mask = (1 << data->nr); struct softirq_action *h; + int cpu = data->cpu; sys_sched_setscheduler(current->pid, SCHED_FIFO, ¶m); // set_user_nice(current, -10); @@ -684,7 +762,8 @@ static int ksoftirqd(void * __data) while (!kthread_should_stop()) { preempt_disable(); - if (!(local_softirq_pending() & mask)) { + if (!(local_softirq_pending() & softirq_mask)) { +sleep_more: __preempt_enable_no_resched(); schedule(); preempt_disable(); @@ -694,16 +773,26 @@ static int ksoftirqd(void * __data) #endif __set_current_state(TASK_RUNNING); - while (local_softirq_pending() & mask) { + while (local_softirq_pending() & softirq_mask) { /* Preempt disable stops cpu going offline. If already offline, we'll be on wrong CPU: don't process */ - if (cpu_is_offline(data->cpu)) + if (cpu_is_offline(cpu)) goto wait_to_die; local_irq_disable(); + /* + * Is the softirq already being executed by + * a hardirq context? + */ + if (per_cpu(softirq_running, cpu) & softirq_mask) { + local_irq_enable(); + set_current_state(TASK_INTERRUPTIBLE); + goto sleep_more; + } + per_cpu(softirq_running, cpu) |= softirq_mask; __preempt_enable_no_resched(); - set_softirq_pending(local_softirq_pending() & ~mask); + set_softirq_pending(local_softirq_pending() & ~softirq_mask); local_bh_disable(); local_irq_enable(); @@ -713,6 +802,7 @@ static int ksoftirqd(void * __data) rcu_bh_qsctr_inc(data->cpu); local_irq_disable(); + per_cpu(softirq_running, cpu) &= ~softirq_mask; _local_bh_enable(); local_irq_enable(); ^ permalink raw reply [flat|nested] 30+ messages in thread
* 2.6.21-rt2..8 troubles 2007-04-01 18:39 ` Ingo Molnar 2007-04-03 23:49 ` 2.6.21-rc5-rt10 troubles Rui Nuno Capela @ 2007-05-25 20:58 ` Rui Nuno Capela 2007-05-26 16:08 ` Thomas Gleixner 2007-05-31 15:56 ` Steven Rostedt 1 sibling, 2 replies; 30+ messages in thread From: Rui Nuno Capela @ 2007-05-25 20:58 UTC (permalink / raw) To: Ingo Molnar; +Cc: linux-kernel, linux-rt-users Hi Ingo et al. It's been quite a while, since last time I've complained about the -rt kernel patch series. This time I'm afraid I have a nasty specialty I've been trying to figure out and isolate but to no definitive results. Fact is, since 2.6.21-rt2 and still on latest -rt8, that I'm facing troubled behavior while running on a Core2 T7200 laptop (SMP). Somehow, soon or later, the whole system starts crawling to death. It just slows down to some kind of Big Freeze, with no evidence over the console whatsoever, so that I'm ultimately left with a brick on my hands. This behavior is consistent and occurs every time after a while. It surely does not occur on 2.6.21-rt1 and earlier. Even stranger, it does not occur on another but older P4@3.3Ghz desktop (HT/SMP) where a very identical system image is deployed (openSUSE 10.2 i386, gcc 4.1.2, KDE 3.5.7) I wish I could give you more details, but the fact is I don't know where to look. The machine just freezes silently, again and again, with all kernels from -rt2 to -rt8 inclusive, with no traceable evidence, at least to my knowledge. The only symptom that I can come about is that, from some moment on and ever since, the system cannot start any new process anymore, or otherwise takes forever to realize and launch any new started process thread. A sample dmesg output: http://www.rncbc.org/datahub/dmesg-2.6.21-rt5.0 The corresponding .config: http://www.rncbc.org/datahub/config-2.6.21-rt5.0 Again, there's no logged evidence of the problem, which is as nasty as repeatable after each boot. Unfortunately, it's not quite deterministically reproducible, this behavior of turning into an unresponsive brick ;) It's just a matter of time, or so I think. That's why I have no clues. Is there anything I can do better to help myself figuring out this issue? As this is a modern laptop such things like a serial console are unavailable, but it would be nice to track things up over netconsole perhaps? I just need some bright and nice directions now ;) Hope someone finds this worth of attention too. Meanwhile, I'll be happy with 2.6.21-rt1 :) Cheers. -- rncbc aka Rui Nuno Capela rncbc@rncbc.org ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: 2.6.21-rt2..8 troubles 2007-05-25 20:58 ` 2.6.21-rt2..8 troubles Rui Nuno Capela @ 2007-05-26 16:08 ` Thomas Gleixner 2007-05-26 21:21 ` Rui Nuno Capela 2007-05-31 15:56 ` Steven Rostedt 1 sibling, 1 reply; 30+ messages in thread From: Thomas Gleixner @ 2007-05-26 16:08 UTC (permalink / raw) To: Rui Nuno Capela; +Cc: Ingo Molnar, linux-kernel, linux-rt-users On Fri, 2007-05-25 at 21:58 +0100, Rui Nuno Capela wrote: > Is there anything I can do better to help myself figuring out this > issue? As this is a modern laptop such things like a serial console are > unavailable, but it would be nice to track things up over netconsole > perhaps? > > I just need some bright and nice directions now ;) Hope someone finds > this worth of attention too. Meanwhile, I'll be happy with 2.6.21-rt1 :) Can you boot with "hpet=disable" on the command line ? If that does not help, please provide the output of /proc/timer_list. Thanks, tglx ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: 2.6.21-rt2..8 troubles 2007-05-26 16:08 ` Thomas Gleixner @ 2007-05-26 21:21 ` Rui Nuno Capela 2007-06-06 0:44 ` Rui Nuno Capela 0 siblings, 1 reply; 30+ messages in thread From: Rui Nuno Capela @ 2007-05-26 21:21 UTC (permalink / raw) To: Thomas Gleixner; +Cc: Ingo Molnar, linux-kernel, linux-rt-users Thomas Gleixner wrote: > On Fri, 2007-05-25 at 21:58 +0100, Rui Nuno Capela wrote: >> Is there anything I can do better to help myself figuring out this >> issue? As this is a modern laptop such things like a serial console are >> unavailable, but it would be nice to track things up over netconsole >> perhaps? >> >> I just need some bright and nice directions now ;) Hope someone finds >> this worth of attention too. Meanwhile, I'll be happy with 2.6.21-rt1 :) > > Can you boot with "hpet=disable" on the command line ? > Nope. It doesn't seem to have significant effect. Same time-bomb behavior: after an indeterminate period of uptime, the systems stops responding and cannot spawn new processes (current running ones still live on, strange). > If that does not help, please provide the output of /proc/timer_list. > This is with my latest iteration: http://www.rncbc.org/datahub/config-2.6.21.1-rt8.0 Normal boot on which it behaves as badly as reported: http://www.rncbc.org/datahub/dmesg-2.6.21.1-rt8.0 # cat /proc/timer_list Timer List Version: v0.3 HRTIMER_MAX_CLOCK_BASES: 2 now at 131736771907 nsecs cpu: 0 clock 0: .index: 0 .resolution: 1 nsecs .get_time: ktime_get_real .offset: 1180213690448299114 nsecs active timers: clock 1: .index: 1 .resolution: 1 nsecs .get_time: ktime_get .offset: 0 nsecs active timers: #0: <ed7c4ef4>, tick_sched_timer, S:01 # expires at 131737000000 nsecs [in 228093 nsecs] #1: <ed7c4ef4>, it_real_fn, S:01 # expires at 131751277843 nsecs [in 14505936 nsecs] #2: <ed7c4ef4>, hrtimer_wakeup, S:01 # expires at 131802703679 nsecs [in 65931772 nsecs] #3: <ed7c4ef4>, hrtimer_wakeup, S:01 # expires at 131802705006 nsecs [in 65933099 nsecs] #4: <ed7c4ef4>, hrtimer_wakeup, S:01 # expires at 132412838830 nsecs [in 676066923 nsecs] #5: <ed7c4ef4>, it_real_fn, S:01 # expires at 137026607454 nsecs [in 5289835547 nsecs] #6: <ed7c4ef4>, hrtimer_wakeup, S:01 # expires at 141381493725 nsecs [in 9644721818 nsecs] #7: <ed7c4ef4>, hrtimer_wakeup, S:01 # expires at 170796028701 nsecs [in 39059256794 nsecs] .expires_next : 131737000000 nsecs .hres_active : 1 .nr_events : 40634 .nohz_mode : 2 .idle_tick : 131724000000 nsecs .tick_stopped : 0 .idle_jiffies : 4294799020 .idle_calls : 178848 .idle_sleeps : 133212 .idle_entrytime : 131736069830 nsecs .idle_sleeptime : 100895567465 nsecs .last_jiffies : 4294799033 .next_jiffies : 4294799039 .idle_expires : 131736000000 nsecs jiffies: 4294799033 cpu: 1 clock 0: .index: 0 .resolution: 1 nsecs .get_time: ktime_get_real .offset: 1180213690448299114 nsecs active timers: clock 1: .index: 1 .resolution: 1 nsecs .get_time: ktime_get .offset: 0 nsecs active timers: #0: <ed7c4ef4>, hrtimer_wakeup, S:01 # expires at 131737067173 nsecs [in 295266 nsecs] #1: <ed7c4ef4>, tick_sched_timer, S:01 # expires at 131737250000 nsecs [in 478093 nsecs] #2: <ed7c4ef4>, hrtimer_wakeup, S:01 # expires at 139151071745 nsecs [in 7414299838 nsecs] #3: <ed7c4ef4>, hrtimer_wakeup, S:01 # expires at 139151133755 nsecs [in 7414361848 nsecs] #4: <ed7c4ef4>, hrtimer_wakeup, S:01 # expires at 139151154005 nsecs [in 7414382098 nsecs] .expires_next : 131737067173 nsecs .hres_active : 1 .nr_events : 31510 .nohz_mode : 2 .idle_tick : 131734250000 nsecs .tick_stopped : 0 .idle_jiffies : 4294799030 .idle_calls : 151213 .idle_sleeps : 107018 .idle_entrytime : 131735193036 nsecs .idle_sleeptime : 108256832194 nsecs .last_jiffies : 4294799032 .next_jiffies : 4294799040 .idle_expires : 131743000000 nsecs jiffies: 4294799033 Tick Device: mode: 1 Clock Event Device: hpet max_delta_ns: 2147483647 min_delta_ns: 3352 mult: 61496110 shift: 32 mode: 3 next_event: 131737000000 nsecs set_next_event: hpet_legacy_next_event set_mode: hpet_legacy_set_mode event_handler: tick_handle_oneshot_broadcast tick_broadcast_mask: 00000003 tick_broadcast_oneshot_mask: 00000001 Tick Device: mode: 1 Clock Event Device: lapic max_delta_ns: 806914928 min_delta_ns: 1442 mult: 44650051 shift: 32 mode: 1 next_event: 131737000000 nsecs set_next_event: lapic_next_event set_mode: lapic_timer_setup event_handler: hrtimer_interrupt Tick Device: mode: 1 Clock Event Device: lapic max_delta_ns: 806914928 min_delta_ns: 1442 mult: 44650051 shift: 32 mode: 3 next_event: 131737067173 nsecs set_next_event: lapic_next_event set_mode: lapic_timer_setup event_handler: hrtimer_interrupt -- Alternate boot with hpet=disabled as suggested, but no better results: http://www.rncbc.org/datahub/dmesg-2.6.21.1-rt8.0-hpet_disabled # cat /proc/timer_list Timer List Version: v0.3 HRTIMER_MAX_CLOCK_BASES: 2 now at 269529706096 nsecs cpu: 0 clock 0: .index: 0 .resolution: 1 nsecs .get_time: ktime_get_real .offset: 1180214106093436428 nsecs active timers: clock 1: .index: 1 .resolution: 1 nsecs .get_time: ktime_get .offset: 0 nsecs active timers: #0: <ed2a2ef4>, tick_sched_timer, S:01 # expires at 269530000000 nsecs [in 293904 nsecs] #1: <ed2a2ef4>, hrtimer_wakeup, S:01 # expires at 269554568320 nsecs [in 24862224 nsecs] #2: <ed2a2ef4>, hrtimer_wakeup, S:01 # expires at 269585566924 nsecs [in 55860828 nsecs] #3: <ed2a2ef4>, hrtimer_wakeup, S:01 # expires at 269822782823 nsecs [in 293076727 nsecs] #4: <ed2a2ef4>, hrtimer_wakeup, S:01 # expires at 272726158017 nsecs [in 3196451921 nsecs] #5: <ed2a2ef4>, it_real_fn, S:01 # expires at 278007767018 nsecs [in 8478060922 nsecs] #6: <ed2a2ef4>, hrtimer_wakeup, S:01 # expires at 283716431029 nsecs [in 14186724933 nsecs] #7: <ed2a2ef4>, hrtimer_wakeup, S:01 # expires at 283716456168 nsecs [in 14186750072 nsecs] #8: <ed2a2ef4>, hrtimer_wakeup, S:01 # expires at 295789281627 nsecs [in 26259575531 nsecs] .expires_next : 269530000000 nsecs .hres_active : 1 .nr_events : 63228 .nohz_mode : 2 .idle_tick : 269527000000 nsecs .tick_stopped : 0 .idle_jiffies : 4294936823 .idle_calls : 217590 .idle_sleeps : 168323 .idle_entrytime : 269528785728 nsecs .idle_sleeptime : 230915526366 nsecs .last_jiffies : 4294936825 .next_jiffies : 4294936840 .idle_expires : 269543000000 nsecs jiffies: 4294936826 cpu: 1 clock 0: .index: 0 .resolution: 1 nsecs .get_time: ktime_get_real .offset: 1180214106093436428 nsecs active timers: clock 1: .index: 1 .resolution: 1 nsecs .get_time: ktime_get .offset: 0 nsecs active timers: #0: <ed2a2ef4>, tick_sched_timer, S:01 # expires at 269530250000 nsecs [in 543904 nsecs] #1: <ed2a2ef4>, it_real_fn, S:01 # expires at 269546379364 nsecs [in 16673268 nsecs] #2: <ed2a2ef4>, hrtimer_wakeup, S:01 # expires at 283723356553 nsecs [in 14193650457 nsecs] .expires_next : 269530250000 nsecs .hres_active : 1 .nr_events : 64947 .nohz_mode : 2 .idle_tick : 269527250000 nsecs .tick_stopped : 0 .idle_jiffies : 4294936824 .idle_calls : 172684 .idle_sleeps : 111081 .idle_entrytime : 269529298565 nsecs .idle_sleeptime : 234502295072 nsecs .last_jiffies : 4294936826 .next_jiffies : 4294936833 .idle_expires : 269536000000 nsecs jiffies: 4294936826 Tick Device: mode: 1 Clock Event Device: pit max_delta_ns: 27461866 min_delta_ns: 12571 mult: 5124677 shift: 32 mode: 3 next_event: 269530250000 nsecs set_next_event: pit_next_event set_mode: init_pit_timer event_handler: tick_handle_oneshot_broadcast tick_broadcast_mask: 00000003 tick_broadcast_oneshot_mask: 00000002 Tick Device: mode: 1 Clock Event Device: lapic max_delta_ns: 807031401 min_delta_ns: 1443 mult: 44643607 shift: 32 mode: 3 next_event: 269530000000 nsecs set_next_event: lapic_next_event set_mode: lapic_timer_setup event_handler: hrtimer_interrupt Tick Device: mode: 1 Clock Event Device: lapic max_delta_ns: 807031401 min_delta_ns: 1443 mult: 44643607 shift: 32 mode: 1 next_event: 269530250000 nsecs set_next_event: lapic_next_event set_mode: lapic_timer_setup event_handler: hrtimer_interrupt -- Thanks for the hints. Cheers. -- rncbc aka Rui Nuno Capela rncbc@rncbc.org ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: 2.6.21-rt2..8 troubles 2007-05-26 21:21 ` Rui Nuno Capela @ 2007-06-06 0:44 ` Rui Nuno Capela 2007-06-08 15:47 ` Thomas Gleixner 0 siblings, 1 reply; 30+ messages in thread From: Rui Nuno Capela @ 2007-06-06 0:44 UTC (permalink / raw) To: Rui Nuno Capela Cc: Thomas Gleixner, Ingo Molnar, linux-kernel, linux-rt-users Rui Nuno Capela wrote: > Thomas Gleixner wrote: >> On Fri, 2007-05-25 at 21:58 +0100, Rui Nuno Capela wrote: >>> Is there anything I can do better to help myself figuring out this >>> issue? As this is a modern laptop such things like a serial console are >>> unavailable, but it would be nice to track things up over netconsole >>> perhaps? >>> >>> I just need some bright and nice directions now ;) Hope someone finds >>> this worth of attention too. Meanwhile, I'll be happy with 2.6.21-rt1 :) >> Can you boot with "hpet=disable" on the command line ? >> > > Nope. It doesn't seem to have significant effect. Same time-bomb > behavior: after an indeterminate period of uptime, the systems stops > responding and cannot spawn new processes (current running ones still > live on, strange). > >> If that does not help, please provide the output of /proc/timer_list. >> > > This is with my latest iteration: > http://www.rncbc.org/datahub/config-2.6.21.1-rt8.0 > > Normal boot on which it behaves as badly as reported: > http://www.rncbc.org/datahub/dmesg-2.6.21.1-rt8.0 > > # cat /proc/timer_list > Timer List Version: v0.3 > HRTIMER_MAX_CLOCK_BASES: 2 > now at 131736771907 nsecs > > cpu: 0 > clock 0: > .index: 0 > .resolution: 1 nsecs > .get_time: ktime_get_real > .offset: 1180213690448299114 nsecs > active timers: > clock 1: > .index: 1 > .resolution: 1 nsecs > .get_time: ktime_get > .offset: 0 nsecs > active timers: > #0: <ed7c4ef4>, tick_sched_timer, S:01 > # expires at 131737000000 nsecs [in 228093 nsecs] > #1: <ed7c4ef4>, it_real_fn, S:01 > # expires at 131751277843 nsecs [in 14505936 nsecs] > #2: <ed7c4ef4>, hrtimer_wakeup, S:01 > # expires at 131802703679 nsecs [in 65931772 nsecs] > #3: <ed7c4ef4>, hrtimer_wakeup, S:01 > # expires at 131802705006 nsecs [in 65933099 nsecs] > #4: <ed7c4ef4>, hrtimer_wakeup, S:01 > # expires at 132412838830 nsecs [in 676066923 nsecs] > #5: <ed7c4ef4>, it_real_fn, S:01 > # expires at 137026607454 nsecs [in 5289835547 nsecs] > #6: <ed7c4ef4>, hrtimer_wakeup, S:01 > # expires at 141381493725 nsecs [in 9644721818 nsecs] > #7: <ed7c4ef4>, hrtimer_wakeup, S:01 > # expires at 170796028701 nsecs [in 39059256794 nsecs] > .expires_next : 131737000000 nsecs > .hres_active : 1 > .nr_events : 40634 > .nohz_mode : 2 > .idle_tick : 131724000000 nsecs > .tick_stopped : 0 > .idle_jiffies : 4294799020 > .idle_calls : 178848 > .idle_sleeps : 133212 > .idle_entrytime : 131736069830 nsecs > .idle_sleeptime : 100895567465 nsecs > .last_jiffies : 4294799033 > .next_jiffies : 4294799039 > .idle_expires : 131736000000 nsecs > jiffies: 4294799033 > > cpu: 1 > clock 0: > .index: 0 > .resolution: 1 nsecs > .get_time: ktime_get_real > .offset: 1180213690448299114 nsecs > active timers: > clock 1: > .index: 1 > .resolution: 1 nsecs > .get_time: ktime_get > .offset: 0 nsecs > active timers: > #0: <ed7c4ef4>, hrtimer_wakeup, S:01 > # expires at 131737067173 nsecs [in 295266 nsecs] > #1: <ed7c4ef4>, tick_sched_timer, S:01 > # expires at 131737250000 nsecs [in 478093 nsecs] > #2: <ed7c4ef4>, hrtimer_wakeup, S:01 > # expires at 139151071745 nsecs [in 7414299838 nsecs] > #3: <ed7c4ef4>, hrtimer_wakeup, S:01 > # expires at 139151133755 nsecs [in 7414361848 nsecs] > #4: <ed7c4ef4>, hrtimer_wakeup, S:01 > # expires at 139151154005 nsecs [in 7414382098 nsecs] > .expires_next : 131737067173 nsecs > .hres_active : 1 > .nr_events : 31510 > .nohz_mode : 2 > .idle_tick : 131734250000 nsecs > .tick_stopped : 0 > .idle_jiffies : 4294799030 > .idle_calls : 151213 > .idle_sleeps : 107018 > .idle_entrytime : 131735193036 nsecs > .idle_sleeptime : 108256832194 nsecs > .last_jiffies : 4294799032 > .next_jiffies : 4294799040 > .idle_expires : 131743000000 nsecs > jiffies: 4294799033 > > > Tick Device: mode: 1 > Clock Event Device: hpet > max_delta_ns: 2147483647 > min_delta_ns: 3352 > mult: 61496110 > shift: 32 > mode: 3 > next_event: 131737000000 nsecs > set_next_event: hpet_legacy_next_event > set_mode: hpet_legacy_set_mode > event_handler: tick_handle_oneshot_broadcast > tick_broadcast_mask: 00000003 > tick_broadcast_oneshot_mask: 00000001 > > > Tick Device: mode: 1 > Clock Event Device: lapic > max_delta_ns: 806914928 > min_delta_ns: 1442 > mult: 44650051 > shift: 32 > mode: 1 > next_event: 131737000000 nsecs > set_next_event: lapic_next_event > set_mode: lapic_timer_setup > event_handler: hrtimer_interrupt > > Tick Device: mode: 1 > Clock Event Device: lapic > max_delta_ns: 806914928 > min_delta_ns: 1442 > mult: 44650051 > shift: 32 > mode: 3 > next_event: 131737067173 nsecs > set_next_event: lapic_next_event > set_mode: lapic_timer_setup > event_handler: hrtimer_interrupt > -- > > > Alternate boot with hpet=disabled as suggested, but no better results: > http://www.rncbc.org/datahub/dmesg-2.6.21.1-rt8.0-hpet_disabled > > # cat /proc/timer_list > Timer List Version: v0.3 > HRTIMER_MAX_CLOCK_BASES: 2 > now at 269529706096 nsecs > > cpu: 0 > clock 0: > .index: 0 > .resolution: 1 nsecs > .get_time: ktime_get_real > .offset: 1180214106093436428 nsecs > active timers: > clock 1: > .index: 1 > .resolution: 1 nsecs > .get_time: ktime_get > .offset: 0 nsecs > active timers: > #0: <ed2a2ef4>, tick_sched_timer, S:01 > # expires at 269530000000 nsecs [in 293904 nsecs] > #1: <ed2a2ef4>, hrtimer_wakeup, S:01 > # expires at 269554568320 nsecs [in 24862224 nsecs] > #2: <ed2a2ef4>, hrtimer_wakeup, S:01 > # expires at 269585566924 nsecs [in 55860828 nsecs] > #3: <ed2a2ef4>, hrtimer_wakeup, S:01 > # expires at 269822782823 nsecs [in 293076727 nsecs] > #4: <ed2a2ef4>, hrtimer_wakeup, S:01 > # expires at 272726158017 nsecs [in 3196451921 nsecs] > #5: <ed2a2ef4>, it_real_fn, S:01 > # expires at 278007767018 nsecs [in 8478060922 nsecs] > #6: <ed2a2ef4>, hrtimer_wakeup, S:01 > # expires at 283716431029 nsecs [in 14186724933 nsecs] > #7: <ed2a2ef4>, hrtimer_wakeup, S:01 > # expires at 283716456168 nsecs [in 14186750072 nsecs] > #8: <ed2a2ef4>, hrtimer_wakeup, S:01 > # expires at 295789281627 nsecs [in 26259575531 nsecs] > .expires_next : 269530000000 nsecs > .hres_active : 1 > .nr_events : 63228 > .nohz_mode : 2 > .idle_tick : 269527000000 nsecs > .tick_stopped : 0 > .idle_jiffies : 4294936823 > .idle_calls : 217590 > .idle_sleeps : 168323 > .idle_entrytime : 269528785728 nsecs > .idle_sleeptime : 230915526366 nsecs > .last_jiffies : 4294936825 > .next_jiffies : 4294936840 > .idle_expires : 269543000000 nsecs > jiffies: 4294936826 > > cpu: 1 > clock 0: > .index: 0 > .resolution: 1 nsecs > .get_time: ktime_get_real > .offset: 1180214106093436428 nsecs > active timers: > clock 1: > .index: 1 > .resolution: 1 nsecs > .get_time: ktime_get > .offset: 0 nsecs > active timers: > #0: <ed2a2ef4>, tick_sched_timer, S:01 > # expires at 269530250000 nsecs [in 543904 nsecs] > #1: <ed2a2ef4>, it_real_fn, S:01 > # expires at 269546379364 nsecs [in 16673268 nsecs] > #2: <ed2a2ef4>, hrtimer_wakeup, S:01 > # expires at 283723356553 nsecs [in 14193650457 nsecs] > .expires_next : 269530250000 nsecs > .hres_active : 1 > .nr_events : 64947 > .nohz_mode : 2 > .idle_tick : 269527250000 nsecs > .tick_stopped : 0 > .idle_jiffies : 4294936824 > .idle_calls : 172684 > .idle_sleeps : 111081 > .idle_entrytime : 269529298565 nsecs > .idle_sleeptime : 234502295072 nsecs > .last_jiffies : 4294936826 > .next_jiffies : 4294936833 > .idle_expires : 269536000000 nsecs > jiffies: 4294936826 > > > Tick Device: mode: 1 > Clock Event Device: pit > max_delta_ns: 27461866 > min_delta_ns: 12571 > mult: 5124677 > shift: 32 > mode: 3 > next_event: 269530250000 nsecs > set_next_event: pit_next_event > set_mode: init_pit_timer > event_handler: tick_handle_oneshot_broadcast > tick_broadcast_mask: 00000003 > tick_broadcast_oneshot_mask: 00000002 > > > Tick Device: mode: 1 > Clock Event Device: lapic > max_delta_ns: 807031401 > min_delta_ns: 1443 > mult: 44643607 > shift: 32 > mode: 3 > next_event: 269530000000 nsecs > set_next_event: lapic_next_event > set_mode: lapic_timer_setup > event_handler: hrtimer_interrupt > > Tick Device: mode: 1 > Clock Event Device: lapic > max_delta_ns: 807031401 > min_delta_ns: 1443 > mult: 44643607 > shift: 32 > mode: 1 > next_event: 269530250000 nsecs > set_next_event: lapic_next_event > set_mode: lapic_timer_setup > event_handler: hrtimer_interrupt > -- > Just for the heads-up, I'm still suffering from this same illness, and it seems even worse (big freeze happens earlier) on 2.6.21.3-rt9. There's no way around. On one box it works flawlessly (desktop, P4@3.3Ghz) while on the patient one (laptop, core2 T7200) it bricks silently. Shrugs:) -- rncbc aka Rui Nuno Capela rncbc@rncbc.org ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: 2.6.21-rt2..8 troubles 2007-06-06 0:44 ` Rui Nuno Capela @ 2007-06-08 15:47 ` Thomas Gleixner 2007-06-08 18:21 ` Rui Nuno Capela 0 siblings, 1 reply; 30+ messages in thread From: Thomas Gleixner @ 2007-06-08 15:47 UTC (permalink / raw) To: Rui Nuno Capela; +Cc: Ingo Molnar, linux-kernel, linux-rt-users On Wed, 2007-06-06 at 01:44 +0100, Rui Nuno Capela wrote: > Just for the heads-up, I'm still suffering from this same illness, and > it seems even worse (big freeze happens earlier) on 2.6.21.3-rt9. > > There's no way around. On one box it works flawlessly (desktop, > P4@3.3Ghz) while on the patient one (laptop, core2 T7200) it bricks > silently. Sorry for responding late. To have some idea where the breakage comes from, can you please try http://www.tglx.de/projects/hrtimers/2.6.22-rc4/patch-2.6.22-rc4-hrt5.patch whether it has the same behaviour. Thanks, tglx ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: 2.6.21-rt2..8 troubles 2007-06-08 15:47 ` Thomas Gleixner @ 2007-06-08 18:21 ` Rui Nuno Capela 2007-06-08 18:50 ` Thomas Gleixner 0 siblings, 1 reply; 30+ messages in thread From: Rui Nuno Capela @ 2007-06-08 18:21 UTC (permalink / raw) To: Thomas Gleixner; +Cc: Ingo Molnar, linux-kernel, linux-rt-users Hi Thomas, On Fri, June 8, 2007 16:47, Thomas Gleixner wrote: > On Wed, 2007-06-06 at 01:44 +0100, Rui Nuno Capela wrote: > >> Just for the heads-up, I'm still suffering from this same illness, and >> it seems even worse (big freeze happens earlier) on 2.6.21.3-rt9. >> >> There's no way around. On one box it works flawlessly (desktop, >> P4@3.3Ghz) while on the patient one (laptop, core2 T7200) it bricks >> silently. > > Sorry for responding late. To have some idea where the breakage comes > from, can you please try > > http://www.tglx.de/projects/hrtimers/2.6.22-rc4/patch-2.6.22-rc4-hrt5.pat > ch > > whether it has the same behaviour. > Just built from linux-2.6.22-rc4.tar.bz2, with patch-2.6.22-rc4-hrt5. All's working apparentely nice on this offending machine (laptop, intel core2 T7200). In fact, I'm writing this very reply under it and through ipw3945 wifi module--which never was so pragmatic on -rt2..9 ;) Nevertheless, this is not preempt-realtime (-rt) is it? And I it never complained about vanilla. Is this good news though? -- rncbc aka Rui Nuno Capela rncbc@rncbc.org ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: 2.6.21-rt2..8 troubles 2007-06-08 18:21 ` Rui Nuno Capela @ 2007-06-08 18:50 ` Thomas Gleixner 2007-06-11 19:36 ` Rui Nuno Capela 0 siblings, 1 reply; 30+ messages in thread From: Thomas Gleixner @ 2007-06-08 18:50 UTC (permalink / raw) To: Rui Nuno Capela; +Cc: Ingo Molnar, linux-kernel, linux-rt-users On Fri, 2007-06-08 at 19:21 +0100, Rui Nuno Capela wrote: > >> There's no way around. On one box it works flawlessly (desktop, > >> P4@3.3Ghz) while on the patient one (laptop, core2 T7200) it bricks > >> silently. > > > > Sorry for responding late. To have some idea where the breakage comes > > from, can you please try > > > > http://www.tglx.de/projects/hrtimers/2.6.22-rc4/patch-2.6.22-rc4-hrt5.pat > > ch > > > > whether it has the same behaviour. > > > > Just built from linux-2.6.22-rc4.tar.bz2, with patch-2.6.22-rc4-hrt5. > All's working apparentely nice on this offending machine (laptop, intel > core2 T7200). In fact, I'm writing this very reply under it and through > ipw3945 wifi module--which never was so pragmatic on -rt2..9 ;) > > Nevertheless, this is not preempt-realtime (-rt) is it? And I it never > complained about vanilla. > > Is this good news though? Well, the patch carries the same high resolution timer fixes as -rt, so I just wanted to exclude those. Thanks for testing. I'm spinning -rt10 with a couple of fixes. Should be out sometimes tomorrow. If the problem persists, we need to dig deeper. Thanks, tglx ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: 2.6.21-rt2..8 troubles 2007-06-08 18:50 ` Thomas Gleixner @ 2007-06-11 19:36 ` Rui Nuno Capela 2007-06-11 19:45 ` Thomas Gleixner 0 siblings, 1 reply; 30+ messages in thread From: Rui Nuno Capela @ 2007-06-11 19:36 UTC (permalink / raw) To: Thomas Gleixner; +Cc: Ingo Molnar, linux-kernel, linux-rt-users Thomas Gleixner wrote: > On Fri, 2007-06-08 at 19:21 +0100, Rui Nuno Capela wrote: >> Just built from linux-2.6.22-rc4.tar.bz2, with patch-2.6.22-rc4-hrt5. >> All's working apparentely nice on this offending machine (laptop, intel >> core2 T7200). In fact, I'm writing this very reply under it and through >> ipw3945 wifi module--which never was so pragmatic on -rt2..9 ;) >> >> Nevertheless, this is not preempt-realtime (-rt) is it? And I it never >> complained about vanilla. >> >> Is this good news though? > > Well, the patch carries the same high resolution timer fixes as -rt, so > I just wanted to exclude those. Thanks for testing. > > I'm spinning -rt10 with a couple of fixes. Should be out sometimes > tomorrow. If the problem persists, we need to dig deeper. > Uhoh. I'm sorry to tell, but the problem is still creeping on 2.6.21.4-rt11 and -rt12 :( So sorry. -- rncbc aka Rui Nuno Capela rncbc@rncbc.org ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: 2.6.21-rt2..8 troubles 2007-06-11 19:36 ` Rui Nuno Capela @ 2007-06-11 19:45 ` Thomas Gleixner 2007-06-11 19:55 ` Daniel Walker 0 siblings, 1 reply; 30+ messages in thread From: Thomas Gleixner @ 2007-06-11 19:45 UTC (permalink / raw) To: Rui Nuno Capela; +Cc: Ingo Molnar, linux-kernel, linux-rt-users On Mon, 2007-06-11 at 20:36 +0100, Rui Nuno Capela wrote: > > I'm spinning -rt10 with a couple of fixes. Should be out sometimes > > tomorrow. If the problem persists, we need to dig deeper. > > > > Uhoh. I'm sorry to tell, but the problem is still creeping on > 2.6.21.4-rt11 and -rt12 :( > > So sorry. Hmm. Does it happen, when you boot with maxcpus=1 on the kernel commandline ? tglx ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: 2.6.21-rt2..8 troubles 2007-06-11 19:45 ` Thomas Gleixner @ 2007-06-11 19:55 ` Daniel Walker 2007-06-11 20:50 ` Rui Nuno Capela 0 siblings, 1 reply; 30+ messages in thread From: Daniel Walker @ 2007-06-11 19:55 UTC (permalink / raw) To: Thomas Gleixner Cc: Rui Nuno Capela, Ingo Molnar, linux-kernel, linux-rt-users On Mon, 2007-06-11 at 21:45 +0200, Thomas Gleixner wrote: > On Mon, 2007-06-11 at 20:36 +0100, Rui Nuno Capela wrote: > > > I'm spinning -rt10 with a couple of fixes. Should be out sometimes > > > tomorrow. If the problem persists, we need to dig deeper. > > > > > > > Uhoh. I'm sorry to tell, but the problem is still creeping on > > 2.6.21.4-rt11 and -rt12 :( > > > > So sorry. > > Hmm. Does it happen, when you boot with maxcpus=1 on the kernel > commandline ? I think 2.6.21-rt2 had some apic updates also, (along with hpet updates) so testing with "noapic" on the command line might be helpful too .. Daniel ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: 2.6.21-rt2..8 troubles 2007-06-11 19:55 ` Daniel Walker @ 2007-06-11 20:50 ` Rui Nuno Capela 2007-06-11 21:14 ` Thomas Gleixner 0 siblings, 1 reply; 30+ messages in thread From: Rui Nuno Capela @ 2007-06-11 20:50 UTC (permalink / raw) To: Daniel Walker; +Cc: Thomas Gleixner, Ingo Molnar, linux-kernel, linux-rt-users Daniel Walker wrote: > On Mon, 2007-06-11 at 21:45 +0200, Thomas Gleixner wrote: >> On Mon, 2007-06-11 at 20:36 +0100, Rui Nuno Capela wrote: >>>> I'm spinning -rt10 with a couple of fixes. Should be out sometimes >>>> tomorrow. If the problem persists, we need to dig deeper. >>>> >>> Uhoh. I'm sorry to tell, but the problem is still creeping on >>> 2.6.21.4-rt11 and -rt12 :( >>> >>> So sorry. >> Hmm. Does it happen, when you boot with maxcpus=1 on the kernel >> commandline ? > > I think 2.6.21-rt2 had some apic updates also, (along with hpet updates) > so testing with "noapic" on the command line might be helpful too .. > Thomas, Yes, "maxcpus=1" seems to keep it running, but then I render my Core2 just half-baked ;) Daniel, No, "noapic" does not seem to help any better. HTH -- rncbc aka Rui Nuno Capela rncbc@rncbc.org ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: 2.6.21-rt2..8 troubles 2007-06-11 20:50 ` Rui Nuno Capela @ 2007-06-11 21:14 ` Thomas Gleixner 2007-06-11 21:25 ` Rui Nuno Capela 0 siblings, 1 reply; 30+ messages in thread From: Thomas Gleixner @ 2007-06-11 21:14 UTC (permalink / raw) To: Rui Nuno Capela; +Cc: Daniel Walker, Ingo Molnar, linux-kernel, linux-rt-users On Mon, 2007-06-11 at 21:50 +0100, Rui Nuno Capela wrote: > Thomas, > > Yes, "maxcpus=1" seems to keep it running, but then I render my Core2 > just half-baked ;) Yes, I know :( /me goes into desperate mode Is this a DELL laptop ? tglx ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: 2.6.21-rt2..8 troubles 2007-06-11 21:14 ` Thomas Gleixner @ 2007-06-11 21:25 ` Rui Nuno Capela 2007-06-11 21:42 ` Thomas Gleixner 0 siblings, 1 reply; 30+ messages in thread From: Rui Nuno Capela @ 2007-06-11 21:25 UTC (permalink / raw) To: Thomas Gleixner; +Cc: Daniel Walker, Ingo Molnar, linux-kernel, linux-rt-users Thomas Gleixner wrote: > On Mon, 2007-06-11 at 21:50 +0100, Rui Nuno Capela wrote: >> Thomas, >> >> Yes, "maxcpus=1" seems to keep it running, but then I render my Core2 >> just half-baked ;) > > Yes, I know :( > > /me goes into desperate mode > > Is this a DELL laptop ? > Nope. It's a Fujitsu-Siemens Amilo Si 1520 -- Intel Core2 Duo T7200@2.0Ghz. Works great with 2.6.21-rt1, and 2.6.22-rc4-hrt5, but that you already know :) Cheers. -- rncbc aka Rui Nuno Capela rncbc@rncbc.org ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: 2.6.21-rt2..8 troubles 2007-06-11 21:25 ` Rui Nuno Capela @ 2007-06-11 21:42 ` Thomas Gleixner 2007-06-11 22:34 ` Daniel Walker 0 siblings, 1 reply; 30+ messages in thread From: Thomas Gleixner @ 2007-06-11 21:42 UTC (permalink / raw) To: Rui Nuno Capela; +Cc: Daniel Walker, Ingo Molnar, linux-kernel, linux-rt-users On Mon, 2007-06-11 at 22:25 +0100, Rui Nuno Capela wrote: > Nope. It's a Fujitsu-Siemens Amilo Si 1520 -- Intel Core2 Duo T7200@2.0Ghz. Yeah, there are Dell ones which have similar or worse symptoms. > Works great with 2.6.21-rt1, and 2.6.22-rc4-hrt5, but that you already > know :) Ok. I go back and figure out which differences we have between 2.6.21-rt>8 and the -hrt queue. tglx ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: 2.6.21-rt2..8 troubles 2007-06-11 21:42 ` Thomas Gleixner @ 2007-06-11 22:34 ` Daniel Walker 2007-06-11 23:08 ` Thomas Gleixner 0 siblings, 1 reply; 30+ messages in thread From: Daniel Walker @ 2007-06-11 22:34 UTC (permalink / raw) To: Thomas Gleixner Cc: Rui Nuno Capela, Ingo Molnar, linux-kernel, linux-rt-users On Mon, 2007-06-11 at 23:42 +0200, Thomas Gleixner wrote: > On Mon, 2007-06-11 at 22:25 +0100, Rui Nuno Capela wrote: > > Nope. It's a Fujitsu-Siemens Amilo Si 1520 -- Intel Core2 Duo T7200@2.0Ghz. > > Yeah, there are Dell ones which have similar or worse symptoms. > > > Works great with 2.6.21-rt1, and 2.6.22-rc4-hrt5, but that you already > > know :) > > Ok. I go back and figure out which differences we have between > 2.6.21-rt>8 and the -hrt queue. Are you sure it's strictly and HRT issue? I didn't see a !CONFIG_HIGH_RES_TIMERS test .. Daniel ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: 2.6.21-rt2..8 troubles 2007-06-11 22:34 ` Daniel Walker @ 2007-06-11 23:08 ` Thomas Gleixner 2007-06-12 10:10 ` Rui Nuno Capela 0 siblings, 1 reply; 30+ messages in thread From: Thomas Gleixner @ 2007-06-11 23:08 UTC (permalink / raw) To: Daniel Walker; +Cc: Rui Nuno Capela, Ingo Molnar, linux-kernel, linux-rt-users On Mon, 2007-06-11 at 15:34 -0700, Daniel Walker wrote: > On Mon, 2007-06-11 at 23:42 +0200, Thomas Gleixner wrote: > > On Mon, 2007-06-11 at 22:25 +0100, Rui Nuno Capela wrote: > > > Nope. It's a Fujitsu-Siemens Amilo Si 1520 -- Intel Core2 Duo T7200@2.0Ghz. > > > > Yeah, there are Dell ones which have similar or worse symptoms. > > > > > Works great with 2.6.21-rt1, and 2.6.22-rc4-hrt5, but that you already > > > know :) > > > > Ok. I go back and figure out which differences we have between > > 2.6.21-rt>8 and the -hrt queue. > > Are you sure it's strictly and HRT issue? I didn't see a > !CONFIG_HIGH_RES_TIMERS test .. The main difference between -rt1 and -rt2 was the update of -hrt, which not only affects CONFIG_HIGH_RES_TIMERS. There are enough CONFIG_HIGH_RES_TIMERS=n related changes to clock events and friends as well. tglx ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: 2.6.21-rt2..8 troubles 2007-06-11 23:08 ` Thomas Gleixner @ 2007-06-12 10:10 ` Rui Nuno Capela 2007-07-06 14:16 ` Rui Nuno Capela 0 siblings, 1 reply; 30+ messages in thread From: Rui Nuno Capela @ 2007-06-12 10:10 UTC (permalink / raw) To: Thomas Gleixner; +Cc: Daniel Walker, Ingo Molnar, linux-kernel, linux-rt-users On Tue, June 12, 2007 00:08, Thomas Gleixner wrote: > On Mon, 2007-06-11 at 15:34 -0700, Daniel Walker wrote: > >> On Mon, 2007-06-11 at 23:42 +0200, Thomas Gleixner wrote: >> >>> On Mon, 2007-06-11 at 22:25 +0100, Rui Nuno Capela wrote: >>> >>>> Nope. It's a Fujitsu-Siemens Amilo Si 1520 -- Intel Core2 Duo >>>> T7200@2.0Ghz. >>>> >>> >>> Yeah, there are Dell ones which have similar or worse symptoms. >>> >>> >>>> Works great with 2.6.21-rt1, and 2.6.22-rc4-hrt5, but that you >>>> already know :) >>> >>> Ok. I go back and figure out which differences we have between >>> 2.6.21-rt>8 and the -hrt queue. >>> >> >> Are you sure it's strictly and HRT issue? I didn't see a >> !CONFIG_HIGH_RES_TIMERS test .. >> > > The main difference between -rt1 and -rt2 was the update of -hrt, which > not only affects CONFIG_HIGH_RES_TIMERS. There are enough > CONFIG_HIGH_RES_TIMERS=n related changes to clock events and friends as > well. > In deed, FWIW and IIRC, I can confirm that the show-stopper problem was still present when tried with CONFIG_HIGH_RES_TIMERS not set (=N). Bye now. -- rncbc aka Rui Nuno Capela rncbc@rncbc.org ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: 2.6.21-rt2..8 troubles 2007-06-12 10:10 ` Rui Nuno Capela @ 2007-07-06 14:16 ` Rui Nuno Capela 0 siblings, 0 replies; 30+ messages in thread From: Rui Nuno Capela @ 2007-07-06 14:16 UTC (permalink / raw) To: Thomas Gleixner; +Cc: Daniel Walker, Ingo Molnar, linux-kernel, linux-rt-users Hi, I'm back but with good news this time :)... On Tue, June 12, 2007 11:10, Rui Nuno Capela wrote: > On Tue, June 12, 2007 00:08, Thomas Gleixner wrote: >> On Mon, 2007-06-11 at 15:34 -0700, Daniel Walker wrote: >>> On Mon, 2007-06-11 at 23:42 +0200, Thomas Gleixner wrote: >>>> On Mon, 2007-06-11 at 22:25 +0100, Rui Nuno Capela wrote: >>>> >>>>> Nope. It's a Fujitsu-Siemens Amilo Si 1520 -- Intel Core2 Duo >>>>> T7200@2.0Ghz. >>>>> >>>> Yeah, there are Dell ones which have similar or worse symptoms. >>>> >>>>> Works great with 2.6.21-rt1, and 2.6.22-rc4-hrt5, but that you >>>>> already know :) >>>> >>>> Ok. I go back and figure out which differences we have between >>>> 2.6.21-rt>8 and the -hrt queue. >>> >>> Are you sure it's strictly and HRT issue? I didn't see a >>> !CONFIG_HIGH_RES_TIMERS test .. >> >> The main difference between -rt1 and -rt2 was the update of -hrt, which >> not only affects CONFIG_HIGH_RES_TIMERS. There are enough >> CONFIG_HIGH_RES_TIMERS=n related changes to clock events and friends as >> well. > > In deed, FWIW and IIRC, I can confirm that the show-stopper problem was > still present when tried with CONFIG_HIGH_RES_TIMERS not set (=N). > Although I'm still with my fingers crossed, I can tell that 2.6.21.5-rt19 (and -rt20) does behave far better now on the very same box. I've more than 8 hours up and running now, without a single glimpse of the bad symptoms, which used to show in a matter of minutes if not earlier during init time. Congratulations, -rt is usable again here and that just makes me happier :) Cheers. -- rncbc aka Rui Nuno Capela rncbc@rncbc.org ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: 2.6.21-rt2..8 troubles 2007-05-25 20:58 ` 2.6.21-rt2..8 troubles Rui Nuno Capela 2007-05-26 16:08 ` Thomas Gleixner @ 2007-05-31 15:56 ` Steven Rostedt 2007-05-31 16:26 ` Luis Claudio R. Goncalves 1 sibling, 1 reply; 30+ messages in thread From: Steven Rostedt @ 2007-05-31 15:56 UTC (permalink / raw) To: Rui Nuno Capela; +Cc: Ingo Molnar, linux-kernel, linux-rt-users On Fri, May 25, 2007 at 09:58:12PM +0100, Rui Nuno Capela wrote: > > I wish I could give you more details, but the fact is I don't know > where > to look. The machine just freezes silently, again and again, with all > kernels from -rt2 to -rt8 inclusive, with no traceable evidence, at > least to my knowledge. The only symptom that I can come about is that, > from some moment on and ever since, the system cannot start any new > process anymore, or otherwise takes forever to realize and launch any > new started process thread. > I have a box that looks like it's doing the same thing. Unfortunately for now it's being used to test other things. But I did do a show-task and see a bunch of D processes. I'll post that output when I get that box free again. -- Steve ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: 2.6.21-rt2..8 troubles 2007-05-31 15:56 ` Steven Rostedt @ 2007-05-31 16:26 ` Luis Claudio R. Goncalves 0 siblings, 0 replies; 30+ messages in thread From: Luis Claudio R. Goncalves @ 2007-05-31 16:26 UTC (permalink / raw) To: Steven Rostedt; +Cc: linux-rt-users On Thu, May 31, 2007 at 11:56:11AM -0400, Steven Rostedt wrote: | | On Fri, May 25, 2007 at 09:58:12PM +0100, Rui Nuno Capela wrote: | > | > I wish I could give you more details, but the fact is I don't know | > where | > to look. The machine just freezes silently, again and again, with all | > kernels from -rt2 to -rt8 inclusive, with no traceable evidence, at | > least to my knowledge. The only symptom that I can come about is that, | > from some moment on and ever since, the system cannot start any new | > process anymore, or otherwise takes forever to realize and launch any | > new started process thread. I have also experienced some of these freezes and system hangs, some of them reproducible, as when I start java+azureus or when xscreensaver pops in. But after I unset CONFIG_DEBUG_RT_MUTEXES they are all gone. Not the DEBUG_RT_MUTEXES has the culpright but judging by the amount of stack traces it spits every few seconds, this workload may potentialize some inner problem in the system. I am currently running rt8 in a FC6 box. I plan to run some further investigation on this matter. Luis | I have a box that looks like it's doing the same thing. Unfortunately | for now it's being used to test other things. | | But I did do a show-task and see a bunch of D processes. I'll post that | output when I get that box free again. | | -- Steve | | - | To unsubscribe from this list: send the line "unsubscribe linux-rt-users" in | the body of a message to majordomo@vger.kernel.org | More majordomo info at http://vger.kernel.org/majordomo-info.html ---end quoted text--- -- [ Luis Claudio R. Goncalves lclaudio at uudg dot org ] [ Fingerprint: 4FDD B8C4 3C59 34BD 8BE9 2696 7203 D980 A448 C8F8 ] [ Linux-HA Developer - LateNite Programmer - Gospel User - Bass Player ] [ Fault Tolerance - Real-Time - Distributed Systems - IECLB - Is 40:31 ] ^ permalink raw reply [flat|nested] 30+ messages in thread
end of thread, other threads:[~2007-07-06 14:17 UTC | newest] Thread overview: 30+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2007-01-30 11:26 2.6.20-rc6-rt4 register_cpu_notification undefined Rui Nuno Capela 2007-02-09 18:56 ` 2.6.20-rt5 Oops on boot Rui Nuno Capela 2007-02-16 0:46 ` 2.6.20-rt5 Oops on boot [-rt8 OK] Rui Nuno Capela 2007-02-16 8:25 ` Ingo Molnar 2007-02-19 12:38 ` Sergio Monteiro Basto 2007-04-01 17:12 ` 2.6.21-rc5-rt6 make errors Rui Nuno Capela 2007-04-01 18:39 ` Ingo Molnar 2007-04-03 23:49 ` 2.6.21-rc5-rt10 troubles Rui Nuno Capela 2007-04-04 8:49 ` Ingo Molnar 2007-04-04 9:42 ` Ingo Molnar 2007-05-25 20:58 ` 2.6.21-rt2..8 troubles Rui Nuno Capela 2007-05-26 16:08 ` Thomas Gleixner 2007-05-26 21:21 ` Rui Nuno Capela 2007-06-06 0:44 ` Rui Nuno Capela 2007-06-08 15:47 ` Thomas Gleixner 2007-06-08 18:21 ` Rui Nuno Capela 2007-06-08 18:50 ` Thomas Gleixner 2007-06-11 19:36 ` Rui Nuno Capela 2007-06-11 19:45 ` Thomas Gleixner 2007-06-11 19:55 ` Daniel Walker 2007-06-11 20:50 ` Rui Nuno Capela 2007-06-11 21:14 ` Thomas Gleixner 2007-06-11 21:25 ` Rui Nuno Capela 2007-06-11 21:42 ` Thomas Gleixner 2007-06-11 22:34 ` Daniel Walker 2007-06-11 23:08 ` Thomas Gleixner 2007-06-12 10:10 ` Rui Nuno Capela 2007-07-06 14:16 ` Rui Nuno Capela 2007-05-31 15:56 ` Steven Rostedt 2007-05-31 16:26 ` Luis Claudio R. Goncalves
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.