On Wednesday 06 Jul 2005 14:31, Ingo Molnar wrote: [snip] > > for the first bootup it makes sense to enable most of them - just to > make sure everything is ok. They have performance overhead, but it > shouldnt show up during everyday use. (it will show up in benchmarks > though) Here's the options i have enabled, typically: Okay, I enabled everything. [snip] > > [...] Also, I got a few unexpected messages in dmesg on bootup. > > Firstly; > > > > spawn_desched_task(00000000) > > desched cpu_callback 3/00000000 > > ksoftirqd started up. > > softirq RT prio: 24. > > ksoftirqd started up. > > softirq RT prio: 24. > > [...] > > desched cpu_callback 2/00000000 > > desched thread 0 started up. > > softlockup thread 0 started up. > > > > Why does it print out the same ksoftirqd message six times? Is this > > expected behaviour? > > these messages are harmless status messages - they were added for > debugging purposes. You are getting 6 of them because there are 6 > separate softirq threads. (but the message is the old one so it talks > about ksoftirqd - very confusing.) In my tree i have removed these > printks. That's good. I'm deliberately critical because I thought there might be a problem on my machine somewhere. Normally I wouldn't think twice about ksoftirqd's messages. > > (generally i try to mark every message in the -RT kernel that signals > some sort of anomaly with a 'BUG:' prefix - that makes it easy to do a > 'dmesg | grep BUG:' to find out whether anything bad is going on. All > other messages should be benign.) Okay, I've got multiple other BUG: messages within 2-3 minutes of booting that highlight problems in areas other than ACPI. Are they useful to you? > > > [...] Next, I got a warning about CONFIG_CRITICAL_IRQSOFF_TIMING; > > should this option be enabled? > > do you mean the bootup warning about performance? That is just a warning > to make sure the kernel is not misconfigured during benchmarks - the > debugging options are otherwise safe and you should be able to use any > variations of them. Great. > > > Finally, I got this: > > > > BUG: soft lockup detected on CPU#0! > > [] softlockup_tick+0x89/0xb0 (8) > > [] timer_interrupt+0x50/0xf0 (20) > > [] handle_IRQ_event+0x81/0x100 (16) > > [] __do_IRQ+0xec/0x190 (48) > > [] do_IRQ+0x48/0x70 (40) > > ======================= > > [] acpi_processor_idle+0x0/0x258 (8) > > [] common_interrupt+0x1f/0x24 (12) > > [] acpi_processor_idle+0x0/0x258 (4) > > [] acpi_processor_idle+0x123/0x258 (40) > > [] acpi_processor_idle+0x0/0x258 (32) > > [] cpu_idle+0x56/0x80 (16) > > [] start_kernel+0x17c/0x1c0 (12) > > [] unknown_bootoption+0x0/0x1f0 (20) > > yes, this is a problem. You can probably work it around by disabling > ACPI, but it would be better to debug and fix it. The message was > generated because the kernel spent too much time [more than 10 seconds] > in acpi_processor_idle(), and the softlockup-thread (which runs at > SCHED_FIFO prio 99) was not scheduled for that amount of time. [or it > thought it was not scheduled.] Was there any suspend/resume activity > while you got that message? No, this is during bootup that it occurs. Suspend and resume do work and are compiled in on my laptop, but were never utilised. [snip] > > ( softirq-timer/0-3 |#0): new 3 us maximum-latency wakeup. > > ( softirq-timer/0-3 |#0): new 1003 us maximum-latency wakeup. > > ( softirq-timer/0-3 |#0): new 1001 us maximum-latency wakeup. > > > > Which is presumably a good sign. > > it's good in terms of stability, but the 1003 usecs maximum wakeup > latency is bad - you should be getting much lower latencies. Could you > enable LATENCY_TRACING and send me the /proc/latency_trace file? [if > it's long then in bz2 format.] You can double-check that it's the right > one: the trace is human-readable and should go roughly from 1 usec to > 1003 usecs. Looking at that trace i can probably tell more about how > this latency happened. I've got a pair of nearly identical traces that highlight a 2645us latency in smp_apic_timer_interrupt. I don't know how the trace is formatted, so I can't tell if it occurred before or after this function call. I also can't see how a ~1000us latency translates to a ~2600us latency in the trace. Since both traces were under 6K each, and I think the list limit is higher, I risked it and have attached both. Annoyingly, my original messages (sent SEVERAL hours ago) seem not to have gotten through to the list. Whatever braindead filter is nuking them, it needs to be fixed. -- Cheers, Alistair. personal: alistair()devzero!co!uk university: s0348365()sms!ed!ac!uk student: CS/CSim Undergraduate contact: 1F2 55 South Clerk Street, Edinburgh. EH8 9PP.