Re: [patch] Real-Time Preemption, -RT-2.6.10-rc2-mm3-V0.7.32-15

* Re: [patch] Real-Time Preemption, -RT-2.6.10-rc2-mm3-V0.7.32-15
@ 2004-12-10 17:49 Mark_H_Johnson
  2004-12-10 21:09 ` Ingo Molnar
                   ` (3 more replies)
  0 siblings, 4 replies; 30+ messages in thread
From: Mark_H_Johnson @ 2004-12-10 17:49 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Mark_H_Johnson, Amit Shah, Karsten Wiese, Bill Huey, Adam Heath,
	emann, Gunther Persoons, K.R. Foley, linux-kernel,
	Florian Schmidt, Fernando Pablo Lopez-Lezcano, Lee Revell,
	Rui Nuno Capela, Shane Shrybman, Esben Nielsen, Thomas Gleixner,
	Michal Schmidt

>The -32-15 kernel can be downloaded from the
>usual place:
>
> http://redhat.com/~mingo/realtime-preempt/

By the time I started today, the directory had -18 so that is what I built
with. Here are some initial results from running cpu_delay (the simple
RT test / user tracing) on a -18 PREEMPT_RT kernel. Have not started
any of the stress tests yet.

To recap, all IRQ # tasks, ksoftirqd/# and events/# tasks are RT FIFO
priority 99. The test program runs at RT FIFO 30 and should use about
70% of the CPU time on the two CPU SMP system under test.

[1] I still do not get traces where cpu_delay switches CPU's. I only
get trace output if it starts and ends on a single CPU. I also had
several cases where I "triggered" a trace but no output - I assume
those are related symptoms. For example:

# ./cpu_delay 0.000100
Delay limit set to 0.00010000 seconds
calibrating loop ....
time diff= 0.504598 or 396354830 loops/sec.
Trace activated with 0.000100 second delay.
Trace triggered with 0.000102 second delay. [not recorded]
Trace activated with 0.000100 second delay.
Trace triggered with 0.000164 second delay. [not recorded]
Trace activated with 0.000100 second delay.
Trace triggered with 0.000132 second delay. [00]
Trace activated with 0.000100 second delay.
Trace triggered with 0.000128 second delay. [01]
Trace activated with 0.000100 second delay.
Trace triggered with 0.000144 second delay. [not recorded]
Trace triggered with 0.000355 second delay. [02]
Trace activated with 0.000100 second delay.
Trace triggered with 0.000101 second delay. [not recorded]
Trace activated with 0.000100 second delay.
Trace triggered with 0.000126 second delay. [not recorded]
Trace triggered with 0.000205 second delay. [not recorded]
Trace activated with 0.000100 second delay.
Trace triggered with 0.000147 second delay. [03]
Trace activated with 0.000100 second delay.
Trace triggered with 0.000135 second delay. [04]
Trace triggered with 0.000110 second delay. [not recorded]
Trace triggered with 0.000247 second delay. [05]
Trace triggered with 0.000120 second delay. [06]
Trace activated with 0.000100 second delay.
Trace triggered with 0.000107 second delay. [07]
Trace triggered with 0.000104 second delay. [not recorded]
Trace activated with 0.000100 second delay.
Trace triggered with 0.000100 second delay. [08]
Trace activated with 0.000100 second delay.
Trace triggered with 0.000201 second delay. [09]

# chrt -f 1 ./get_ltrace.sh 50
Current Maximum is 4965280, limit will be 50.
Resetting max latency from 4965280 to 50.
No new latency samples at Fri Dec 10 11:01:22 CST 2004.
No new latency samples at Fri Dec 10 11:01:32 CST 2004.
No new latency samples at Fri Dec 10 11:01:42 CST 2004.
No new latency samples at Fri Dec 10 11:01:53 CST 2004.
No new latency samples at Fri Dec 10 11:02:03 CST 2004.
No new latency samples at Fri Dec 10 11:02:13 CST 2004.
New trace 0 w/ 117 usec latency.
Resetting max latency from 117 to 50.
No new latency samples at Fri Dec 10 11:02:35 CST 2004.
No new latency samples at Fri Dec 10 11:02:45 CST 2004.
New trace 1 w/ 99 usec latency.
Resetting max latency from 99 to 50.
New trace 2 w/ 248 usec latency.
Resetting max latency from 248 to 50.
New trace 3 w/ 146 usec latency.
Resetting max latency from 146 to 50.
New trace 4 w/ 134 usec latency.
Resetting max latency from 134 to 50.
New trace 5 w/ 250 usec latency.
Resetting max latency from 250 to 50.
New trace 6 w/ 120 usec latency.
Resetting max latency from 120 to 50.
New trace 7 w/ 105 usec latency.
Resetting max latency from 105 to 50.
New trace 8 w/ 91 usec latency.
Resetting max latency from 91 to 50.
New trace 9 w/ 200 usec latency.

For the most part, the elapsed times in the traces match closely to what
was measured by the application.

[2] The all CPU traces appear to show some cases where both ksoftirqd
tasks are active during a preemption of the RT task. That is expected
with my priority settings. [though the first trace shows BOTH getting
activated within 2 usec!]

[3] Some traces show information on both CPU's and then a long period
with no traces from the other. Here is an example...

preemption latency trace v1.1.4 on 2.6.10-rc2-mm3-V0.7.32-18RT
--------------------------------------------------------------------
 latency: 99 us, #275/275, CPU#0 | (M:rt VP:0, KP:1, SP:1 HP:1 #P:2)
    -----------------
    | task: cpu_delay-3556 (uid:0 nice:0 policy:1 rt_prio:30)
    -----------------
 => started at: <00000000>
 => ended at:   <00000000>

                 _------=> CPU#
                / _-----=> irqs-off
               | / _----=> need-resched
               || / _---=> hardirq/softirq
               ||| / _--=> preempt-depth
               |||| /
               |||||     delay
   cmd     pid ||||| time  |   caller
      \   /    |||||   \   |   /
<unknown-2847  1d...    0µs : smp_apic_timer_interrupt (86a89bf 0 0)
cpu_dela-3556  0d.h1    0µs : update_process_times
(smp_apic_timer_interrupt)
cpu_dela-3556  0d.h1    0µs : account_system_time (update_process_times)
cpu_dela-3556  0d.h1    0µs : check_rlimit (account_system_time)
<unknown-2847  1d.h.    0µs : update_process_times
(smp_apic_timer_interrupt)
cpu_dela-3556  0d.h1    0µs : account_it_prof (account_system_time)
...
<unknown-2847  1d.h1    4µs : _raw_spin_unlock (scheduler_tick)
cpu_dela-3556  0d.h1    4µs : irq_exit (apic_timer_interrupt)
<unknown-2847  1d.h.    4µs : rebalance_tick (scheduler_tick)
cpu_dela-3556  0d..2    4µs : do_softirq (irq_exit)
cpu_dela-3556  0d..2    4µs : __do_softirq (do_softirq)
<unknown-2847  1d.h.    5µs : irq_exit (apic_timer_interrupt)
cpu_dela-3556  0d..2    5µs : wake_up_process (do_softirq)
cpu_dela-3556  0d..2    5µs : try_to_wake_up (wake_up_process)
cpu_dela-3556  0d..2    5µs : task_rq_lock (try_to_wake_up)
<unknown-2847  1d...    5µs < (0)
cpu_dela-3556  0d..2    5µs : _raw_spin_lock (task_rq_lock)
cpu_dela-3556  0d..3    6µs : activate_task (try_to_wake_up)
cpu_dela-3556  0d..3    6µs : sched_clock (activate_task)
cpu_dela-3556  0d..3    7µs : recalc_task_prio (activate_task)
cpu_dela-3556  0d..3    7µs : effective_prio (recalc_task_prio)
cpu_dela-3556  0d..3    7µs : activate_task <ksoftirq-4> (0 1):
cpu_dela-3556  0d..3    7µs : enqueue_task (activate_task)
cpu_dela-3556  0d..3    8µs : resched_task (try_to_wake_up)
cpu_dela-3556  0dn.3    8µs : __trace_start_sched_wakeup (try_to_wake_up)
cpu_dela-3556  0dn.3    8µs : try_to_wake_up <ksoftirq-4> (0 45):
cpu_dela-3556  0dn.3    9µs : _raw_spin_unlock_irqrestore (try_to_wake_up)
cpu_dela-3556  0dn.2    9µs : preempt_schedule (try_to_wake_up)
cpu_dela-3556  0dn.2    9µs : wake_up_process (do_softirq)
cpu_dela-3556  0dn.1   10µs < (0)
cpu_dela-3556  0.n.1   10µs : preempt_schedule (up)
cpu_dela-3556  0.n..   10µs : preempt_schedule (user_trace_start)
cpu_dela-3556  0dn..   11µs : __sched_text_start (preempt_schedule)
cpu_dela-3556  0dn.1   11µs : sched_clock (__sched_text_start)
cpu_dela-3556  0dn.1   11µs : _raw_spin_lock_irq (__sched_text_start)
cpu_dela-3556  0dn.1   12µs : _raw_spin_lock_irqsave (__sched_text_start)
cpu_dela-3556  0dn.2   12µs : pull_rt_tasks (__sched_text_start)
cpu_dela-3556  0dn.2   12µs : find_next_bit (pull_rt_tasks)
cpu_dela-3556  0dn.2   13µs : find_next_bit (pull_rt_tasks)
cpu_dela-3556  0d..2   13µs : trace_array (__sched_text_start)
cpu_dela-3556  0d..2   13µs : trace_array <ksoftirq-4> (0 1):
cpu_dela-3556  0d..2   15µs : trace_array <cpu_dela-3556> (45 46):
cpu_dela-3556  0d..2   16µs+: trace_array (__sched_text_start)
ksoftirq-4     0d..2   19µs : __switch_to (__sched_text_start)
ksoftirq-4     0d..2   20µs : __sched_text_start <cpu_dela-3556> (45 0):
ksoftirq-4     0d..2   20µs : finish_task_switch (__sched_text_start)
ksoftirq-4     0d..2   20µs : smp_send_reschedule_allbutself
(finish_task_switch)
ksoftirq-4     0d..2   20µs : send_IPI_allbutself
(smp_send_reschedule_allbutself)
ksoftirq-4     0d..2   21µs : __bitmap_weight (send_IPI_allbutself)
ksoftirq-4     0d..2   21µs : __send_IPI_shortcut (send_IPI_allbutself)
ksoftirq-4     0d..2   21µs : _raw_spin_unlock (finish_task_switch)
ksoftirq-4     0d..1   22µs : trace_stop_sched_switched
(finish_task_switch)
ksoftirq-4     0....   23µs : _do_softirq (ksoftirqd)
ksoftirq-4     0d...   23µs : ___do_softirq (_do_softirq)
ksoftirq-4     0....   23µs : run_timer_softirq (___do_softirq)
ksoftirq-4     0....   24µs : _spin_lock (run_timer_softirq)
ksoftirq-4     0....   24µs : __spin_lock (_spin_lock)
ksoftirq-4     0....   24µs : __might_sleep (__spin_lock)
<unknown-2847  1d...   24µs : smp_reschedule_interrupt (86a8bd8 0 0)
ksoftirq-4     0....   24µs : _down_mutex (__spin_lock)
<unknown-2847  1d...   25µs < (0)
ksoftirq-4     0....   25µs : __down_mutex (__spin_lock)
ksoftirq-4     0....   25µs : __might_sleep (__down_mutex)
ksoftirq-4     0d...   25µs : _raw_spin_lock (__down_mutex)
ksoftirq-4     0d..1   25µs : _raw_spin_lock (__down_mutex)
ksoftirq-4     0d..2   26µs : _raw_spin_lock (__down_mutex)
ksoftirq-4     0d..3   26µs : set_new_owner (__down_mutex)
ksoftirq-4     0d..3   26µs : set_new_owner <ksoftirq-4> (0 0):
ksoftirq-4     0d..3   27µs : _raw_spin_unlock (__down_mutex)
ksoftirq-4     0d..2   27µs : _raw_spin_unlock (__down_mutex)
ksoftirq-4     0d..1   27µs : _raw_spin_unlock (__down_mutex)
... no more traces from CPU 1 ...
ksoftirq-4     0....   77µs : rcu_check_quiescent_state
(__rcu_process_callbacks)
ksoftirq-4     0....   77µs : cond_resched_all (___do_softirq)
ksoftirq-4     0....   77µs : cond_resched (___do_softirq)
ksoftirq-4     0....   78µs : cond_resched (ksoftirqd)
ksoftirq-4     0....   78µs : kthread_should_stop (ksoftirqd)
ksoftirq-4     0....   78µs : schedule (ksoftirqd)
ksoftirq-4     0....   78µs : __sched_text_start (schedule)
ksoftirq-4     0...1   79µs : sched_clock (__sched_text_start)
ksoftirq-4     0...1   79µs : _raw_spin_lock_irq (__sched_text_start)
ksoftirq-4     0...1   79µs : _raw_spin_lock_irqsave (__sched_text_start)
ksoftirq-4     0d..2   80µs : deactivate_task (__sched_text_start)
ksoftirq-4     0d..2   80µs : dequeue_task (deactivate_task)
ksoftirq-4     0d..2   81µs : trace_array (__sched_text_start)
ksoftirq-4     0d..2   82µs : trace_array <cpu_dela-3556> (45 46):
ksoftirq-4     0d..2   84µs+: trace_array (__sched_text_start)
cpu_dela-3556  0d..2   86µs : __switch_to (__sched_text_start)
cpu_dela-3556  0d..2   87µs : __sched_text_start <ksoftirq-4> (0 45):
cpu_dela-3556  0d..2   87µs : finish_task_switch (__sched_text_start)
cpu_dela-3556  0d..2   87µs : _raw_spin_unlock (finish_task_switch)
cpu_dela-3556  0d..1   88µs : trace_stop_sched_switched
(finish_task_switch)
cpu_dela-3556  0d...   89µs+< (0)
cpu_dela-3556  0d...   92µs : math_state_restore (device_not_available)
cpu_dela-3556  0d...   92µs : restore_fpu (math_state_restore)
cpu_dela-3556  0d...   93µs < (0)
cpu_dela-3556  0....   93µs > sys_gettimeofday (00000000 00000000 0000007b)
cpu_dela-3556  0....   93µs : sys_gettimeofday (sysenter_past_esp)
cpu_dela-3556  0....   94µs : user_trace_stop (sys_gettimeofday)
cpu_dela-3556  0...1   94µs : user_trace_stop (sys_gettimeofday)
cpu_dela-3556  0...1   95µs : _raw_spin_lock_irqsave (user_trace_stop)
cpu_dela-3556  0d..2   95µs : _raw_spin_unlock_irqrestore (user_trace_stop)

If I read this right, we tried to reschedule cpu_delay on CPU #1 (at
24 usec) but it never happened and cpu_delay was still "ready to run"
on CPU #0 70 usec later.

[4] I have a trace where cpu_delay was bumped off of CPU #1 at 20 usec
while the X server (not RT) was the active process on CPU #0 for another
130 usec (several traces with preempt-depth ==0) when it finally gets
bumped by IRQ 0.

[5] More of a cosmetic problem, several traces still show the
application name as "unknown" - even for long lived processes like
ksoftirqd/0 and X.

Due to the file size, I will send the traces separately.
  --Mark

^ permalink raw reply	[flat|nested] 30+ messages in thread