Linux 2.4.20-SMP: where is all my CPU time going?

* Linux 2.4.20-SMP: where is all my CPU time going?
@ 2003-02-09 12:59 Chris Rankin
  0 siblings, 0 replies; 3+ messages in thread
From: Chris Rankin @ 2003-02-09 12:59 UTC (permalink / raw)
  To: linux-kernel

Hi,

I have been running Linux-2.4.20-SMP on my dual PIII-9333MHz machine
(1 GB RAM) for over 3 weeks as a desktop machine, with 2 instances of
SETI@Home running at nice 19 in the background the whole time:

# uptime
 12:25:44 up 24 days, 17:51,  7 users,  load average: 2.03, 2.05, 2.13

However, I have occasionally been noticing that some SETI jobs have
been taking longer than I would otherwise suppose. Here is what
wait4() has to say about the CPU time for each work unit over the last
few days:

TIMESTAMP [2003] CLIENT NAME             RUN-TIME         SYS-TIME
Feb 07 16:16:48: SETI-6              33215.280000       278.470000
Feb 07 18:20:19: SETI-1              33722.820000       203.570000
Feb 08 02:27:32: SETI-4              33207.240000       219.950000
Feb 08 04:26:54: SETI-3              32954.330000       217.670000
Feb 08 15:24:00: SETI-5              29543.910000      1908.440000
Feb 08 17:30:35: SETI-6              29512.810000      1949.680000
Feb 09 06:14:08: SETI-1              33391.280000      2165.590000
Feb 09 12:22:21: SETI-4              32747.760000      3840.990000

Each work unit is run as a distinct process. When the process
terminates, a super-demon program writes an entry in a log file, along
with the data from the struct rusage. However, over the last few days,
more and more time has started being spent in the kernel. The last
work unit to complete must have begun around 17:30:35 on Feb 8th, but
didn't finish until 12:22:21 on Feb 9th. That's about 19 hours of
wall-time, of which 9 hours were spent crunching the work unit and 1
hour was spent "doing things" in the kernel. That leaves 9 hours
unaccounted for! Granted, the SETI process was running at nice 19 but
then the machine was also idle since at least midnight on Feb 9th.

Can anyone suggest what the kernel might be doing to generate one hour
of sys-time (compared to a more usual ~ 4 minutes?) And where could
those missing 9 hours of run time have gone?

Thanks for any insights here,
Cheers,
Chris

P.S. This isn't the first time that this has happened. The *very*
first sys-time "spike" happened in Oct 2001, but they've become much
more regular since about Oct 2002, appearing approximately one or two
weeks apart.

^ permalink raw reply	[flat|nested] 3+ messages in thread