All of lore.kernel.org
 help / color / mirror / Atom feed
* top displaying 9999% CPU usage
@ 2007-10-03 12:33 Frans Pop
  2007-10-03 12:52 ` Jan Engelhardt
  2007-10-03 13:03 ` Alexander E. Patrakov
  0 siblings, 2 replies; 21+ messages in thread
From: Frans Pop @ 2007-10-03 12:33 UTC (permalink / raw)
  To: linux-kernel

I saw top occasionally displaying 9999% CPU usage for a process. The
first few times it was amarokapp, this last time it was kontact.
Both applications were basically idle.
The "cc1" is a kernel compile (rc9 + CFS :-).

I cannot remember seeing this before, but as I also don't run top that
frequently I cannot be sure its a recent regression.

$ uname -a
Linux faramir 2.6.23-rc9 #1 SMP Tue Oct 2 11:16:15 CEST 2007 x86_64 GNU/Linux

top - 14:28:27 up 38 min,  1 user,  load average: 1.12, 0.67, 0.32
Tasks: 128 total,   2 running, 126 sleeping,   0 stopped,   0 zombie
Cpu(s): 32.0%us, 15.7%sy,  0.0%ni, 43.8%id,  8.5%wa,  0.0%hi,  0.0%si,  0.0%st
Mem:   2054476k total,   998460k used,  1056016k free,    95284k buffers
Swap:   979924k total,        0k used,   979924k free,   485804k cached

  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
 5269 fjp       20   0  441m  58m  32m S 9999  2.9   0:16.34 kontact
                                         ^^^^
 8272 fjp       20   0  5952  468  208 S    3  0.0   0:04.62 faked-sysv
 3666 fjp       20   0 25268 2036  884 S    2  0.1   0:03.30 famd
26530 fjp       20   0 20932 6292 1568 R    2  0.3   0:00.06 cc1
 5124 fjp       20   0  210m  26m  19m S    1  1.3   0:09.18 kicker
17012 fjp       20   0 19016 1252  916 R    1  0.1   0:00.34 top
 5159 fjp       20   0  508m  54m  30m S    0  2.7   0:12.67 amarokapp

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: top displaying 9999% CPU usage
  2007-10-03 12:33 top displaying 9999% CPU usage Frans Pop
@ 2007-10-03 12:52 ` Jan Engelhardt
  2007-10-03 13:03 ` Alexander E. Patrakov
  1 sibling, 0 replies; 21+ messages in thread
From: Jan Engelhardt @ 2007-10-03 12:52 UTC (permalink / raw)
  To: Frans Pop; +Cc: linux-kernel


On Oct 3 2007 14:33, Frans Pop wrote:
>
>I saw top occasionally displaying 9999% CPU usage for a process. The
>first few times it was amarokapp, this last time it was kontact.
>Both applications were basically idle.

Yes this certainly sounds like KDE. Did you try with Gnome,
or perhaps a simple `perl -e '1 while 1'`?

>The "cc1" is a kernel compile (rc9 + CFS :-).
>
>I cannot remember seeing this before, but as I also don't run top that
>frequently I cannot be sure its a recent regression.

Test various things, like another top utility (htop perhaps),
compare. Narrowing down where the bug is.

>top - 14:28:27 up 38 min,  1 user,  load average: 1.12, 0.67, 0.32
>Tasks: 128 total,   2 running, 126 sleeping,   0 stopped,   0 zombie
>Cpu(s): 32.0%us, 15.7%sy,  0.0%ni, 43.8%id,  8.5%wa,  0.0%hi,  0.0%si,  0.0%st
>Mem:   2054476k total,   998460k used,  1056016k free,    95284k buffers
>Swap:   979924k total,        0k used,   979924k free,   485804k cached
>
>  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
> 5269 fjp       20   0  441m  58m  32m S 9999  2.9   0:16.34 kontact
>                                         ^^^^
> 8272 fjp       20   0  5952  468  208 S    3  0.0   0:04.62 faked-sysv
> 3666 fjp       20   0 25268 2036  884 S    2  0.1   0:03.30 famd
>26530 fjp       20   0 20932 6292 1568 R    2  0.3   0:00.06 cc1
> 5124 fjp       20   0  210m  26m  19m S    1  1.3   0:09.18 kicker
>17012 fjp       20   0 19016 1252  916 R    1  0.1   0:00.34 top
> 5159 fjp       20   0  508m  54m  30m S    0  2.7   0:12.67 amarokapp

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: top displaying 9999% CPU usage
  2007-10-03 12:33 top displaying 9999% CPU usage Frans Pop
  2007-10-03 12:52 ` Jan Engelhardt
@ 2007-10-03 13:03 ` Alexander E. Patrakov
  2007-10-03 14:04   ` Frans Pop
  1 sibling, 1 reply; 21+ messages in thread
From: Alexander E. Patrakov @ 2007-10-03 13:03 UTC (permalink / raw)
  To: linux-kernel; +Cc: elendil

Frans Pop wrote:
> I saw top occasionally displaying 9999% CPU usage for a process. The
> first few times it was amarokapp, this last time it was kontact.
> Both applications were basically idle.
> The "cc1" is a kernel compile (rc9 + CFS :-).
> 
> I cannot remember seeing this before, but as I also don't run top that
> frequently I cannot be sure its a recent regression.

Try to capture the i/o log with the following command:

strace -o top.log top

This will show for sure whether the kernel gives out incorrect data or 
top misinterprets them.


-- 
Alexander E. Patrakov


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: top displaying 9999% CPU usage
  2007-10-03 13:03 ` Alexander E. Patrakov
@ 2007-10-03 14:04   ` Frans Pop
  2007-10-03 14:43     ` Ilpo Järvinen
  0 siblings, 1 reply; 21+ messages in thread
From: Frans Pop @ 2007-10-03 14:04 UTC (permalink / raw)
  To: linux-kernel; +Cc: Alexander E. Patrakov

On Wednesday 03 October 2007, you wrote:
> Try to capture the i/o log with the following command:
> strace -o top.log top

Thanks for the suggestion.

> This will show for sure whether the kernel gives out incorrect data or
> top misinterprets them.

 PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
5269 fjp       20   0  442m  61m  33m S 9999  3.1   0:34.12 kontact

Here are the last two reads for PID 5269 from /proc before that:

15:48:51 stat("/proc/5269", {st_mode=S_IFDIR|0555, st_size=0, ...}) = 0
15:48:51 open("/proc/5269/stat", O_RDONLY) = 10
15:48:51 read(10, "5269 (kontact) S 1 5100 5100 0 -1 4202560 58186 7891 290 1 2911 502 10 15 20 0 6 0 9225 464408576 15694 18446744073709551615 4194304 4365900 140733817144688 18446744073709551615 47998530367074 0 0 4098 83113 18446744073709551615 0 0 17 1 0 0 0\n", 1023) = 244
15:48:51 open("/proc/5269/statm", O_RDONLY) = 10
15:48:54 stat("/proc/5269", {st_mode=S_IFDIR|0555, st_size=0, ...}) = 0
15:48:54 open("/proc/5269/stat", O_RDONLY) = 10
15:48:54 read(10, "5269 (kontact) S 1 5100 5100 0 -1 4202560 58186 7891 290 1 2912 500 10 15 20 0 6 0 9225 464408576 15694 18446744073709551615 4194304 4365900 140733817144688 18446744073709551615 47998530367074 0 0 4098 83113 18446744073709551615 0 0 17 1 0 0 0\n", 1023) = 244
15:48:54 open("/proc/5269/statm", O_RDONLY) = 10

The only change is in 2 consecutive columns: "2911 502" -> "2912 500".
Is processor usage calculated from those? Can someone explain how?

In reply to Jan Engelhardt:
If I run the endless loop, top just displays 100%.
I've not seen the 9999 in htop, only occasionally "100." for both.

This is on an x86_64 SMP box.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: top displaying 9999% CPU usage
  2007-10-03 14:04   ` Frans Pop
@ 2007-10-03 14:43     ` Ilpo Järvinen
  2007-10-03 14:51       ` Ilpo Järvinen
  0 siblings, 1 reply; 21+ messages in thread
From: Ilpo Järvinen @ 2007-10-03 14:43 UTC (permalink / raw)
  To: Frans Pop; +Cc: LKML, Alexander E. Patrakov

On Wed, 3 Oct 2007, Frans Pop wrote:

> On Wednesday 03 October 2007, you wrote:
> > Try to capture the i/o log with the following command:
> > strace -o top.log top
> 
> Thanks for the suggestion.
> 
> > This will show for sure whether the kernel gives out incorrect data or
> > top misinterprets them.
> 
>  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
> 5269 fjp       20   0  442m  61m  33m S 9999  3.1   0:34.12 kontact
> 
> Here are the last two reads for PID 5269 from /proc before that:
> 
> 15:48:51 stat("/proc/5269", {st_mode=S_IFDIR|0555, st_size=0, ...}) = 0
> 15:48:51 open("/proc/5269/stat", O_RDONLY) = 10
> 15:48:51 read(10, "5269 (kontact) S 1 5100 5100 0 -1 4202560 58186 7891 290 1 2911 502 10 15 20 0 6 0 9225 464408576 15694 18446744073709551615 4194304 4365900 140733817144688 18446744073709551615 47998530367074 0 0 4098 83113 18446744073709551615 0 0 17 1 0 0 0\n", 1023) = 244
> 15:48:51 open("/proc/5269/statm", O_RDONLY) = 10
> 15:48:54 stat("/proc/5269", {st_mode=S_IFDIR|0555, st_size=0, ...}) = 0
> 15:48:54 open("/proc/5269/stat", O_RDONLY) = 10
> 15:48:54 read(10, "5269 (kontact) S 1 5100 5100 0 -1 4202560 58186 7891 290 1 2912 500 10 15 20 0 6 0 9225 464408576 15694 18446744073709551615 4194304 4365900 140733817144688 18446744073709551615 47998530367074 0 0 4098 83113 18446744073709551615 0 0 17 1 0 0 0\n", 1023) = 244
> 15:48:54 open("/proc/5269/statm", O_RDONLY) = 10
> 
> The only change is in 2 consecutive columns: "2911 502" -> "2912 500".
> Is processor usage calculated from those? Can someone explain how?

The latter seems to be utime ...decreasing. No wonder if arithmetics will 
give strange results (probably top is using unsigned delta?)...


-- 
 i.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: top displaying 9999% CPU usage
  2007-10-03 14:43     ` Ilpo Järvinen
@ 2007-10-03 14:51       ` Ilpo Järvinen
  2007-10-03 19:27         ` Decreasing stime running confuses top (was: top displaying 9999% CPU usage) Frans Pop
  0 siblings, 1 reply; 21+ messages in thread
From: Ilpo Järvinen @ 2007-10-03 14:51 UTC (permalink / raw)
  To: Frans Pop; +Cc: LKML, Alexander E. Patrakov

[-- Attachment #1: Type: TEXT/PLAIN, Size: 454 bytes --]

On Wed, 3 Oct 2007, Ilpo Järvinen wrote:

> On Wed, 3 Oct 2007, Frans Pop wrote:
> 
> > The only change is in 2 consecutive columns: "2911 502" -> "2912 500".
> > Is processor usage calculated from those? Can someone explain how?
> 
> The latter seems to be utime ...decreasing. No wonder if arithmetics will 
> give strange results (probably top is using unsigned delta?)...

Hmm, minor miscounting from my side, stime seems more appropriate...

-- 
 i.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Decreasing stime running confuses top (was: top displaying 9999% CPU usage)
  2007-10-03 14:51       ` Ilpo Järvinen
@ 2007-10-03 19:27         ` Frans Pop
  2007-10-03 20:24           ` Willy Tarreau
  0 siblings, 1 reply; 21+ messages in thread
From: Frans Pop @ 2007-10-03 19:27 UTC (permalink / raw)
  To: LKML; +Cc: Ilpo Järvinen, Alexander E. Patrakov

On Wednesday 03 October 2007, you wrote:
> On Wed, 3 Oct 2007, Ilpo Järvinen wrote:
> > On Wed, 3 Oct 2007, Frans Pop wrote:
> > > The only change is in 2 consecutive columns: "2911 502" -> "2912
> > > 500". Is processor usage calculated from those? Can someone explain
> > > how?
> >
> > The latter seems to be utime ...decreasing. No wonder if arithmetics
> > will give strange results (probably top is using unsigned delta?)...
>
> Hmm, minor miscounting from my side, stime seems more appropriate...

Here is a series showing utime and stime for kontact over 2 minutes.

Values were obtained using (identical values removed):
$ while true; do awk '{print $14" "$15}' /proc/5269/stat; sleep 1; done | ts

Oct 03 21:17:12 12220 1593
Oct 03 21:17:18 12221 1594
Oct 03 21:17:26 12222 1593  <--
Oct 03 21:17:34 12223 1594
Oct 03 21:17:43 12224 1594
Oct 03 21:17:51 12224 1595
Oct 03 21:17:59 12225 1596
Oct 03 21:18:07 12226 1595  <--
Oct 03 21:18:15 12227 1596
Oct 03 21:18:18 12228 1596
Oct 03 21:18:22 12229 1595  <--
Oct 03 21:18:31 12230 1596
Oct 03 21:18:39 12230 1597
Oct 03 21:18:44 12231 1597
Oct 03 21:18:48 12232 1596  <--
Oct 03 21:18:56 12233 1597
Oct 03 21:19:04 12234 1596  <--
Oct 03 21:19:11 12235 1597

So, is it normal that stime decreases sometimes or a kernel bug?
/me expects the last...

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Decreasing stime running confuses top (was: top displaying 9999% CPU usage)
  2007-10-03 19:27         ` Decreasing stime running confuses top (was: top displaying 9999% CPU usage) Frans Pop
@ 2007-10-03 20:24           ` Willy Tarreau
  2007-10-03 23:32             ` Frans Pop
  0 siblings, 1 reply; 21+ messages in thread
From: Willy Tarreau @ 2007-10-03 20:24 UTC (permalink / raw)
  To: Frans Pop; +Cc: LKML, Ilpo Järvinen, Alexander E. Patrakov

On Wed, Oct 03, 2007 at 09:27:41PM +0200, Frans Pop wrote:
> On Wednesday 03 October 2007, you wrote:
> > On Wed, 3 Oct 2007, Ilpo Järvinen wrote:
> > > On Wed, 3 Oct 2007, Frans Pop wrote:
> > > > The only change is in 2 consecutive columns: "2911 502" -> "2912
> > > > 500". Is processor usage calculated from those? Can someone explain
> > > > how?
> > >
> > > The latter seems to be utime ...decreasing. No wonder if arithmetics
> > > will give strange results (probably top is using unsigned delta?)...
> >
> > Hmm, minor miscounting from my side, stime seems more appropriate...
> 
> Here is a series showing utime and stime for kontact over 2 minutes.
> 
> Values were obtained using (identical values removed):
> $ while true; do awk '{print $14" "$15}' /proc/5269/stat; sleep 1; done | ts
> 
> Oct 03 21:17:12 12220 1593
> Oct 03 21:17:18 12221 1594
> Oct 03 21:17:26 12222 1593  <--
> Oct 03 21:17:34 12223 1594
> Oct 03 21:17:43 12224 1594
> Oct 03 21:17:51 12224 1595
> Oct 03 21:17:59 12225 1596
> Oct 03 21:18:07 12226 1595  <--
> Oct 03 21:18:15 12227 1596
> Oct 03 21:18:18 12228 1596
> Oct 03 21:18:22 12229 1595  <--
> Oct 03 21:18:31 12230 1596
> Oct 03 21:18:39 12230 1597
> Oct 03 21:18:44 12231 1597
> Oct 03 21:18:48 12232 1596  <--
> Oct 03 21:18:56 12233 1597
> Oct 03 21:19:04 12234 1596  <--
> Oct 03 21:19:11 12235 1597
> 
> So, is it normal that stime decreases sometimes or a kernel bug?
> /me expects the last...

Let me guess... Dual core AMD64 ?

I'm 99.99% sure that if you boot with "notsc", the problem disappears. If
so, you have one of those wonderful AMD64 with unsynced clock and without
HPET to sync with. I wrote a simple program in the past to exhibit the
problem. It would bsimply run "date +%s" in a busy loops and display each
time it would change. Amazing. It could jump back and forth by up to 3
seconds!

Basically, it looked like this :

old=$(date +%s)
while : ; do
   x=$(date +%s)
   if [ $x != $old ]; then
      echo "$old -> $x"
      old=$x
   fi
done
 
Regards,
Willy


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Decreasing stime running confuses top (was: top displaying 9999% CPU usage)
  2007-10-03 20:24           ` Willy Tarreau
@ 2007-10-03 23:32             ` Frans Pop
  2007-10-04 19:19               ` Luca Tettamanti
  0 siblings, 1 reply; 21+ messages in thread
From: Frans Pop @ 2007-10-03 23:32 UTC (permalink / raw)
  To: Willy Tarreau, LKML, Ilpo Järvinen, Alexander E. Patrakov

On Wednesday 03 October 2007, you wrote:
> On Wed, Oct 03, 2007 at 09:27:41PM +0200, Frans Pop wrote:
> > On Wednesday 03 October 2007, you wrote:
> > > On Wed, 3 Oct 2007, Ilpo Järvinen wrote:
> > > > On Wed, 3 Oct 2007, Frans Pop wrote:
> > > > > The only change is in 2 consecutive columns: "2911 502" -> "2912
> > > > > 500". Is processor usage calculated from those? Can someone
> > > > > explain how?
> > > >
> > > > The latter seems to be utime ...decreasing. No wonder if
> > > > arithmetics will give strange results (probably top is using
> > > > unsigned delta?)...
> > >
> > > Hmm, minor miscounting from my side, stime seems more appropriate...
> >
> > So, is it normal that stime decreases sometimes or a kernel bug?
> > /me expects the last...
>
> Let me guess... Dual core AMD64 ?

Nope: Intel(R) Pentium(R) D CPU 3.20GHz

> I'm 99.99% sure that if you boot with "notsc", the problem disappears.

Not really. With that first test I did have:
$ cat /sys/devices/system/clocksource/clocksource0/current_clocksource
tsc

If I boot with 'notsc', I get:
cat /sys/devices/system/clocksource/clocksource0/current_clocksource
hpet

But the problem is still exactly the same:
Oct 04 00:53:37 545 92
Oct 04 00:53:38 545 94
Oct 04 00:53:43 546 92  <--
Oct 04 00:53:49 547 94
Oct 04 00:53:54 549 93  <--
Oct 04 00:54:00 550 94

Some relevant lines from kernel log:
checking TSC synchronization [CPU#0 -> CPU#1]: passed  <--- Not there with 'notsc'
hpet0: at MMIO 0xfed00000, IRQs 2, 8, 0
hpet0: 3 64-bit timers, 14318180 Hz
ACPI: RTC can wake from S4
Time: hpet clocksource has been installed.
hpet_resources: 0xfed00000 is busy
Time: tsc clocksource has been installed.  <--- Not there with 'notsc'

> If so, you have one of those wonderful AMD64 with unsynced clock and
> without HPET to sync with. I wrote a simple program in the past to exhibit
> the problem. It would bsimply run "date +%s" in a busy loops and display
> each time it would change. Amazing. It could jump back and forth by up to
> 3 seconds!

I tried your script, but the clock runs perfectly. Never saw anything
other than a 1 second increment.


The following may well be relevant.
With 2.6.22 and early 2.6.23-rc kernels (rc3-rc6) I often had this in my
kernel log (see http://lkml.org/lkml/2007/9/16/45):
   checking TSC synchronization [CPU#0 -> CPU#1]:
   Measured 248 cycles TSC warp between CPUs, turning off TSC clock.
   Marking TSC unstable due to check_tsc_sync_source failed

Some boots the TSC synchronization would be OK, but I'd see ~2/3 failures.
Kernels before 2.6.22 did not have this problem.

However, checking my logs now I see that these messages have disappeared
since 2.6.23-rc7. Now the TSC synchronization check always passes.


I also tried with 2.6.22-6 and with that the jumping around is _not_
present. This was a boot where TSC synchronization failed, so with hpet
as clocksource.
Also, the numbers stay constant much longer and have bigger increments
(updates look to be once per minute?):
Oct 04 01:24:19 465 67
Oct 04 01:24:50 467 69
Oct 04 01:24:51 469 72
Oct 04 01:25:51 474 76
Oct 04 01:26:50 478 80

Cheers,
Frans Pop

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Decreasing stime running confuses top (was: top displaying 9999% CPU usage)
  2007-10-03 23:32             ` Frans Pop
@ 2007-10-04 19:19               ` Luca Tettamanti
  2007-10-04 19:32                 ` Decreasing stime running confuses top Chuck Ebbert
  0 siblings, 1 reply; 21+ messages in thread
From: Luca Tettamanti @ 2007-10-04 19:19 UTC (permalink / raw)
  To: Frans Pop; +Cc: Willy Tarreau, LKML, Ilpo Järvinen, Alexander E. Patrakov

Il Thu, Oct 04, 2007 at 01:32:44AM +0200, Frans Pop ha scritto: 
> On Wednesday 03 October 2007, you wrote:
> > On Wed, Oct 03, 2007 at 09:27:41PM +0200, Frans Pop wrote:
> > > On Wednesday 03 October 2007, you wrote:
> > > > On Wed, 3 Oct 2007, Ilpo Järvinen wrote:
> > > > > On Wed, 3 Oct 2007, Frans Pop wrote:
> > > > > > The only change is in 2 consecutive columns: "2911 502" -> "2912
> > > > > > 500". Is processor usage calculated from those? Can someone
> > > > > > explain how?
> > > > >
> > > > > The latter seems to be utime ...decreasing. No wonder if
> > > > > arithmetics will give strange results (probably top is using
> > > > > unsigned delta?)...
> > > >
> > > > Hmm, minor miscounting from my side, stime seems more appropriate...
> > >
> > > So, is it normal that stime decreases sometimes or a kernel bug?
> > > /me expects the last...
> >
> > Let me guess... Dual core AMD64 ?
> 
> Nope: Intel(R) Pentium(R) D CPU 3.20GHz

I just notice the same thing here, with a Core2 Duo (which is supposed
to have synced TSCs) and working HPET.

> The following may well be relevant.
> With 2.6.22 and early 2.6.23-rc kernels (rc3-rc6) I often had this in my
> kernel log (see http://lkml.org/lkml/2007/9/16/45):
>    checking TSC synchronization [CPU#0 -> CPU#1]:
>    Measured 248 cycles TSC warp between CPUs, turning off TSC clock.
>    Marking TSC unstable due to check_tsc_sync_source failed

I don't see this though, TSCs are always syncronized between the 2
cores.

Luca
-- 
Ligabue canta: "Tutti vogliono viaggiare in primaaaa..."
Io ci ho provato e dopo un chilometro ho fuso il motore e bruciato
la frizione...

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Decreasing stime running confuses top
  2007-10-04 19:19               ` Luca Tettamanti
@ 2007-10-04 19:32                 ` Chuck Ebbert
  2007-10-04 20:00                   ` Christian Borntraeger
  0 siblings, 1 reply; 21+ messages in thread
From: Chuck Ebbert @ 2007-10-04 19:32 UTC (permalink / raw)
  To: Luca Tettamanti
  Cc: Frans Pop, Willy Tarreau, LKML, Ilpo Järvinen,
	Alexander E. Patrakov

On 10/04/2007 03:19 PM, Luca Tettamanti wrote:
>>>>>> The latter seems to be utime ...decreasing. No wonder if
>>>>>> arithmetics will give strange results (probably top is using
>>>>>> unsigned delta?)...
>>>>> Hmm, minor miscounting from my side, stime seems more appropriate...
>>>> So, is it normal that stime decreases sometimes or a kernel bug?
>>>> /me expects the last...
>>> Let me guess... Dual core AMD64 ?
>> Nope: Intel(R) Pentium(R) D CPU 3.20GHz
> 
> I just notice the same thing here, with a Core2 Duo (which is supposed
> to have synced TSCs) and working HPET.
> 
>> The following may well be relevant.
>> With 2.6.22 and early 2.6.23-rc kernels (rc3-rc6) I often had this in my
>> kernel log (see http://lkml.org/lkml/2007/9/16/45):
>>    checking TSC synchronization [CPU#0 -> CPU#1]:
>>    Measured 248 cycles TSC warp between CPUs, turning off TSC clock.
>>    Marking TSC unstable due to check_tsc_sync_source failed
> 
> I don't see this though, TSCs are always syncronized between the 2
> cores.
> 

Is CONFIG_VIRT_CPU_ACCOUNTING set?


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Decreasing stime running confuses top
  2007-10-04 19:32                 ` Decreasing stime running confuses top Chuck Ebbert
@ 2007-10-04 20:00                   ` Christian Borntraeger
  2007-10-04 20:21                     ` Chuck Ebbert
  0 siblings, 1 reply; 21+ messages in thread
From: Christian Borntraeger @ 2007-10-04 20:00 UTC (permalink / raw)
  To: Chuck Ebbert
  Cc: Luca Tettamanti, Frans Pop, Willy Tarreau, LKML,
	Ilpo Järvinen, Alexander E. Patrakov

Am Donnerstag, 4. Oktober 2007 schrieb Chuck Ebbert:
> Is CONFIG_VIRT_CPU_ACCOUNTING set?

This is s390 and powerpc only, so the answer is probably no ;-)

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Decreasing stime running confuses top
  2007-10-04 20:00                   ` Christian Borntraeger
@ 2007-10-04 20:21                     ` Chuck Ebbert
  2007-10-04 21:10                       ` [PATCH for testing] " Christian Borntraeger
  0 siblings, 1 reply; 21+ messages in thread
From: Chuck Ebbert @ 2007-10-04 20:21 UTC (permalink / raw)
  To: Christian Borntraeger
  Cc: Luca Tettamanti, Frans Pop, Willy Tarreau, LKML,
	Ilpo Järvinen, Alexander E. Patrakov, Ingo Molnar

On 10/04/2007 04:00 PM, Christian Borntraeger wrote:
> Am Donnerstag, 4. Oktober 2007 schrieb Chuck Ebbert:
>> Is CONFIG_VIRT_CPU_ACCOUNTING set?
> 
> This is s390 and powerpc only, so the answer is probably no ;-)
> 

The code in fs/proc/array.c is... interesting.

1. task_stime() converts p->se.sum_exec_runtime to a clock_t

2. it calls task_utime() which does the same thing (can it change
   between the two reads?), does some calculations that yield a 
   clock_t, turns the result into a cputime and returns that

3. task_stime() then converts that result back into a clock_t and
   uses it!


static cputime_t task_stime(struct task_struct *p)
{
        clock_t stime;

        stime = nsec_to_clock_t(p->se.sum_exec_runtime) -
                        cputime_to_clock_t(task_utime(p));

        return clock_t_to_cputime(stime);
}

static cputime_t task_utime(struct task_struct *p)
{
        clock_t utime = cputime_to_clock_t(p->utime),
                total = utime + cputime_to_clock_t(p->stime);
        u64 temp;

        temp = (u64)nsec_to_clock_t(p->se.sum_exec_runtime);

        if (total) {
                temp *= utime;
                do_div(temp, total);
        }
        utime = (clock_t)temp;

        return clock_t_to_cputime(utime);
}

^ permalink raw reply	[flat|nested] 21+ messages in thread

* [PATCH for testing] Re: Decreasing stime running confuses top
  2007-10-04 20:21                     ` Chuck Ebbert
@ 2007-10-04 21:10                       ` Christian Borntraeger
  2007-10-04 22:01                         ` Chuck Ebbert
  2007-10-05 15:49                         ` Frans Pop
  0 siblings, 2 replies; 21+ messages in thread
From: Christian Borntraeger @ 2007-10-04 21:10 UTC (permalink / raw)
  To: Chuck Ebbert
  Cc: Luca Tettamanti, Frans Pop, Willy Tarreau, LKML,
	Ilpo Järvinen, Alexander E. Patrakov, Ingo Molnar

Am Donnerstag, 4. Oktober 2007 schrieb Chuck Ebbert:
> On 10/04/2007 04:00 PM, Christian Borntraeger wrote:
> > Am Donnerstag, 4. Oktober 2007 schrieb Chuck Ebbert:
> >> Is CONFIG_VIRT_CPU_ACCOUNTING set?
> > 
> > This is s390 and powerpc only, so the answer is probably no ;-)
> > 
> 
> The code in fs/proc/array.c is... interesting.

Short history what I know about this file:
after 2.6.22 the accounting was changes to use sched_clock and to use stime 
and utime only for the split. This is where task_utime and task_stime were 
introduced. See the code in 2.6.23-rc1.
Unfortunately this broke the accouting on s390 which already has precise 
numbers in utime and stime. So the code was partially reverted to use stime 
and utime again where appropriate, which are of type  cputime_t. 

> 
> 1. task_stime() converts p->se.sum_exec_runtime to a clock_t

correct.

> 
> 2. it calls task_utime() which does the same thing (can it change
>    between the two reads?), does some calculations that yield a 
>    clock_t, turns the result into a cputime and returns that
> 
> 3. task_stime() then converts that result back into a clock_t and
>    uses it!

These conversions can reduce the accuracy to jiffie resolution, but should not
make the value non-monotonic. I dont think that here is the problem. 

But: it seems that p->se.sum_exec_runtime can change indeed. That would 
explain stime going backward, if sum_exec_runtime was increases in the middle 
of the calculation. 
The logic of task_stime and task_utime was introduced by 
http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=b27f03d4bdc145a09fb7b0c0e004b29f1ee555fa


Frans can you test this patch if this makes stime and utime monotic again?

It basically reverts the rest of  b27f03d4bdc145a09fb7b0c0e004b29f1ee555fa and 
should restore the 2.6.22 behavior. The process time is used from tasks utime 
and stime instead of the scheduler clock. That means, in general after a long 
period of time, it is less accurate than the current time and behaves like 
2.6.22.

Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
---
 array.c |   37 -------------------------------------
 1 file changed, 37 deletions(-)


diff --git a/fs/proc/array.c b/fs/proc/array.c
index ee4814d..e42c76e 100644
--- a/fs/proc/array.c
+++ b/fs/proc/array.c
@@ -323,7 +323,6 @@ int proc_pid_status(struct task_struct *task, char 
*buffer)
 /*
  * Use precise platform statistics if available:
  */
-#ifdef CONFIG_VIRT_CPU_ACCOUNTING
 static cputime_t task_utime(struct task_struct *p)
 {
 	return p->utime;
@@ -333,42 +332,6 @@ static cputime_t task_stime(struct task_struct *p)
 {
 	return p->stime;
 }
-#else
-static cputime_t task_utime(struct task_struct *p)
-{
-	clock_t utime = cputime_to_clock_t(p->utime),
-		total = utime + cputime_to_clock_t(p->stime);
-	u64 temp;
-
-	/*
-	 * Use CFS's precise accounting:
-	 */
-	temp = (u64)nsec_to_clock_t(p->se.sum_exec_runtime);
-
-	if (total) {
-		temp *= utime;
-		do_div(temp, total);
-	}
-	utime = (clock_t)temp;
-
-	return clock_t_to_cputime(utime);
-}
-
-static cputime_t task_stime(struct task_struct *p)
-{
-	clock_t stime;
-
-	/*
-	 * Use CFS's precise accounting. (we subtract utime from
-	 * the total, to make sure the total observed by userspace
-	 * grows monotonically - apps rely on that):
-	 */
-	stime = nsec_to_clock_t(p->se.sum_exec_runtime) -
-			cputime_to_clock_t(task_utime(p));
-
-	return clock_t_to_cputime(stime);
-}
-#endif
 
 static int do_task_stat(struct task_struct *task, char *buffer, int whole)
 {

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* Re: [PATCH for testing] Re: Decreasing stime running confuses top
  2007-10-04 21:10                       ` [PATCH for testing] " Christian Borntraeger
@ 2007-10-04 22:01                         ` Chuck Ebbert
  2007-10-04 22:31                           ` Christian Borntraeger
                                             ` (2 more replies)
  2007-10-05 15:49                         ` Frans Pop
  1 sibling, 3 replies; 21+ messages in thread
From: Chuck Ebbert @ 2007-10-04 22:01 UTC (permalink / raw)
  To: Christian Borntraeger
  Cc: Luca Tettamanti, Frans Pop, Willy Tarreau, LKML,
	Ilpo Järvinen, Alexander E. Patrakov, Ingo Molnar

On 10/04/2007 05:10 PM, Christian Borntraeger wrote:

> <patch reverting to 2.6.22 behavior>

Alternative patch:

procfs: Don't read runtime twice when computing task's stime

Current code reads p->se.sum_exec_runtime twice and goes through
multiple type conversions to calculate stime. Read it once and
skip some of the conversions.

Signed-off-by: Chuck Ebbert <cebbert@redhat.com>

--- linux-2.6.23-rc6-dell.orig/fs/proc/array.c
+++ linux-2.6.23-rc6-dell/fs/proc/array.c
@@ -334,39 +334,38 @@ static cputime_t task_stime(struct task_
 	return p->stime;
 }
 #else
-static cputime_t task_utime(struct task_struct *p)
+static clock_t __task_utime(struct task_struct *p, u64 runtime)
 {
 	clock_t utime = cputime_to_clock_t(p->utime),
 		total = utime + cputime_to_clock_t(p->stime);
-	u64 temp;
 
 	/*
 	 * Use CFS's precise accounting:
 	 */
-	temp = (u64)nsec_to_clock_t(p->se.sum_exec_runtime);
-
 	if (total) {
-		temp *= utime;
-		do_div(temp, total);
+		runtime *= utime;
+		do_div(runtime, total);
 	}
-	utime = (clock_t)temp;
+	return (clock_t)runtime;
+}
 
-	return clock_t_to_cputime(utime);
+static cputime_t task_utime(struct task_struct *p)
+{
+	u64 runtime = (u64)nsec_to_clock_t(p->se.sum_exec_runtime);
+
+	return clock_t_to_cputime(__task_utime(p, runtime));
 }
 
 static cputime_t task_stime(struct task_struct *p)
 {
-	clock_t stime;
+	u64 runtime = (u64)nsec_to_clock_t(p->se.sum_exec_runtime);
 
 	/*
 	 * Use CFS's precise accounting. (we subtract utime from
 	 * the total, to make sure the total observed by userspace
 	 * grows monotonically - apps rely on that):
 	 */
-	stime = nsec_to_clock_t(p->se.sum_exec_runtime) -
-			cputime_to_clock_t(task_utime(p));
-
-	return clock_t_to_cputime(stime);
+	return clock_t_to_cputime(runtime - __task_utime(p, runtime));
 }
 #endif
 


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH for testing] Re: Decreasing stime running confuses top
  2007-10-04 22:01                         ` Chuck Ebbert
@ 2007-10-04 22:31                           ` Christian Borntraeger
  2007-10-05 11:43                           ` Luca
  2007-10-05 15:07                           ` Frans Pop
  2 siblings, 0 replies; 21+ messages in thread
From: Christian Borntraeger @ 2007-10-04 22:31 UTC (permalink / raw)
  To: Chuck Ebbert
  Cc: Luca Tettamanti, Frans Pop, Willy Tarreau, LKML,
	Ilpo Järvinen, Alexander E. Patrakov, Ingo Molnar

Am Freitag, 5. Oktober 2007 schrieb Chuck Ebbert:
> On 10/04/2007 05:10 PM, Christian Borntraeger wrote:
> 
> > <patch reverting to 2.6.22 behavior>
> 
> Alternative patch:
> 
> procfs: Don't read runtime twice when computing task's stime
> 
> Current code reads p->se.sum_exec_runtime twice and goes through
> multiple type conversions to calculate stime. Read it once and
> skip some of the conversions.
> 
> Signed-off-by: Chuck Ebbert <cebbert@redhat.com>

Looks better and makes the code nicer. s390 and power should work as well as 
CONFIG_VIRT_CPU_ACCOUNTING is unaffected. 

If Frans successfully tests this patch, feel free to add

Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>

Christian

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH for testing] Re: Decreasing stime running confuses top
  2007-10-04 22:01                         ` Chuck Ebbert
  2007-10-04 22:31                           ` Christian Borntraeger
@ 2007-10-05 11:43                           ` Luca
  2007-10-05 15:07                           ` Frans Pop
  2 siblings, 0 replies; 21+ messages in thread
From: Luca @ 2007-10-05 11:43 UTC (permalink / raw)
  To: Chuck Ebbert
  Cc: Christian Borntraeger, Frans Pop, Willy Tarreau, LKML,
	Ilpo Järvinen, Alexander E. Patrakov, Ingo Molnar

On 10/5/07, Chuck Ebbert <cebbert@redhat.com> wrote:
> On 10/04/2007 05:10 PM, Christian Borntraeger wrote:
>
> > <patch reverting to 2.6.22 behavior>
>
> Alternative patch:
>
> procfs: Don't read runtime twice when computing task's stime
>
> Current code reads p->se.sum_exec_runtime twice and goes through
> multiple type conversions to calculate stime. Read it once and
> skip some of the conversions.

Works fine here.

>
> Signed-off-by: Chuck Ebbert <cebbert@redhat.com>

Tested-By: Luca Tettamanti <kronos.it@gmail.com>

Luca

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH for testing] Re: Decreasing stime running confuses top
  2007-10-04 22:01                         ` Chuck Ebbert
  2007-10-04 22:31                           ` Christian Borntraeger
  2007-10-05 11:43                           ` Luca
@ 2007-10-05 15:07                           ` Frans Pop
  2 siblings, 0 replies; 21+ messages in thread
From: Frans Pop @ 2007-10-05 15:07 UTC (permalink / raw)
  To: Chuck Ebbert
  Cc: Christian Borntraeger, Luca Tettamanti, Willy Tarreau, LKML,
	Ilpo Järvinen, Alexander E. Patrakov, Ingo Molnar

On Friday 05 October 2007, Chuck Ebbert wrote:
> procfs: Don't read runtime twice when computing task's stime
>
> Current code reads p->se.sum_exec_runtime twice and goes through
> multiple type conversions to calculate stime. Read it once and
> skip some of the conversions.
>
> Signed-off-by: Chuck Ebbert <cebbert@redhat.com>

This second patch is a major improvement. But both for kontact and amarok I 
still see stime decreasing occasionally. Sometimes even still quite 
frequently like in this series:

Oct 05 17:00:42 698 178
Oct 05 17:00:43 700 177
Oct 05 17:00:44 700 177
Oct 05 17:00:45 700 177
Oct 05 17:00:46 700 178
Oct 05 17:00:47 700 178
Oct 05 17:00:48 700 177
Oct 05 17:00:49 700 177
Oct 05 17:00:50 700 178
Oct 05 17:00:51 700 178
Oct 05 17:00:52 700 179
Oct 05 17:00:53 698 180
Oct 05 17:00:54 700 179
Oct 05 17:00:55 700 179
Oct 05 17:00:56 700 179
Oct 05 17:00:57 700 179
Oct 05 17:00:58 700 180
Oct 05 17:00:59 700 179

This was after 6 minutes of steady increases.
Should we try a debug patch that shows the raw data behind the calculations?

I'll give the first patch a try next.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH for testing] Re: Decreasing stime running confuses top
  2007-10-04 21:10                       ` [PATCH for testing] " Christian Borntraeger
  2007-10-04 22:01                         ` Chuck Ebbert
@ 2007-10-05 15:49                         ` Frans Pop
  2007-10-08 16:49                           ` Christian Borntraeger
  1 sibling, 1 reply; 21+ messages in thread
From: Frans Pop @ 2007-10-05 15:49 UTC (permalink / raw)
  To: Christian Borntraeger
  Cc: Chuck Ebbert, Luca Tettamanti, Willy Tarreau, LKML,
	Ilpo Järvinen, Alexander E. Patrakov, Ingo Molnar

On Thursday 04 October 2007, you wrote:
> Frans can you test this patch if this makes stime and utime monotic
> again?
>
> It basically reverts the rest of 
> b27f03d4bdc145a09fb7b0c0e004b29f1ee555fa and should restore the 2.6.22
> behavior. The process time is used from tasks utime and stime instead of
> the scheduler clock. That means, in general after a long period of time,
> it is less accurate than the current time and behaves like 2.6.22.
>
> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>

Yes, this gives steady increases.
For kontact it also again shows updates only once every minute. I really 
wonder where all the other fluctuations for contact come from with the 
alternative code.

It seems to me that this patch would be the best option for 2.6.23.

Tested-by: Frans Pop <elendil@planet.nl>

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH for testing] Re: Decreasing stime running confuses top
  2007-10-05 15:49                         ` Frans Pop
@ 2007-10-08 16:49                           ` Christian Borntraeger
  2007-10-08 17:00                             ` Ingo Molnar
  0 siblings, 1 reply; 21+ messages in thread
From: Christian Borntraeger @ 2007-10-08 16:49 UTC (permalink / raw)
  To: Frans Pop, Ingo Molnar
  Cc: Chuck Ebbert, Luca Tettamanti, Willy Tarreau, LKML,
	Ilpo Järvinen, Alexander E. Patrakov

Am Freitag, 5. Oktober 2007 schrieb Frans Pop:
> On Thursday 04 October 2007, you wrote:
> > Frans can you test this patch if this makes stime and utime monotic
> > again?
> >
> > It basically reverts the rest of 
> > b27f03d4bdc145a09fb7b0c0e004b29f1ee555fa and should restore the 2.6.22
> > behavior. The process time is used from tasks utime and stime instead of
> > the scheduler clock. That means, in general after a long period of time,
> > it is less accurate than the current time and behaves like 2.6.22.
> >
> > Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
> 
> Yes, this gives steady increases.
> For kontact it also again shows updates only once every minute. I really 
> wonder where all the other fluctuations for contact come from with the 
> alternative code.

Please correct me, if I am wrong, but here is my guess:

I think that the new code gives actually better numbers for kontact. Kontact 
is using the cpu for very short periods, right? The old code updates utime 
and stime via sampling at each timer tick. If kontact is scheduled based on 
the timer tick(lets say timeout and a low amount of other interrupts) it will 
start shortly after a tick. If kontact now manages to return the cpu before 
the next tick, the old code would not account anything for kontact.
The new code instead, should be correct in terms of overall runtime as it 
accounts the scheduled time in ns.

Why does it still shows numbers going backwards? I guess the sampled values 
for stime and utime change in flight between task_utime and task_stime are 
called. Lets say utime will be increased. Given the same sum_exec_runtime 
that means that the result of task_stime() will get smaller at this point. 

So Chucks patch only deals with sum_exec_runtime changing.

> 
> It seems to me that this patch would be the best option for 2.6.23.

Ingo, do you have any opinion about how to proceed?

Christian

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH for testing] Re: Decreasing stime running confuses top
  2007-10-08 16:49                           ` Christian Borntraeger
@ 2007-10-08 17:00                             ` Ingo Molnar
  0 siblings, 0 replies; 21+ messages in thread
From: Ingo Molnar @ 2007-10-08 17:00 UTC (permalink / raw)
  To: Christian Borntraeger
  Cc: Frans Pop, Chuck Ebbert, Luca Tettamanti, Willy Tarreau, LKML,
	Ilpo Järvinen, Alexander E. Patrakov


* Christian Borntraeger <borntraeger@de.ibm.com> wrote:

> Why does it still shows numbers going backwards? I guess the sampled 
> values for stime and utime change in flight between task_utime and 
> task_stime are called. Lets say utime will be increased. Given the 
> same sum_exec_runtime that means that the result of task_stime() will 
> get smaller at this point.
> 
> So Chucks patch only deals with sum_exec_runtime changing.

basically sum_exec_runtime is split up to form a precise utime/stime, 
using the stime/utime ratio as the factor.

> > It seems to me that this patch would be the best option for 2.6.23.
> 
> Ingo, do you have any opinion about how to proceed?

the problem occurs when there's a different "split" dictated by 
p->stime/p->utime. The sum of stime+utime as reported should be 
monotonic, but the individual components may not. (the reason is that we 
have a precise "sum" for free, given by the scheduler, but we do not 
want the overhead of per-syscall timestamps to get a precise stime/utime 
numbers. So we sample p->stime and p->utime from the scheduler tick.)

	Ingo

^ permalink raw reply	[flat|nested] 21+ messages in thread

end of thread, other threads:[~2007-10-08 17:00 UTC | newest]

Thread overview: 21+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2007-10-03 12:33 top displaying 9999% CPU usage Frans Pop
2007-10-03 12:52 ` Jan Engelhardt
2007-10-03 13:03 ` Alexander E. Patrakov
2007-10-03 14:04   ` Frans Pop
2007-10-03 14:43     ` Ilpo Järvinen
2007-10-03 14:51       ` Ilpo Järvinen
2007-10-03 19:27         ` Decreasing stime running confuses top (was: top displaying 9999% CPU usage) Frans Pop
2007-10-03 20:24           ` Willy Tarreau
2007-10-03 23:32             ` Frans Pop
2007-10-04 19:19               ` Luca Tettamanti
2007-10-04 19:32                 ` Decreasing stime running confuses top Chuck Ebbert
2007-10-04 20:00                   ` Christian Borntraeger
2007-10-04 20:21                     ` Chuck Ebbert
2007-10-04 21:10                       ` [PATCH for testing] " Christian Borntraeger
2007-10-04 22:01                         ` Chuck Ebbert
2007-10-04 22:31                           ` Christian Borntraeger
2007-10-05 11:43                           ` Luca
2007-10-05 15:07                           ` Frans Pop
2007-10-05 15:49                         ` Frans Pop
2007-10-08 16:49                           ` Christian Borntraeger
2007-10-08 17:00                             ` Ingo Molnar

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.