Runaway real/sys time in newer paravirt domUs?

* Runaway real/sys time in newer paravirt domUs?
@ 2010-07-06 16:32 Jed Smith
  2010-07-06 19:05 ` Jeremy Fitzhardinge
  0 siblings, 1 reply; 9+ messages in thread
From: Jed Smith @ 2010-07-06 16:32 UTC (permalink / raw)
  To: xen-devel

Good morning,

We've had a few reports from domU customers[1] - confirmed by myself - that CPU
time accounting is very inaccurate in certain circumstances.  This issue seems
to be limited to x86_64 domUs, starting around the 2.6.32 family (but I can't be
sure of that).

The symptoms of the flaw include top reporting hours and days of CPU consumed by
a task which has been running for mere seconds of wall time, as well as the
time(1) utility reporting hundreds of years in some cases.  Contra-indicatively,
the /proc/stat timers on all four VCPUs increment at roughly the expected rate.
Needless to say, this is puzzling.

A test case which highlights the failure has been brought to our attention by
Ævar Arnfjörð Bjarmason, which is a simple Perl script[2] that forks and
executes numerous dig(1) processes.  At the end of his script, time(1) reports
268659840m0.951s of user and 38524003m13.072s of system time consumed.  I am
able to confirm this demonstration using:

 - Xen 3.4.1 on dom0 2.6.18.8-931-2
 - Debian Lenny on domU 2.6.32.12-x86_64-linode12 [3]

Running Ævar's test case looks like this, in that domU:

> real 0m30.741s
> user 307399002m50.773s
> sys 46724m44.192s

However, a quick busyloop in Python seems to report the correct time:

> li21-66:~# cat doit.py 
> for i in xrange(10000000):
>  a = i ** 5
>
> li21-66:~# time python doit.py
>
> real	0m16.600s
> user	0m16.593s
> sys	0m0.006s

I rebooted the domU, and the problem no longer exists.  It seems to be transient
in nature, and difficult to isolate.  /proc/stat seems to increment normally:

> li21-66:/proc# cat stat | grep "cpu " && sleep 1 && cat stat | grep "cpu "
> cpu  3742 0 1560 700180 1326 0 27 1282 0
> cpu  3742 0 1562 700983 1326 0 27 1282 0

I'm not sure where to begin with this one - any thoughts?

 [1]: http://www.linode.com/forums/viewtopic.php?p=30715
 [2]: git://gist.github.com/449825.git
 [3]: Source: http://www.linode.com/src/2.6.32.12-x86_64-linode12.tar.bz2
      Config: http://jedsmith.org/tmp/2.6.32.12-x86_64-linode12.txt

Thanks for the assistance,

Jed Smith
Systems Administrator
Linode, LLC
+1 (609) 593-7103 x1209
jed@linode.com

^ permalink raw reply	[flat|nested] 9+ messages in thread