From mboxrd@z Thu Jan 1 00:00:00 1970 From: Jed Smith Subject: Re: Runaway real/sys time in newer paravirt domUs? Date: Fri, 9 Jul 2010 10:57:59 -0400 Message-ID: References: <4C337E5D.7040407@goop.org> <053B0147-FC71-4440-896E-1A37E6A4CEEA@linode.com> <4C36FDED020000780000A6AA@vpn.id2.novell.com> Mime-Version: 1.0 (Apple Message framework v1081) Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: quoted-printable Return-path: In-Reply-To: <4C36FDED020000780000A6AA@vpn.id2.novell.com> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xensource.com Errors-To: xen-devel-bounces@lists.xensource.com To: Jan Beulich Cc: xen-devel@lists.xensource.com List-Id: xen-devel@lists.xenproject.org On Jul 9, 2010, at 4:46 AM, Jan Beulich wrote: > Is this perhaps also dependent on the CPU make/model? I have a mostly-homogenous environment, but I'll see what I can do. I = have one box in mind, actually... Since I spoke with Jeremy yesterday, I've become suspicious that the = issue I am reporting here is likely the same that I've seen for some time. A quick background: In our domU environment, Munin incorrectly reports idle = time. All Linode domUs have 4 VCPUs, so user/sys travel freely to 400%. Idle, = however, has always reported 800% under paravirtualized kernels. This led me to investigate a bit, and this is what I deduced: - In /proc/uptime, idle time outruns system uptime significantly. - This ratio is seemingly affected by the number of VCPUs the domU is configured for. With only one VCPU, the ratio is roughly 2.0, = leading me to think that idle ticks are counted twice per-VCPU. - The ratio between idle/system is inconclusive after a lengthy uptime. - In a 2.6.29 domU, two things happen: - a) The original bug does not manifest after 50 attempts. - b) In /proc/uptime, idle time is always precisely 0.00. It never = counts. - Between 2.6.29 and 2.6.30, /proc/uptime behaves much differently, and = the bug then exposes itself. Something changed there. I have seen the /proc/uptime behavior on i386. In fact, a personal = domU: 10:51 jsmith@undertow% cat /proc/uptime 1984022.43 2954870.51 10:51 jsmith@undertow% awk '{printf("%f\n", $2 / $1)}' /proc/uptime 1.489342 10:51 jsmith@undertow% uname -a Linux undertow.jedsmith.org 2.6.32.12-linode25 #1 SMP Wed Apr 28 = 19:25:11 UTC 2010 i686 GNU/Linux I am not sure if the fact that I can only make the original bug appear = on x86_64 is a red herring. Maybe the timer {ov,und}erflow display is different = depending on word size, and this is all the same issue? I will see if I can try this on a vastly different CPU later today. Regards, Jed Smith Systems Administrator Linode, LLC +1 (609) 593-7103 x1209 jed@linode.com