From mboxrd@z Thu Jan 1 00:00:00 1970 From: Keir Fraser Subject: Re: Xen 4 TSC problems Date: Mon, 28 Feb 2011 15:00:47 +0000 Message-ID: References: Mime-Version: 1.0 Content-Type: text/plain; charset="ISO-8859-1" Content-Transfer-Encoding: quoted-printable Return-path: In-Reply-To: List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xensource.com Errors-To: xen-devel-bounces@lists.xensource.com To: Olivier Hanesse , Jeremy Fitzhardinge Cc: Dan Magenheimer , xen-devel@lists.xensource.com, Keir Fraser , Jan Beulich , Xen Users , Mark Adams List-Id: xen-devel@lists.xenproject.org The message about detecting wrapped platform timer on Xen console indicates a host problem rather than a guest configuration problem. Did you try running long term with changed platform timer source on Xen command line (clocksource=3Dpit), and also cpuidle=3D0? K. On 28/02/2011 14:37, "Olivier Hanesse" wrote: > Hello, >=20 > It happened again twice this weekend. >=20 > What about setting "tsc_mode=3D2" for my vms ? Should this mode prevent thi= s bug > (coming from a bad emulated tsc due to firmware issue ? is it possible ?)= from > affecting time in domUs ? >=20 > Setting clocksource=3Dpit, make 'tsc' available in > "/sys/devices/system/clocksource/clocksource0/available_clocksource" > (otherwise only xen is available, is it normal ? ).=A0 >=20 > Should I bypass xen clocksource and use tsc as a clocksource for dom0/dom= U ? > or =A0will it be worsed ? >=20 > Regards >=20 > Olivier >=20 > 2011/2/24 Jeremy Fitzhardinge >> On 02/24/2011 09:43 AM, Dan Magenheimer wrote: >>> Just a wild guess, but this in Olivier's posted output: >>>=20 >>> (XEN) Platform timer appears to have unexpectedly wrapped 10 or more ti= mes. >>>=20 >>> and the fact that a 32-bit HPET wrap is ~300 seconds and, with the >>> "10 or more times", 10 * 300 seconds is 3000 seconds, might be a clue >>> (or a complete red herring, but I thought it worth mentioning). >>>=20 >>> Mark and Olivier, it would be interesting to know if you are >>> using the same processor/system. >>=20 >> It definitely seems like some kind of problem on the host system rather >> than anything in the guests themselves. =A0If the platform timer is >> misbehaving, then Xen could be completely screwing up the pvclock >> calibration which it then passes to guests. >>=20 >> Could it be one of those "platform clock stops in certain power states" >> problems? >>=20 >> =A0 =A0J >>=20 >>>> -----Original Message----- >>>> From: Keir Fraser [mailto:keir.xen@gmail.com] >>>> Sent: Thursday, February 24, 2011 7:52 AM >>>> To: Olivier Hanesse; Jan Beulich >>>> Cc: Mark Adams; Jeremy Fitzhardinge; xen-devel@lists.xensource.com; Xe= n >>>> Users; Dan Magenheimer; Keir Fraser >>>> Subject: Re: [Xen-devel] Xen 4 TSC problems >>>>=20 >>>> On 24/02/2011 14:20, "Olivier Hanesse" >>>> wrote: >>>>=20 >>>>> Both dom0 and domUs are affected by this" jump". >>>>>=20 >>>>> I expect to see something like "TSC marked as reliable, warp =3D 0". >>>>> I got this on newer hardware with same config/distros. >>>> It depends on the CPU itself, older CPUs do not have the super-stable >>>> TSC >>>> features. But that should never cause a massive 3000s time jump. >>>>=20 >>>>> Is there a way to measure if it is a TSC warp ? to point out a cpu >>>> tsc issue ? >>>>=20 >>>> The TSC warps or out-of-sync issues that we could reasonably expect >>>> would be >>>> on the order of microseconds. A 3000s warp is something else entirely. >>>> Xen >>>> is very confused and/or some TSC or platform timer has jumped a long >>>> way >>>> (indicating a hardware/firmware issue). >>>>=20 >>>> =A0-- Keir >>>>=20 >>>>> 2011/2/24 Jan Beulich >>>>>>>>> On 24.02.11 at 12:57, Olivier Hanesse >>>> wrote: >>>>>>> I tried to turn off cstates with max_cstate=3D0 without success >>>> (still "not >>>>>>> reliable"). >>>>>>>=20 >>>>>>> With cpuidle=3D0, I also got : >>>>>>>=20 >>>>>>> (XEN) TSC has constant rate, deep Cstates possible, so not >>>> reliable, >>>>>>> warp=3D3022 (count=3D1) >>>>>> This message by itself isn't telling much I believe. >>>>>>=20 >>>>>>> xm info | grep command >>>>>>> xen_commandline =A0 =A0 =A0 =A0: dom0_mem=3D512M cpuidle=3D0 loglvl=3Dall >>>> guest_loglvl=3Dall >>>>>>> dom0_max_vcpus=3D1 dom0_vcpus_pin console=3Dvga,com1 com1=3D19200,8n1 >>>>>>>=20 >>>>>>> Keir : >>>>>>>=20 >>>>>>> Using clocksource=3Dpit : >>>>>>>=20 >>>>>>> (XEN) Platform timer is 1.193MHz PIT >>>>>>>=20 >>>>>>> I also got : >>>>>>>=20 >>>>>>> (XEN) TSC has constant rate, deep Cstates possible, so not >>>> reliable, >>>>>>> warp=3D3262 (count=3D2) >>>>>> The question is whether any of this eliminates the time jumps seen >>>>>> by your DomU-s (from your past mails I wasn't actually sure whether >>>>>> Dom0 also experienced this problem, albeit it would be odd if it >>>> didn't). >>>>>> Jan >>>>>>=20 >>>>>> Jan >>>>>>=20 >>>>>=20 >>>>=20 >>=20 >=20 >=20