All of lore.kernel.org
 help / color / mirror / Atom feed
* Ever increasing time offset for HVM domain / Huge amounts of drift
@ 2013-01-14 13:37 Phil Evans
  2013-01-14 15:14 ` Pasi Kärkkäinen
  2013-01-17 13:55 ` Tim Deegan
  0 siblings, 2 replies; 17+ messages in thread
From: Phil Evans @ 2013-01-14 13:37 UTC (permalink / raw)
  To: xen-devel

Hi,

I am currently running Xen 4.2.1 (this has also happened in 4.2.0 as well).  We have been having a major problem with sometimes huge amounts of clock drift in Windows VMs.  Sometimes the clock on a VM could suddenly jump by over a week (usually forwards, however time has been known to go backwards as well).

Now I don’t profess to know the internals of Xen, however through my investigation I believe I have a degree of knowledge of what could be causing the problem.

The steps to reproduce this (for me at least), is to simply do a manual NTP sync on a Windows VM.  Upon monitoring the qemu-dm log file for the VM, I see similar to the following:

Time offset set 489, added offset 480
Time offset set 436, added offset -53
Time offset set 496, added offset 60
Time offset set 494, added offset -2
Time offset set 554, added offset 60
Time offset set 565, added offset 11
Time offset set 606, added offset 41
Time offset set -1974, added offset -2580
Time offset set 1626, added offset 3600
Time offset set 1579, added offset -47
Time offset set 1639, added offset 60

It seems to add the same number of seconds to the offset as has passed since the last sync.  The offset just keeps on increasing, eventually resulting in huge numbers equating to days.  Occasionally the offset may jump a bit and go down but the general trend is up.  Although this does not affect the VM immediately, at some point I am guessing it syncs itself with the CMOS clock (which is now a large number of seconds offset from the actual time), resulting in a huge jump in time.  A reboot is a guaranteed way to get the new, incorrect time.

Although I do not understand all of the underlying code, I presume the correct way this should work is it should be comparing the CMOS time that’s just been set with the hardware clock on the physical machine, resulting in an offset between the two.  This would result in a generally stable number (ideally 0).  Obviously it is incorrect behaviour for the number to keep going up.  To my mind it looks like it may be somehow getting an inaccurate time from the system (in many cases a fixed time rather than an up-to-date current time).

Does anyone have any light they may be able to shed on this?  Is it possible it could be struggling to get an accurate time from the hardware?  I have checked on several occasions and both the system time and the BIOS clock are spot on.

Regards,
Phil.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Ever increasing time offset for HVM domain / Huge amounts of drift
  2013-01-14 13:37 Ever increasing time offset for HVM domain / Huge amounts of drift Phil Evans
@ 2013-01-14 15:14 ` Pasi Kärkkäinen
  2013-01-14 16:17   ` Phil Evans
  2013-01-17 13:55 ` Tim Deegan
  1 sibling, 1 reply; 17+ messages in thread
From: Pasi Kärkkäinen @ 2013-01-14 15:14 UTC (permalink / raw)
  To: Phil Evans; +Cc: xen-devel

On Mon, Jan 14, 2013 at 01:37:01PM +0000, Phil Evans wrote:
> Hi,
> 
> I am currently running Xen 4.2.1 (this has also happened in 4.2.0 as well).  We have been having a major problem with sometimes huge amounts of clock drift in Windows VMs.  Sometimes the clock on a VM could suddenly jump by over a week (usually forwards, however time has been known to go backwards as well).
> 
> Now I don?t profess to know the internals of Xen, however through my investigation I believe I have a degree of knowledge of what could be causing the problem.
> 
> The steps to reproduce this (for me at least), is to simply do a manual NTP sync on a Windows VM.  Upon monitoring the qemu-dm log file for the VM, I see similar to the following:
> 
> Time offset set 489, added offset 480
> Time offset set 436, added offset -53
> Time offset set 496, added offset 60
> Time offset set 494, added offset -2
> Time offset set 554, added offset 60
> Time offset set 565, added offset 11
> Time offset set 606, added offset 41
> Time offset set -1974, added offset -2580
> Time offset set 1626, added offset 3600
> Time offset set 1579, added offset -47
> Time offset set 1639, added offset 60
> 
> It seems to add the same number of seconds to the offset as has passed since the last sync.  The offset just keeps on increasing, eventually resulting in huge numbers equating to days.  Occasionally the offset may jump a bit and go down but the general trend is up.  Although this does not affect the VM immediately, at some point I am guessing it syncs itself with the CMOS clock (which is now a large number of seconds offset from the actual time), resulting in a huge jump in time.  A reboot is a guaranteed way to get the new, incorrect time.
> 
> Although I do not understand all of the underlying code, I presume the correct way this should work is it should be comparing the CMOS time that?s just been set with the hardware clock on the physical machine, resulting in an offset between the two.  This would result in a generally stable number (ideally 0).  Obviously it is incorrect behaviour for the number to keep going up.  To my mind it looks like it may be somehow getting an inaccurate time from the system (in many cases a fixed time rather than an up-to-date current time).
> 
> Does anyone have any light they may be able to shed on this?  Is it possible it could be struggling to get an accurate time from the hardware?  I have checked on several occasions and both the system time and the BIOS clock are spot on.
> 

Please paste the cfgfile of your HVM Windows.

-- Pasi

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Ever increasing time offset for HVM domain / Huge amounts of drift
  2013-01-14 15:14 ` Pasi Kärkkäinen
@ 2013-01-14 16:17   ` Phil Evans
  2013-01-14 16:30     ` Pasi Kärkkäinen
  0 siblings, 1 reply; 17+ messages in thread
From: Phil Evans @ 2013-01-14 16:17 UTC (permalink / raw)
  To: Pasi Kärkkäinen; +Cc: xen-devel

Hi,

Sorry I should have included that in the first place:

import os,re
arch = os.uname()[4]
kernel = '/usr/lib/xen-default/boot/hvmloader'                                                                                                                                  
builder = 'hvm'                                                                                                                                                                 
name = 'vm_141'                                                                                                                                                                 
memory = '2048'                                                                                                                                                                 
disk = ['phy:/dev/storage_node_2/disk_806,xvda,w','file:/control/isos/empty.iso,xvdd:cdrom,r']                                                                                  
vif = ['mac=00:16:3e:3f:2f:1a, bridge=vlan_369, vifname=vm_141.0, ip=89.238.190.22 2a02:40:501:3::2 2a02:40:501:3::5 89.238.190.88','mac=00:16:3e:67:66:71, bridge=vlan_4000, vifname=vm_141.1, ip=0.0.0.0','mac=00:16:3e:01:7e:e8, bridge=vlan_369, vifname=vm_141.2']                                                                                         
device_model = '/usr/lib/xen-default/bin/qemu-dm'
boot = 'cd'
vnc = 1
vncpasswd = 'YpD5aVZ8'
usbdevice = 'tablet'
acpi = 0
vcpus = 4
viridian = 1

Thanks,
Phil.
________________________________________
From: Pasi Kärkkäinen [pasik@iki.fi]
Sent: 14 January 2013 15:14
To: Phil Evans
Cc: xen-devel@lists.xen.org
Subject: Re: [Xen-devel] Ever increasing time offset for HVM domain / Huge amounts of drift

On Mon, Jan 14, 2013 at 01:37:01PM +0000, Phil Evans wrote:
> Hi,
>
> I am currently running Xen 4.2.1 (this has also happened in 4.2.0 as well).  We have been having a major problem with sometimes huge amounts of clock drift in Windows VMs.  Sometimes the clock on a VM could suddenly jump by over a week (usually forwards, however time has been known to go backwards as well).
>
> Now I don?t profess to know the internals of Xen, however through my investigation I believe I have a degree of knowledge of what could be causing the problem.
>
> The steps to reproduce this (for me at least), is to simply do a manual NTP sync on a Windows VM.  Upon monitoring the qemu-dm log file for the VM, I see similar to the following:
>
> Time offset set 489, added offset 480
> Time offset set 436, added offset -53
> Time offset set 496, added offset 60
> Time offset set 494, added offset -2
> Time offset set 554, added offset 60
> Time offset set 565, added offset 11
> Time offset set 606, added offset 41
> Time offset set -1974, added offset -2580
> Time offset set 1626, added offset 3600
> Time offset set 1579, added offset -47
> Time offset set 1639, added offset 60
>
> It seems to add the same number of seconds to the offset as has passed since the last sync.  The offset just keeps on increasing, eventually resulting in huge numbers equating to days.  Occasionally the offset may jump a bit and go down but the general trend is up.  Although this does not affect the VM immediately, at some point I am guessing it syncs itself with the CMOS clock (which is now a large number of seconds offset from the actual time), resulting in a huge jump in time.  A reboot is a guaranteed way to get the new, incorrect time.
>
> Although I do not understand all of the underlying code, I presume the correct way this should work is it should be comparing the CMOS time that?s just been set with the hardware clock on the physical machine, resulting in an offset between the two.  This would result in a generally stable number (ideally 0).  Obviously it is incorrect behaviour for the number to keep going up.  To my mind it looks like it may be somehow getting an inaccurate time from the system (in many cases a fixed time rather than an up-to-date current time).
>
> Does anyone have any light they may be able to shed on this?  Is it possible it could be struggling to get an accurate time from the hardware?  I have checked on several occasions and both the system time and the BIOS clock are spot on.
>

Please paste the cfgfile of your HVM Windows.

-- Pasi

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Ever increasing time offset for HVM domain / Huge amounts of drift
  2013-01-14 16:17   ` Phil Evans
@ 2013-01-14 16:30     ` Pasi Kärkkäinen
  2013-01-14 16:52       ` Phil Evans
  0 siblings, 1 reply; 17+ messages in thread
From: Pasi Kärkkäinen @ 2013-01-14 16:30 UTC (permalink / raw)
  To: Phil Evans; +Cc: xen-devel

On Mon, Jan 14, 2013 at 04:17:39PM +0000, Phil Evans wrote:
> Hi,
> 
> Sorry I should have included that in the first place:
> 

Ok. Did you try experimenting with these options?:

timer_mode=X
hpet=0|1
tsc_mode=X


-- Pasi


> import os,re
> arch = os.uname()[4]
> kernel = '/usr/lib/xen-default/boot/hvmloader'                                                                                                                                  
> builder = 'hvm'                                                                                                                                                                 
> name = 'vm_141'                                                                                                                                                                 
> memory = '2048'                                                                                                                                                                 
> disk = ['phy:/dev/storage_node_2/disk_806,xvda,w','file:/control/isos/empty.iso,xvdd:cdrom,r']                                                                                  
> vif = ['mac=00:16:3e:3f:2f:1a, bridge=vlan_369, vifname=vm_141.0, ip=89.238.190.22 2a02:40:501:3::2 2a02:40:501:3::5 89.238.190.88','mac=00:16:3e:67:66:71, bridge=vlan_4000, vifname=vm_141.1, ip=0.0.0.0','mac=00:16:3e:01:7e:e8, bridge=vlan_369, vifname=vm_141.2']                                                                                         
> device_model = '/usr/lib/xen-default/bin/qemu-dm'
> boot = 'cd'
> vnc = 1
> vncpasswd = 'YpD5aVZ8'
> usbdevice = 'tablet'
> acpi = 0
> vcpus = 4
> viridian = 1
> 
> Thanks,
> Phil.
> ________________________________________
> From: Pasi Kärkkäinen [pasik@iki.fi]
> Sent: 14 January 2013 15:14
> To: Phil Evans
> Cc: xen-devel@lists.xen.org
> Subject: Re: [Xen-devel] Ever increasing time offset for HVM domain / Huge amounts of drift
> 
> On Mon, Jan 14, 2013 at 01:37:01PM +0000, Phil Evans wrote:
> > Hi,
> >
> > I am currently running Xen 4.2.1 (this has also happened in 4.2.0 as well).  We have been having a major problem with sometimes huge amounts of clock drift in Windows VMs.  Sometimes the clock on a VM could suddenly jump by over a week (usually forwards, however time has been known to go backwards as well).
> >
> > Now I don?t profess to know the internals of Xen, however through my investigation I believe I have a degree of knowledge of what could be causing the problem.
> >
> > The steps to reproduce this (for me at least), is to simply do a manual NTP sync on a Windows VM.  Upon monitoring the qemu-dm log file for the VM, I see similar to the following:
> >
> > Time offset set 489, added offset 480
> > Time offset set 436, added offset -53
> > Time offset set 496, added offset 60
> > Time offset set 494, added offset -2
> > Time offset set 554, added offset 60
> > Time offset set 565, added offset 11
> > Time offset set 606, added offset 41
> > Time offset set -1974, added offset -2580
> > Time offset set 1626, added offset 3600
> > Time offset set 1579, added offset -47
> > Time offset set 1639, added offset 60
> >
> > It seems to add the same number of seconds to the offset as has passed since the last sync.  The offset just keeps on increasing, eventually resulting in huge numbers equating to days.  Occasionally the offset may jump a bit and go down but the general trend is up.  Although this does not affect the VM immediately, at some point I am guessing it syncs itself with the CMOS clock (which is now a large number of seconds offset from the actual time), resulting in a huge jump in time.  A reboot is a guaranteed way to get the new, incorrect time.
> >
> > Although I do not understand all of the underlying code, I presume the correct way this should work is it should be comparing the CMOS time that?s just been set with the hardware clock on the physical machine, resulting in an offset between the two.  This would result in a generally stable number (ideally 0).  Obviously it is incorrect behaviour for the number to keep going up.  To my mind it looks like it may be somehow getting an inaccurate time from the system (in many cases a fixed time rather than an up-to-date current time).
> >
> > Does anyone have any light they may be able to shed on this?  Is it possible it could be struggling to get an accurate time from the hardware?  I have checked on several occasions and both the system time and the BIOS clock are spot on.
> >
> 
> Please paste the cfgfile of your HVM Windows.
> 
> -- Pasi
> 

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Ever increasing time offset for HVM domain / Huge amounts of drift
  2013-01-14 16:30     ` Pasi Kärkkäinen
@ 2013-01-14 16:52       ` Phil Evans
  2013-01-14 16:54         ` Pasi Kärkkäinen
  2013-01-14 23:57         ` Dan Magenheimer
  0 siblings, 2 replies; 17+ messages in thread
From: Phil Evans @ 2013-01-14 16:52 UTC (permalink / raw)
  To: Pasi Kärkkäinen; +Cc: xen-devel

I had tried messing with all but tsc_mode.  I have tried all options for tsc_mode as well now but there is no change.  The key thing here is that the drift is increasing at a rate of 1 second per second of uptime amounting to huge amounts.  It doesn't seem to be a clock skew issue, it seems to be something is simply not counting at all.

Phil.
________________________________________
From: Pasi Kärkkäinen [pasik@iki.fi]
Sent: 14 January 2013 16:30
To: Phil Evans
Cc: xen-devel@lists.xen.org
Subject: Re: [Xen-devel] Ever increasing time offset for HVM domain / Huge amounts of drift

On Mon, Jan 14, 2013 at 04:17:39PM +0000, Phil Evans wrote:
> Hi,
>
> Sorry I should have included that in the first place:
>

Ok. Did you try experimenting with these options?:

timer_mode=X
hpet=0|1
tsc_mode=X


-- Pasi


> import os,re
> arch = os.uname()[4]
> kernel = '/usr/lib/xen-default/boot/hvmloader'
> builder = 'hvm'
> name = 'vm_141'
> memory = '2048'
> disk = ['phy:/dev/storage_node_2/disk_806,xvda,w','file:/control/isos/empty.iso,xvdd:cdrom,r']
> vif = ['mac=00:16:3e:3f:2f:1a, bridge=vlan_369, vifname=vm_141.0, ip=89.238.190.22 2a02:40:501:3::2 2a02:40:501:3::5 89.238.190.88','mac=00:16:3e:67:66:71, bridge=vlan_4000, vifname=vm_141.1, ip=0.0.0.0','mac=00:16:3e:01:7e:e8, bridge=vlan_369, vifname=vm_141.2']
> device_model = '/usr/lib/xen-default/bin/qemu-dm'
> boot = 'cd'
> vnc = 1
> vncpasswd = 'YpD5aVZ8'
> usbdevice = 'tablet'
> acpi = 0
> vcpus = 4
> viridian = 1
>
> Thanks,
> Phil.
> ________________________________________
> From: Pasi Kärkkäinen [pasik@iki.fi]
> Sent: 14 January 2013 15:14
> To: Phil Evans
> Cc: xen-devel@lists.xen.org
> Subject: Re: [Xen-devel] Ever increasing time offset for HVM domain / Huge amounts of drift
>
> On Mon, Jan 14, 2013 at 01:37:01PM +0000, Phil Evans wrote:
> > Hi,
> >
> > I am currently running Xen 4.2.1 (this has also happened in 4.2.0 as well).  We have been having a major problem with sometimes huge amounts of clock drift in Windows VMs.  Sometimes the clock on a VM could suddenly jump by over a week (usually forwards, however time has been known to go backwards as well).
> >
> > Now I don?t profess to know the internals of Xen, however through my investigation I believe I have a degree of knowledge of what could be causing the problem.
> >
> > The steps to reproduce this (for me at least), is to simply do a manual NTP sync on a Windows VM.  Upon monitoring the qemu-dm log file for the VM, I see similar to the following:
> >
> > Time offset set 489, added offset 480
> > Time offset set 436, added offset -53
> > Time offset set 496, added offset 60
> > Time offset set 494, added offset -2
> > Time offset set 554, added offset 60
> > Time offset set 565, added offset 11
> > Time offset set 606, added offset 41
> > Time offset set -1974, added offset -2580
> > Time offset set 1626, added offset 3600
> > Time offset set 1579, added offset -47
> > Time offset set 1639, added offset 60
> >
> > It seems to add the same number of seconds to the offset as has passed since the last sync.  The offset just keeps on increasing, eventually resulting in huge numbers equating to days.  Occasionally the offset may jump a bit and go down but the general trend is up.  Although this does not affect the VM immediately, at some point I am guessing it syncs itself with the CMOS clock (which is now a large number of seconds offset from the actual time), resulting in a huge jump in time.  A reboot is a guaranteed way to get the new, incorrect time.
> >
> > Although I do not understand all of the underlying code, I presume the correct way this should work is it should be comparing the CMOS time that?s just been set with the hardware clock on the physical machine, resulting in an offset between the two.  This would result in a generally stable number (ideally 0).  Obviously it is incorrect behaviour for the number to keep going up.  To my mind it looks like it may be somehow getting an inaccurate time from the system (in many cases a fixed time rather than an up-to-date current time).
> >
> > Does anyone have any light they may be able to shed on this?  Is it possible it could be struggling to get an accurate time from the hardware?  I have checked on several occasions and both the system time and the BIOS clock are spot on.
> >
>
> Please paste the cfgfile of your HVM Windows.
>
> -- Pasi
>

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Ever increasing time offset for HVM domain / Huge amounts of drift
  2013-01-14 16:52       ` Phil Evans
@ 2013-01-14 16:54         ` Pasi Kärkkäinen
  2013-01-14 17:10           ` Phil Evans
  2013-01-14 23:57         ` Dan Magenheimer
  1 sibling, 1 reply; 17+ messages in thread
From: Pasi Kärkkäinen @ 2013-01-14 16:54 UTC (permalink / raw)
  To: Phil Evans; +Cc: xen-devel

On Mon, Jan 14, 2013 at 04:52:31PM +0000, Phil Evans wrote:
> I had tried messing with all but tsc_mode.  I have tried all options for tsc_mode as well now but there is no change.  The key thing here is that the drift is increasing at a rate of 1 second per second of uptime amounting to huge amounts.  It doesn't seem to be a clock skew issue, it seems to be something is simply not counting at all.
>

Please don't top-post..

What's the hardware config? 

-- Pasi

 
> Phil.
> ________________________________________
> From: Pasi Kärkkäinen [pasik@iki.fi]
> Sent: 14 January 2013 16:30
> To: Phil Evans
> Cc: xen-devel@lists.xen.org
> Subject: Re: [Xen-devel] Ever increasing time offset for HVM domain / Huge amounts of drift
> 
> On Mon, Jan 14, 2013 at 04:17:39PM +0000, Phil Evans wrote:
> > Hi,
> >
> > Sorry I should have included that in the first place:
> >
> 
> Ok. Did you try experimenting with these options?:
> 
> timer_mode=X
> hpet=0|1
> tsc_mode=X
> 
> 
> -- Pasi
> 
> 
> > import os,re
> > arch = os.uname()[4]
> > kernel = '/usr/lib/xen-default/boot/hvmloader'
> > builder = 'hvm'
> > name = 'vm_141'
> > memory = '2048'
> > disk = ['phy:/dev/storage_node_2/disk_806,xvda,w','file:/control/isos/empty.iso,xvdd:cdrom,r']
> > vif = ['mac=00:16:3e:3f:2f:1a, bridge=vlan_369, vifname=vm_141.0, ip=89.238.190.22 2a02:40:501:3::2 2a02:40:501:3::5 89.238.190.88','mac=00:16:3e:67:66:71, bridge=vlan_4000, vifname=vm_141.1, ip=0.0.0.0','mac=00:16:3e:01:7e:e8, bridge=vlan_369, vifname=vm_141.2']
> > device_model = '/usr/lib/xen-default/bin/qemu-dm'
> > boot = 'cd'
> > vnc = 1
> > vncpasswd = 'YpD5aVZ8'
> > usbdevice = 'tablet'
> > acpi = 0
> > vcpus = 4
> > viridian = 1
> >
> > Thanks,
> > Phil.
> > ________________________________________
> > From: Pasi Kärkkäinen [pasik@iki.fi]
> > Sent: 14 January 2013 15:14
> > To: Phil Evans
> > Cc: xen-devel@lists.xen.org
> > Subject: Re: [Xen-devel] Ever increasing time offset for HVM domain / Huge amounts of drift
> >
> > On Mon, Jan 14, 2013 at 01:37:01PM +0000, Phil Evans wrote:
> > > Hi,
> > >
> > > I am currently running Xen 4.2.1 (this has also happened in 4.2.0 as well).  We have been having a major problem with sometimes huge amounts of clock drift in Windows VMs.  Sometimes the clock on a VM could suddenly jump by over a week (usually forwards, however time has been known to go backwards as well).
> > >
> > > Now I don?t profess to know the internals of Xen, however through my investigation I believe I have a degree of knowledge of what could be causing the problem.
> > >
> > > The steps to reproduce this (for me at least), is to simply do a manual NTP sync on a Windows VM.  Upon monitoring the qemu-dm log file for the VM, I see similar to the following:
> > >
> > > Time offset set 489, added offset 480
> > > Time offset set 436, added offset -53
> > > Time offset set 496, added offset 60
> > > Time offset set 494, added offset -2
> > > Time offset set 554, added offset 60
> > > Time offset set 565, added offset 11
> > > Time offset set 606, added offset 41
> > > Time offset set -1974, added offset -2580
> > > Time offset set 1626, added offset 3600
> > > Time offset set 1579, added offset -47
> > > Time offset set 1639, added offset 60
> > >
> > > It seems to add the same number of seconds to the offset as has passed since the last sync.  The offset just keeps on increasing, eventually resulting in huge numbers equating to days.  Occasionally the offset may jump a bit and go down but the general trend is up.  Although this does not affect the VM immediately, at some point I am guessing it syncs itself with the CMOS clock (which is now a large number of seconds offset from the actual time), resulting in a huge jump in time.  A reboot is a guaranteed way to get the new, incorrect time.
> > >
> > > Although I do not understand all of the underlying code, I presume the correct way this should work is it should be comparing the CMOS time that?s just been set with the hardware clock on the physical machine, resulting in an offset between the two.  This would result in a generally stable number (ideally 0).  Obviously it is incorrect behaviour for the number to keep going up.  To my mind it looks like it may be somehow getting an inaccurate time from the system (in many cases a fixed time rather than an up-to-date current time).
> > >
> > > Does anyone have any light they may be able to shed on this?  Is it possible it could be struggling to get an accurate time from the hardware?  I have checked on several occasions and both the system time and the BIOS clock are spot on.
> > >
> >
> > Please paste the cfgfile of your HVM Windows.
> >
> > -- Pasi
> >

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Ever increasing time offset for HVM domain / Huge amounts of drift
  2013-01-14 16:54         ` Pasi Kärkkäinen
@ 2013-01-14 17:10           ` Phil Evans
  0 siblings, 0 replies; 17+ messages in thread
From: Phil Evans @ 2013-01-14 17:10 UTC (permalink / raw)
  To: Pasi Kärkkäinen; +Cc: xen-devel

> On Mon, Jan 14, 2013 at 04:52:31PM +0000, Phil Evans wrote:
>> I had tried messing with all but tsc_mode.  I have tried all options for tsc_mode as well now but there is no change.  The key thing here is that the drift is increasing at a rate of 1 second per second of uptime amounting to huge amounts.  It doesn't seem to be a clock skew issue, it seems to be something is simply not counting at all.
>>
>
> Please don't top-post..
>
> What's the hardware config? 
>
> -- Pasi
>
>  

The boxes are Dell PowerEdge R420's with 128GB RAM and dual 8-core Intel
Xeon E5-2450s @ 2.1GHz.  Here's xm info:

host                   : node5
release                : 3.2.34
version                : #2 SMP Wed Dec 5 12:29:46 GMT 2012
machine                : x86_64
nr_cpus                : 32
nr_nodes               : 2
cores_per_socket       : 8
threads_per_core       : 2
cpu_mhz                : 2100
hw_caps                :
bfebfbff:2c100800:00000000:00003f40:13bee3ff:00000000:00000001:00000000
virt_caps              : hvm hvm_directio
total_memory           : 130994
free_memory            : 119056
free_cpus              : 0
xen_major              : 4
xen_minor              : 2
xen_extra              : .1
xen_caps               : xen-3.0-x86_64 xen-3.0-x86_32p hvm-3.0-x86_32
hvm-3.0-x86_32p hvm-3.0-x86_64
xen_scheduler          : credit
xen_pagesize           : 4096
platform_params        : virt_start=0xffff800000000000
xen_changeset          : unavailable
xen_commandline        : dom0_mem=8192M cpufreq=xen dom0_max_vcpus=4
dom0_vcpus_pin xsave=off
cc_compiler            : gcc (GCC) 4.4.6 20120305 (Red Hat 4.4.6-4)
cc_compile_by          : mockbuild
cc_compile_domain      : crc.id.au
cc_compile_date        : Wed Dec 19 01:32:40 EST 2012
xend_config_format     : 4

Phil.

>
>> Phil.
>> ________________________________________
>> From: Pasi Kärkkäinen [pasik@iki.fi]
>> Sent: 14 January 2013 16:30
>> To: Phil Evans
>> Cc: xen-devel@lists.xen.org
>> Subject: Re: [Xen-devel] Ever increasing time offset for HVM domain / Huge amounts of drift
>>
>> On Mon, Jan 14, 2013 at 04:17:39PM +0000, Phil Evans wrote:
>>> Hi,
>>>
>>> Sorry I should have included that in the first place:
>>>
>>
>> Ok. Did you try experimenting with these options?:
>>
>> timer_mode=X
>> hpet=0|1
>> tsc_mode=X
>>
>>
>> -- Pasi
>>
>>
>>> import os,re
>>> arch = os.uname()[4]
>>> kernel = '/usr/lib/xen-default/boot/hvmloader'
>>> builder = 'hvm'
>>> name = 'vm_141'
>>> memory = '2048'
>>> disk = ['phy:/dev/storage_node_2/disk_806,xvda,w','file:/control/isos/empty.iso,xvdd:cdrom,r']
>>> vif = ['mac=00:16:3e:3f:2f:1a, bridge=vlan_369, vifname=vm_141.0, ip=89.238.190.22 2a02:40:501:3::2 2a02:40:501:3::5 89.238.190.88','mac=00:16:3e:67:66:71, bridge=vlan_4000, vifname=vm_141.1, ip=0.0.0.0','mac=00:16:3e:01:7e:e8, bridge=vlan_369, vifname=vm_141.2']
>>> device_model = '/usr/lib/xen-default/bin/qemu-dm'
>>> boot = 'cd'
>>> vnc = 1
>>> vncpasswd = 'YpD5aVZ8'
>>> usbdevice = 'tablet'
>>> acpi = 0
>>> vcpus = 4
>>> viridian = 1
>>>
>>> Thanks,
>>> Phil.
>>> ________________________________________
>>> From: Pasi Kärkkäinen [pasik@iki.fi]
>>> Sent: 14 January 2013 15:14
>>> To: Phil Evans
>>> Cc: xen-devel@lists.xen.org
>>> Subject: Re: [Xen-devel] Ever increasing time offset for HVM domain / Huge amounts of drift
>>>
>>> On Mon, Jan 14, 2013 at 01:37:01PM +0000, Phil Evans wrote:
>>>> Hi,
>>>>
>>>> I am currently running Xen 4.2.1 (this has also happened in 4.2.0 as well).  We have been having a major problem with sometimes huge amounts of clock drift in Windows VMs.  Sometimes the clock on a VM could suddenly jump by over a week (usually forwards, however time has been known to go backwards as well).
>>>>
>>>> Now I don?t profess to know the internals of Xen, however through my investigation I believe I have a degree of knowledge of what could be causing the problem.
>>>>
>>>> The steps to reproduce this (for me at least), is to simply do a manual NTP sync on a Windows VM.  Upon monitoring the qemu-dm log file for the VM, I see similar to the following:
>>>>
>>>> Time offset set 489, added offset 480
>>>> Time offset set 436, added offset -53
>>>> Time offset set 496, added offset 60
>>>> Time offset set 494, added offset -2
>>>> Time offset set 554, added offset 60
>>>> Time offset set 565, added offset 11
>>>> Time offset set 606, added offset 41
>>>> Time offset set -1974, added offset -2580
>>>> Time offset set 1626, added offset 3600
>>>> Time offset set 1579, added offset -47
>>>> Time offset set 1639, added offset 60
>>>>
>>>> It seems to add the same number of seconds to the offset as has passed since the last sync.  The offset just keeps on increasing, eventually resulting in huge numbers equating to days.  Occasionally the offset may jump a bit and go down but the general trend is up.  Although this does not affect the VM immediately, at some point I am guessing it syncs itself with the CMOS clock (which is now a large number of seconds offset from the actual time), resulting in a huge jump in time.  A reboot is a guaranteed way to get the new, incorrect time.
>>>>
>>>> Although I do not understand all of the underlying code, I presume the correct way this should work is it should be comparing the CMOS time that?s just been set with the hardware clock on the physical machine, resulting in an offset between the two.  This would result in a generally stable number (ideally 0).  Obviously it is incorrect behaviour for the number to keep going up.  To my mind it looks like it may be somehow getting an inaccurate time from the system (in many cases a fixed time rather than an up-to-date current time).
>>>>
>>>> Does anyone have any light they may be able to shed on this?  Is it possible it could be struggling to get an accurate time from the hardware?  I have checked on several occasions and both the system time and the BIOS clock are spot on.
>>>>
>>>
>>> Please paste the cfgfile of your HVM Windows.
>>>
>>> -- Pasi
>>>

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Ever increasing time offset for HVM domain / Huge amounts of drift
  2013-01-14 16:52       ` Phil Evans
  2013-01-14 16:54         ` Pasi Kärkkäinen
@ 2013-01-14 23:57         ` Dan Magenheimer
  1 sibling, 0 replies; 17+ messages in thread
From: Dan Magenheimer @ 2013-01-14 23:57 UTC (permalink / raw)
  To: Phil Evans, Pasi Kärkkäinen; +Cc: xen-devel

> From: Phil Evans [mailto:Phil.Evans@m247.com]
> Sent: Monday, January 14, 2013 9:53 AM
> To: Pasi Kärkkäinen
> Cc: xen-devel@lists.xen.org
> Subject: Re: [Xen-devel] Ever increasing time offset for HVM domain / Huge amounts of drift
> 
> I had tried messing with all but tsc_mode.  I have tried all options for tsc_mode as well now but
> there is no change.  The key thing here is that the drift is increasing at a rate of 1 second per
> second of uptime amounting to huge amounts.  It doesn't seem to be a clock skew issue, it seems to be
> something is simply not counting at all.

Just a wild idea... What Windows version are you running?  Historically,
I think, Windows has ignored TSC i.e. never issues a rdtsc nor writes
to TSC.  I wonder if the kernel of whatever version of Windows you are
running (or the NTP sync somehow with the blessing of the kernel) _is_
checking certain hardware settings, then writing to the guest's (virtual)
TSC and this might cause things to get very confused.

P.S. Avoiding top-posting can be difficult and annoying if
you are running some versions of Outlook as your mail client.
Google it to see how, if you'd like to avoid complaints
on this list.

> From: Pasi Kärkkäinen [pasik@iki.fi]
> Sent: 14 January 2013 16:30
> To: Phil Evans
> Cc: xen-devel@lists.xen.org
> Subject: Re: [Xen-devel] Ever increasing time offset for HVM domain / Huge amounts of drift
> 
> On Mon, Jan 14, 2013 at 04:17:39PM +0000, Phil Evans wrote:
> > Hi,
> >
> > Sorry I should have included that in the first place:
> >
> 
> Ok. Did you try experimenting with these options?:
> 
> timer_mode=X
> hpet=0|1
> tsc_mode=X
> 
> 
> -- Pasi
> 
> 
> > import os,re
> > arch = os.uname()[4]
> > kernel = '/usr/lib/xen-default/boot/hvmloader'
> > builder = 'hvm'
> > name = 'vm_141'
> > memory = '2048'
> > disk = ['phy:/dev/storage_node_2/disk_806,xvda,w','file:/control/isos/empty.iso,xvdd:cdrom,r']
> > vif = ['mac=00:16:3e:3f:2f:1a, bridge=vlan_369, vifname=vm_141.0, ip=89.238.190.22 2a02:40:501:3::2
> 2a02:40:501:3::5 89.238.190.88','mac=00:16:3e:67:66:71, bridge=vlan_4000, vifname=vm_141.1,
> ip=0.0.0.0','mac=00:16:3e:01:7e:e8, bridge=vlan_369, vifname=vm_141.2']
> > device_model = '/usr/lib/xen-default/bin/qemu-dm'
> > boot = 'cd'
> > vnc = 1
> > vncpasswd = 'YpD5aVZ8'
> > usbdevice = 'tablet'
> > acpi = 0
> > vcpus = 4
> > viridian = 1
> >
> > Thanks,
> > Phil.
> > ________________________________________
> > From: Pasi Kärkkäinen [pasik@iki.fi]
> > Sent: 14 January 2013 15:14
> > To: Phil Evans
> > Cc: xen-devel@lists.xen.org
> > Subject: Re: [Xen-devel] Ever increasing time offset for HVM domain / Huge amounts of drift
> >
> > On Mon, Jan 14, 2013 at 01:37:01PM +0000, Phil Evans wrote:
> > > Hi,
> > >
> > > I am currently running Xen 4.2.1 (this has also happened in 4.2.0 as well).  We have been having a
> major problem with sometimes huge amounts of clock drift in Windows VMs.  Sometimes the clock on a VM
> could suddenly jump by over a week (usually forwards, however time has been known to go backwards as
> well).
> > >
> > > Now I don?t profess to know the internals of Xen, however through my investigation I believe I
> have a degree of knowledge of what could be causing the problem.
> > >
> > > The steps to reproduce this (for me at least), is to simply do a manual NTP sync on a Windows VM.
> Upon monitoring the qemu-dm log file for the VM, I see similar to the following:
> > >
> > > Time offset set 489, added offset 480
> > > Time offset set 436, added offset -53
> > > Time offset set 496, added offset 60
> > > Time offset set 494, added offset -2
> > > Time offset set 554, added offset 60
> > > Time offset set 565, added offset 11
> > > Time offset set 606, added offset 41
> > > Time offset set -1974, added offset -2580
> > > Time offset set 1626, added offset 3600
> > > Time offset set 1579, added offset -47
> > > Time offset set 1639, added offset 60
> > >
> > > It seems to add the same number of seconds to the offset as has passed since the last sync.  The
> offset just keeps on increasing, eventually resulting in huge numbers equating to days.  Occasionally
> the offset may jump a bit and go down but the general trend is up.  Although this does not affect the
> VM immediately, at some point I am guessing it syncs itself with the CMOS clock (which is now a large
> number of seconds offset from the actual time), resulting in a huge jump in time.  A reboot is a
> guaranteed way to get the new, incorrect time.
> > >
> > > Although I do not understand all of the underlying code, I presume the correct way this should
> work is it should be comparing the CMOS time that?s just been set with the hardware clock on the
> physical machine, resulting in an offset between the two.  This would result in a generally stable
> number (ideally 0).  Obviously it is incorrect behaviour for the number to keep going up.  To my mind
> it looks like it may be somehow getting an inaccurate time from the system (in many cases a fixed time
> rather than an up-to-date current time).
> > >
> > > Does anyone have any light they may be able to shed on this?  Is it possible it could be
> struggling to get an accurate time from the hardware?  I have checked on several occasions and both
> the system time and the BIOS clock are spot on.
> > >
> >
> > Please paste the cfgfile of your HVM Windows.
> >
> > -- Pasi
> >
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Ever increasing time offset for HVM domain / Huge amounts of drift
  2013-01-14 13:37 Ever increasing time offset for HVM domain / Huge amounts of drift Phil Evans
  2013-01-14 15:14 ` Pasi Kärkkäinen
@ 2013-01-17 13:55 ` Tim Deegan
  2013-01-17 16:41   ` [PATCH] " Tim Deegan
  1 sibling, 1 reply; 17+ messages in thread
From: Tim Deegan @ 2013-01-17 13:55 UTC (permalink / raw)
  To: Phil Evans; +Cc: Yang Zhang, Jan Beulich, xen-devel

Hi, 

At 13:37 +0000 on 14 Jan (1358170621), Phil Evans wrote:
> I am currently running Xen 4.2.1 (this has also happened in 4.2.0 as
> well).  We have been having a major problem with sometimes huge
> amounts of clock drift in Windows VMs.  Sometimes the clock on a VM
> could suddenly jump by over a week (usually forwards, however time has
> been known to go backwards as well).
> 
> The steps to reproduce this (for me at least), is to simply do a
> manual NTP sync on a Windows VM.  Upon monitoring the qemu-dm log file
> for the VM, I see similar to the following:

Which version of Windows are you using for this?
Did you see this on older (4.1.x) Xen versions?

> Time offset set 489, added offset 480
> Time offset set 436, added offset -53
> Time offset set 496, added offset 60
> Time offset set 494, added offset -2
> Time offset set 554, added offset 60
> Time offset set 565, added offset 11
> Time offset set 606, added offset 41
> Time offset set -1974, added offset -2580
> Time offset set 1626, added offset 3600
> Time offset set 1579, added offset -47
> Time offset set 1639, added offset 60
> 
> It seems to add the same number of seconds to the offset as has passed
> since the last sync.

This printout is from some code that gets given a _change_ in time
offset from Xen; it prints out the new value and the change, so they
should always add up like that.

But yes, it's striking that the VM is (mostly) drifting forward over
time. 

>  The offset just keeps on increasing, eventually
> resulting in huge numbers equating to days.  Occasionally the offset
> may jump a bit and go down but the general trend is up.  Although this
> does not affect the VM immediately, at some point I am guessing it
> syncs itself with the CMOS clock (which is now a large number of
> seconds offset from the actual time), resulting in a huge jump in
> time.  A reboot is a guaranteed way to get the new, incorrect time.

That makes sense; if the RTC is being set to the wrong time, a reboot
will copy the error into the new OS time. 

> Although I do not understand all of the underlying code, I presume the
> correct way this should work is it should be comparing the CMOS time
> that?s just been set with the hardware clock on the physical machine,
> resulting in an offset between the two.

More or less.  In fact IIRC it compares it with current CMOS time, and
propagates the difference into an offset from hardware-clock.

It's possible that the code to calculate 'current CMOS time' for a VM is
buggy -- that was changed in 4.2.  Cc'ing the people who touched that
code in 4.2 for their opinions.

Tim.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCH] Re: Ever increasing time offset for HVM domain / Huge amounts of drift
  2013-01-17 13:55 ` Tim Deegan
@ 2013-01-17 16:41   ` Tim Deegan
  2013-01-17 17:02     ` Jan Beulich
                       ` (2 more replies)
  0 siblings, 3 replies; 17+ messages in thread
From: Tim Deegan @ 2013-01-17 16:41 UTC (permalink / raw)
  To: Phil Evans; +Cc: Yang Zhang, Jan Beulich, xen-devel

[-- Attachment #1: Type: text/plain, Size: 455 bytes --]

At 13:55 +0000 on 17 Jan (1358430927), Tim Deegan wrote:
> It's possible that the code to calculate 'current CMOS time' for a VM is
> buggy -- that was changed in 4.2.  Cc'ing the people who touched that
> code in 4.2 for their opinions.

In fact I think it's the code that handles CMOS writes.  The attached
patch fixes the issue for me; can you try it on your system?

(Jan, this is a candidate for applying to 4.2 as well as unstable.)

Cheers,

Tim.
 

[-- Attachment #2: x --]
[-- Type: text/plain, Size: 1193 bytes --]

x86/hvm: fix RTC setting.

When the guest writes one field of the RTC time, we must bring all the
other fields up to date for the current second before calculating the
new RTC time.

Signed-off-by: Tim Deegan <tim@xen.org> 

diff -r d81f9832a082 xen/arch/x86/hvm/rtc.c
--- a/xen/arch/x86/hvm/rtc.c	Thu Jan 17 15:55:02 2013 +0000
+++ b/xen/arch/x86/hvm/rtc.c	Thu Jan 17 16:31:09 2013 +0000
@@ -399,10 +399,17 @@ static int rtc_ioport_write(void *opaque
     case RTC_DAY_OF_MONTH:
     case RTC_MONTH:
     case RTC_YEAR:
-        s->hw.cmos_data[s->hw.cmos_index] = data;
-        /* if in set mode, do not update the time */
-        if ( !(s->hw.cmos_data[RTC_REG_B] & RTC_SET) )
+        /* if in set mode, just write the register */
+        if ( (s->hw.cmos_data[RTC_REG_B] & RTC_SET) )
+            s->hw.cmos_data[s->hw.cmos_index] = data;
+        else
+        {
+            /* Fetch the current time and update just this field. */
+            s->current_tm = gmtime(get_localtime(d));
+            rtc_copy_date(s);
+            s->hw.cmos_data[s->hw.cmos_index] = data;
             rtc_set_time(s);
+        }
         alarm_timer_update(s);
         break;
     case RTC_REG_A:

[-- Attachment #3: Type: text/plain, Size: 126 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCH] Re: Ever increasing time offset for HVM domain / Huge amounts of drift
  2013-01-17 16:41   ` [PATCH] " Tim Deegan
@ 2013-01-17 17:02     ` Jan Beulich
  2013-01-17 17:13       ` Tim Deegan
  2013-01-18  1:45     ` Zhang, Yang Z
  2013-01-18 11:17     ` Phil Evans
  2 siblings, 1 reply; 17+ messages in thread
From: Jan Beulich @ 2013-01-17 17:02 UTC (permalink / raw)
  To: Phil Evans, Tim Deegan; +Cc: Yang Zhang, xen-devel

>>> On 17.01.13 at 17:41, Tim Deegan <tim@xen.org> wrote:
> At 13:55 +0000 on 17 Jan (1358430927), Tim Deegan wrote:
>> It's possible that the code to calculate 'current CMOS time' for a VM is
>> buggy -- that was changed in 4.2.  Cc'ing the people who touched that
>> code in 4.2 for their opinions.
> 
> In fact I think it's the code that handles CMOS writes.  The attached
> patch fixes the issue for me; can you try it on your system?

Looks plausible, so feel free to put my ack on it if it also helps
Phil.

> (Jan, this is a candidate for applying to 4.2 as well as unstable.)

I agree, but the code you touch here hasn't been changed for
years (and even the involved helper functions changed only
marginally between 4.1 and 4.2), so wouldn't that have been
a problem for much longer? In which case it ought to also go
into 4.1?

Jan

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] Re: Ever increasing time offset for HVM domain / Huge amounts of drift
  2013-01-17 17:02     ` Jan Beulich
@ 2013-01-17 17:13       ` Tim Deegan
  0 siblings, 0 replies; 17+ messages in thread
From: Tim Deegan @ 2013-01-17 17:13 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Yang Zhang, Phil Evans, xen-devel

At 17:02 +0000 on 17 Jan (1358442159), Jan Beulich wrote:
> >>> On 17.01.13 at 17:41, Tim Deegan <tim@xen.org> wrote:
> > At 13:55 +0000 on 17 Jan (1358430927), Tim Deegan wrote:
> >> It's possible that the code to calculate 'current CMOS time' for a VM is
> >> buggy -- that was changed in 4.2.  Cc'ing the people who touched that
> >> code in 4.2 for their opinions.
> > 
> > In fact I think it's the code that handles CMOS writes.  The attached
> > patch fixes the issue for me; can you try it on your system?
> 
> Looks plausible, so feel free to put my ack on it if it also helps
> Phil.
> 
> > (Jan, this is a candidate for applying to 4.2 as well as unstable.)
> 
> I agree, but the code you touch here hasn't been changed for
> years (and even the involved helper functions changed only
> marginally between 4.1 and 4.2), so wouldn't that have been
> a problem for much longer? In which case it ought to also go
> into 4.1?

The bug was introduced in 24974:6deb0b626f3f, when the 1-second timer
that used to keep those registers up-to-date was removed; AFAICT that's
only in 4.2.

Tim.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] Re: Ever increasing time offset for HVM domain / Huge amounts of drift
  2013-01-17 16:41   ` [PATCH] " Tim Deegan
  2013-01-17 17:02     ` Jan Beulich
@ 2013-01-18  1:45     ` Zhang, Yang Z
  2013-01-18 11:17     ` Phil Evans
  2 siblings, 0 replies; 17+ messages in thread
From: Zhang, Yang Z @ 2013-01-18  1:45 UTC (permalink / raw)
  To: Tim Deegan, Phil Evans; +Cc: Jan Beulich, xen-devel

Tim Deegan wrote on 2013-01-18:
> At 13:55 +0000 on 17 Jan (1358430927), Tim Deegan wrote:
>> It's possible that the code to calculate 'current CMOS time' for a VM is
>> buggy -- that was changed in 4.2.  Cc'ing the people who touched that
>> code in 4.2 for their opinions.
> 
> In fact I think it's the code that handles CMOS writes.  The attached
> patch fixes the issue for me; can you try it on your system?
Right. Must to renew current CMOS before writing.

Best regards,
Yang

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] Re: Ever increasing time offset for HVM domain / Huge amounts of drift
  2013-01-17 16:41   ` [PATCH] " Tim Deegan
  2013-01-17 17:02     ` Jan Beulich
  2013-01-18  1:45     ` Zhang, Yang Z
@ 2013-01-18 11:17     ` Phil Evans
  2013-01-24 13:38       ` Pasi Kärkkäinen
  2 siblings, 1 reply; 17+ messages in thread
From: Phil Evans @ 2013-01-18 11:17 UTC (permalink / raw)
  To: Tim Deegan; +Cc: Yang Zhang, Jan Beulich, xen-devel

On Thu, 17 Jan 2013 16:41:13 +0000, Tim Deegan <tim@xen.org> wrote:
> At 13:55 +0000 on 17 Jan (1358430927), Tim Deegan wrote:
>> It's possible that the code to calculate 'current CMOS time' for a VM
is
>> buggy -- that was changed in 4.2.  Cc'ing the people who touched that
>> code in 4.2 for their opinions.
> 
> In fact I think it's the code that handles CMOS writes.  The attached
> patch fixes the issue for me; can you try it on your system?

I have tried the patch and it fixes the issue for me as well.

> (Jan, this is a candidate for applying to 4.2 as well as unstable.)
> 
> Cheers,
> 
> Tim.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] Re: Ever increasing time offset for HVM domain / Huge amounts of drift
  2013-01-18 11:17     ` Phil Evans
@ 2013-01-24 13:38       ` Pasi Kärkkäinen
  2013-01-24 13:56         ` Jan Beulich
  0 siblings, 1 reply; 17+ messages in thread
From: Pasi Kärkkäinen @ 2013-01-24 13:38 UTC (permalink / raw)
  To: Phil Evans; +Cc: Yang Zhang, Tim Deegan, Jan Beulich, xen-devel

On Fri, Jan 18, 2013 at 11:17:14AM +0000, Phil Evans wrote:
> On Thu, 17 Jan 2013 16:41:13 +0000, Tim Deegan <tim@xen.org> wrote:
> > At 13:55 +0000 on 17 Jan (1358430927), Tim Deegan wrote:
> >> It's possible that the code to calculate 'current CMOS time' for a VM
> is
> >> buggy -- that was changed in 4.2.  Cc'ing the people who touched that
> >> code in 4.2 for their opinions.
> > 
> > In fact I think it's the code that handles CMOS writes.  The attached
> > patch fixes the issue for me; can you try it on your system?
> 
> I have tried the patch and it fixes the issue for me as well.
> 
> > (Jan, this is a candidate for applying to 4.2 as well as unstable.)
> >

Hello,

Jan: Is this patch in your list to apply to xen-4.2-testing.hg aswell? 

-- Pasi

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] Re: Ever increasing time offset for HVM domain / Huge amounts of drift
  2013-01-24 13:38       ` Pasi Kärkkäinen
@ 2013-01-24 13:56         ` Jan Beulich
  0 siblings, 0 replies; 17+ messages in thread
From: Jan Beulich @ 2013-01-24 13:56 UTC (permalink / raw)
  To: Pasi Kärkkäinen; +Cc: Yang Zhang, Tim Deegan, Phil Evans, xen-devel

>>> On 24.01.13 at 14:38, Pasi Kärkkäinen<pasik@iki.fi> wrote:
> On Fri, Jan 18, 2013 at 11:17:14AM +0000, Phil Evans wrote:
>> On Thu, 17 Jan 2013 16:41:13 +0000, Tim Deegan <tim@xen.org> wrote:
>> > At 13:55 +0000 on 17 Jan (1358430927), Tim Deegan wrote:
>> >> It's possible that the code to calculate 'current CMOS time' for a VM
>> is
>> >> buggy -- that was changed in 4.2.  Cc'ing the people who touched that
>> >> code in 4.2 for their opinions.
>> > 
>> > In fact I think it's the code that handles CMOS writes.  The attached
>> > patch fixes the issue for me; can you try it on your system?
>> 
>> I have tried the patch and it fixes the issue for me as well.
>> 
>> > (Jan, this is a candidate for applying to 4.2 as well as unstable.)
>> >
> 
> Jan: Is this patch in your list to apply to xen-4.2-testing.hg aswell? 

Yes, of course. I merely want this to go through our 4.2 based
trees before pushing it out, as there's no rush currently.

Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Ever increasing time offset for HVM domain / Huge amounts of drift
@ 2013-01-10 18:27 Phil Evans
  0 siblings, 0 replies; 17+ messages in thread
From: Phil Evans @ 2013-01-10 18:27 UTC (permalink / raw)
  To: xen-devel


[-- Attachment #1.1: Type: text/plain, Size: 2445 bytes --]

Hi,

I am currently running Xen 4.2.1 (this has also happened in 4.2.0 as well).  We have been having a major problem with sometimes huge amounts of clock drift in Windows VMs.  Sometimes the clock on a VM could suddenly jump by over a week (usually forwards, however time has been known to go backwards as well).

Now I don't profess to know the internals of Xen, however through my investigation I believe I have a degree of knowledge of what could be causing the problem.

The steps to reproduce this (for me at least), is to simply do a manual NTP sync on a Windows VM.  Upon monitoring the qemu-dm log file for the VM, I see similar to the following:

Time offset set 489, added offset 480
Time offset set 436, added offset -53
Time offset set 496, added offset 60
Time offset set 494, added offset -2
Time offset set 554, added offset 60
Time offset set 565, added offset 11
Time offset set 606, added offset 41
Time offset set -1974, added offset -2580
Time offset set 1626, added offset 3600
Time offset set 1579, added offset -47
Time offset set 1639, added offset 60

It seems to add the same number of seconds to the offset as has passed since the last sync.  The offset just keeps on increasing, eventually resulting in huge numbers equating to days.  Occasionally the offset may jump a bit and go down but the general trend is up.  Although this does not affect the VM immediately, at some point I am guessing it syncs itself with the CMOS clock (which is now a large number of seconds offset from the actual time), resulting in a huge jump in time.  A reboot is a guaranteed way to get the new, incorrect time.

Although I do not understand all of the underlying code, I presume the correct way this should work is it should be comparing the CMOS time that's just been set with the hardware clock on the physical machine, resulting in an offset between the two.  This would result in a generally stable number (ideally 0).  Obviously it is incorrect behaviour for the number to keep going up.  To my mind it looks like it may be somehow getting an inaccurate time from the system (in many cases a fixed time rather than an up-to-date current time).

Does anyone have any light they may be able to shed on this?  Is it possible it could be struggling to get an accurate time from the hardware?  I have checked on several occasions and both the system time and the BIOS clock are spot on.

Regards,
Phil.

[-- Attachment #1.2: Type: text/html, Size: 5270 bytes --]

[-- Attachment #2: Type: text/plain, Size: 126 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2013-01-24 13:56 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-01-14 13:37 Ever increasing time offset for HVM domain / Huge amounts of drift Phil Evans
2013-01-14 15:14 ` Pasi Kärkkäinen
2013-01-14 16:17   ` Phil Evans
2013-01-14 16:30     ` Pasi Kärkkäinen
2013-01-14 16:52       ` Phil Evans
2013-01-14 16:54         ` Pasi Kärkkäinen
2013-01-14 17:10           ` Phil Evans
2013-01-14 23:57         ` Dan Magenheimer
2013-01-17 13:55 ` Tim Deegan
2013-01-17 16:41   ` [PATCH] " Tim Deegan
2013-01-17 17:02     ` Jan Beulich
2013-01-17 17:13       ` Tim Deegan
2013-01-18  1:45     ` Zhang, Yang Z
2013-01-18 11:17     ` Phil Evans
2013-01-24 13:38       ` Pasi Kärkkäinen
2013-01-24 13:56         ` Jan Beulich
  -- strict thread matches above, loose matches on Subject: below --
2013-01-10 18:27 Phil Evans

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.