All of lore.kernel.org
 help / color / mirror / Atom feed
* Timekeeping on ARM guests/hosts
@ 2018-10-09 23:39 Miriam Zimmerman
  2018-10-10 10:01 ` Marc Zyngier
  0 siblings, 1 reply; 21+ messages in thread
From: Miriam Zimmerman @ 2018-10-09 23:39 UTC (permalink / raw)
  To: kvmarm


[-- Attachment #1.1: Type: text/plain, Size: 463 bytes --]

Hi,

I'm working with an ARM device hosting an ARM guest. When the host is
suspended, guest time stops advancing and it doesn't get adjusted on resume.

For an x86 machine, the CONFIG_KVM_GUEST flag would enable paravirt for
time and fix this problem, but CONFIG_KVM_GUEST isn't available on ARM.

Is there a configuration option to enable paravirtualized timekeeping on
ARM? If not, how can I configure ARM guests to handle timekeeping properly?

Thanks,
Miriam

[-- Attachment #1.2: Type: text/html, Size: 574 bytes --]

[-- Attachment #2: Type: text/plain, Size: 151 bytes --]

_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Timekeeping on ARM guests/hosts
  2018-10-09 23:39 Timekeeping on ARM guests/hosts Miriam Zimmerman
@ 2018-10-10 10:01 ` Marc Zyngier
  2018-10-10 18:38   ` Miriam Zimmerman
  0 siblings, 1 reply; 21+ messages in thread
From: Marc Zyngier @ 2018-10-10 10:01 UTC (permalink / raw)
  To: Miriam Zimmerman, kvmarm

Hi Myriam,

On 10/10/18 00:39, Miriam Zimmerman wrote:
> Hi,
> 
> I'm working with an ARM device hosting an ARM guest. When the host is 
> suspended, guest time stops advancing and it doesn't get adjusted on resume.

I know the feeling, my arm64 laptop gives me that kind of grief all the 
time... :-/

> For an x86 machine, the CONFIG_KVM_GUEST flag would enable paravirt for 
> time and fix this problem, but CONFIG_KVM_GUEST isn't available on ARM.
> 
> Is there a configuration option to enable paravirtualized timekeeping on 
> ARM? If not, how can I configure ARM guests to handle timekeeping properly?

PV time (or rather stolen time) is a work in progress at the moment, and 
Christoffer has his hands in that particular pie.

Thanks,

	M.
-- 
Jazz is not dead. It just smells funny...

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Timekeeping on ARM guests/hosts
  2018-10-10 10:01 ` Marc Zyngier
@ 2018-10-10 18:38   ` Miriam Zimmerman
  2018-10-11  7:54     ` Marc Zyngier
  0 siblings, 1 reply; 21+ messages in thread
From: Miriam Zimmerman @ 2018-10-10 18:38 UTC (permalink / raw)
  To: marc.zyngier; +Cc: kvmarm

(oops, sorry for lack of plaintext in the first email. must've
forgotten to click the button in my email client)

Until that happens, what's the best workaround? Just running an ntp
daemon in guest?

On Wed, Oct 10, 2018 at 3:01 AM Marc Zyngier <marc.zyngier@arm.com> wrote:
>
> Hi Myriam,
>
> On 10/10/18 00:39, Miriam Zimmerman wrote:
> > Hi,
> >
> > I'm working with an ARM device hosting an ARM guest. When the host is
> > suspended, guest time stops advancing and it doesn't get adjusted on resume.
>
> I know the feeling, my arm64 laptop gives me that kind of grief all the
> time... :-/
>
> > For an x86 machine, the CONFIG_KVM_GUEST flag would enable paravirt for
> > time and fix this problem, but CONFIG_KVM_GUEST isn't available on ARM.
> >
> > Is there a configuration option to enable paravirtualized timekeeping on
> > ARM? If not, how can I configure ARM guests to handle timekeeping properly?
>
> PV time (or rather stolen time) is a work in progress at the moment, and
> Christoffer has his hands in that particular pie.
>
> Thanks,
>
>         M.
> --
> Jazz is not dead. It just smells funny...

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Timekeeping on ARM guests/hosts
  2018-10-10 18:38   ` Miriam Zimmerman
@ 2018-10-11  7:54     ` Marc Zyngier
  2018-10-11 15:21       ` Laszlo Ersek
  0 siblings, 1 reply; 21+ messages in thread
From: Marc Zyngier @ 2018-10-11  7:54 UTC (permalink / raw)
  To: Miriam Zimmerman; +Cc: kvmarm

Hi Miriam,

On Wed, 10 Oct 2018 19:38:47 +0100,
Miriam Zimmerman <mutexlox@google.com> wrote:
> 
> (oops, sorry for lack of plaintext in the first email. must've
> forgotten to click the button in my email client)
> 
> Until that happens, what's the best workaround? Just running an ntp
> daemon in guest?

Christoffer reminded me yesterday that stolen time accounting only
affects scheduling, and is not evaluated for

An NTP daemon may not be the best course of action, as the guest is
going to see a massive jump anyway, which most NTP implementations are
not design to handle (they rightly assume that something else is
wrong). It would also mean that you'd have to run a NTP server
somewhere on the host, as you cannot always assume full connectivity.

A popular way to solve this seems to be using the QEMU guest agent,
but I must admit I never really investigated that side of the problem.

I'm quite curious of how this is done on x86 though. KVM_GUEST mostly
seems to give the guest a PV clocksource, which is not going to help in
terms of wall clock. Do you have any idea?

Thanks,

	M.

> 
> On Wed, Oct 10, 2018 at 3:01 AM Marc Zyngier <marc.zyngier@arm.com> wrote:
> >
> > Hi Myriam,
> >
> > On 10/10/18 00:39, Miriam Zimmerman wrote:
> > > Hi,
> > >
> > > I'm working with an ARM device hosting an ARM guest. When the host is
> > > suspended, guest time stops advancing and it doesn't get adjusted on resume.
> >
> > I know the feeling, my arm64 laptop gives me that kind of grief all the
> > time... :-/
> >
> > > For an x86 machine, the CONFIG_KVM_GUEST flag would enable paravirt for
> > > time and fix this problem, but CONFIG_KVM_GUEST isn't available on ARM.
> > >
> > > Is there a configuration option to enable paravirtualized timekeeping on
> > > ARM? If not, how can I configure ARM guests to handle timekeeping properly?
> >
> > PV time (or rather stolen time) is a work in progress at the moment, and
> > Christoffer has his hands in that particular pie.
> >
> > Thanks,
> >
> >         M.
> > --
> > Jazz is not dead. It just smells funny...

-- 
Jazz is not dead, it just smell funny.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Timekeeping on ARM guests/hosts
  2018-10-11  7:54     ` Marc Zyngier
@ 2018-10-11 15:21       ` Laszlo Ersek
  2018-10-11 18:40         ` Miriam Zimmerman
  0 siblings, 1 reply; 21+ messages in thread
From: Laszlo Ersek @ 2018-10-11 15:21 UTC (permalink / raw)
  To: Marc Zyngier, Miriam Zimmerman; +Cc: kvmarm

On 10/11/18 09:54, Marc Zyngier wrote:
> Hi Miriam,
> 
> On Wed, 10 Oct 2018 19:38:47 +0100,
> Miriam Zimmerman <mutexlox@google.com> wrote:
>>
>> (oops, sorry for lack of plaintext in the first email. must've
>> forgotten to click the button in my email client)
>>
>> Until that happens, what's the best workaround? Just running an ntp
>> daemon in guest?
> 
> Christoffer reminded me yesterday that stolen time accounting only
> affects scheduling, and is not evaluated for
> 
> An NTP daemon may not be the best course of action, as the guest is
> going to see a massive jump anyway, which most NTP implementations are
> not design to handle (they rightly assume that something else is
> wrong). It would also mean that you'd have to run a NTP server
> somewhere on the host, as you cannot always assume full connectivity.
> 
> A popular way to solve this seems to be using the QEMU guest agent,
> but I must admit I never really investigated that side of the problem.

The guest agent method is documented here, for example:

https://git.qemu.org/?p=qemu.git;a=blob;f=qga/qapi-schema.json;h=dfbc4a5e32bde4070f12497c23973c604accfa7d;hb=v3.0.0#l128

and IIRC it is exposed (for example) via "virsh domtime" to the libvirt
user (or to higher level mgmt tools).

I suspect though that the guest agent method might introduce the same
kind of jump to the guest clock.

> I'm quite curious of how this is done on x86 though. KVM_GUEST mostly
> seems to give the guest a PV clocksource, which is not going to help in
> terms of wall clock. Do you have any idea?

I've seen this question raised, specifically wrt. x86, with people
closing their laptops' lids, and their guests losing correct track of
time. AIUI, there is no easy answer. (I was surprised to see Miriam's
initial statement that CONFIG_KVM_GUEST had solved it.) Some references:

https://bugs.launchpad.net/qemu/+bug/1174654
https://bugzilla.redhat.com/show_bug.cgi?id=1352992
https://bugzilla.redhat.com/show_bug.cgi?id=1380893

I'll spare you the verbatim quoting of the emails that I produced back
then :) ; a summary of workarounds is:

* Before you suspend the host, suspend the guest first. This way the
guest will not be surprised when it sees the physical clock (= whatever
it thinks is a physical clock) jump. Another benefit is that, if the
host fails to resume for some reason, data loss on the VM disks should
be reasonably unlikely, because when the guest suspends, it will flush
its stuff first.

* Use "-rtc clock=vm" on the QEMU command line. (Equivalently, use
<timer name='rtc' track='guest'/> in the libvirt domain XML.) See the
QEMU manual, and the libvirt domain XML manual on these. Those settings
decouple the guest's RTC from the host's time, bringing both benefits
(no jumps in guest time) and drawbacks (the timelines diverge).

* Also, I've heard rumors that libvirtd might put a suspend inhibitor in
place (on the host) while some VMs are running. ("Suspend inhibitor" is
a SystemD term, I think.) Not sure how/if that works in practice; either
way it would solve the issue from a different perspective (namely, you
couldn't suspend the host).


Obviously I'm not trying to speak on this with any kind of "authority",
so take it FWIW. I happen to be a fan of the first option (manual guest
suspend).

Thanks,
Laszlo

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Timekeeping on ARM guests/hosts
  2018-10-11 15:21       ` Laszlo Ersek
@ 2018-10-11 18:40         ` Miriam Zimmerman
  2018-10-31 16:41           ` Steven Price
  0 siblings, 1 reply; 21+ messages in thread
From: Miriam Zimmerman @ 2018-10-11 18:40 UTC (permalink / raw)
  To: lersek; +Cc: marc.zyngier, kvmarm

On Thu, Oct 11, 2018 at 8:21 AM Laszlo Ersek <lersek@redhat.com> wrote:
>
> On 10/11/18 09:54, Marc Zyngier wrote:
> > Hi Miriam,
> >
> > On Wed, 10 Oct 2018 19:38:47 +0100,
> > Miriam Zimmerman <mutexlox@google.com> wrote:
> >>
> >> (oops, sorry for lack of plaintext in the first email. must've
> >> forgotten to click the button in my email client)
> >>
> >> Until that happens, what's the best workaround? Just running an ntp
> >> daemon in guest?
> >
> > Christoffer reminded me yesterday that stolen time accounting only
> > affects scheduling, and is not evaluated for
> >
> > An NTP daemon may not be the best course of action, as the guest is
> > going to see a massive jump anyway, which most NTP implementations are
> > not design to handle (they rightly assume that something else is
> > wrong). It would also mean that you'd have to run a NTP server
> > somewhere on the host, as you cannot always assume full connectivity.
> >
> > A popular way to solve this seems to be using the QEMU guest agent,
> > but I must admit I never really investigated that side of the problem.
>
> The guest agent method is documented here, for example:
>
> https://git.qemu.org/?p=qemu.git;a=blob;f=qga/qapi-schema.json;h=dfbc4a5e32bde4070f12497c23973c604accfa7d;hb=v3.0.0#l128
>
> and IIRC it is exposed (for example) via "virsh domtime" to the libvirt
> user (or to higher level mgmt tools).
>
> I suspect though that the guest agent method might introduce the same
> kind of jump to the guest clock.
>
> > I'm quite curious of how this is done on x86 though. KVM_GUEST mostly
> > seems to give the guest a PV clocksource, which is not going to help in
> > terms of wall clock. Do you have any idea?
>
> I've seen this question raised, specifically wrt. x86, with people
> closing their laptops' lids, and their guests losing correct track of
> time. AIUI, there is no easy answer. (I was surprised to see Miriam's
> initial statement that CONFIG_KVM_GUEST had solved it.) Some references:

Interesting; I haven't dug too much into the specifics of how the
timekeeping works, but I just did a quick experiment: I took two
laptops (one ARM and one x86) next to each other, ran "date" in VMs in
both, closed them for a few minutes, then reopened them and ran "date"
again. The x86 laptop had the correct time, whereas the ARM laptop
guest had (approximately) the same time as when I closed it.

I'm guessing this behavior is implemented in either
arch/x86/kernel/kvmclock.c or arch/x86/kernel/pvclock.c, but I'll
confess that I've only skimmed those.

I'll investigate how this works on x86 a bit. My plan had been to
workaround by using a guest agent that receives the correct wallclock
time on resume and adjusts the VM's clock as appropriate, but the
suspend option seems like a pretty good idea.

> https://bugs.launchpad.net/qemu/+bug/1174654
> https://bugzilla.redhat.com/show_bug.cgi?id=1352992
> https://bugzilla.redhat.com/show_bug.cgi?id=1380893
>
> I'll spare you the verbatim quoting of the emails that I produced back
> then :) ; a summary of workarounds is:
>
> * Before you suspend the host, suspend the guest first. This way the
> guest will not be surprised when it sees the physical clock (= whatever
> it thinks is a physical clock) jump. Another benefit is that, if the
> host fails to resume for some reason, data loss on the VM disks should
> be reasonably unlikely, because when the guest suspends, it will flush
> its stuff first.
>
> * Use "-rtc clock=vm" on the QEMU command line. (Equivalently, use
> <timer name='rtc' track='guest'/> in the libvirt domain XML.) See the
> QEMU manual, and the libvirt domain XML manual on these. Those settings
> decouple the guest's RTC from the host's time, bringing both benefits
> (no jumps in guest time) and drawbacks (the timelines diverge).
>
> * Also, I've heard rumors that libvirtd might put a suspend inhibitor in
> place (on the host) while some VMs are running. ("Suspend inhibitor" is
> a SystemD term, I think.) Not sure how/if that works in practice; either
> way it would solve the issue from a different perspective (namely, you
> couldn't suspend the host).
>
>
> Obviously I'm not trying to speak on this with any kind of "authority",
> so take it FWIW. I happen to be a fan of the first option (manual guest
> suspend).
>
> Thanks,
> Laszlo

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Timekeeping on ARM guests/hosts
  2018-10-11 18:40         ` Miriam Zimmerman
@ 2018-10-31 16:41           ` Steven Price
  2018-10-31 18:49             ` Miriam Zimmerman
  0 siblings, 1 reply; 21+ messages in thread
From: Steven Price @ 2018-10-31 16:41 UTC (permalink / raw)
  To: Miriam Zimmerman, lersek; +Cc: marc.zyngier, kvmarm

On 11/10/2018 19:40, Miriam Zimmerman wrote:
> On Thu, Oct 11, 2018 at 8:21 AM Laszlo Ersek <lersek@redhat.com> wrote:
>>
>> On 10/11/18 09:54, Marc Zyngier wrote:
>>> Hi Miriam,
>>>
>>> On Wed, 10 Oct 2018 19:38:47 +0100,
>>> Miriam Zimmerman <mutexlox@google.com> wrote:
>>>>
>>>> (oops, sorry for lack of plaintext in the first email. must've
>>>> forgotten to click the button in my email client)
>>>>
>>>> Until that happens, what's the best workaround? Just running an ntp
>>>> daemon in guest?
>>>
>>> Christoffer reminded me yesterday that stolen time accounting only
>>> affects scheduling, and is not evaluated for
>>>
>>> An NTP daemon may not be the best course of action, as the guest is
>>> going to see a massive jump anyway, which most NTP implementations are
>>> not design to handle (they rightly assume that something else is
>>> wrong). It would also mean that you'd have to run a NTP server
>>> somewhere on the host, as you cannot always assume full connectivity.
>>>
>>> A popular way to solve this seems to be using the QEMU guest agent,
>>> but I must admit I never really investigated that side of the problem.
>>
>> The guest agent method is documented here, for example:
>>
>> https://git.qemu.org/?p=qemu.git;a=blob;f=qga/qapi-schema.json;h=dfbc4a5e32bde4070f12497c23973c604accfa7d;hb=v3.0.0#l128
>>
>> and IIRC it is exposed (for example) via "virsh domtime" to the libvirt
>> user (or to higher level mgmt tools).
>>
>> I suspect though that the guest agent method might introduce the same
>> kind of jump to the guest clock.
>>
>>> I'm quite curious of how this is done on x86 though. KVM_GUEST mostly
>>> seems to give the guest a PV clocksource, which is not going to help in
>>> terms of wall clock. Do you have any idea?
>>
>> I've seen this question raised, specifically wrt. x86, with people
>> closing their laptops' lids, and their guests losing correct track of
>> time. AIUI, there is no easy answer. (I was surprised to see Miriam's
>> initial statement that CONFIG_KVM_GUEST had solved it.) Some references:
> 
> Interesting; I haven't dug too much into the specifics of how the
> timekeeping works, but I just did a quick experiment: I took two
> laptops (one ARM and one x86) next to each other, ran "date" in VMs in
> both, closed them for a few minutes, then reopened them and ran "date"
> again. The x86 laptop had the correct time, whereas the ARM laptop
> guest had (approximately) the same time as when I closed it.

Interesting, since I see the opposite behaviour with kvmtool and
pause/resume commands. When running on x86 time (from 'date') freezes on
x86 (without CONFIG_KVM_GUEST), but skips time on Arm.

CONFIG_KVM_GUEST fixes this by providing the guest with information on
the host time and informs it when the guest is paused (see
MSR_KVM_SYSTEM_TIME_NEW). Arm doesn't (yet) have para-virtualised time.

On Arm, as far as I know, the guest's view of time is purely from the
virtual counter. Since nothing saves/restores this during the pause, the
counter continues to increment and the jump in time is visible to the guest.

Whether the guest sees time progress depends on what happens to that
counter during suspend. kvmtool pause/resume simply prevents the vCPU
threads from continuing, but the system counter is still running.

If you save the state of a VM to a file then the counter value is
saved/restored so the guest won't see any change in time.

The Arm ARM doesn't say a great deal about power saving modes so I
wouldn't be surprised if there's differing behaviour as to whether the
system clock is stopped during suspend modes. Indeed I wonder if the
clock can even go backwards during a suspend-to-disk/resume cycle? I
don't have hardware handy to test this.

I was thinking about changing the Arm behaviour to save/restore the
value of the system clock during host suspend which would at least make
the behaviour consistent.

Also related is that Arm doesn't have PV support for getting the clock
from the host. x86 has MSR_KVM_WALL_CLOCK_NEW to report the wall clock
time from the host.

Steve

> I'm guessing this behavior is implemented in either
> arch/x86/kernel/kvmclock.c or arch/x86/kernel/pvclock.c, but I'll
> confess that I've only skimmed those.
> 
> I'll investigate how this works on x86 a bit. My plan had been to
> workaround by using a guest agent that receives the correct wallclock
> time on resume and adjusts the VM's clock as appropriate, but the
> suspend option seems like a pretty good idea.
> 
>> https://bugs.launchpad.net/qemu/+bug/1174654
>> https://bugzilla.redhat.com/show_bug.cgi?id=1352992
>> https://bugzilla.redhat.com/show_bug.cgi?id=1380893
>>
>> I'll spare you the verbatim quoting of the emails that I produced back
>> then :) ; a summary of workarounds is:
>>
>> * Before you suspend the host, suspend the guest first. This way the
>> guest will not be surprised when it sees the physical clock (= whatever
>> it thinks is a physical clock) jump. Another benefit is that, if the
>> host fails to resume for some reason, data loss on the VM disks should
>> be reasonably unlikely, because when the guest suspends, it will flush
>> its stuff first.
>>
>> * Use "-rtc clock=vm" on the QEMU command line. (Equivalently, use
>> <timer name='rtc' track='guest'/> in the libvirt domain XML.) See the
>> QEMU manual, and the libvirt domain XML manual on these. Those settings
>> decouple the guest's RTC from the host's time, bringing both benefits
>> (no jumps in guest time) and drawbacks (the timelines diverge).
>>
>> * Also, I've heard rumors that libvirtd might put a suspend inhibitor in
>> place (on the host) while some VMs are running. ("Suspend inhibitor" is
>> a SystemD term, I think.) Not sure how/if that works in practice; either
>> way it would solve the issue from a different perspective (namely, you
>> couldn't suspend the host).
>>
>>
>> Obviously I'm not trying to speak on this with any kind of "authority",
>> so take it FWIW. I happen to be a fan of the first option (manual guest
>> suspend).
>>
>> Thanks,
>> Laszlo
> _______________________________________________
> kvmarm mailing list
> kvmarm@lists.cs.columbia.edu
> https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
> 

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Timekeeping on ARM guests/hosts
  2018-10-31 16:41           ` Steven Price
@ 2018-10-31 18:49             ` Miriam Zimmerman
  2018-11-02 14:34               ` Christoffer Dall
  0 siblings, 1 reply; 21+ messages in thread
From: Miriam Zimmerman @ 2018-10-31 18:49 UTC (permalink / raw)
  To: steven.price; +Cc: marc.zyngier, lersek, kvmarm

On Wed, Oct 31, 2018 at 9:41 AM Steven Price <steven.price@arm.com> wrote:
>
> On 11/10/2018 19:40, Miriam Zimmerman wrote:
> > On Thu, Oct 11, 2018 at 8:21 AM Laszlo Ersek <lersek@redhat.com> wrote:
> >>
> >> On 10/11/18 09:54, Marc Zyngier wrote:
> >>> Hi Miriam,
> >>>
> >>> On Wed, 10 Oct 2018 19:38:47 +0100,
> >>> Miriam Zimmerman <mutexlox@google.com> wrote:
> >>>>
> >>>> (oops, sorry for lack of plaintext in the first email. must've
> >>>> forgotten to click the button in my email client)
> >>>>
> >>>> Until that happens, what's the best workaround? Just running an ntp
> >>>> daemon in guest?
> >>>
> >>> Christoffer reminded me yesterday that stolen time accounting only
> >>> affects scheduling, and is not evaluated for
> >>>
> >>> An NTP daemon may not be the best course of action, as the guest is
> >>> going to see a massive jump anyway, which most NTP implementations are
> >>> not design to handle (they rightly assume that something else is
> >>> wrong). It would also mean that you'd have to run a NTP server
> >>> somewhere on the host, as you cannot always assume full connectivity.
> >>>
> >>> A popular way to solve this seems to be using the QEMU guest agent,
> >>> but I must admit I never really investigated that side of the problem.
> >>
> >> The guest agent method is documented here, for example:
> >>
> >> https://git.qemu.org/?p=qemu.git;a=blob;f=qga/qapi-schema.json;h=dfbc4a5e32bde4070f12497c23973c604accfa7d;hb=v3.0.0#l128
> >>
> >> and IIRC it is exposed (for example) via "virsh domtime" to the libvirt
> >> user (or to higher level mgmt tools).
> >>
> >> I suspect though that the guest agent method might introduce the same
> >> kind of jump to the guest clock.
> >>
> >>> I'm quite curious of how this is done on x86 though. KVM_GUEST mostly
> >>> seems to give the guest a PV clocksource, which is not going to help in
> >>> terms of wall clock. Do you have any idea?
> >>
> >> I've seen this question raised, specifically wrt. x86, with people
> >> closing their laptops' lids, and their guests losing correct track of
> >> time. AIUI, there is no easy answer. (I was surprised to see Miriam's
> >> initial statement that CONFIG_KVM_GUEST had solved it.) Some references:
> >
> > Interesting; I haven't dug too much into the specifics of how the
> > timekeeping works, but I just did a quick experiment: I took two
> > laptops (one ARM and one x86) next to each other, ran "date" in VMs in
> > both, closed them for a few minutes, then reopened them and ran "date"
> > again. The x86 laptop had the correct time, whereas the ARM laptop
> > guest had (approximately) the same time as when I closed it.
>
> Interesting, since I see the opposite behaviour with kvmtool and
> pause/resume commands. When running on x86 time (from 'date') freezes on
> x86 (without CONFIG_KVM_GUEST), but skips time on Arm.
I would expect that x86 time would freeze without CONFIG_KVM_GUEST,
but it's surprising to me that Arm catches up without it.

> CONFIG_KVM_GUEST fixes this by providing the guest with information on
> the host time and informs it when the guest is paused (see
> MSR_KVM_SYSTEM_TIME_NEW). Arm doesn't (yet) have para-virtualised time.
>
> On Arm, as far as I know, the guest's view of time is purely from the
> virtual counter. Since nothing saves/restores this during the pause, the
> counter continues to increment and the jump in time is visible to the guest.
What is this virtual counter? How can one configure or enable it?

My understanding of what's going on now is that the guest kernel right
now has roughly no awareness that it's inside of a VM, so it just
tracks time using the assumption that it is native and is unaware that
the host ever suspends. (It just doesn't see any CPU time then.)

>
> Whether the guest sees time progress depends on what happens to that
> counter during suspend. kvmtool pause/resume simply prevents the vCPU
> threads from continuing, but the system counter is still running.
>
> If you save the state of a VM to a file then the counter value is
> saved/restored so the guest won't see any change in time.
I don't believe that we're explicitly saving any state to a file on
suspend right now, though I could be wrong; I haven't looked into that
codepath yet.

> The Arm ARM doesn't say a great deal about power saving modes so I
> wouldn't be surprised if there's differing behaviour as to whether the
> system clock is stopped during suspend modes. Indeed I wonder if the
> clock can even go backwards during a suspend-to-disk/resume cycle? I
> don't have hardware handy to test this.
The specific hardware I have in front of me is a Samsung Chromebook
Plus (board codename "kevin"), which I believe has an RK3399
processor. (More info at
https://www.chromium.org/chromium-os/developer-information-for-chrome-os-devices)

>
> I was thinking about changing the Arm behaviour to save/restore the
> value of the system clock during host suspend which would at least make
> the behaviour consistent.
I assume during resume as well? :-)

I don't suppose you have any idea of what kernel version you might be
able to get this change into?

Thanks!
Miriam

>
> Also related is that Arm doesn't have PV support for getting the clock
> from the host. x86 has MSR_KVM_WALL_CLOCK_NEW to report the wall clock
> time from the host.
>
> Steve
>
> > I'm guessing this behavior is implemented in either
> > arch/x86/kernel/kvmclock.c or arch/x86/kernel/pvclock.c, but I'll
> > confess that I've only skimmed those.
> >
> > I'll investigate how this works on x86 a bit. My plan had been to
> > workaround by using a guest agent that receives the correct wallclock
> > time on resume and adjusts the VM's clock as appropriate, but the
> > suspend option seems like a pretty good idea.
> >
> >> https://bugs.launchpad.net/qemu/+bug/1174654
> >> https://bugzilla.redhat.com/show_bug.cgi?id=1352992
> >> https://bugzilla.redhat.com/show_bug.cgi?id=1380893
> >>
> >> I'll spare you the verbatim quoting of the emails that I produced back
> >> then :) ; a summary of workarounds is:
> >>
> >> * Before you suspend the host, suspend the guest first. This way the
> >> guest will not be surprised when it sees the physical clock (= whatever
> >> it thinks is a physical clock) jump. Another benefit is that, if the
> >> host fails to resume for some reason, data loss on the VM disks should
> >> be reasonably unlikely, because when the guest suspends, it will flush
> >> its stuff first.
> >>
> >> * Use "-rtc clock=vm" on the QEMU command line. (Equivalently, use
> >> <timer name='rtc' track='guest'/> in the libvirt domain XML.) See the
> >> QEMU manual, and the libvirt domain XML manual on these. Those settings
> >> decouple the guest's RTC from the host's time, bringing both benefits
> >> (no jumps in guest time) and drawbacks (the timelines diverge).
> >>
> >> * Also, I've heard rumors that libvirtd might put a suspend inhibitor in
> >> place (on the host) while some VMs are running. ("Suspend inhibitor" is
> >> a SystemD term, I think.) Not sure how/if that works in practice; either
> >> way it would solve the issue from a different perspective (namely, you
> >> couldn't suspend the host).
> >>
> >>
> >> Obviously I'm not trying to speak on this with any kind of "authority",
> >> so take it FWIW. I happen to be a fan of the first option (manual guest
> >> suspend).
> >>
> >> Thanks,
> >> Laszlo
> > _______________________________________________
> > kvmarm mailing list
> > kvmarm@lists.cs.columbia.edu
> > https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
> >
>

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Timekeeping on ARM guests/hosts
  2018-10-31 18:49             ` Miriam Zimmerman
@ 2018-11-02 14:34               ` Christoffer Dall
  2018-11-02 18:28                 ` Miriam Zimmerman
  2018-11-03  0:19                 ` Bijan Mottahedeh
  0 siblings, 2 replies; 21+ messages in thread
From: Christoffer Dall @ 2018-11-02 14:34 UTC (permalink / raw)
  To: Miriam Zimmerman; +Cc: marc.zyngier, lersek, kvmarm, steven.price

On Wed, Oct 31, 2018 at 11:49:05AM -0700, Miriam Zimmerman wrote:

[...]

> 
> > CONFIG_KVM_GUEST fixes this by providing the guest with information on
> > the host time and informs it when the guest is paused (see
> > MSR_KVM_SYSTEM_TIME_NEW). Arm doesn't (yet) have para-virtualised time.
> >
> > On Arm, as far as I know, the guest's view of time is purely from the
> > virtual counter. Since nothing saves/restores this during the pause, the
> > counter continues to increment and the jump in time is visible to the guest.
> What is this virtual counter? How can one configure or enable it?

The 'virtual counter' is an architectural concept in the Arm
architecture, not a virtual (in the software VM sense) counter.

The Arm Generic Timer Architecture defines two counters:
 - The physical counter, accessed via CNTPCT_EL0
 - The virtual counter, accessed via CNTVCT_EL0

And a number of timers.  All timers are associated with the physical
counter, except the 'EL1 Virtual Timer' which uses the count in the
virtual counter for its comparison.

The virtual counter yields the value of the physical counter minus an
offset.

That offset is controlled by a hypervisor running at EL2 which can
program the offset value in CNTVOFF_EL2.

The basic idea is that a VM can use the virtual timer/counter to keep
track of some notion of virtual time, and the physical timer/counter to
keep track of something that always relates to wall-clock time.

This all breaks quite horribly with migration where the physical counter
is not meaningful across physical systems, and where the counter
frequencies can vary.

But we can use the offset for the virtual counter to communicate some
other measure of time to VMs.

> 
> My understanding of what's going on now is that the guest kernel right
> now has roughly no awareness that it's inside of a VM, so it just
> tracks time using the assumption that it is native and is unaware that
> the host ever suspends. (It just doesn't see any CPU time then.)
> 
> >
> > Whether the guest sees time progress depends on what happens to that
> > counter during suspend. kvmtool pause/resume simply prevents the vCPU
> > threads from continuing, but the system counter is still running.
> >
> > If you save the state of a VM to a file then the counter value is
> > saved/restored so the guest won't see any change in time.
> I don't believe that we're explicitly saving any state to a file on
> suspend right now, though I could be wrong; I haven't looked into that
> codepath yet.

The key is whether the userspace program that controls the KVM VM
(kvmtool, QEMU, crosvm) uses the KVM_REG_ARM_TIMER_CNT ioctl to save the
VM view of virtual time, and to retore that at a later time.

KVM adjusts the CNTVOFF_EL2 for the VM on which the ioctl is executed to
represent the value written by userspace to the VM when the VM reads
CNTVCT_EL0.

> > The Arm ARM doesn't say a great deal about power saving modes so I
> > wouldn't be surprised if there's differing behaviour as to whether the
> > system clock is stopped during suspend modes. Indeed I wonder if the
> > clock can even go backwards during a suspend-to-disk/resume cycle? I
> > don't have hardware handy to test this.
> The specific hardware I have in front of me is a Samsung Chromebook
> Plus (board codename "kevin"), which I believe has an RK3399
> processor. (More info at
> https://www.chromium.org/chromium-os/developer-information-for-chrome-os-devices)
> 

For the purpose of timekeeping in KVM, we need an architecturally
meaningful solution, not something specific to any device.


Thanks,

    Christoffer

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Timekeeping on ARM guests/hosts
  2018-11-02 14:34               ` Christoffer Dall
@ 2018-11-02 18:28                 ` Miriam Zimmerman
  2018-11-02 21:23                   ` Miriam Zimmerman
  2018-11-03  0:19                 ` Bijan Mottahedeh
  1 sibling, 1 reply; 21+ messages in thread
From: Miriam Zimmerman @ 2018-11-02 18:28 UTC (permalink / raw)
  To: christoffer.dall; +Cc: marc.zyngier, lersek, kvmarm, steven.price

On Fri, Nov 2, 2018 at 7:34 AM Christoffer Dall
<christoffer.dall@arm.com> wrote:
>
> On Wed, Oct 31, 2018 at 11:49:05AM -0700, Miriam Zimmerman wrote:
>
> [...]
>
> >
> > > CONFIG_KVM_GUEST fixes this by providing the guest with information on
> > > the host time and informs it when the guest is paused (see
> > > MSR_KVM_SYSTEM_TIME_NEW). Arm doesn't (yet) have para-virtualised time.
> > >
> > > On Arm, as far as I know, the guest's view of time is purely from the
> > > virtual counter. Since nothing saves/restores this during the pause, the
> > > counter continues to increment and the jump in time is visible to the guest.
> > What is this virtual counter? How can one configure or enable it?
>
> The 'virtual counter' is an architectural concept in the Arm
> architecture, not a virtual (in the software VM sense) counter.
>
> The Arm Generic Timer Architecture defines two counters:
>  - The physical counter, accessed via CNTPCT_EL0
>  - The virtual counter, accessed via CNTVCT_EL0
>
> And a number of timers.  All timers are associated with the physical
> counter, except the 'EL1 Virtual Timer' which uses the count in the
> virtual counter for its comparison.
>
> The virtual counter yields the value of the physical counter minus an
> offset.
>
> That offset is controlled by a hypervisor running at EL2 which can
> program the offset value in CNTVOFF_EL2.
>
> The basic idea is that a VM can use the virtual timer/counter to keep
> track of some notion of virtual time, and the physical timer/counter to
> keep track of something that always relates to wall-clock time.
>
> This all breaks quite horribly with migration where the physical counter
> is not meaningful across physical systems, and where the counter
> frequencies can vary.
>
> But we can use the offset for the virtual counter to communicate some
> other measure of time to VMs.

That makes sense. Thanks for the primer! I haven't done any ARM kernel
work before, so this is all great to know.

> > My understanding of what's going on now is that the guest kernel right
> > now has roughly no awareness that it's inside of a VM, so it just
> > tracks time using the assumption that it is native and is unaware that
> > the host ever suspends. (It just doesn't see any CPU time then.)
> >
> > >
> > > Whether the guest sees time progress depends on what happens to that
> > > counter during suspend. kvmtool pause/resume simply prevents the vCPU
> > > threads from continuing, but the system counter is still running.
> > >
> > > If you save the state of a VM to a file then the counter value is
> > > saved/restored so the guest won't see any change in time.
> > I don't believe that we're explicitly saving any state to a file on
> > suspend right now, though I could be wrong; I haven't looked into that
> > codepath yet.
>
> The key is whether the userspace program that controls the KVM VM
> (kvmtool, QEMU, crosvm) uses the KVM_REG_ARM_TIMER_CNT ioctl to save the
> VM view of virtual time, and to retore that at a later time.
>
> KVM adjusts the CNTVOFF_EL2 for the VM on which the ioctl is executed to
> represent the value written by userspace to the VM when the VM reads
> CNTVCT_EL0.
>

Ah hah. I haven't looked yet, but I bet this is the missing piece.
Thanks very much, Christoffer!

>
> Thanks,
>
>     Christoffer

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Timekeeping on ARM guests/hosts
  2018-11-02 18:28                 ` Miriam Zimmerman
@ 2018-11-02 21:23                   ` Miriam Zimmerman
  2018-11-06  7:45                     ` Christoffer Dall
  0 siblings, 1 reply; 21+ messages in thread
From: Miriam Zimmerman @ 2018-11-02 21:23 UTC (permalink / raw)
  To: christoffer.dall; +Cc: marc.zyngier, lersek, kvmarm, steven.price

In researching KVM_REG_ARM_TIMER_CNT, I discovered your commit 4b7a6bf
("target-arm: kvm: Differentiate registers based on write-back
levels"), which seems to limit when the KVM_REG_ARM_TIMER_CNT is used
to save time. Under what circumstances should this be saved in order
to provide a consistent view of wall clock time (as given by `date` in
the VM)?  The commit refers to 'machine initialization or on vmload
operations', but I'm having difficulty figuring out what a vmload
operation is on ARM. Does this include resume-from-sleep/suspend?
On Fri, Nov 2, 2018 at 11:28 AM Miriam Zimmerman <mutexlox@google.com> wrote:
>
> On Fri, Nov 2, 2018 at 7:34 AM Christoffer Dall
> <christoffer.dall@arm.com> wrote:
> >
> > On Wed, Oct 31, 2018 at 11:49:05AM -0700, Miriam Zimmerman wrote:
> >
> > [...]
> >
> > >
> > > > CONFIG_KVM_GUEST fixes this by providing the guest with information on
> > > > the host time and informs it when the guest is paused (see
> > > > MSR_KVM_SYSTEM_TIME_NEW). Arm doesn't (yet) have para-virtualised time.
> > > >
> > > > On Arm, as far as I know, the guest's view of time is purely from the
> > > > virtual counter. Since nothing saves/restores this during the pause, the
> > > > counter continues to increment and the jump in time is visible to the guest.
> > > What is this virtual counter? How can one configure or enable it?
> >
> > The 'virtual counter' is an architectural concept in the Arm
> > architecture, not a virtual (in the software VM sense) counter.
> >
> > The Arm Generic Timer Architecture defines two counters:
> >  - The physical counter, accessed via CNTPCT_EL0
> >  - The virtual counter, accessed via CNTVCT_EL0
> >
> > And a number of timers.  All timers are associated with the physical
> > counter, except the 'EL1 Virtual Timer' which uses the count in the
> > virtual counter for its comparison.
> >
> > The virtual counter yields the value of the physical counter minus an
> > offset.
> >
> > That offset is controlled by a hypervisor running at EL2 which can
> > program the offset value in CNTVOFF_EL2.
> >
> > The basic idea is that a VM can use the virtual timer/counter to keep
> > track of some notion of virtual time, and the physical timer/counter to
> > keep track of something that always relates to wall-clock time.
> >
> > This all breaks quite horribly with migration where the physical counter
> > is not meaningful across physical systems, and where the counter
> > frequencies can vary.
> >
> > But we can use the offset for the virtual counter to communicate some
> > other measure of time to VMs.
>
> That makes sense. Thanks for the primer! I haven't done any ARM kernel
> work before, so this is all great to know.
>
> > > My understanding of what's going on now is that the guest kernel right
> > > now has roughly no awareness that it's inside of a VM, so it just
> > > tracks time using the assumption that it is native and is unaware that
> > > the host ever suspends. (It just doesn't see any CPU time then.)
> > >
> > > >
> > > > Whether the guest sees time progress depends on what happens to that
> > > > counter during suspend. kvmtool pause/resume simply prevents the vCPU
> > > > threads from continuing, but the system counter is still running.
> > > >
> > > > If you save the state of a VM to a file then the counter value is
> > > > saved/restored so the guest won't see any change in time.
> > > I don't believe that we're explicitly saving any state to a file on
> > > suspend right now, though I could be wrong; I haven't looked into that
> > > codepath yet.
> >
> > The key is whether the userspace program that controls the KVM VM
> > (kvmtool, QEMU, crosvm) uses the KVM_REG_ARM_TIMER_CNT ioctl to save the
> > VM view of virtual time, and to retore that at a later time.
> >
> > KVM adjusts the CNTVOFF_EL2 for the VM on which the ioctl is executed to
> > represent the value written by userspace to the VM when the VM reads
> > CNTVCT_EL0.
> >
>
> Ah hah. I haven't looked yet, but I bet this is the missing piece.
> Thanks very much, Christoffer!
>
> >
> > Thanks,
> >
> >     Christoffer

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Timekeeping on ARM guests/hosts
  2018-11-02 14:34               ` Christoffer Dall
  2018-11-02 18:28                 ` Miriam Zimmerman
@ 2018-11-03  0:19                 ` Bijan Mottahedeh
  2018-11-06  7:49                   ` Christoffer Dall
  1 sibling, 1 reply; 21+ messages in thread
From: Bijan Mottahedeh @ 2018-11-03  0:19 UTC (permalink / raw)
  To: Christoffer Dall, Miriam Zimmerman
  Cc: marc.zyngier, lersek, kvmarm, steven.price

On 11/2/2018 7:34 AM, Christoffer Dall wrote:
> The key is whether the userspace program that controls the KVM VM
> (kvmtool, QEMU, crosvm) uses the KVM_REG_ARM_TIMER_CNT ioctl to save the
> VM view of virtual time, and to retore that at a later time.
>
> KVM adjusts the CNTVOFF_EL2 for the VM on which the ioctl is executed to
> represent the value written by userspace to the VM when the VM reads
> CNTVCT_EL0.
>
>>> The Arm ARM doesn't say a great deal about power saving modes so I
>>> wouldn't be surprised if there's differing behaviour as to whether the
>>> system clock is stopped during suspend modes. Indeed I wonder if the
>>> clock can even go backwards during a suspend-to-disk/resume cycle? I
>>> don't have hardware handy to test this.
>> The specific hardware I have in front of me is a Samsung Chromebook
>> Plus (board codename "kevin"), which I believe has an RK3399
>> processor. (More info at
>> https://www.chromium.org/chromium-os/developer-information-for-chrome-os-devices)
>>
> For the purpose of timekeeping in KVM, we need an architecturally
> meaningful solution, not something specific to any device.

I've been working on a QEMU patch to address a problem observed when an 
active guest is paused and resumed after a certain delay with virsh or 
the QEMU monitor.

A simple test to reproduce the problem executes one or more instances of 
the following command in the guest:

dd if=/dev/zero of=/dev/null &

and then pauses and resumes the guest after a certain delay:

virsh suspend <guest>        # pauses the guest
sleep 120
virsh resume <guest>

After the guest is resumed, there are soft lockup warning messages 
displayed on the console.

A comparison with x86 shows that hwclock and date values diverge after 
the above pause and resume sequence for x86 but remain the same for Arm.

The patch accumulates the total guest pause time in QEMU and adjusts the 
virtual offset counter accordingly with the KVM_REG_ARM_TIMER_CNT ioctl 
before the guest is resumed.  With the patch the time behavior is the 
same as x86 and the soft lockup messages go away.

I've tested the patch on an Ampere eMag server but I'm not sure how 
complete, generic, and backward compatible of a solution the patch is in 
terms of other Arm platforms.

Also, I'm not sure if and when this patch would be superseded by the 
proposal from your KVM Forum 2018 presentation:

Paravirtualized Time for Arm-based Systems
https://developer.arm.com/docs/den0057/a

Would it make sense to send the patch as an RFC for evaluation at this 
point or do you suggest any other considerations?

Thanks.

--bijan
_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Timekeeping on ARM guests/hosts
  2018-11-02 21:23                   ` Miriam Zimmerman
@ 2018-11-06  7:45                     ` Christoffer Dall
  2018-11-06 13:39                       ` Alex Bennée
  2018-11-06 18:37                       ` Miriam Zimmerman
  0 siblings, 2 replies; 21+ messages in thread
From: Christoffer Dall @ 2018-11-06  7:45 UTC (permalink / raw)
  To: Miriam Zimmerman; +Cc: marc.zyngier, lersek, steven.price, kvmarm

On Fri, Nov 02, 2018 at 02:23:45PM -0700, Miriam Zimmerman wrote:
> In researching KVM_REG_ARM_TIMER_CNT, I discovered your commit 4b7a6bf
> ("target-arm: kvm: Differentiate registers based on write-back
> levels"), which seems to limit when the KVM_REG_ARM_TIMER_CNT is used
> to save time. Under what circumstances should this be saved in order
> to provide a consistent view of wall clock time (as given by `date` in
> the VM)? 

In general, and not specific to QEMU, I think that the virtual
counter value should stop counting when the entirety of the VM is not
running, for example when the host machine is suspended, or when the
entire VM is stopped/suspended, either as part of a suspend/resume
operation, debug operation, or as part of migration of some sort.

Supporting these timekeeping semantics is not something anyone has tried
up until now with KVM/Arm, as far as I'm aware, and as such is 'new'
work.


> The commit refers to 'machine initialization or on vmload
> operations', but I'm having difficulty figuring out what a vmload
> operation is on ARM. Does this include resume-from-sleep/suspend?

I believe vmload is qemu-speak for loading in VM state from a stored
migration stream, but you'd have to ask the QEMU folks or study the code
more carefully to figure out when a vmload really happens.

Thanks,

    Christoffer

> On Fri, Nov 2, 2018 at 11:28 AM Miriam Zimmerman <mutexlox@google.com> wrote:
> >
> > On Fri, Nov 2, 2018 at 7:34 AM Christoffer Dall
> > <christoffer.dall@arm.com> wrote:
> > >
> > > On Wed, Oct 31, 2018 at 11:49:05AM -0700, Miriam Zimmerman wrote:
> > >
> > > [...]
> > >
> > > >
> > > > > CONFIG_KVM_GUEST fixes this by providing the guest with information on
> > > > > the host time and informs it when the guest is paused (see
> > > > > MSR_KVM_SYSTEM_TIME_NEW). Arm doesn't (yet) have para-virtualised time.
> > > > >
> > > > > On Arm, as far as I know, the guest's view of time is purely from the
> > > > > virtual counter. Since nothing saves/restores this during the pause, the
> > > > > counter continues to increment and the jump in time is visible to the guest.
> > > > What is this virtual counter? How can one configure or enable it?
> > >
> > > The 'virtual counter' is an architectural concept in the Arm
> > > architecture, not a virtual (in the software VM sense) counter.
> > >
> > > The Arm Generic Timer Architecture defines two counters:
> > >  - The physical counter, accessed via CNTPCT_EL0
> > >  - The virtual counter, accessed via CNTVCT_EL0
> > >
> > > And a number of timers.  All timers are associated with the physical
> > > counter, except the 'EL1 Virtual Timer' which uses the count in the
> > > virtual counter for its comparison.
> > >
> > > The virtual counter yields the value of the physical counter minus an
> > > offset.
> > >
> > > That offset is controlled by a hypervisor running at EL2 which can
> > > program the offset value in CNTVOFF_EL2.
> > >
> > > The basic idea is that a VM can use the virtual timer/counter to keep
> > > track of some notion of virtual time, and the physical timer/counter to
> > > keep track of something that always relates to wall-clock time.
> > >
> > > This all breaks quite horribly with migration where the physical counter
> > > is not meaningful across physical systems, and where the counter
> > > frequencies can vary.
> > >
> > > But we can use the offset for the virtual counter to communicate some
> > > other measure of time to VMs.
> >
> > That makes sense. Thanks for the primer! I haven't done any ARM kernel
> > work before, so this is all great to know.
> >
> > > > My understanding of what's going on now is that the guest kernel right
> > > > now has roughly no awareness that it's inside of a VM, so it just
> > > > tracks time using the assumption that it is native and is unaware that
> > > > the host ever suspends. (It just doesn't see any CPU time then.)
> > > >
> > > > >
> > > > > Whether the guest sees time progress depends on what happens to that
> > > > > counter during suspend. kvmtool pause/resume simply prevents the vCPU
> > > > > threads from continuing, but the system counter is still running.
> > > > >
> > > > > If you save the state of a VM to a file then the counter value is
> > > > > saved/restored so the guest won't see any change in time.
> > > > I don't believe that we're explicitly saving any state to a file on
> > > > suspend right now, though I could be wrong; I haven't looked into that
> > > > codepath yet.
> > >
> > > The key is whether the userspace program that controls the KVM VM
> > > (kvmtool, QEMU, crosvm) uses the KVM_REG_ARM_TIMER_CNT ioctl to save the
> > > VM view of virtual time, and to retore that at a later time.
> > >
> > > KVM adjusts the CNTVOFF_EL2 for the VM on which the ioctl is executed to
> > > represent the value written by userspace to the VM when the VM reads
> > > CNTVCT_EL0.
> > >
> >
> > Ah hah. I haven't looked yet, but I bet this is the missing piece.
> > Thanks very much, Christoffer!
> >
> > >
> > > Thanks,
> > >
> > >     Christoffer

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Timekeeping on ARM guests/hosts
  2018-11-03  0:19                 ` Bijan Mottahedeh
@ 2018-11-06  7:49                   ` Christoffer Dall
  0 siblings, 0 replies; 21+ messages in thread
From: Christoffer Dall @ 2018-11-06  7:49 UTC (permalink / raw)
  To: Bijan Mottahedeh
  Cc: marc.zyngier, steven.price, lersek, Miriam Zimmerman, kvmarm

On Fri, Nov 02, 2018 at 05:19:20PM -0700, Bijan Mottahedeh wrote:
> On 11/2/2018 7:34 AM, Christoffer Dall wrote:
> >The key is whether the userspace program that controls the KVM VM
> >(kvmtool, QEMU, crosvm) uses the KVM_REG_ARM_TIMER_CNT ioctl to save the
> >VM view of virtual time, and to retore that at a later time.
> >
> >KVM adjusts the CNTVOFF_EL2 for the VM on which the ioctl is executed to
> >represent the value written by userspace to the VM when the VM reads
> >CNTVCT_EL0.
> >
> >>>The Arm ARM doesn't say a great deal about power saving modes so I
> >>>wouldn't be surprised if there's differing behaviour as to whether the
> >>>system clock is stopped during suspend modes. Indeed I wonder if the
> >>>clock can even go backwards during a suspend-to-disk/resume cycle? I
> >>>don't have hardware handy to test this.
> >>The specific hardware I have in front of me is a Samsung Chromebook
> >>Plus (board codename "kevin"), which I believe has an RK3399
> >>processor. (More info at
> >>https://www.chromium.org/chromium-os/developer-information-for-chrome-os-devices)
> >>
> >For the purpose of timekeeping in KVM, we need an architecturally
> >meaningful solution, not something specific to any device.
> 
> I've been working on a QEMU patch to address a problem observed when an
> active guest is paused and resumed after a certain delay with virsh or the
> QEMU monitor.
> 
> A simple test to reproduce the problem executes one or more instances of the
> following command in the guest:
> 
> dd if=/dev/zero of=/dev/null &
> 
> and then pauses and resumes the guest after a certain delay:
> 
> virsh suspend <guest>        # pauses the guest
> sleep 120
> virsh resume <guest>
> 
> After the guest is resumed, there are soft lockup warning messages displayed
> on the console.
> 
> A comparison with x86 shows that hwclock and date values diverge after the
> above pause and resume sequence for x86 but remain the same for Arm.
> 
> The patch accumulates the total guest pause time in QEMU and adjusts the
> virtual offset counter accordingly with the KVM_REG_ARM_TIMER_CNT ioctl
> before the guest is resumed.  With the patch the time behavior is the same
> as x86 and the soft lockup messages go away.
> 
> I've tested the patch on an Ampere eMag server but I'm not sure how
> complete, generic, and backward compatible of a solution the patch is in
> terms of other Arm platforms.

Without having looked at the patch, that sounds to me like it would work
on all Arm implementations (ignoring hardware bugs around the
counter/time, if any should exist).

> 
> Also, I'm not sure if and when this patch would be superseded by the
> proposal from your KVM Forum 2018 presentation:
> 
> Paravirtualized Time for Arm-based Systems
> https://developer.arm.com/docs/den0057/a

I think that patch would be part of the work to support the semantics
in the PV Timer spec.  This is part of what userspace would have to do
to communicate 'Live Physical Time (LPT)' to the guest, by telling KVM
to preserve a previous value of LPT, which it in turns does by adjusting
the offset in CNTVOFF_EL2 (which effectively accounts for 'paused' time
- in PV Timer spec terms).

> 
> Would it make sense to send the patch as an RFC for evaluation at this point
> or do you suggest any other considerations?
> 

Definitely that would make sense.

We need additional work to account for suspended time, either via hooks
in userspace or via hooks directly in the arch timer in KVM, and all of
the PV features come on top.


Thanks,

    Christoffer
_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Timekeeping on ARM guests/hosts
  2018-11-06  7:45                     ` Christoffer Dall
@ 2018-11-06 13:39                       ` Alex Bennée
  2018-11-06 18:37                       ` Miriam Zimmerman
  1 sibling, 0 replies; 21+ messages in thread
From: Alex Bennée @ 2018-11-06 13:39 UTC (permalink / raw)
  To: Christoffer Dall
  Cc: Miriam Zimmerman, marc.zyngier, steven.price, lersek, kvmarm


Christoffer Dall <christoffer.dall@arm.com> writes:

> On Fri, Nov 02, 2018 at 02:23:45PM -0700, Miriam Zimmerman wrote:
>> In researching KVM_REG_ARM_TIMER_CNT, I discovered your commit 4b7a6bf
>> ("target-arm: kvm: Differentiate registers based on write-back
>> levels"), which seems to limit when the KVM_REG_ARM_TIMER_CNT is used
>> to save time. Under what circumstances should this be saved in order
>> to provide a consistent view of wall clock time (as given by `date` in
>> the VM)?
>
> In general, and not specific to QEMU, I think that the virtual
> counter value should stop counting when the entirety of the VM is not
> running, for example when the host machine is suspended, or when the
> entire VM is stopped/suspended, either as part of a suspend/resume
> operation, debug operation, or as part of migration of some sort.
>
> Supporting these timekeeping semantics is not something anyone has tried
> up until now with KVM/Arm, as far as I'm aware, and as such is 'new'
> work.
>
>
>> The commit refers to 'machine initialization or on vmload
>> operations', but I'm having difficulty figuring out what a vmload
>> operation is on ARM. Does this include resume-from-sleep/suspend?
>
> I believe vmload is qemu-speak for loading in VM state from a stored
> migration stream, but you'd have to ask the QEMU folks or study the code
> more carefully to figure out when a vmload really happens.

Generally QEMU doesn't care about the internal state of the guest unless
it has to. When it does it calls:

  cpu_synchronize_state

Which will then iterate through the registers and copy the kernel's view
of register state into QEMU. Sometimes for KVM specific things it will
call kvm_cpu_synchronize_state(cs) which is what cpu_synchronize_state
does when KVM is active.

These cases are:

  - handling debug events (gdbserver/KVM_SET_DEBUG)
  - system start/reset
  - migration (before saving the vm state)

If the VM stops running for whatever reason QEMU will just sit there an
wait until an event wakes it up (generally something IO related). Some
architectures do things differently - for example x86 often needs to
sync state so it can disassemble instructions to emulate mmio. However
for ARM most things can be handled without a full sync of the register
state.


>
> Thanks,
>
>     Christoffer
>
>> On Fri, Nov 2, 2018 at 11:28 AM Miriam Zimmerman <mutexlox@google.com> wrote:
>> >
>> > On Fri, Nov 2, 2018 at 7:34 AM Christoffer Dall
>> > <christoffer.dall@arm.com> wrote:
>> > >
>> > > On Wed, Oct 31, 2018 at 11:49:05AM -0700, Miriam Zimmerman wrote:
>> > >
>> > > [...]
>> > >
>> > > >
>> > > > > CONFIG_KVM_GUEST fixes this by providing the guest with information on
>> > > > > the host time and informs it when the guest is paused (see
>> > > > > MSR_KVM_SYSTEM_TIME_NEW). Arm doesn't (yet) have para-virtualised time.
>> > > > >
>> > > > > On Arm, as far as I know, the guest's view of time is purely from the
>> > > > > virtual counter. Since nothing saves/restores this during the pause, the
>> > > > > counter continues to increment and the jump in time is visible to the guest.
>> > > > What is this virtual counter? How can one configure or enable it?
>> > >
>> > > The 'virtual counter' is an architectural concept in the Arm
>> > > architecture, not a virtual (in the software VM sense) counter.
>> > >
>> > > The Arm Generic Timer Architecture defines two counters:
>> > >  - The physical counter, accessed via CNTPCT_EL0
>> > >  - The virtual counter, accessed via CNTVCT_EL0
>> > >
>> > > And a number of timers.  All timers are associated with the physical
>> > > counter, except the 'EL1 Virtual Timer' which uses the count in the
>> > > virtual counter for its comparison.
>> > >
>> > > The virtual counter yields the value of the physical counter minus an
>> > > offset.
>> > >
>> > > That offset is controlled by a hypervisor running at EL2 which can
>> > > program the offset value in CNTVOFF_EL2.
>> > >
>> > > The basic idea is that a VM can use the virtual timer/counter to keep
>> > > track of some notion of virtual time, and the physical timer/counter to
>> > > keep track of something that always relates to wall-clock time.
>> > >
>> > > This all breaks quite horribly with migration where the physical counter
>> > > is not meaningful across physical systems, and where the counter
>> > > frequencies can vary.
>> > >
>> > > But we can use the offset for the virtual counter to communicate some
>> > > other measure of time to VMs.
>> >
>> > That makes sense. Thanks for the primer! I haven't done any ARM kernel
>> > work before, so this is all great to know.
>> >
>> > > > My understanding of what's going on now is that the guest kernel right
>> > > > now has roughly no awareness that it's inside of a VM, so it just
>> > > > tracks time using the assumption that it is native and is unaware that
>> > > > the host ever suspends. (It just doesn't see any CPU time then.)
>> > > >
>> > > > >
>> > > > > Whether the guest sees time progress depends on what happens to that
>> > > > > counter during suspend. kvmtool pause/resume simply prevents the vCPU
>> > > > > threads from continuing, but the system counter is still running.
>> > > > >
>> > > > > If you save the state of a VM to a file then the counter value is
>> > > > > saved/restored so the guest won't see any change in time.
>> > > > I don't believe that we're explicitly saving any state to a file on
>> > > > suspend right now, though I could be wrong; I haven't looked into that
>> > > > codepath yet.
>> > >
>> > > The key is whether the userspace program that controls the KVM VM
>> > > (kvmtool, QEMU, crosvm) uses the KVM_REG_ARM_TIMER_CNT ioctl to save the
>> > > VM view of virtual time, and to retore that at a later time.
>> > >
>> > > KVM adjusts the CNTVOFF_EL2 for the VM on which the ioctl is executed to
>> > > represent the value written by userspace to the VM when the VM reads
>> > > CNTVCT_EL0.
>> > >
>> >
>> > Ah hah. I haven't looked yet, but I bet this is the missing piece.
>> > Thanks very much, Christoffer!
>> >
>> > >
>> > > Thanks,
>> > >
>> > >     Christoffer


--
Alex Bennée
_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Timekeeping on ARM guests/hosts
  2018-11-06  7:45                     ` Christoffer Dall
  2018-11-06 13:39                       ` Alex Bennée
@ 2018-11-06 18:37                       ` Miriam Zimmerman
  2018-11-07  9:42                         ` Christoffer Dall
  1 sibling, 1 reply; 21+ messages in thread
From: Miriam Zimmerman @ 2018-11-06 18:37 UTC (permalink / raw)
  To: christoffer.dall; +Cc: marc.zyngier, lersek, steven.price, kvmarm

On Mon, Nov 5, 2018 at 11:45 PM Christoffer Dall
<christoffer.dall@arm.com> wrote:
>
> On Fri, Nov 02, 2018 at 02:23:45PM -0700, Miriam Zimmerman wrote:
> > In researching KVM_REG_ARM_TIMER_CNT, I discovered your commit 4b7a6bf
> > ("target-arm: kvm: Differentiate registers based on write-back
> > levels"), which seems to limit when the KVM_REG_ARM_TIMER_CNT is used
> > to save time. Under what circumstances should this be saved in order
> > to provide a consistent view of wall clock time (as given by `date` in
> > the VM)?
>
> In general, and not specific to QEMU, I think that the virtual
> counter value should stop counting when the entirety of the VM is not
> running, for example when the host machine is suspended, or when the
> entire VM is stopped/suspended, either as part of a suspend/resume
> operation, debug operation, or as part of migration of some sort.
>
> Supporting these timekeeping semantics is not something anyone has tried
> up until now with KVM/Arm, as far as I'm aware, and as such is 'new'
> work.

Hrm, that's perplexing to me. I thought you said that in your tests,
going into S3 suspend on a host did *not* result in time drift on the
guest? That would suggest to me that there is code that correctly
handles it.

Upthread, you said:
> The key is whether the userspace program that controls the KVM VM
> (kvmtool, QEMU, crosvm) uses the KVM_REG_ARM_TIMER_CNT ioctl to save the
> VM view of virtual time, and to retore that at a later time.
Is this a description of current behavior of any of these userspace
programs, or a description of how they might opt to address the
time-drift-on-suspend issue?

>
>
>
> > The commit refers to 'machine initialization or on vmload
> > operations', but I'm having difficulty figuring out what a vmload
> > operation is on ARM. Does this include resume-from-sleep/suspend?
>
> I believe vmload is qemu-speak for loading in VM state from a stored
> migration stream, but you'd have to ask the QEMU folks or study the code
> more carefully to figure out when a vmload really happens.
I see, okay. I misread the patch's commit log and thought you had written it.

Thanks,
Miriam

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Timekeeping on ARM guests/hosts
  2018-11-06 18:37                       ` Miriam Zimmerman
@ 2018-11-07  9:42                         ` Christoffer Dall
  2018-11-07 18:22                           ` Miriam Zimmerman
  0 siblings, 1 reply; 21+ messages in thread
From: Christoffer Dall @ 2018-11-07  9:42 UTC (permalink / raw)
  To: Miriam Zimmerman; +Cc: marc.zyngier, lersek, steven.price, kvmarm

On Tue, Nov 06, 2018 at 10:37:21AM -0800, Miriam Zimmerman wrote:
> On Mon, Nov 5, 2018 at 11:45 PM Christoffer Dall
> <christoffer.dall@arm.com> wrote:
> >
> > On Fri, Nov 02, 2018 at 02:23:45PM -0700, Miriam Zimmerman wrote:
> > > In researching KVM_REG_ARM_TIMER_CNT, I discovered your commit 4b7a6bf
> > > ("target-arm: kvm: Differentiate registers based on write-back
> > > levels"), which seems to limit when the KVM_REG_ARM_TIMER_CNT is used
> > > to save time. Under what circumstances should this be saved in order
> > > to provide a consistent view of wall clock time (as given by `date` in
> > > the VM)?
> >
> > In general, and not specific to QEMU, I think that the virtual
> > counter value should stop counting when the entirety of the VM is not
> > running, for example when the host machine is suspended, or when the
> > entire VM is stopped/suspended, either as part of a suspend/resume
> > operation, debug operation, or as part of migration of some sort.
> >
> > Supporting these timekeeping semantics is not something anyone has tried
> > up until now with KVM/Arm, as far as I'm aware, and as such is 'new'
> > work.
> 
> Hrm, that's perplexing to me. I thought you said that in your tests,
> going into S3 suspend on a host did *not* result in time drift on the
> guest? That would suggest to me that there is code that correctly
> handles it.

I don't believe I've said that.  I haven't actually tried that myself,
but I know anecdotally from others that time jumps on the guest when you
suspend the host, leading to warnings in a guest.

There must be some misunderstanding here.

> 
> Upthread, you said:
> > The key is whether the userspace program that controls the KVM VM
> > (kvmtool, QEMU, crosvm) uses the KVM_REG_ARM_TIMER_CNT ioctl to save the
> > VM view of virtual time, and to retore that at a later time.
> Is this a description of current behavior of any of these userspace
> programs, or a description of how they might opt to address the
> time-drift-on-suspend issue?
> 

That is how they might choose, using the current KVM/Arm api, to adjust
for suspending the VM (different from suspending the host).

> >
> >
> >
> > > The commit refers to 'machine initialization or on vmload
> > > operations', but I'm having difficulty figuring out what a vmload
> > > operation is on ARM. Does this include resume-from-sleep/suspend?
> >
> > I believe vmload is qemu-speak for loading in VM state from a stored
> > migration stream, but you'd have to ask the QEMU folks or study the code
> > more carefully to figure out when a vmload really happens.
> I see, okay. I misread the patch's commit log and thought you had written it.
> 

I did, but I'm (at best) a drive-by QEMU hacker, and I am not presently
in a position to contribute to QEMU.  There is also always the
possibility that my patch was wrong.  I think in this particular case,
though, I was trying to solve a differnet problem with that patch and
didn't realy consider the problem of virtual time back then.

Hope this helps,

    Christoffer

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Timekeeping on ARM guests/hosts
  2018-11-07  9:42                         ` Christoffer Dall
@ 2018-11-07 18:22                           ` Miriam Zimmerman
  2018-11-08 10:26                             ` Christoffer Dall
  0 siblings, 1 reply; 21+ messages in thread
From: Miriam Zimmerman @ 2018-11-07 18:22 UTC (permalink / raw)
  To: christoffer.dall; +Cc: marc.zyngier, lersek, steven.price, kvmarm

On Wed, Nov 7, 2018 at 1:42 AM Christoffer Dall
<christoffer.dall@arm.com> wrote:
>
> On Tue, Nov 06, 2018 at 10:37:21AM -0800, Miriam Zimmerman wrote:
> > On Mon, Nov 5, 2018 at 11:45 PM Christoffer Dall
> > <christoffer.dall@arm.com> wrote:
> > >
> > > On Fri, Nov 02, 2018 at 02:23:45PM -0700, Miriam Zimmerman wrote:
> > > > In researching KVM_REG_ARM_TIMER_CNT, I discovered your commit 4b7a6bf
> > > > ("target-arm: kvm: Differentiate registers based on write-back
> > > > levels"), which seems to limit when the KVM_REG_ARM_TIMER_CNT is used
> > > > to save time. Under what circumstances should this be saved in order
> > > > to provide a consistent view of wall clock time (as given by `date` in
> > > > the VM)?
> > >
> > > In general, and not specific to QEMU, I think that the virtual
> > > counter value should stop counting when the entirety of the VM is not
> > > running, for example when the host machine is suspended, or when the
> > > entire VM is stopped/suspended, either as part of a suspend/resume
> > > operation, debug operation, or as part of migration of some sort.
> > >
> > > Supporting these timekeeping semantics is not something anyone has tried
> > > up until now with KVM/Arm, as far as I'm aware, and as such is 'new'
> > > work.
> >
> > Hrm, that's perplexing to me. I thought you said that in your tests,
> > going into S3 suspend on a host did *not* result in time drift on the
> > guest? That would suggest to me that there is code that correctly
> > handles it.
>
> I don't believe I've said that.  I haven't actually tried that myself,
> but I know anecdotally from others that time jumps on the guest when you
> suspend the host, leading to warnings in a guest.
>
> There must be some misunderstanding here.

Ah, indeed - Steven said that he tried this and saw time track
properly in-guest on ARM. I misremembered and thought that was you.

> >
> > Upthread, you said:
> > > The key is whether the userspace program that controls the KVM VM
> > > (kvmtool, QEMU, crosvm) uses the KVM_REG_ARM_TIMER_CNT ioctl to save the
> > > VM view of virtual time, and to retore that at a later time.
> > Is this a description of current behavior of any of these userspace
> > programs, or a description of how they might opt to address the
> > time-drift-on-suspend issue?
> >
>
> That is how they might choose, using the current KVM/Arm api, to adjust
> for suspending the VM (different from suspending the host).

I understand this now; thanks. I was somewhat confused when I started
diving into qemu and didn't see anything like what you described :-).


> > > > The commit refers to 'machine initialization or on vmload
> > > > operations', but I'm having difficulty figuring out what a vmload
> > > > operation is on ARM. Does this include resume-from-sleep/suspend?
> > >
> > > I believe vmload is qemu-speak for loading in VM state from a stored
> > > migration stream, but you'd have to ask the QEMU folks or study the code
> > > more carefully to figure out when a vmload really happens.
> > I see, okay. I misread the patch's commit log and thought you had written it.
> >
>
> I did, but I'm (at best) a drive-by QEMU hacker, and I am not presently
> in a position to contribute to QEMU.  There is also always the
> possibility that my patch was wrong.  I think in this particular case,
> though, I was trying to solve a differnet problem with that patch and
> didn't realy consider the problem of virtual time back then.

Got it. Thanks, Christoffer!
> Hope this helps,
>
>     Christoffer

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Timekeeping on ARM guests/hosts
  2018-11-07 18:22                           ` Miriam Zimmerman
@ 2018-11-08 10:26                             ` Christoffer Dall
  2018-11-08 16:34                               ` Steven Price
  0 siblings, 1 reply; 21+ messages in thread
From: Christoffer Dall @ 2018-11-08 10:26 UTC (permalink / raw)
  To: Miriam Zimmerman; +Cc: marc.zyngier, lersek, steven.price, kvmarm

On Wed, Nov 07, 2018 at 10:22:06AM -0800, Miriam Zimmerman wrote:
> On Wed, Nov 7, 2018 at 1:42 AM Christoffer Dall
> <christoffer.dall@arm.com> wrote:
> >
> > On Tue, Nov 06, 2018 at 10:37:21AM -0800, Miriam Zimmerman wrote:
> > > On Mon, Nov 5, 2018 at 11:45 PM Christoffer Dall
> > > <christoffer.dall@arm.com> wrote:
> > > >
> > > > On Fri, Nov 02, 2018 at 02:23:45PM -0700, Miriam Zimmerman wrote:
> > > > > In researching KVM_REG_ARM_TIMER_CNT, I discovered your commit 4b7a6bf
> > > > > ("target-arm: kvm: Differentiate registers based on write-back
> > > > > levels"), which seems to limit when the KVM_REG_ARM_TIMER_CNT is used
> > > > > to save time. Under what circumstances should this be saved in order
> > > > > to provide a consistent view of wall clock time (as given by `date` in
> > > > > the VM)?
> > > >
> > > > In general, and not specific to QEMU, I think that the virtual
> > > > counter value should stop counting when the entirety of the VM is not
> > > > running, for example when the host machine is suspended, or when the
> > > > entire VM is stopped/suspended, either as part of a suspend/resume
> > > > operation, debug operation, or as part of migration of some sort.
> > > >
> > > > Supporting these timekeeping semantics is not something anyone has tried
> > > > up until now with KVM/Arm, as far as I'm aware, and as such is 'new'
> > > > work.
> > >
> > > Hrm, that's perplexing to me. I thought you said that in your tests,
> > > going into S3 suspend on a host did *not* result in time drift on the
> > > guest? That would suggest to me that there is code that correctly
> > > handles it.
> >
> > I don't believe I've said that.  I haven't actually tried that myself,
> > but I know anecdotally from others that time jumps on the guest when you
> > suspend the host, leading to warnings in a guest.
> >
> > There must be some misunderstanding here.
> 
> Ah, indeed - Steven said that he tried this and saw time track
> properly in-guest on ARM. I misremembered and thought that was you.

What Steven said was:

  "On Arm, as far as I know, the guest's view of time is purely from the
  virtual counter. Since nothing saves/restores this during the pause,
  the counter continues to increment and the jump in time is visible to
  the guest."

So here he means that time in the guest jumps, which is not what the
guest expects, and thus you see warnings and problems from the guest,
even though date/time may be reported correctly in the guest for the
same reason.

Adjusting virtual time should prevent the guest from getting confused
wrt. watchdogs and starved processes etc.

However, I'm still not entirely clear on how the guest will correctly
observe wall-clock if we adjust virtual time.  Should it use the
physical counter?  Does PV take care of this?  Does it receive a
notification that it must update its clock via NTP?

Steve, any insight?


Thanks,

    Christoffer

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Timekeeping on ARM guests/hosts
  2018-11-08 10:26                             ` Christoffer Dall
@ 2018-11-08 16:34                               ` Steven Price
  2018-11-08 20:06                                 ` Christoffer Dall
  0 siblings, 1 reply; 21+ messages in thread
From: Steven Price @ 2018-11-08 16:34 UTC (permalink / raw)
  To: Christoffer Dall, Miriam Zimmerman; +Cc: marc.zyngier, lersek, kvmarm

On 08/11/2018 10:26, Christoffer Dall wrote:
> On Wed, Nov 07, 2018 at 10:22:06AM -0800, Miriam Zimmerman wrote:
>> On Wed, Nov 7, 2018 at 1:42 AM Christoffer Dall
>> <christoffer.dall@arm.com> wrote:
>>>
>>> On Tue, Nov 06, 2018 at 10:37:21AM -0800, Miriam Zimmerman wrote:
>>>> On Mon, Nov 5, 2018 at 11:45 PM Christoffer Dall
>>>> <christoffer.dall@arm.com> wrote:
>>>>>
>>>>> On Fri, Nov 02, 2018 at 02:23:45PM -0700, Miriam Zimmerman wrote:
>>>>>> In researching KVM_REG_ARM_TIMER_CNT, I discovered your commit 4b7a6bf
>>>>>> ("target-arm: kvm: Differentiate registers based on write-back
>>>>>> levels"), which seems to limit when the KVM_REG_ARM_TIMER_CNT is used
>>>>>> to save time. Under what circumstances should this be saved in order
>>>>>> to provide a consistent view of wall clock time (as given by `date` in
>>>>>> the VM)?
>>>>>
>>>>> In general, and not specific to QEMU, I think that the virtual
>>>>> counter value should stop counting when the entirety of the VM is not
>>>>> running, for example when the host machine is suspended, or when the
>>>>> entire VM is stopped/suspended, either as part of a suspend/resume
>>>>> operation, debug operation, or as part of migration of some sort.
>>>>>
>>>>> Supporting these timekeeping semantics is not something anyone has tried
>>>>> up until now with KVM/Arm, as far as I'm aware, and as such is 'new'
>>>>> work.
>>>>
>>>> Hrm, that's perplexing to me. I thought you said that in your tests,
>>>> going into S3 suspend on a host did *not* result in time drift on the
>>>> guest? That would suggest to me that there is code that correctly
>>>> handles it.
>>>
>>> I don't believe I've said that.  I haven't actually tried that myself,
>>> but I know anecdotally from others that time jumps on the guest when you
>>> suspend the host, leading to warnings in a guest.
>>>
>>> There must be some misunderstanding here.
>>
>> Ah, indeed - Steven said that he tried this and saw time track
>> properly in-guest on ARM. I misremembered and thought that was you.
> 
> What Steven said was:
> 
>   "On Arm, as far as I know, the guest's view of time is purely from the
>   virtual counter. Since nothing saves/restores this during the pause,
>   the counter continues to increment and the jump in time is visible to
>   the guest."
> 
> So here he means that time in the guest jumps, which is not what the
> guest expects, and thus you see warnings and problems from the guest,
> even though date/time may be reported correctly in the guest for the
> same reason.
> 
> Adjusting virtual time should prevent the guest from getting confused
> wrt. watchdogs and starved processes etc.

Good summary - that's at least what I meant to say :)

> However, I'm still not entirely clear on how the guest will correctly
> observe wall-clock if we adjust virtual time.  Should it use the
> physical counter?  Does PV take care of this?  Does it receive a
> notification that it must update its clock via NTP?
> 
> Steve, any insight?

PV time doesn't fix the guest observing wall-clock time. All it provides
the guest is "live physical time" - i.e. a good view of time when it is
executing, not general time.

There are two/three approaches I can see we could take:

1. Don't "fix" the fact that the virtual time runs when the guest is
paused, but instead implement the KVM_KVMCLOCK_CTRL ioctl for arm to
notify the kernel that time has jumped. This will silence the kernel's
watchdogs but user space will still see a big jump. It also does nothing
to fix drift caused by other events (e.g. the virtual machine being
saved to a file or the host machine being suspended/hybernated).

2. Stop virtual time when the guest isn't running and provide another
mechanism for the guest to get hold of wall clock time. E.g. x86 has
MSR_KVM_WALL_CLOCK_NEW which returns a structure with the actual wall
clock time of the host.

3. Assume the guest can synchronise with something external: i.e. NTP.
Sadly this doesn't work well in practice because NTP is built around a
reliable local clock that just needs a minor correction to the
frequency. NTP will take a while to notice a large jump in time and
(depending on configuration) can be reluctant to step time to correct it.

I haven't investigated how this actually works on x86 - it appears to be
some variant of 2 - but exactly how the MSR_KVM_WALL_CLOCK_NEW
functionality works I haven't understood yet.

Steve

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Timekeeping on ARM guests/hosts
  2018-11-08 16:34                               ` Steven Price
@ 2018-11-08 20:06                                 ` Christoffer Dall
  0 siblings, 0 replies; 21+ messages in thread
From: Christoffer Dall @ 2018-11-08 20:06 UTC (permalink / raw)
  To: Steven Price; +Cc: marc.zyngier, lersek, Miriam Zimmerman, kvmarm

On Thu, Nov 08, 2018 at 04:34:08PM +0000, Steven Price wrote:
> On 08/11/2018 10:26, Christoffer Dall wrote:
> > On Wed, Nov 07, 2018 at 10:22:06AM -0800, Miriam Zimmerman wrote:
> >> On Wed, Nov 7, 2018 at 1:42 AM Christoffer Dall
> >> <christoffer.dall@arm.com> wrote:
> >>>
> >>> On Tue, Nov 06, 2018 at 10:37:21AM -0800, Miriam Zimmerman wrote:
> >>>> On Mon, Nov 5, 2018 at 11:45 PM Christoffer Dall
> >>>> <christoffer.dall@arm.com> wrote:
> >>>>>
> >>>>> On Fri, Nov 02, 2018 at 02:23:45PM -0700, Miriam Zimmerman wrote:
> >>>>>> In researching KVM_REG_ARM_TIMER_CNT, I discovered your commit 4b7a6bf
> >>>>>> ("target-arm: kvm: Differentiate registers based on write-back
> >>>>>> levels"), which seems to limit when the KVM_REG_ARM_TIMER_CNT is used
> >>>>>> to save time. Under what circumstances should this be saved in order
> >>>>>> to provide a consistent view of wall clock time (as given by `date` in
> >>>>>> the VM)?
> >>>>>
> >>>>> In general, and not specific to QEMU, I think that the virtual
> >>>>> counter value should stop counting when the entirety of the VM is not
> >>>>> running, for example when the host machine is suspended, or when the
> >>>>> entire VM is stopped/suspended, either as part of a suspend/resume
> >>>>> operation, debug operation, or as part of migration of some sort.
> >>>>>
> >>>>> Supporting these timekeeping semantics is not something anyone has tried
> >>>>> up until now with KVM/Arm, as far as I'm aware, and as such is 'new'
> >>>>> work.
> >>>>
> >>>> Hrm, that's perplexing to me. I thought you said that in your tests,
> >>>> going into S3 suspend on a host did *not* result in time drift on the
> >>>> guest? That would suggest to me that there is code that correctly
> >>>> handles it.
> >>>
> >>> I don't believe I've said that.  I haven't actually tried that myself,
> >>> but I know anecdotally from others that time jumps on the guest when you
> >>> suspend the host, leading to warnings in a guest.
> >>>
> >>> There must be some misunderstanding here.
> >>
> >> Ah, indeed - Steven said that he tried this and saw time track
> >> properly in-guest on ARM. I misremembered and thought that was you.
> > 
> > What Steven said was:
> > 
> >   "On Arm, as far as I know, the guest's view of time is purely from the
> >   virtual counter. Since nothing saves/restores this during the pause,
> >   the counter continues to increment and the jump in time is visible to
> >   the guest."
> > 
> > So here he means that time in the guest jumps, which is not what the
> > guest expects, and thus you see warnings and problems from the guest,
> > even though date/time may be reported correctly in the guest for the
> > same reason.
> > 
> > Adjusting virtual time should prevent the guest from getting confused
> > wrt. watchdogs and starved processes etc.
> 
> Good summary - that's at least what I meant to say :)
> 
> > However, I'm still not entirely clear on how the guest will correctly
> > observe wall-clock if we adjust virtual time.  Should it use the
> > physical counter?  Does PV take care of this?  Does it receive a
> > notification that it must update its clock via NTP?
> > 
> > Steve, any insight?
> 
> PV time doesn't fix the guest observing wall-clock time. All it provides
> the guest is "live physical time" - i.e. a good view of time when it is
> executing, not general time.
> 
> There are two/three approaches I can see we could take:
> 
> 1. Don't "fix" the fact that the virtual time runs when the guest is
> paused, but instead implement the KVM_KVMCLOCK_CTRL ioctl for arm to
> notify the kernel that time has jumped. This will silence the kernel's
> watchdogs but user space will still see a big jump. It also does nothing
> to fix drift caused by other events (e.g. the virtual machine being
> saved to a file or the host machine being suspended/hybernated).
> 
> 2. Stop virtual time when the guest isn't running and provide another
> mechanism for the guest to get hold of wall clock time. E.g. x86 has
> MSR_KVM_WALL_CLOCK_NEW which returns a structure with the actual wall
> clock time of the host.
> 
> 3. Assume the guest can synchronise with something external: i.e. NTP.
> Sadly this doesn't work well in practice because NTP is built around a
> reliable local clock that just needs a minor correction to the
> frequency. NTP will take a while to notice a large jump in time and
> (depending on configuration) can be reluctant to step time to correct it.
> 
> I haven't investigated how this actually works on x86 - it appears to be
> some variant of 2 - but exactly how the MSR_KVM_WALL_CLOCK_NEW
> functionality works I haven't understood yet.
> 

Is the fact that we can have an emulated RTC device in QEMU another
mechanism that we need to consider?

How about the physical counter, does that help (assuming we mandate
trap-and-emulate of the physical counter on migration).  Could we have

  wall_clock_time_requested_hook(void) {
        static old_diff;
  	pcnt = read_sysreg(CNTPCT_EL0);
  	vcnt = read_sysreg(CNTVCT_EL0);

	if (pcnt - vcnt !- old_diff)
		rtc_pl031_update_now();
	old_diff = pcnt - vcnt;
  }

See also the thread from Bijan looking at this from the QEMU side.


Thanks,

    Christoffer

^ permalink raw reply	[flat|nested] 21+ messages in thread

end of thread, other threads:[~2018-11-08 20:06 UTC | newest]

Thread overview: 21+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-10-09 23:39 Timekeeping on ARM guests/hosts Miriam Zimmerman
2018-10-10 10:01 ` Marc Zyngier
2018-10-10 18:38   ` Miriam Zimmerman
2018-10-11  7:54     ` Marc Zyngier
2018-10-11 15:21       ` Laszlo Ersek
2018-10-11 18:40         ` Miriam Zimmerman
2018-10-31 16:41           ` Steven Price
2018-10-31 18:49             ` Miriam Zimmerman
2018-11-02 14:34               ` Christoffer Dall
2018-11-02 18:28                 ` Miriam Zimmerman
2018-11-02 21:23                   ` Miriam Zimmerman
2018-11-06  7:45                     ` Christoffer Dall
2018-11-06 13:39                       ` Alex Bennée
2018-11-06 18:37                       ` Miriam Zimmerman
2018-11-07  9:42                         ` Christoffer Dall
2018-11-07 18:22                           ` Miriam Zimmerman
2018-11-08 10:26                             ` Christoffer Dall
2018-11-08 16:34                               ` Steven Price
2018-11-08 20:06                                 ` Christoffer Dall
2018-11-03  0:19                 ` Bijan Mottahedeh
2018-11-06  7:49                   ` Christoffer Dall

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.