xen-devel.lists.xenproject.org archive mirror
 help / color / mirror / Atom feed
* HVMlite gains
@ 2016-03-15 21:14 Luis R. Rodriguez
  2016-03-15 21:29 ` Andrew Cooper
  2016-03-15 22:39 ` Konrad Rzeszutek Wilk
  0 siblings, 2 replies; 10+ messages in thread
From: Luis R. Rodriguez @ 2016-03-15 21:14 UTC (permalink / raw)
  To: xen-devel; +Cc: Matt Fleming, Boris Ostrovsky, Borislav Petkov, Andy Lutomirski

While discussing HVMLite with a few people a few questions have come
up. Since I only really understand a few possible gains with the
current design I wanted to get clarificaiton on a few which I simply
have no clue if we stand to gain from them, or if its on the roadmap:

  a) Will context switches use the actual CR3 register?
  b) Will IOPL live in the actual FLAGS register?
  c) Will guest-usable CPU features should show up in CPUID, and will
features that shouldn't be used should *not* show up in CPUID. For
instance currently you happen to boot Xen 4.4.0 with a new Linux dom0
on a CPU that supports MPX what will happen? What about with HVMlite?
  d)  Will acking an interrupt use the standard APIC mechanism?  Do
any of the current Xen variants do that?
  e) Can timing use RDTSC?

 Luis

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: HVMlite gains
  2016-03-15 21:14 HVMlite gains Luis R. Rodriguez
@ 2016-03-15 21:29 ` Andrew Cooper
  2016-03-15 21:36   ` Andy Lutomirski
  2016-03-15 22:39 ` Konrad Rzeszutek Wilk
  1 sibling, 1 reply; 10+ messages in thread
From: Andrew Cooper @ 2016-03-15 21:29 UTC (permalink / raw)
  To: Luis R. Rodriguez, xen-devel
  Cc: Matt Fleming, Boris Ostrovsky, Borislav Petkov, Andy Lutomirski

On 15/03/2016 21:14, Luis R. Rodriguez wrote:
> While discussing HVMLite with a few people a few questions have come
> up. Since I only really understand a few possible gains with the
> current design I wanted to get clarificaiton on a few which I simply
> have no clue if we stand to gain from them, or if its on the roadmap:
>
>   a) Will context switches use the actual CR3 register?

Yes.  HVMLite will use fully the full array of hardware virtualisation
extensions, so gets its own pagetables and cr3.

>   b) Will IOPL live in the actual FLAGS register?

Yes

>   c) Will guest-usable CPU features should show up in CPUID, and will
> features that shouldn't be used should *not* show up in CPUID. For
> instance currently you happen to boot Xen 4.4.0 with a new Linux dom0
> on a CPU that supports MPX what will happen? What about with HVMlite?

I am desperately trying to fix the existing broken-ness.  PV is too far
beyond repair when it comes to the OSXSAVE bit, but the rest is fine. 
See my "x86: Improvements to cpuid handling for guests" series, v3 of
which was posted today.

With that series in place, a guest should always see correct features
(give or take some "fun" with masking, but at least the xen_cpuid() path
will still be correct).

>   d)  Will acking an interrupt use the standard APIC mechanism?  Do
> any of the current Xen variants do that?

Emulated APIC will be enabled by default. The "PV event" path will be
available as an alternative.

If a user desperately wishes to avoid the emulated APIC, they have the
option to disable it in the domain config.

>   e) Can timing use RDTSC?

I don't understand this question in the context of the others.  RDTSC
has (as far as I can tell) always been advertised and available for
guest use.  RDTSCP is a different matter, and I have half-fixed that
brokenness; it should now work correctly in HVM guests.

~Andrew

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: HVMlite gains
  2016-03-15 21:29 ` Andrew Cooper
@ 2016-03-15 21:36   ` Andy Lutomirski
  2016-03-15 21:50     ` Boris Ostrovsky
  2016-03-15 21:50     ` Andrew Cooper
  0 siblings, 2 replies; 10+ messages in thread
From: Andy Lutomirski @ 2016-03-15 21:36 UTC (permalink / raw)
  To: Andrew Cooper
  Cc: Matt Fleming, Luis R. Rodriguez, xen-devel, Borislav Petkov,
	Boris Ostrovsky

On Tue, Mar 15, 2016 at 2:29 PM, Andrew Cooper
<andrew.cooper3@citrix.com> wrote:
> On 15/03/2016 21:14, Luis R. Rodriguez wrote:
>> While discussing HVMLite with a few people a few questions have come
>> up. Since I only really understand a few possible gains with the
>> current design I wanted to get clarificaiton on a few which I simply
>> have no clue if we stand to gain from them, or if its on the roadmap:
>>
>>   a) Will context switches use the actual CR3 register?
>
> Yes.  HVMLite will use fully the full array of hardware virtualisation
> extensions, so gets its own pagetables and cr3.
>
>>   b) Will IOPL live in the actual FLAGS register?
>
> Yes
>
>>   c) Will guest-usable CPU features should show up in CPUID, and will
>> features that shouldn't be used should *not* show up in CPUID. For
>> instance currently you happen to boot Xen 4.4.0 with a new Linux dom0
>> on a CPU that supports MPX what will happen? What about with HVMlite?
>
> I am desperately trying to fix the existing broken-ness.  PV is too far
> beyond repair when it comes to the OSXSAVE bit, but the rest is fine.
> See my "x86: Improvements to cpuid handling for guests" series, v3 of
> which was posted today.
>
> With that series in place, a guest should always see correct features
> (give or take some "fun" with masking, but at least the xen_cpuid() path
> will still be correct).
>
>>   d)  Will acking an interrupt use the standard APIC mechanism?  Do
>> any of the current Xen variants do that?
>
> Emulated APIC will be enabled by default. The "PV event" path will be
> available as an alternative.
>
> If a user desperately wishes to avoid the emulated APIC, they have the
> option to disable it in the domain config.
>
>>   e) Can timing use RDTSC?
>
> I don't understand this question in the context of the others.  RDTSC
> has (as far as I can tell) always been advertised and available for
> guest use.  RDTSCP is a different matter, and I have half-fixed that
> brokenness; it should now work correctly in HVM guests.
>

These questions mostly came from me, and they weren't necessarily
intended to make sense as a coherent whole :)  They were more of a
random collection of things I was wondering about to varying extents.

What I mean is:  if we point sched_clock at RDTSC and try to use the
regular TSC timesource in a guest, will it work reasonably well,
assuming that the underlying hardware supports it?  And, if the
underlying hardware doesn't support it (e.g. not constant / invariant
or no TSC offsetting available or similar), will the hypervisor tell
the guest this fact via CPUID so that the standard guest clocksource
code doesn't try to use a non-working TSC?

--Andy

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: HVMlite gains
  2016-03-15 21:36   ` Andy Lutomirski
@ 2016-03-15 21:50     ` Boris Ostrovsky
  2016-03-15 21:50     ` Andrew Cooper
  1 sibling, 0 replies; 10+ messages in thread
From: Boris Ostrovsky @ 2016-03-15 21:50 UTC (permalink / raw)
  To: Andy Lutomirski, Andrew Cooper
  Cc: Matt Fleming, Luis R. Rodriguez, xen-devel, Borislav Petkov

On 03/15/2016 05:36 PM, Andy Lutomirski wrote:
> On Tue, Mar 15, 2016 at 2:29 PM, Andrew Cooper
> <andrew.cooper3@citrix.com> wrote:
>> On 15/03/2016 21:14, Luis R. Rodriguez wrote:
>>> While discussing HVMLite with a few people a few questions have come
>>> up. Since I only really understand a few possible gains with the
>>> current design I wanted to get clarificaiton on a few which I simply
>>> have no clue if we stand to gain from them, or if its on the roadmap:
>>>
>>>    a) Will context switches use the actual CR3 register?
>> Yes.  HVMLite will use fully the full array of hardware virtualisation
>> extensions, so gets its own pagetables and cr3.
>>
>>>    b) Will IOPL live in the actual FLAGS register?
>> Yes
>>
>>>    c) Will guest-usable CPU features should show up in CPUID, and will
>>> features that shouldn't be used should *not* show up in CPUID. For
>>> instance currently you happen to boot Xen 4.4.0 with a new Linux dom0
>>> on a CPU that supports MPX what will happen? What about with HVMlite?
>> I am desperately trying to fix the existing broken-ness.  PV is too far
>> beyond repair when it comes to the OSXSAVE bit, but the rest is fine.
>> See my "x86: Improvements to cpuid handling for guests" series, v3 of
>> which was posted today.
>>
>> With that series in place, a guest should always see correct features
>> (give or take some "fun" with masking, but at least the xen_cpuid() path
>> will still be correct).
>>
>>>    d)  Will acking an interrupt use the standard APIC mechanism?  Do
>>> any of the current Xen variants do that?
>> Emulated APIC will be enabled by default. The "PV event" path will be
>> available as an alternative.
>>
>> If a user desperately wishes to avoid the emulated APIC, they have the
>> option to disable it in the domain config.
>>
>>>    e) Can timing use RDTSC?
>> I don't understand this question in the context of the others.  RDTSC
>> has (as far as I can tell) always been advertised and available for
>> guest use.  RDTSCP is a different matter, and I have half-fixed that
>> brokenness; it should now work correctly in HVM guests.
>>
> These questions mostly came from me, and they weren't necessarily
> intended to make sense as a coherent whole :)  They were more of a
> random collection of things I was wondering about to varying extents.
>
> What I mean is:  if we point sched_clock at RDTSC and try to use the
> regular TSC timesource in a guest, will it work reasonably well,
> assuming that the underlying hardware supports it?  And, if the
> underlying hardware doesn't support it (e.g. not constant / invariant
> or no TSC offsetting available or similar), will the hypervisor tell
> the guest this fact via CPUID so that the standard guest clocksource
> code doesn't try to use a non-working TSC?

Hypervisor typically clears TSC Invariant bit because the guest can 
migrate to a system with a different clock

http://xenbits.xen.org/gitweb/?p=xen.git;a=blob;f=xen/arch/x86/domain.c;h=a6d721bd48176f51c0d9dfb57099c8b7f52220c2;hb=HEAD#l2612

There are options (in guest config file) to have hypervisor set this flag.

-boris


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: HVMlite gains
  2016-03-15 21:36   ` Andy Lutomirski
  2016-03-15 21:50     ` Boris Ostrovsky
@ 2016-03-15 21:50     ` Andrew Cooper
  2016-03-15 21:52       ` Andy Lutomirski
  1 sibling, 1 reply; 10+ messages in thread
From: Andrew Cooper @ 2016-03-15 21:50 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Matt Fleming, Luis R. Rodriguez, xen-devel, Borislav Petkov,
	Boris Ostrovsky

On 15/03/2016 21:36, Andy Lutomirski wrote:
>
>>>   e) Can timing use RDTSC?
>> I don't understand this question in the context of the others.  RDTSC
>> has (as far as I can tell) always been advertised and available for
>> guest use.  RDTSCP is a different matter, and I have half-fixed that
>> brokenness; it should now work correctly in HVM guests.
>>
> These questions mostly came from me, and they weren't necessarily
> intended to make sense as a coherent whole :)  They were more of a
> random collection of things I was wondering about to varying extents.
>
> What I mean is:  if we point sched_clock at RDTSC and try to use the
> regular TSC timesource in a guest, will it work reasonably well,
> assuming that the underlying hardware supports it?  And, if the
> underlying hardware doesn't support it (e.g. not constant / invariant
> or no TSC offsetting available or similar), will the hypervisor tell
> the guest this fact via CPUID so that the standard guest clocksource
> code doesn't try to use a non-working TSC?

In principle yes, but it is rather more complicated than that.

By default, if you want a guest to be migrateable and you can't
guarantee that you will have hardware TSC scaling support on every
future destination, you cannot advertise the TSC as stable to the
guest.  We err on the side of caution and don't advertise invariance by
default.

In practice, if you are running on anything vaguely modern, the TSC will
be reliable between migrates.

What the migration protocol currently lacks is a mechanism to identify
"This VM was advertised invariant TSC at frequency $X when it was
booted".  There is nominally a "no migrate" flag which can be set, at
which point invariance will be advertised if the host is capable. 
However, there is no way for the toolstack to query this, so nothing in
the migrate code checks or acts upon it.

Windows have worked around this limitation with the Viridian spec,
whereby the hypervisor can provide the current TSC frequency, and
promises that it won't change until the next suspend/resume, at which
point the frequency will be resampled.

~Andrew

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: HVMlite gains
  2016-03-15 21:50     ` Andrew Cooper
@ 2016-03-15 21:52       ` Andy Lutomirski
  2016-03-15 22:05         ` Andrew Cooper
  0 siblings, 1 reply; 10+ messages in thread
From: Andy Lutomirski @ 2016-03-15 21:52 UTC (permalink / raw)
  To: Andrew Cooper
  Cc: Matt Fleming, Luis R. Rodriguez, xen-devel, Borislav Petkov,
	Boris Ostrovsky

On Tue, Mar 15, 2016 at 2:50 PM, Andrew Cooper
<andrew.cooper3@citrix.com> wrote:
> On 15/03/2016 21:36, Andy Lutomirski wrote:
>>
>>>>   e) Can timing use RDTSC?
>>> I don't understand this question in the context of the others.  RDTSC
>>> has (as far as I can tell) always been advertised and available for
>>> guest use.  RDTSCP is a different matter, and I have half-fixed that
>>> brokenness; it should now work correctly in HVM guests.
>>>
>> These questions mostly came from me, and they weren't necessarily
>> intended to make sense as a coherent whole :)  They were more of a
>> random collection of things I was wondering about to varying extents.
>>
>> What I mean is:  if we point sched_clock at RDTSC and try to use the
>> regular TSC timesource in a guest, will it work reasonably well,
>> assuming that the underlying hardware supports it?  And, if the
>> underlying hardware doesn't support it (e.g. not constant / invariant
>> or no TSC offsetting available or similar), will the hypervisor tell
>> the guest this fact via CPUID so that the standard guest clocksource
>> code doesn't try to use a non-working TSC?
>
> In principle yes, but it is rather more complicated than that.
>
> By default, if you want a guest to be migrateable and you can't
> guarantee that you will have hardware TSC scaling support on every
> future destination, you cannot advertise the TSC as stable to the
> guest.  We err on the side of caution and don't advertise invariance by
> default.
>
> In practice, if you are running on anything vaguely modern, the TSC will
> be reliable between migrates.

By "reliable" do you mean monotonic and not horribly jumpy?  I thought
there was no shipping hardware with TSC scaling.

>
> What the migration protocol currently lacks is a mechanism to identify
> "This VM was advertised invariant TSC at frequency $X when it was
> booted".  There is nominally a "no migrate" flag which can be set, at
> which point invariance will be advertised if the host is capable.
> However, there is no way for the toolstack to query this, so nothing in
> the migrate code checks or acts upon it.
>
> Windows have worked around this limitation with the Viridian spec,
> whereby the hypervisor can provide the current TSC frequency, and
> promises that it won't change until the next suspend/resume, at which
> point the frequency will be resampled.
>

That's simpler and maybe even better than the pvclock design, at least
as implemented by KVM.  Sigh.

--Andy

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: HVMlite gains
  2016-03-15 21:52       ` Andy Lutomirski
@ 2016-03-15 22:05         ` Andrew Cooper
  2016-03-16  3:18           ` Andy Lutomirski
  0 siblings, 1 reply; 10+ messages in thread
From: Andrew Cooper @ 2016-03-15 22:05 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Matt Fleming, Luis R. Rodriguez, xen-devel, Borislav Petkov,
	Boris Ostrovsky

On 15/03/2016 21:52, Andy Lutomirski wrote:
> On Tue, Mar 15, 2016 at 2:50 PM, Andrew Cooper
> <andrew.cooper3@citrix.com> wrote:
>> On 15/03/2016 21:36, Andy Lutomirski wrote:
>>>>>   e) Can timing use RDTSC?
>>>> I don't understand this question in the context of the others.  RDTSC
>>>> has (as far as I can tell) always been advertised and available for
>>>> guest use.  RDTSCP is a different matter, and I have half-fixed that
>>>> brokenness; it should now work correctly in HVM guests.
>>>>
>>> These questions mostly came from me, and they weren't necessarily
>>> intended to make sense as a coherent whole :)  They were more of a
>>> random collection of things I was wondering about to varying extents.
>>>
>>> What I mean is:  if we point sched_clock at RDTSC and try to use the
>>> regular TSC timesource in a guest, will it work reasonably well,
>>> assuming that the underlying hardware supports it?  And, if the
>>> underlying hardware doesn't support it (e.g. not constant / invariant
>>> or no TSC offsetting available or similar), will the hypervisor tell
>>> the guest this fact via CPUID so that the standard guest clocksource
>>> code doesn't try to use a non-working TSC?
>> In principle yes, but it is rather more complicated than that.
>>
>> By default, if you want a guest to be migrateable and you can't
>> guarantee that you will have hardware TSC scaling support on every
>> future destination, you cannot advertise the TSC as stable to the
>> guest.  We err on the side of caution and don't advertise invariance by
>> default.
>>
>> In practice, if you are running on anything vaguely modern, the TSC will
>> be reliable between migrates.
> By "reliable" do you mean monotonic and not horribly jumpy?  I thought
> there was no shipping hardware with TSC scaling.

AMD have had TSC scaling for a long time (code added to Xen in 2011). 
Intel are the ones late to the party in this case.

There was a patch series from Joao around about Christmas "x86/time:
PVCLOCK_TSC_STABLE_BIT supportwhich identified several bugs with Xen's
TSC handling as visible in the PVCLK.  It would be nice to get those
bugs fixed.

>
>> What the migration protocol currently lacks is a mechanism to identify
>> "This VM was advertised invariant TSC at frequency $X when it was
>> booted".  There is nominally a "no migrate" flag which can be set, at
>> which point invariance will be advertised if the host is capable.
>> However, there is no way for the toolstack to query this, so nothing in
>> the migrate code checks or acts upon it.
>>
>> Windows have worked around this limitation with the Viridian spec,
>> whereby the hypervisor can provide the current TSC frequency, and
>> promises that it won't change until the next suspend/resume, at which
>> point the frequency will be resampled.
>>
> That's simpler and maybe even better than the pvclock design, at least
> as implemented by KVM.  Sigh.

Updates to that also need fixing.  PVCLOCK is a Xen ABI which was
borrowed by KVM then locally modified.

I believe the two are still compatible.

But yes - the Viridian way does appear substantially more sane.

~Andrew

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: HVMlite gains
  2016-03-15 21:14 HVMlite gains Luis R. Rodriguez
  2016-03-15 21:29 ` Andrew Cooper
@ 2016-03-15 22:39 ` Konrad Rzeszutek Wilk
  1 sibling, 0 replies; 10+ messages in thread
From: Konrad Rzeszutek Wilk @ 2016-03-15 22:39 UTC (permalink / raw)
  To: Luis R. Rodriguez
  Cc: Matt Fleming, Boris Ostrovsky, xen-devel, Borislav Petkov,
	Andy Lutomirski

On Tue, Mar 15, 2016 at 02:14:15PM -0700, Luis R. Rodriguez wrote:
> While discussing HVMLite with a few people a few questions have come
> up. Since I only really understand a few possible gains with the
> current design I wanted to get clarificaiton on a few which I simply

Just think of baremetal without BIOS. Without PCI support (unless
needed).

> have no clue if we stand to gain from them, or if its on the roadmap:
> 
>   a) Will context switches use the actual CR3 register?

Yes.
>   b) Will IOPL live in the actual FLAGS register?

Yes.
>   c) Will guest-usable CPU features should show up in CPUID, and will
> features that shouldn't be used should *not* show up in CPUID. For

Correct.
> instance currently you happen to boot Xen 4.4.0 with a new Linux dom0
> on a CPU that supports MPX what will happen? What about with HVMlite?

It should boot just normally. Either PV dom0 guest or HVMLite dom0
guest (whenever that is operational).


>   d)  Will acking an interrupt use the standard APIC mechanism?  Do
> any of the current Xen variants do that?

Yes and no. It is disabled right now but it will be enabled as I want
to make sure we can use vAPIC in the guest.

You don't want to use emulated APIC mechanism as it incurs VMEXITs.

>   e) Can timing use RDTSC?

Yes (odd question? You could always use rdtsc). 
> 
>  Luis

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: HVMlite gains
  2016-03-15 22:05         ` Andrew Cooper
@ 2016-03-16  3:18           ` Andy Lutomirski
  2016-03-16 10:46             ` Andrew Cooper
  0 siblings, 1 reply; 10+ messages in thread
From: Andy Lutomirski @ 2016-03-16  3:18 UTC (permalink / raw)
  To: Andrew Cooper
  Cc: Matt Fleming, Boris Ostrovsky, xen-devel, Borislav Petkov,
	Luis R. Rodriguez

On Mar 15, 2016 3:05 PM, "Andrew Cooper" <andrew.cooper3@citrix.com> wrote:
>
> On 15/03/2016 21:52, Andy Lutomirski wrote:
> > On Tue, Mar 15, 2016 at 2:50 PM, Andrew Cooper
> > <andrew.cooper3@citrix.com> wrote:
> >> On 15/03/2016 21:36, Andy Lutomirski wrote:
> >>>>>   e) Can timing use RDTSC?
> >>>> I don't understand this question in the context of the others.  RDTSC
> >>>> has (as far as I can tell) always been advertised and available for
> >>>> guest use.  RDTSCP is a different matter, and I have half-fixed that
> >>>> brokenness; it should now work correctly in HVM guests.
> >>>>
> >>> These questions mostly came from me, and they weren't necessarily
> >>> intended to make sense as a coherent whole :)  They were more of a
> >>> random collection of things I was wondering about to varying extents.
> >>>
> >>> What I mean is:  if we point sched_clock at RDTSC and try to use the
> >>> regular TSC timesource in a guest, will it work reasonably well,
> >>> assuming that the underlying hardware supports it?  And, if the
> >>> underlying hardware doesn't support it (e.g. not constant / invariant
> >>> or no TSC offsetting available or similar), will the hypervisor tell
> >>> the guest this fact via CPUID so that the standard guest clocksource
> >>> code doesn't try to use a non-working TSC?
> >> In principle yes, but it is rather more complicated than that.
> >>
> >> By default, if you want a guest to be migrateable and you can't
> >> guarantee that you will have hardware TSC scaling support on every
> >> future destination, you cannot advertise the TSC as stable to the
> >> guest.  We err on the side of caution and don't advertise invariance by
> >> default.
> >>
> >> In practice, if you are running on anything vaguely modern, the TSC will
> >> be reliable between migrates.
> > By "reliable" do you mean monotonic and not horribly jumpy?  I thought
> > there was no shipping hardware with TSC scaling.
>
> AMD have had TSC scaling for a long time (code added to Xen in 2011).
> Intel are the ones late to the party in this case.
>
> There was a patch series from Joao around about Christmas "x86/time:
> PVCLOCK_TSC_STABLE_BIT supportwhich identified several bugs with Xen's
> TSC handling as visible in the PVCLK.  It would be nice to get those
> bugs fixed.
>
> >
> >> What the migration protocol currently lacks is a mechanism to identify
> >> "This VM was advertised invariant TSC at frequency $X when it was
> >> booted".  There is nominally a "no migrate" flag which can be set, at
> >> which point invariance will be advertised if the host is capable.
> >> However, there is no way for the toolstack to query this, so nothing in
> >> the migrate code checks or acts upon it.
> >>
> >> Windows have worked around this limitation with the Viridian spec,
> >> whereby the hypervisor can provide the current TSC frequency, and
> >> promises that it won't change until the next suspend/resume, at which
> >> point the frequency will be resampled.
> >>
> > That's simpler and maybe even better than the pvclock design, at least
> > as implemented by KVM.  Sigh.
>
> Updates to that also need fixing.  PVCLOCK is a Xen ABI which was
> borrowed by KVM then locally modified.
>
> I believe the two are still compatible.
>
> But yes - the Viridian way does appear substantially more sane.
>

Hmm. Is migration synchronous enough for that approach to be reliable?
 That is, when the TSC frequency changes, is there some way that the
guest is guaranteed to be notified before it starts screwing up its
timing calculations?

> ~Andrew

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: HVMlite gains
  2016-03-16  3:18           ` Andy Lutomirski
@ 2016-03-16 10:46             ` Andrew Cooper
  0 siblings, 0 replies; 10+ messages in thread
From: Andrew Cooper @ 2016-03-16 10:46 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Matt Fleming, Boris Ostrovsky, xen-devel, Borislav Petkov,
	Luis R. Rodriguez

On 16/03/16 03:18, Andy Lutomirski wrote:
> On Mar 15, 2016 3:05 PM, "Andrew Cooper" <andrew.cooper3@citrix.com> wrote:
>> On 15/03/2016 21:52, Andy Lutomirski wrote:
>>> On Tue, Mar 15, 2016 at 2:50 PM, Andrew Cooper
>>> <andrew.cooper3@citrix.com> wrote:
>>>> On 15/03/2016 21:36, Andy Lutomirski wrote:
>>>>>>>   e) Can timing use RDTSC?
>>>>>> I don't understand this question in the context of the others.  RDTSC
>>>>>> has (as far as I can tell) always been advertised and available for
>>>>>> guest use.  RDTSCP is a different matter, and I have half-fixed that
>>>>>> brokenness; it should now work correctly in HVM guests.
>>>>>>
>>>>> These questions mostly came from me, and they weren't necessarily
>>>>> intended to make sense as a coherent whole :)  They were more of a
>>>>> random collection of things I was wondering about to varying extents.
>>>>>
>>>>> What I mean is:  if we point sched_clock at RDTSC and try to use the
>>>>> regular TSC timesource in a guest, will it work reasonably well,
>>>>> assuming that the underlying hardware supports it?  And, if the
>>>>> underlying hardware doesn't support it (e.g. not constant / invariant
>>>>> or no TSC offsetting available or similar), will the hypervisor tell
>>>>> the guest this fact via CPUID so that the standard guest clocksource
>>>>> code doesn't try to use a non-working TSC?
>>>> In principle yes, but it is rather more complicated than that.
>>>>
>>>> By default, if you want a guest to be migrateable and you can't
>>>> guarantee that you will have hardware TSC scaling support on every
>>>> future destination, you cannot advertise the TSC as stable to the
>>>> guest.  We err on the side of caution and don't advertise invariance by
>>>> default.
>>>>
>>>> In practice, if you are running on anything vaguely modern, the TSC will
>>>> be reliable between migrates.
>>> By "reliable" do you mean monotonic and not horribly jumpy?  I thought
>>> there was no shipping hardware with TSC scaling.
>> AMD have had TSC scaling for a long time (code added to Xen in 2011).
>> Intel are the ones late to the party in this case.
>>
>> There was a patch series from Joao around about Christmas "x86/time:
>> PVCLOCK_TSC_STABLE_BIT supportwhich identified several bugs with Xen's
>> TSC handling as visible in the PVCLK.  It would be nice to get those
>> bugs fixed.
>>
>>>> What the migration protocol currently lacks is a mechanism to identify
>>>> "This VM was advertised invariant TSC at frequency $X when it was
>>>> booted".  There is nominally a "no migrate" flag which can be set, at
>>>> which point invariance will be advertised if the host is capable.
>>>> However, there is no way for the toolstack to query this, so nothing in
>>>> the migrate code checks or acts upon it.
>>>>
>>>> Windows have worked around this limitation with the Viridian spec,
>>>> whereby the hypervisor can provide the current TSC frequency, and
>>>> promises that it won't change until the next suspend/resume, at which
>>>> point the frequency will be resampled.
>>>>
>>> That's simpler and maybe even better than the pvclock design, at least
>>> as implemented by KVM.  Sigh.
>> Updates to that also need fixing.  PVCLOCK is a Xen ABI which was
>> borrowed by KVM then locally modified.
>>
>> I believe the two are still compatible.
>>
>> But yes - the Viridian way does appear substantially more sane.
>>
> Hmm. Is migration synchronous enough for that approach to be reliable?
>  That is, when the TSC frequency changes, is there some way that the
> guest is guaranteed to be notified before it starts screwing up its
> timing calculations?

For VMs which do not have any Xen PV drivers, it is possible to migrate
them without any cooperation at all.  In this case, there is no
practical way to indicate that the TSC frequency has changed.

For VMs which do have PV drivers, migration requires guest cooperation,
or memory corruption will occur (pre-existing shared mappings can't have
writes tracked on them, so the guest driver is required to replay its
control ring on the destination side).  In this case, the guest always
passes through xen_suspend().  HYPERVISOR_suspend() is called in the
source side, and returns on the destination.

~Andrew

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2016-03-16 10:46 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-03-15 21:14 HVMlite gains Luis R. Rodriguez
2016-03-15 21:29 ` Andrew Cooper
2016-03-15 21:36   ` Andy Lutomirski
2016-03-15 21:50     ` Boris Ostrovsky
2016-03-15 21:50     ` Andrew Cooper
2016-03-15 21:52       ` Andy Lutomirski
2016-03-15 22:05         ` Andrew Cooper
2016-03-16  3:18           ` Andy Lutomirski
2016-03-16 10:46             ` Andrew Cooper
2016-03-15 22:39 ` Konrad Rzeszutek Wilk

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).