* HVMlite gains
@ 2016-03-15 21:14 Luis R. Rodriguez
2016-03-15 21:29 ` Andrew Cooper
2016-03-15 22:39 ` Konrad Rzeszutek Wilk
0 siblings, 2 replies; 10+ messages in thread
From: Luis R. Rodriguez @ 2016-03-15 21:14 UTC (permalink / raw)
To: xen-devel; +Cc: Matt Fleming, Boris Ostrovsky, Borislav Petkov, Andy Lutomirski
While discussing HVMLite with a few people a few questions have come
up. Since I only really understand a few possible gains with the
current design I wanted to get clarificaiton on a few which I simply
have no clue if we stand to gain from them, or if its on the roadmap:
a) Will context switches use the actual CR3 register?
b) Will IOPL live in the actual FLAGS register?
c) Will guest-usable CPU features should show up in CPUID, and will
features that shouldn't be used should *not* show up in CPUID. For
instance currently you happen to boot Xen 4.4.0 with a new Linux dom0
on a CPU that supports MPX what will happen? What about with HVMlite?
d) Will acking an interrupt use the standard APIC mechanism? Do
any of the current Xen variants do that?
e) Can timing use RDTSC?
Luis
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: HVMlite gains
2016-03-15 21:14 HVMlite gains Luis R. Rodriguez
@ 2016-03-15 21:29 ` Andrew Cooper
2016-03-15 21:36 ` Andy Lutomirski
2016-03-15 22:39 ` Konrad Rzeszutek Wilk
1 sibling, 1 reply; 10+ messages in thread
From: Andrew Cooper @ 2016-03-15 21:29 UTC (permalink / raw)
To: Luis R. Rodriguez, xen-devel
Cc: Matt Fleming, Boris Ostrovsky, Borislav Petkov, Andy Lutomirski
On 15/03/2016 21:14, Luis R. Rodriguez wrote:
> While discussing HVMLite with a few people a few questions have come
> up. Since I only really understand a few possible gains with the
> current design I wanted to get clarificaiton on a few which I simply
> have no clue if we stand to gain from them, or if its on the roadmap:
>
> a) Will context switches use the actual CR3 register?
Yes. HVMLite will use fully the full array of hardware virtualisation
extensions, so gets its own pagetables and cr3.
> b) Will IOPL live in the actual FLAGS register?
Yes
> c) Will guest-usable CPU features should show up in CPUID, and will
> features that shouldn't be used should *not* show up in CPUID. For
> instance currently you happen to boot Xen 4.4.0 with a new Linux dom0
> on a CPU that supports MPX what will happen? What about with HVMlite?
I am desperately trying to fix the existing broken-ness. PV is too far
beyond repair when it comes to the OSXSAVE bit, but the rest is fine.
See my "x86: Improvements to cpuid handling for guests" series, v3 of
which was posted today.
With that series in place, a guest should always see correct features
(give or take some "fun" with masking, but at least the xen_cpuid() path
will still be correct).
> d) Will acking an interrupt use the standard APIC mechanism? Do
> any of the current Xen variants do that?
Emulated APIC will be enabled by default. The "PV event" path will be
available as an alternative.
If a user desperately wishes to avoid the emulated APIC, they have the
option to disable it in the domain config.
> e) Can timing use RDTSC?
I don't understand this question in the context of the others. RDTSC
has (as far as I can tell) always been advertised and available for
guest use. RDTSCP is a different matter, and I have half-fixed that
brokenness; it should now work correctly in HVM guests.
~Andrew
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: HVMlite gains
2016-03-15 21:29 ` Andrew Cooper
@ 2016-03-15 21:36 ` Andy Lutomirski
2016-03-15 21:50 ` Boris Ostrovsky
2016-03-15 21:50 ` Andrew Cooper
0 siblings, 2 replies; 10+ messages in thread
From: Andy Lutomirski @ 2016-03-15 21:36 UTC (permalink / raw)
To: Andrew Cooper
Cc: Matt Fleming, Luis R. Rodriguez, xen-devel, Borislav Petkov,
Boris Ostrovsky
On Tue, Mar 15, 2016 at 2:29 PM, Andrew Cooper
<andrew.cooper3@citrix.com> wrote:
> On 15/03/2016 21:14, Luis R. Rodriguez wrote:
>> While discussing HVMLite with a few people a few questions have come
>> up. Since I only really understand a few possible gains with the
>> current design I wanted to get clarificaiton on a few which I simply
>> have no clue if we stand to gain from them, or if its on the roadmap:
>>
>> a) Will context switches use the actual CR3 register?
>
> Yes. HVMLite will use fully the full array of hardware virtualisation
> extensions, so gets its own pagetables and cr3.
>
>> b) Will IOPL live in the actual FLAGS register?
>
> Yes
>
>> c) Will guest-usable CPU features should show up in CPUID, and will
>> features that shouldn't be used should *not* show up in CPUID. For
>> instance currently you happen to boot Xen 4.4.0 with a new Linux dom0
>> on a CPU that supports MPX what will happen? What about with HVMlite?
>
> I am desperately trying to fix the existing broken-ness. PV is too far
> beyond repair when it comes to the OSXSAVE bit, but the rest is fine.
> See my "x86: Improvements to cpuid handling for guests" series, v3 of
> which was posted today.
>
> With that series in place, a guest should always see correct features
> (give or take some "fun" with masking, but at least the xen_cpuid() path
> will still be correct).
>
>> d) Will acking an interrupt use the standard APIC mechanism? Do
>> any of the current Xen variants do that?
>
> Emulated APIC will be enabled by default. The "PV event" path will be
> available as an alternative.
>
> If a user desperately wishes to avoid the emulated APIC, they have the
> option to disable it in the domain config.
>
>> e) Can timing use RDTSC?
>
> I don't understand this question in the context of the others. RDTSC
> has (as far as I can tell) always been advertised and available for
> guest use. RDTSCP is a different matter, and I have half-fixed that
> brokenness; it should now work correctly in HVM guests.
>
These questions mostly came from me, and they weren't necessarily
intended to make sense as a coherent whole :) They were more of a
random collection of things I was wondering about to varying extents.
What I mean is: if we point sched_clock at RDTSC and try to use the
regular TSC timesource in a guest, will it work reasonably well,
assuming that the underlying hardware supports it? And, if the
underlying hardware doesn't support it (e.g. not constant / invariant
or no TSC offsetting available or similar), will the hypervisor tell
the guest this fact via CPUID so that the standard guest clocksource
code doesn't try to use a non-working TSC?
--Andy
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: HVMlite gains
2016-03-15 21:36 ` Andy Lutomirski
@ 2016-03-15 21:50 ` Boris Ostrovsky
2016-03-15 21:50 ` Andrew Cooper
1 sibling, 0 replies; 10+ messages in thread
From: Boris Ostrovsky @ 2016-03-15 21:50 UTC (permalink / raw)
To: Andy Lutomirski, Andrew Cooper
Cc: Matt Fleming, Luis R. Rodriguez, xen-devel, Borislav Petkov
On 03/15/2016 05:36 PM, Andy Lutomirski wrote:
> On Tue, Mar 15, 2016 at 2:29 PM, Andrew Cooper
> <andrew.cooper3@citrix.com> wrote:
>> On 15/03/2016 21:14, Luis R. Rodriguez wrote:
>>> While discussing HVMLite with a few people a few questions have come
>>> up. Since I only really understand a few possible gains with the
>>> current design I wanted to get clarificaiton on a few which I simply
>>> have no clue if we stand to gain from them, or if its on the roadmap:
>>>
>>> a) Will context switches use the actual CR3 register?
>> Yes. HVMLite will use fully the full array of hardware virtualisation
>> extensions, so gets its own pagetables and cr3.
>>
>>> b) Will IOPL live in the actual FLAGS register?
>> Yes
>>
>>> c) Will guest-usable CPU features should show up in CPUID, and will
>>> features that shouldn't be used should *not* show up in CPUID. For
>>> instance currently you happen to boot Xen 4.4.0 with a new Linux dom0
>>> on a CPU that supports MPX what will happen? What about with HVMlite?
>> I am desperately trying to fix the existing broken-ness. PV is too far
>> beyond repair when it comes to the OSXSAVE bit, but the rest is fine.
>> See my "x86: Improvements to cpuid handling for guests" series, v3 of
>> which was posted today.
>>
>> With that series in place, a guest should always see correct features
>> (give or take some "fun" with masking, but at least the xen_cpuid() path
>> will still be correct).
>>
>>> d) Will acking an interrupt use the standard APIC mechanism? Do
>>> any of the current Xen variants do that?
>> Emulated APIC will be enabled by default. The "PV event" path will be
>> available as an alternative.
>>
>> If a user desperately wishes to avoid the emulated APIC, they have the
>> option to disable it in the domain config.
>>
>>> e) Can timing use RDTSC?
>> I don't understand this question in the context of the others. RDTSC
>> has (as far as I can tell) always been advertised and available for
>> guest use. RDTSCP is a different matter, and I have half-fixed that
>> brokenness; it should now work correctly in HVM guests.
>>
> These questions mostly came from me, and they weren't necessarily
> intended to make sense as a coherent whole :) They were more of a
> random collection of things I was wondering about to varying extents.
>
> What I mean is: if we point sched_clock at RDTSC and try to use the
> regular TSC timesource in a guest, will it work reasonably well,
> assuming that the underlying hardware supports it? And, if the
> underlying hardware doesn't support it (e.g. not constant / invariant
> or no TSC offsetting available or similar), will the hypervisor tell
> the guest this fact via CPUID so that the standard guest clocksource
> code doesn't try to use a non-working TSC?
Hypervisor typically clears TSC Invariant bit because the guest can
migrate to a system with a different clock
http://xenbits.xen.org/gitweb/?p=xen.git;a=blob;f=xen/arch/x86/domain.c;h=a6d721bd48176f51c0d9dfb57099c8b7f52220c2;hb=HEAD#l2612
There are options (in guest config file) to have hypervisor set this flag.
-boris
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: HVMlite gains
2016-03-15 21:36 ` Andy Lutomirski
2016-03-15 21:50 ` Boris Ostrovsky
@ 2016-03-15 21:50 ` Andrew Cooper
2016-03-15 21:52 ` Andy Lutomirski
1 sibling, 1 reply; 10+ messages in thread
From: Andrew Cooper @ 2016-03-15 21:50 UTC (permalink / raw)
To: Andy Lutomirski
Cc: Matt Fleming, Luis R. Rodriguez, xen-devel, Borislav Petkov,
Boris Ostrovsky
On 15/03/2016 21:36, Andy Lutomirski wrote:
>
>>> e) Can timing use RDTSC?
>> I don't understand this question in the context of the others. RDTSC
>> has (as far as I can tell) always been advertised and available for
>> guest use. RDTSCP is a different matter, and I have half-fixed that
>> brokenness; it should now work correctly in HVM guests.
>>
> These questions mostly came from me, and they weren't necessarily
> intended to make sense as a coherent whole :) They were more of a
> random collection of things I was wondering about to varying extents.
>
> What I mean is: if we point sched_clock at RDTSC and try to use the
> regular TSC timesource in a guest, will it work reasonably well,
> assuming that the underlying hardware supports it? And, if the
> underlying hardware doesn't support it (e.g. not constant / invariant
> or no TSC offsetting available or similar), will the hypervisor tell
> the guest this fact via CPUID so that the standard guest clocksource
> code doesn't try to use a non-working TSC?
In principle yes, but it is rather more complicated than that.
By default, if you want a guest to be migrateable and you can't
guarantee that you will have hardware TSC scaling support on every
future destination, you cannot advertise the TSC as stable to the
guest. We err on the side of caution and don't advertise invariance by
default.
In practice, if you are running on anything vaguely modern, the TSC will
be reliable between migrates.
What the migration protocol currently lacks is a mechanism to identify
"This VM was advertised invariant TSC at frequency $X when it was
booted". There is nominally a "no migrate" flag which can be set, at
which point invariance will be advertised if the host is capable.
However, there is no way for the toolstack to query this, so nothing in
the migrate code checks or acts upon it.
Windows have worked around this limitation with the Viridian spec,
whereby the hypervisor can provide the current TSC frequency, and
promises that it won't change until the next suspend/resume, at which
point the frequency will be resampled.
~Andrew
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: HVMlite gains
2016-03-15 21:50 ` Andrew Cooper
@ 2016-03-15 21:52 ` Andy Lutomirski
2016-03-15 22:05 ` Andrew Cooper
0 siblings, 1 reply; 10+ messages in thread
From: Andy Lutomirski @ 2016-03-15 21:52 UTC (permalink / raw)
To: Andrew Cooper
Cc: Matt Fleming, Luis R. Rodriguez, xen-devel, Borislav Petkov,
Boris Ostrovsky
On Tue, Mar 15, 2016 at 2:50 PM, Andrew Cooper
<andrew.cooper3@citrix.com> wrote:
> On 15/03/2016 21:36, Andy Lutomirski wrote:
>>
>>>> e) Can timing use RDTSC?
>>> I don't understand this question in the context of the others. RDTSC
>>> has (as far as I can tell) always been advertised and available for
>>> guest use. RDTSCP is a different matter, and I have half-fixed that
>>> brokenness; it should now work correctly in HVM guests.
>>>
>> These questions mostly came from me, and they weren't necessarily
>> intended to make sense as a coherent whole :) They were more of a
>> random collection of things I was wondering about to varying extents.
>>
>> What I mean is: if we point sched_clock at RDTSC and try to use the
>> regular TSC timesource in a guest, will it work reasonably well,
>> assuming that the underlying hardware supports it? And, if the
>> underlying hardware doesn't support it (e.g. not constant / invariant
>> or no TSC offsetting available or similar), will the hypervisor tell
>> the guest this fact via CPUID so that the standard guest clocksource
>> code doesn't try to use a non-working TSC?
>
> In principle yes, but it is rather more complicated than that.
>
> By default, if you want a guest to be migrateable and you can't
> guarantee that you will have hardware TSC scaling support on every
> future destination, you cannot advertise the TSC as stable to the
> guest. We err on the side of caution and don't advertise invariance by
> default.
>
> In practice, if you are running on anything vaguely modern, the TSC will
> be reliable between migrates.
By "reliable" do you mean monotonic and not horribly jumpy? I thought
there was no shipping hardware with TSC scaling.
>
> What the migration protocol currently lacks is a mechanism to identify
> "This VM was advertised invariant TSC at frequency $X when it was
> booted". There is nominally a "no migrate" flag which can be set, at
> which point invariance will be advertised if the host is capable.
> However, there is no way for the toolstack to query this, so nothing in
> the migrate code checks or acts upon it.
>
> Windows have worked around this limitation with the Viridian spec,
> whereby the hypervisor can provide the current TSC frequency, and
> promises that it won't change until the next suspend/resume, at which
> point the frequency will be resampled.
>
That's simpler and maybe even better than the pvclock design, at least
as implemented by KVM. Sigh.
--Andy
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: HVMlite gains
2016-03-15 21:52 ` Andy Lutomirski
@ 2016-03-15 22:05 ` Andrew Cooper
2016-03-16 3:18 ` Andy Lutomirski
0 siblings, 1 reply; 10+ messages in thread
From: Andrew Cooper @ 2016-03-15 22:05 UTC (permalink / raw)
To: Andy Lutomirski
Cc: Matt Fleming, Luis R. Rodriguez, xen-devel, Borislav Petkov,
Boris Ostrovsky
On 15/03/2016 21:52, Andy Lutomirski wrote:
> On Tue, Mar 15, 2016 at 2:50 PM, Andrew Cooper
> <andrew.cooper3@citrix.com> wrote:
>> On 15/03/2016 21:36, Andy Lutomirski wrote:
>>>>> e) Can timing use RDTSC?
>>>> I don't understand this question in the context of the others. RDTSC
>>>> has (as far as I can tell) always been advertised and available for
>>>> guest use. RDTSCP is a different matter, and I have half-fixed that
>>>> brokenness; it should now work correctly in HVM guests.
>>>>
>>> These questions mostly came from me, and they weren't necessarily
>>> intended to make sense as a coherent whole :) They were more of a
>>> random collection of things I was wondering about to varying extents.
>>>
>>> What I mean is: if we point sched_clock at RDTSC and try to use the
>>> regular TSC timesource in a guest, will it work reasonably well,
>>> assuming that the underlying hardware supports it? And, if the
>>> underlying hardware doesn't support it (e.g. not constant / invariant
>>> or no TSC offsetting available or similar), will the hypervisor tell
>>> the guest this fact via CPUID so that the standard guest clocksource
>>> code doesn't try to use a non-working TSC?
>> In principle yes, but it is rather more complicated than that.
>>
>> By default, if you want a guest to be migrateable and you can't
>> guarantee that you will have hardware TSC scaling support on every
>> future destination, you cannot advertise the TSC as stable to the
>> guest. We err on the side of caution and don't advertise invariance by
>> default.
>>
>> In practice, if you are running on anything vaguely modern, the TSC will
>> be reliable between migrates.
> By "reliable" do you mean monotonic and not horribly jumpy? I thought
> there was no shipping hardware with TSC scaling.
AMD have had TSC scaling for a long time (code added to Xen in 2011).
Intel are the ones late to the party in this case.
There was a patch series from Joao around about Christmas "x86/time:
PVCLOCK_TSC_STABLE_BIT supportwhich identified several bugs with Xen's
TSC handling as visible in the PVCLK. It would be nice to get those
bugs fixed.
>
>> What the migration protocol currently lacks is a mechanism to identify
>> "This VM was advertised invariant TSC at frequency $X when it was
>> booted". There is nominally a "no migrate" flag which can be set, at
>> which point invariance will be advertised if the host is capable.
>> However, there is no way for the toolstack to query this, so nothing in
>> the migrate code checks or acts upon it.
>>
>> Windows have worked around this limitation with the Viridian spec,
>> whereby the hypervisor can provide the current TSC frequency, and
>> promises that it won't change until the next suspend/resume, at which
>> point the frequency will be resampled.
>>
> That's simpler and maybe even better than the pvclock design, at least
> as implemented by KVM. Sigh.
Updates to that also need fixing. PVCLOCK is a Xen ABI which was
borrowed by KVM then locally modified.
I believe the two are still compatible.
But yes - the Viridian way does appear substantially more sane.
~Andrew
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: HVMlite gains
2016-03-15 21:14 HVMlite gains Luis R. Rodriguez
2016-03-15 21:29 ` Andrew Cooper
@ 2016-03-15 22:39 ` Konrad Rzeszutek Wilk
1 sibling, 0 replies; 10+ messages in thread
From: Konrad Rzeszutek Wilk @ 2016-03-15 22:39 UTC (permalink / raw)
To: Luis R. Rodriguez
Cc: Matt Fleming, Boris Ostrovsky, xen-devel, Borislav Petkov,
Andy Lutomirski
On Tue, Mar 15, 2016 at 02:14:15PM -0700, Luis R. Rodriguez wrote:
> While discussing HVMLite with a few people a few questions have come
> up. Since I only really understand a few possible gains with the
> current design I wanted to get clarificaiton on a few which I simply
Just think of baremetal without BIOS. Without PCI support (unless
needed).
> have no clue if we stand to gain from them, or if its on the roadmap:
>
> a) Will context switches use the actual CR3 register?
Yes.
> b) Will IOPL live in the actual FLAGS register?
Yes.
> c) Will guest-usable CPU features should show up in CPUID, and will
> features that shouldn't be used should *not* show up in CPUID. For
Correct.
> instance currently you happen to boot Xen 4.4.0 with a new Linux dom0
> on a CPU that supports MPX what will happen? What about with HVMlite?
It should boot just normally. Either PV dom0 guest or HVMLite dom0
guest (whenever that is operational).
> d) Will acking an interrupt use the standard APIC mechanism? Do
> any of the current Xen variants do that?
Yes and no. It is disabled right now but it will be enabled as I want
to make sure we can use vAPIC in the guest.
You don't want to use emulated APIC mechanism as it incurs VMEXITs.
> e) Can timing use RDTSC?
Yes (odd question? You could always use rdtsc).
>
> Luis
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: HVMlite gains
2016-03-15 22:05 ` Andrew Cooper
@ 2016-03-16 3:18 ` Andy Lutomirski
2016-03-16 10:46 ` Andrew Cooper
0 siblings, 1 reply; 10+ messages in thread
From: Andy Lutomirski @ 2016-03-16 3:18 UTC (permalink / raw)
To: Andrew Cooper
Cc: Matt Fleming, Boris Ostrovsky, xen-devel, Borislav Petkov,
Luis R. Rodriguez
On Mar 15, 2016 3:05 PM, "Andrew Cooper" <andrew.cooper3@citrix.com> wrote:
>
> On 15/03/2016 21:52, Andy Lutomirski wrote:
> > On Tue, Mar 15, 2016 at 2:50 PM, Andrew Cooper
> > <andrew.cooper3@citrix.com> wrote:
> >> On 15/03/2016 21:36, Andy Lutomirski wrote:
> >>>>> e) Can timing use RDTSC?
> >>>> I don't understand this question in the context of the others. RDTSC
> >>>> has (as far as I can tell) always been advertised and available for
> >>>> guest use. RDTSCP is a different matter, and I have half-fixed that
> >>>> brokenness; it should now work correctly in HVM guests.
> >>>>
> >>> These questions mostly came from me, and they weren't necessarily
> >>> intended to make sense as a coherent whole :) They were more of a
> >>> random collection of things I was wondering about to varying extents.
> >>>
> >>> What I mean is: if we point sched_clock at RDTSC and try to use the
> >>> regular TSC timesource in a guest, will it work reasonably well,
> >>> assuming that the underlying hardware supports it? And, if the
> >>> underlying hardware doesn't support it (e.g. not constant / invariant
> >>> or no TSC offsetting available or similar), will the hypervisor tell
> >>> the guest this fact via CPUID so that the standard guest clocksource
> >>> code doesn't try to use a non-working TSC?
> >> In principle yes, but it is rather more complicated than that.
> >>
> >> By default, if you want a guest to be migrateable and you can't
> >> guarantee that you will have hardware TSC scaling support on every
> >> future destination, you cannot advertise the TSC as stable to the
> >> guest. We err on the side of caution and don't advertise invariance by
> >> default.
> >>
> >> In practice, if you are running on anything vaguely modern, the TSC will
> >> be reliable between migrates.
> > By "reliable" do you mean monotonic and not horribly jumpy? I thought
> > there was no shipping hardware with TSC scaling.
>
> AMD have had TSC scaling for a long time (code added to Xen in 2011).
> Intel are the ones late to the party in this case.
>
> There was a patch series from Joao around about Christmas "x86/time:
> PVCLOCK_TSC_STABLE_BIT supportwhich identified several bugs with Xen's
> TSC handling as visible in the PVCLK. It would be nice to get those
> bugs fixed.
>
> >
> >> What the migration protocol currently lacks is a mechanism to identify
> >> "This VM was advertised invariant TSC at frequency $X when it was
> >> booted". There is nominally a "no migrate" flag which can be set, at
> >> which point invariance will be advertised if the host is capable.
> >> However, there is no way for the toolstack to query this, so nothing in
> >> the migrate code checks or acts upon it.
> >>
> >> Windows have worked around this limitation with the Viridian spec,
> >> whereby the hypervisor can provide the current TSC frequency, and
> >> promises that it won't change until the next suspend/resume, at which
> >> point the frequency will be resampled.
> >>
> > That's simpler and maybe even better than the pvclock design, at least
> > as implemented by KVM. Sigh.
>
> Updates to that also need fixing. PVCLOCK is a Xen ABI which was
> borrowed by KVM then locally modified.
>
> I believe the two are still compatible.
>
> But yes - the Viridian way does appear substantially more sane.
>
Hmm. Is migration synchronous enough for that approach to be reliable?
That is, when the TSC frequency changes, is there some way that the
guest is guaranteed to be notified before it starts screwing up its
timing calculations?
> ~Andrew
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: HVMlite gains
2016-03-16 3:18 ` Andy Lutomirski
@ 2016-03-16 10:46 ` Andrew Cooper
0 siblings, 0 replies; 10+ messages in thread
From: Andrew Cooper @ 2016-03-16 10:46 UTC (permalink / raw)
To: Andy Lutomirski
Cc: Matt Fleming, Boris Ostrovsky, xen-devel, Borislav Petkov,
Luis R. Rodriguez
On 16/03/16 03:18, Andy Lutomirski wrote:
> On Mar 15, 2016 3:05 PM, "Andrew Cooper" <andrew.cooper3@citrix.com> wrote:
>> On 15/03/2016 21:52, Andy Lutomirski wrote:
>>> On Tue, Mar 15, 2016 at 2:50 PM, Andrew Cooper
>>> <andrew.cooper3@citrix.com> wrote:
>>>> On 15/03/2016 21:36, Andy Lutomirski wrote:
>>>>>>> e) Can timing use RDTSC?
>>>>>> I don't understand this question in the context of the others. RDTSC
>>>>>> has (as far as I can tell) always been advertised and available for
>>>>>> guest use. RDTSCP is a different matter, and I have half-fixed that
>>>>>> brokenness; it should now work correctly in HVM guests.
>>>>>>
>>>>> These questions mostly came from me, and they weren't necessarily
>>>>> intended to make sense as a coherent whole :) They were more of a
>>>>> random collection of things I was wondering about to varying extents.
>>>>>
>>>>> What I mean is: if we point sched_clock at RDTSC and try to use the
>>>>> regular TSC timesource in a guest, will it work reasonably well,
>>>>> assuming that the underlying hardware supports it? And, if the
>>>>> underlying hardware doesn't support it (e.g. not constant / invariant
>>>>> or no TSC offsetting available or similar), will the hypervisor tell
>>>>> the guest this fact via CPUID so that the standard guest clocksource
>>>>> code doesn't try to use a non-working TSC?
>>>> In principle yes, but it is rather more complicated than that.
>>>>
>>>> By default, if you want a guest to be migrateable and you can't
>>>> guarantee that you will have hardware TSC scaling support on every
>>>> future destination, you cannot advertise the TSC as stable to the
>>>> guest. We err on the side of caution and don't advertise invariance by
>>>> default.
>>>>
>>>> In practice, if you are running on anything vaguely modern, the TSC will
>>>> be reliable between migrates.
>>> By "reliable" do you mean monotonic and not horribly jumpy? I thought
>>> there was no shipping hardware with TSC scaling.
>> AMD have had TSC scaling for a long time (code added to Xen in 2011).
>> Intel are the ones late to the party in this case.
>>
>> There was a patch series from Joao around about Christmas "x86/time:
>> PVCLOCK_TSC_STABLE_BIT supportwhich identified several bugs with Xen's
>> TSC handling as visible in the PVCLK. It would be nice to get those
>> bugs fixed.
>>
>>>> What the migration protocol currently lacks is a mechanism to identify
>>>> "This VM was advertised invariant TSC at frequency $X when it was
>>>> booted". There is nominally a "no migrate" flag which can be set, at
>>>> which point invariance will be advertised if the host is capable.
>>>> However, there is no way for the toolstack to query this, so nothing in
>>>> the migrate code checks or acts upon it.
>>>>
>>>> Windows have worked around this limitation with the Viridian spec,
>>>> whereby the hypervisor can provide the current TSC frequency, and
>>>> promises that it won't change until the next suspend/resume, at which
>>>> point the frequency will be resampled.
>>>>
>>> That's simpler and maybe even better than the pvclock design, at least
>>> as implemented by KVM. Sigh.
>> Updates to that also need fixing. PVCLOCK is a Xen ABI which was
>> borrowed by KVM then locally modified.
>>
>> I believe the two are still compatible.
>>
>> But yes - the Viridian way does appear substantially more sane.
>>
> Hmm. Is migration synchronous enough for that approach to be reliable?
> That is, when the TSC frequency changes, is there some way that the
> guest is guaranteed to be notified before it starts screwing up its
> timing calculations?
For VMs which do not have any Xen PV drivers, it is possible to migrate
them without any cooperation at all. In this case, there is no
practical way to indicate that the TSC frequency has changed.
For VMs which do have PV drivers, migration requires guest cooperation,
or memory corruption will occur (pre-existing shared mappings can't have
writes tracked on them, so the guest driver is required to replay its
control ring on the destination side). In this case, the guest always
passes through xen_suspend(). HYPERVISOR_suspend() is called in the
source side, and returns on the destination.
~Andrew
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel
^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2016-03-16 10:46 UTC | newest]
Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-03-15 21:14 HVMlite gains Luis R. Rodriguez
2016-03-15 21:29 ` Andrew Cooper
2016-03-15 21:36 ` Andy Lutomirski
2016-03-15 21:50 ` Boris Ostrovsky
2016-03-15 21:50 ` Andrew Cooper
2016-03-15 21:52 ` Andy Lutomirski
2016-03-15 22:05 ` Andrew Cooper
2016-03-16 3:18 ` Andy Lutomirski
2016-03-16 10:46 ` Andrew Cooper
2016-03-15 22:39 ` Konrad Rzeszutek Wilk
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).