All of lore.kernel.org
 help / color / mirror / Atom feed
* [RFH]: AMD CR intercept for lmsw/clts
@ 2014-08-05  1:33 Mukesh Rathor
  2014-08-05  7:46 ` Jan Beulich
  0 siblings, 1 reply; 11+ messages in thread
From: Mukesh Rathor @ 2014-08-05  1:33 UTC (permalink / raw)
  To: Xen-devel
  Cc: boris.ostrovsky, Aravind.Gopalakrishnan, suravee.suthikulpanit,
	Jan Beulich

Hi,

On AMD, clts/lmsw will cause "mov cr" vmexit, but unlike intel, they
can't be handled via svm_vmexit_do_cr_access and are emulated thru
handle_mmio() which is a problem for pvh because of:

handle_mmio():
..
    ASSERT(!is_pvh_vcpu(curr));

AMD CR intercepts in svm.c:
    case VMEXIT_CR0_READ ... VMEXIT_CR15_READ:
    case VMEXIT_CR0_WRITE ... VMEXIT_CR15_WRITE:
        if ( cpu_has_svm_decode && (vmcb->exitinfo1 & (1ULL << 63)) )
            svm_vmexit_do_cr_access(vmcb, regs);
        else if ( !handle_mmio() )     <==========
            hvm_inject_hw_exception(TRAP_gp_fault, 0);
    break;

Soooo, this leaves no choice but to make the ASSERT conditional
for intel only, and let handle_mmio go thru x86_emulate and let
x86_emulate fail for anything other than lmsw/clts? I was thinking
something like:

x86_emulate()
  int fail_pvh_emul = 1;
  ...
  case lmsw/clts:
     .....
     fail_pvh_emul = 0;

then
 done:
     if (fail_pvh_emul)
         rc = X86EMUL_UNHANDLEABLE;
     return rc;

Or, should I just create a new function for clts/lmsw and call it
directly from vmexit switch itself?

Can't think of any other clever way to do this... thoughts/suggestions?

thanks,
Mukesh

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [RFH]: AMD CR intercept for lmsw/clts
  2014-08-05  1:33 [RFH]: AMD CR intercept for lmsw/clts Mukesh Rathor
@ 2014-08-05  7:46 ` Jan Beulich
  2014-08-05 11:16   ` Andrew Cooper
  2014-08-05 22:22   ` Mukesh Rathor
  0 siblings, 2 replies; 11+ messages in thread
From: Jan Beulich @ 2014-08-05  7:46 UTC (permalink / raw)
  To: Mukesh Rathor
  Cc: xen-devel, boris.ostrovsky, Aravind.Gopalakrishnan,
	suravee.suthikulpanit

>>> On 05.08.14 at 03:33, <mukesh.rathor@oracle.com> wrote:
> Hi,
> 
> On AMD, clts/lmsw will cause "mov cr" vmexit, but unlike intel, they
> can't be handled via svm_vmexit_do_cr_access and are emulated thru
> handle_mmio() which is a problem for pvh because of:
> 
> handle_mmio():
> ..
>     ASSERT(!is_pvh_vcpu(curr));
> 
> AMD CR intercepts in svm.c:
>     case VMEXIT_CR0_READ ... VMEXIT_CR15_READ:
>     case VMEXIT_CR0_WRITE ... VMEXIT_CR15_WRITE:
>         if ( cpu_has_svm_decode && (vmcb->exitinfo1 & (1ULL << 63)) )
>             svm_vmexit_do_cr_access(vmcb, regs);
>         else if ( !handle_mmio() )     <==========
>             hvm_inject_hw_exception(TRAP_gp_fault, 0);
>     break;
> 
> Soooo, this leaves no choice but to make the ASSERT conditional
> for intel only, and let handle_mmio go thru x86_emulate and let
> x86_emulate fail for anything other than lmsw/clts? I was thinking
> something like:
> 
> x86_emulate()
>   int fail_pvh_emul = 1;
>   ...
>   case lmsw/clts:
>      .....
>      fail_pvh_emul = 0;
> 
> then
>  done:
>      if (fail_pvh_emul)
>          rc = X86EMUL_UNHANDLEABLE;
>      return rc;

I'd strongly recommend against adding any such to x86_emulate():
There's nothing precluding the emulator to be used for PVH guests.

> Or, should I just create a new function for clts/lmsw and call it
> directly from vmexit switch itself?

I'd prefer this - it seems pretty ugly to me that handle_mmio()/
x86_emulate() gets used for this purpose - but am not certain this
will actually work out nicely for other than CLTS: All the
instructions currently handled specially are ones with fixed
operands, and only CLTS fits that.

You'll btw have the same problem with SMSW and DRx accesses,
string I/O instructions, as well as (on older CPUs) with moves to/from
CRx and INVLPG.

In the case this doesn't turn out reasonable, rather than
manipulating handle_mmio() any further, I'd suggest to investigate
bypassing that function in favor of a more direct access to the x86
emulator. After all you're not after any MMIO emulation here, just
bare instructions (many of which without memory operands at all).

Jan

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [RFH]: AMD CR intercept for lmsw/clts
  2014-08-05  7:46 ` Jan Beulich
@ 2014-08-05 11:16   ` Andrew Cooper
  2014-08-05 12:11     ` Jan Beulich
  2014-08-05 22:22   ` Mukesh Rathor
  1 sibling, 1 reply; 11+ messages in thread
From: Andrew Cooper @ 2014-08-05 11:16 UTC (permalink / raw)
  To: Jan Beulich, Mukesh Rathor
  Cc: xen-devel, boris.ostrovsky, Aravind.Gopalakrishnan,
	suravee.suthikulpanit

On 05/08/2014 08:46, Jan Beulich wrote:
>>>> On 05.08.14 at 03:33, <mukesh.rathor@oracle.com> wrote:
>> Hi,
>>
>> On AMD, clts/lmsw will cause "mov cr" vmexit, but unlike intel, they
>> can't be handled via svm_vmexit_do_cr_access and are emulated thru
>> handle_mmio() which is a problem for pvh because of:
>>
>> handle_mmio():
>> ..
>>     ASSERT(!is_pvh_vcpu(curr));
>>
>> AMD CR intercepts in svm.c:
>>     case VMEXIT_CR0_READ ... VMEXIT_CR15_READ:
>>     case VMEXIT_CR0_WRITE ... VMEXIT_CR15_WRITE:
>>         if ( cpu_has_svm_decode && (vmcb->exitinfo1 & (1ULL << 63)) )
>>             svm_vmexit_do_cr_access(vmcb, regs);
>>         else if ( !handle_mmio() )     <==========
>>             hvm_inject_hw_exception(TRAP_gp_fault, 0);
>>     break;
>>
>> Soooo, this leaves no choice but to make the ASSERT conditional
>> for intel only, and let handle_mmio go thru x86_emulate and let
>> x86_emulate fail for anything other than lmsw/clts? I was thinking
>> something like:
>>
>> x86_emulate()
>>   int fail_pvh_emul = 1;
>>   ...
>>   case lmsw/clts:
>>      .....
>>      fail_pvh_emul = 0;
>>
>> then
>>  done:
>>      if (fail_pvh_emul)
>>          rc = X86EMUL_UNHANDLEABLE;
>>      return rc;
> I'd strongly recommend against adding any such to x86_emulate():
> There's nothing precluding the emulator to be used for PVH guests.
>
>> Or, should I just create a new function for clts/lmsw and call it
>> directly from vmexit switch itself?
> I'd prefer this - it seems pretty ugly to me that handle_mmio()/
> x86_emulate() gets used for this purpose - but am not certain this
> will actually work out nicely for other than CLTS: All the
> instructions currently handled specially are ones with fixed
> operands, and only CLTS fits that.
>
> You'll btw have the same problem with SMSW and DRx accesses,
> string I/O instructions, as well as (on older CPUs) with moves to/from
> CRx and INVLPG.

SMSW certainly, but what is wrong with the others?  For those, the exit
info appears to give fully decoded information.

> In the case this doesn't turn out reasonable, rather than
> manipulating handle_mmio() any further, I'd suggest to investigate
> bypassing that function in favor of a more direct access to the x86
> emulator. After all you're not after any MMIO emulation here, just
> bare instructions (many of which without memory operands at all).

In the AMD System manual (vol 2), there are special cases listed for
lmsw, clts and smsw in section 15.8.1.  In each case, bit 63 of
exitinfo1 will be clear (which is checked using magic numbers in the
code).  It would appear that under these circumstances, the only
information provided is "it wasn't a mov instruction".  As a result, I
think x86_emulate() is the best way go to, rather than hacking something
up locally, although I would agree that handle_mmio() is not really
appropriate.

~Andrew

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [RFH]: AMD CR intercept for lmsw/clts
  2014-08-05 11:16   ` Andrew Cooper
@ 2014-08-05 12:11     ` Jan Beulich
  2014-08-05 13:00       ` Andrew Cooper
  0 siblings, 1 reply; 11+ messages in thread
From: Jan Beulich @ 2014-08-05 12:11 UTC (permalink / raw)
  To: Andrew Cooper, Mukesh Rathor
  Cc: xen-devel, boris.ostrovsky, Aravind.Gopalakrishnan,
	suravee.suthikulpanit

>>> On 05.08.14 at 13:16, <andrew.cooper3@citrix.com> wrote:
> On 05/08/2014 08:46, Jan Beulich wrote:
>> I'd prefer this - it seems pretty ugly to me that handle_mmio()/
>> x86_emulate() gets used for this purpose - but am not certain this
>> will actually work out nicely for other than CLTS: All the
>> instructions currently handled specially are ones with fixed
>> operands, and only CLTS fits that.
>>
>> You'll btw have the same problem with SMSW and DRx accesses,
>> string I/O instructions, as well as (on older CPUs) with moves to/from
>> CRx and INVLPG.
> 
> SMSW certainly, but what is wrong with the others?  For those, the exit
> info appears to give fully decoded information.

All the decoding info is conditional upon decoding assists being
available. And for string I/O handle_mmio() is being used kind of
unconditionally.

>> In the case this doesn't turn out reasonable, rather than
>> manipulating handle_mmio() any further, I'd suggest to investigate
>> bypassing that function in favor of a more direct access to the x86
>> emulator. After all you're not after any MMIO emulation here, just
>> bare instructions (many of which without memory operands at all).
> 
> In the AMD System manual (vol 2), there are special cases listed for
> lmsw, clts and smsw in section 15.8.1.  In each case, bit 63 of
> exitinfo1 will be clear (which is checked using magic numbers in the
> code).  It would appear that under these circumstances, the only
> information provided is "it wasn't a mov instruction".

Right, except that at least for CLTS (not having any operands) the
light weight decoding could still be used. I'd even think decoding the
CRx/DRx moves would be okay as they only permit register
operands. But with LMSW and SMSW having memory operands it
would indeed start to become like re-implementing x86_emulate().
Question is whether one couldn't simply forbid PVH guests to use
them, as moves to/from CR0 provide all the needed functionality.

Jan

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [RFH]: AMD CR intercept for lmsw/clts
  2014-08-05 12:11     ` Jan Beulich
@ 2014-08-05 13:00       ` Andrew Cooper
  2014-08-05 13:15         ` Jan Beulich
  2014-08-05 22:30         ` Mukesh Rathor
  0 siblings, 2 replies; 11+ messages in thread
From: Andrew Cooper @ 2014-08-05 13:00 UTC (permalink / raw)
  To: Jan Beulich, Mukesh Rathor
  Cc: xen-devel, boris.ostrovsky, Aravind.Gopalakrishnan,
	suravee.suthikulpanit

On 05/08/2014 13:11, Jan Beulich wrote:
>>>> On 05.08.14 at 13:16, <andrew.cooper3@citrix.com> wrote:
>> On 05/08/2014 08:46, Jan Beulich wrote:
>>> I'd prefer this - it seems pretty ugly to me that handle_mmio()/
>>> x86_emulate() gets used for this purpose - but am not certain this
>>> will actually work out nicely for other than CLTS: All the
>>> instructions currently handled specially are ones with fixed
>>> operands, and only CLTS fits that.
>>>
>>> You'll btw have the same problem with SMSW and DRx accesses,
>>> string I/O instructions, as well as (on older CPUs) with moves to/from
>>> CRx and INVLPG.
>> SMSW certainly, but what is wrong with the others?  For those, the exit
>> info appears to give fully decoded information.
> All the decoding info is conditional upon decoding assists being
> available. And for string I/O handle_mmio() is being used kind of
> unconditionally.

The decode assists apply strictly to mov CRx/DRx, INTn, INVLPG and
(nested) #PF.  In contrast, the IOIO intercept appears to
unconditionally fill the decode information in exitinfo1, so I believe
the current code is correct.

>
>>> In the case this doesn't turn out reasonable, rather than
>>> manipulating handle_mmio() any further, I'd suggest to investigate
>>> bypassing that function in favor of a more direct access to the x86
>>> emulator. After all you're not after any MMIO emulation here, just
>>> bare instructions (many of which without memory operands at all).
>> In the AMD System manual (vol 2), there are special cases listed for
>> lmsw, clts and smsw in section 15.8.1.  In each case, bit 63 of
>> exitinfo1 will be clear (which is checked using magic numbers in the
>> code).  It would appear that under these circumstances, the only
>> information provided is "it wasn't a mov instruction".
> Right, except that at least for CLTS (not having any operands) the
> light weight decoding could still be used. I'd even think decoding the
> CRx/DRx moves would be okay as they only permit register
> operands. But with LMSW and SMSW having memory operands it
> would indeed start to become like re-implementing x86_emulate().
> Question is whether one couldn't simply forbid PVH guests to use
> them, as moves to/from CR0 provide all the needed functionality.

I think that is a very dangerous route to take.

Despite the current limitations, I firmly believe that PVH should be HVM
- device model, rather than PV + VMX/SVM.  Restricting the use of
certain instructions would certainly move PVH into the latter category.

Fundamentally, the end goal of PVH needs deciding ASAP, and documenting,
to help guide decisions like this.

~Andrew

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [RFH]: AMD CR intercept for lmsw/clts
  2014-08-05 13:00       ` Andrew Cooper
@ 2014-08-05 13:15         ` Jan Beulich
  2014-08-05 22:30         ` Mukesh Rathor
  1 sibling, 0 replies; 11+ messages in thread
From: Jan Beulich @ 2014-08-05 13:15 UTC (permalink / raw)
  To: Andrew Cooper
  Cc: xen-devel, boris.ostrovsky, Aravind.Gopalakrishnan,
	suravee.suthikulpanit

>>> On 05.08.14 at 15:00, <andrew.cooper3@citrix.com> wrote:
> On 05/08/2014 13:11, Jan Beulich wrote:
>>>>> On 05.08.14 at 13:16, <andrew.cooper3@citrix.com> wrote:
>>> On 05/08/2014 08:46, Jan Beulich wrote:
>>>> I'd prefer this - it seems pretty ugly to me that handle_mmio()/
>>>> x86_emulate() gets used for this purpose - but am not certain this
>>>> will actually work out nicely for other than CLTS: All the
>>>> instructions currently handled specially are ones with fixed
>>>> operands, and only CLTS fits that.
>>>>
>>>> You'll btw have the same problem with SMSW and DRx accesses,
>>>> string I/O instructions, as well as (on older CPUs) with moves to/from
>>>> CRx and INVLPG.
>>> SMSW certainly, but what is wrong with the others?  For those, the exit
>>> info appears to give fully decoded information.
>> All the decoding info is conditional upon decoding assists being
>> available. And for string I/O handle_mmio() is being used kind of
>> unconditionally.
> 
> The decode assists apply strictly to mov CRx/DRx, INTn, INVLPG and
> (nested) #PF.  In contrast, the IOIO intercept appears to
> unconditionally fill the decode information in exitinfo1, so I believe
> the current code is correct.

And I didn't say anything to the contrary. All I said was that there is
another use of handle_mmio() lurking.

>>>> In the case this doesn't turn out reasonable, rather than
>>>> manipulating handle_mmio() any further, I'd suggest to investigate
>>>> bypassing that function in favor of a more direct access to the x86
>>>> emulator. After all you're not after any MMIO emulation here, just
>>>> bare instructions (many of which without memory operands at all).
>>> In the AMD System manual (vol 2), there are special cases listed for
>>> lmsw, clts and smsw in section 15.8.1.  In each case, bit 63 of
>>> exitinfo1 will be clear (which is checked using magic numbers in the
>>> code).  It would appear that under these circumstances, the only
>>> information provided is "it wasn't a mov instruction".
>> Right, except that at least for CLTS (not having any operands) the
>> light weight decoding could still be used. I'd even think decoding the
>> CRx/DRx moves would be okay as they only permit register
>> operands. But with LMSW and SMSW having memory operands it
>> would indeed start to become like re-implementing x86_emulate().
>> Question is whether one couldn't simply forbid PVH guests to use
>> them, as moves to/from CR0 provide all the needed functionality.
> 
> I think that is a very dangerous route to take.
> 
> Despite the current limitations, I firmly believe that PVH should be HVM
> - device model, rather than PV + VMX/SVM.  Restricting the use of
> certain instructions would certainly move PVH into the latter category.

Generally I would agree. For those two specific instructions though,
I don't think any reasonably modern OS would be using them
anyway outside of 16-bit boot code. After adding to this the fact
that PVH, just like PV, implies a modified OS, putting on such a
restriction doesn't seem very harmful to me.

Jan

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [RFH]: AMD CR intercept for lmsw/clts
  2014-08-05  7:46 ` Jan Beulich
  2014-08-05 11:16   ` Andrew Cooper
@ 2014-08-05 22:22   ` Mukesh Rathor
  1 sibling, 0 replies; 11+ messages in thread
From: Mukesh Rathor @ 2014-08-05 22:22 UTC (permalink / raw)
  To: Jan Beulich
  Cc: xen-devel, boris.ostrovsky, Aravind.Gopalakrishnan,
	suravee.suthikulpanit

On Tue, 05 Aug 2014 08:46:20 +0100
"Jan Beulich" <JBeulich@suse.com> wrote:

> >>> On 05.08.14 at 03:33, <mukesh.rathor@oracle.com> wrote:
> > Hi,
> > 
> > On AMD, clts/lmsw will cause "mov cr" vmexit, but unlike intel, they
> > can't be handled via svm_vmexit_do_cr_access and are emulated thru
> > handle_mmio() which is a problem for pvh because of:
> > 
.....

> > Or, should I just create a new function for clts/lmsw and call it
> > directly from vmexit switch itself?
> 
> I'd prefer this - it seems pretty ugly to me that handle_mmio()/
> x86_emulate() gets used for this purpose - but am not certain this
> will actually work out nicely for other than CLTS: All the
> instructions currently handled specially are ones with fixed
> operands, and only CLTS fits that.
> 
> You'll btw have the same problem with SMSW and DRx accesses,
> string I/O instructions, as well as (on older CPUs) with moves to/from
> CRx and INVLPG.

I see, I'd be duplicating from x86_emulate more than just couple cases, so
I think the best thing would be to just call x86_emulate directly, which
is what should've been done in the first place for the CR intercepts. 

BTW, as regards INVLPG, I'm *requiring* svm decode for pvh.

thanks,
Mukesh

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [RFH]: AMD CR intercept for lmsw/clts
  2014-08-05 13:00       ` Andrew Cooper
  2014-08-05 13:15         ` Jan Beulich
@ 2014-08-05 22:30         ` Mukesh Rathor
  2014-08-06  9:34           ` Andrew Cooper
  1 sibling, 1 reply; 11+ messages in thread
From: Mukesh Rathor @ 2014-08-05 22:30 UTC (permalink / raw)
  To: Andrew Cooper
  Cc: xen-devel, boris.ostrovsky, Aravind.Gopalakrishnan,
	suravee.suthikulpanit, Jan Beulich

On Tue, 05 Aug 2014 14:00:25 +0100
Andrew Cooper <andrew.cooper3@citrix.com> wrote:

> On 05/08/2014 13:11, Jan Beulich wrote:
> >>>> On 05.08.14 at 13:16, <andrew.cooper3@citrix.com> wrote:
> >> On 05/08/2014 08:46, Jan Beulich wrote:
...

> Despite the current limitations, I firmly believe that PVH should be
> HVM
> - device model, rather than PV + VMX/SVM.  

I think that might be a dangerous route to take, classifying upfront
whether it's that way or the other. Eg, if we say it's former, then
anyone adding any feature would not examine the best approach, but just
take hvm approach.

> Fundamentally, the end goal of PVH needs deciding ASAP, and
> documenting, to help guide decisions like this.

I think it's decided somewhat. Evolve to one of three approaches: PV,
HVM, or alternate, picking the easiest and fastest. IMO, at the very
least, pvh should retain "guest modified" characteristic, that would be
good for xen future imho.

Mukesh

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [RFH]: AMD CR intercept for lmsw/clts
  2014-08-05 22:30         ` Mukesh Rathor
@ 2014-08-06  9:34           ` Andrew Cooper
  2014-08-15 21:04             ` Konrad Rzeszutek Wilk
  0 siblings, 1 reply; 11+ messages in thread
From: Andrew Cooper @ 2014-08-06  9:34 UTC (permalink / raw)
  To: Mukesh Rathor
  Cc: xen-devel, boris.ostrovsky, Aravind.Gopalakrishnan,
	suravee.suthikulpanit, Jan Beulich

On 05/08/2014 23:30, Mukesh Rathor wrote:
> On Tue, 05 Aug 2014 14:00:25 +0100
> Andrew Cooper <andrew.cooper3@citrix.com> wrote:
>
>> On 05/08/2014 13:11, Jan Beulich wrote:
>>>>>> On 05.08.14 at 13:16, <andrew.cooper3@citrix.com> wrote:
>>>> On 05/08/2014 08:46, Jan Beulich wrote:
> ...
>
>> Despite the current limitations, I firmly believe that PVH should be
>> HVM
>> - device model, rather than PV + VMX/SVM.  
> I think that might be a dangerous route to take, classifying upfront
> whether it's that way or the other. Eg, if we say it's former, then
> anyone adding any feature would not examine the best approach, but just
> take hvm approach.

There are many PV-isms which already exist for HVM.  Saying "HVM -
device model" does not preclude further PVism from being introduced and
used.  It does however means that PV-aware HVM guests get equal
opportunity at these improvements.  Fundamentally, having PVH closer to
HVM than PV means fewer modifications required to turn a native kernel
into a PVH kernel, which is a *very* good thing from the point of view
of the kernel authors.

But as I said, this is only my opinion.

>
>> Fundamentally, the end goal of PVH needs deciding ASAP, and
>> documenting, to help guide decisions like this.
> I think it's decided somewhat. Evolve to one of three approaches: PV,
> HVM, or alternate, picking the easiest and fastest. IMO, at the very
> least, pvh should retain "guest modified" characteristic, that would be
> good for xen future imho.

It clearly is not decided, or even semi-certain, by virtue of having
this conversation.

There are currently many opinions (some of which certainly can't
coexist, many which can), a lot of semi-baked code with many
restrictions (and repeated breaking of PVH/PVHdom0 by making seemingly
innocent code changes elsewhere), and no concrete plan of what PVH is or
what it should be.

What needs to happen urgently is for someone to make a firm decision,
and prepare a document for /docs/specs/pvh.  A document like that is not
immutable in the future if hindsight shows otherwise, but it will
provide solid guidance as to how to proceed in matters like this.

~Andrew

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [RFH]: AMD CR intercept for lmsw/clts
  2014-08-06  9:34           ` Andrew Cooper
@ 2014-08-15 21:04             ` Konrad Rzeszutek Wilk
  2014-08-15 21:48               ` Andrew Cooper
  0 siblings, 1 reply; 11+ messages in thread
From: Konrad Rzeszutek Wilk @ 2014-08-15 21:04 UTC (permalink / raw)
  To: Andrew Cooper
  Cc: suravee.suthikulpanit, Aravind.Gopalakrishnan, Jan Beulich,
	xen-devel, boris.ostrovsky

On Wed, Aug 06, 2014 at 10:34:21AM +0100, Andrew Cooper wrote:
> On 05/08/2014 23:30, Mukesh Rathor wrote:
> > On Tue, 05 Aug 2014 14:00:25 +0100
> > Andrew Cooper <andrew.cooper3@citrix.com> wrote:
> >
> >> On 05/08/2014 13:11, Jan Beulich wrote:
> >>>>>> On 05.08.14 at 13:16, <andrew.cooper3@citrix.com> wrote:
> >>>> On 05/08/2014 08:46, Jan Beulich wrote:
> > ...
> >
> >> Despite the current limitations, I firmly believe that PVH should be
> >> HVM
> >> - device model, rather than PV + VMX/SVM.  
> > I think that might be a dangerous route to take, classifying upfront
> > whether it's that way or the other. Eg, if we say it's former, then
> > anyone adding any feature would not examine the best approach, but just
> > take hvm approach.
> 
> There are many PV-isms which already exist for HVM.  Saying "HVM -
> device model" does not preclude further PVism from being introduced and
> used.  It does however means that PV-aware HVM guests get equal
> opportunity at these improvements.  Fundamentally, having PVH closer to
> HVM than PV means fewer modifications required to turn a native kernel
> into a PVH kernel, which is a *very* good thing from the point of view
> of the kernel authors.

Right. I would like to stress that the x86 maintainers are excited
about this as it would remove the pvops that don't have clear
semantic.
> 
> But as I said, this is only my opinion.
> 
> >
> >> Fundamentally, the end goal of PVH needs deciding ASAP, and
> >> documenting, to help guide decisions like this.
> > I think it's decided somewhat. Evolve to one of three approaches: PV,
> > HVM, or alternate, picking the easiest and fastest. IMO, at the very
> > least, pvh should retain "guest modified" characteristic, that would be
> > good for xen future imho.
> 
> It clearly is not decided, or even semi-certain, by virtue of having
> this conversation.

HA!
> 
> There are currently many opinions (some of which certainly can't
> coexist, many which can), a lot of semi-baked code with many
> restrictions (and repeated breaking of PVH/PVHdom0 by making seemingly
> innocent code changes elsewhere), and no concrete plan of what PVH is or
> what it should be.
> 
> What needs to happen urgently is for someone to make a firm decision,
> and prepare a document for /docs/specs/pvh.  A document like that is not
> immutable in the future if hindsight shows otherwise, but it will
> provide solid guidance as to how to proceed in matters like this.

That could certainly be done but I think we are all tied in fixing
code and trying to get features in Xen 4.5 before the feature
freeze gates are shut.

It should be fairly easy as most of it is 'runs like HVM' with
some HVM-ism disabled (so point to Intel SDM and AMD). And then
going through the hypercalls and seeing which are enabled.

Then there is the business of the startup which is complex, but
fortunatly there is a Wiki page to rip:
http://wiki.xenproject.org/wiki/X86_Paravirtualised_Memory_Management

Andrew, that nice template you used for the migrationv2 - where can
one find it?

> 
> ~Andrew
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [RFH]: AMD CR intercept for lmsw/clts
  2014-08-15 21:04             ` Konrad Rzeszutek Wilk
@ 2014-08-15 21:48               ` Andrew Cooper
  0 siblings, 0 replies; 11+ messages in thread
From: Andrew Cooper @ 2014-08-15 21:48 UTC (permalink / raw)
  To: Konrad Rzeszutek Wilk
  Cc: suravee.suthikulpanit, Aravind.Gopalakrishnan, Jan Beulich,
	xen-devel, boris.ostrovsky

On 15/08/2014 22:04, Konrad Rzeszutek Wilk wrote:
> On Wed, Aug 06, 2014 at 10:34:21AM +0100, Andrew Cooper wrote:
>> On 05/08/2014 23:30, Mukesh Rathor wrote:
>>> On Tue, 05 Aug 2014 14:00:25 +0100
>>> Andrew Cooper <andrew.cooper3@citrix.com> wrote:
>>>
>>>> On 05/08/2014 13:11, Jan Beulich wrote:
>>>>>>>> On 05.08.14 at 13:16, <andrew.cooper3@citrix.com> wrote:
>>>>>> On 05/08/2014 08:46, Jan Beulich wrote:
>>> ...
>>>
>>>> Despite the current limitations, I firmly believe that PVH should be
>>>> HVM
>>>> - device model, rather than PV + VMX/SVM.  
>>> I think that might be a dangerous route to take, classifying upfront
>>> whether it's that way or the other. Eg, if we say it's former, then
>>> anyone adding any feature would not examine the best approach, but just
>>> take hvm approach.
>> There are many PV-isms which already exist for HVM.  Saying "HVM -
>> device model" does not preclude further PVism from being introduced and
>> used.  It does however means that PV-aware HVM guests get equal
>> opportunity at these improvements.  Fundamentally, having PVH closer to
>> HVM than PV means fewer modifications required to turn a native kernel
>> into a PVH kernel, which is a *very* good thing from the point of view
>> of the kernel authors.
> Right. I would like to stress that the x86 maintainers are excited
> about this as it would remove the pvops that don't have clear
> semantic.
>> But as I said, this is only my opinion.
>>
>>>> Fundamentally, the end goal of PVH needs deciding ASAP, and
>>>> documenting, to help guide decisions like this.
>>> I think it's decided somewhat. Evolve to one of three approaches: PV,
>>> HVM, or alternate, picking the easiest and fastest. IMO, at the very
>>> least, pvh should retain "guest modified" characteristic, that would be
>>> good for xen future imho.
>> It clearly is not decided, or even semi-certain, by virtue of having
>> this conversation.
> HA!
>> There are currently many opinions (some of which certainly can't
>> coexist, many which can), a lot of semi-baked code with many
>> restrictions (and repeated breaking of PVH/PVHdom0 by making seemingly
>> innocent code changes elsewhere), and no concrete plan of what PVH is or
>> what it should be.
>>
>> What needs to happen urgently is for someone to make a firm decision,
>> and prepare a document for /docs/specs/pvh.  A document like that is not
>> immutable in the future if hindsight shows otherwise, but it will
>> provide solid guidance as to how to proceed in matters like this.
> That could certainly be done but I think we are all tied in fixing
> code and trying to get features in Xen 4.5 before the feature
> freeze gates are shut.
>
> It should be fairly easy as most of it is 'runs like HVM' with
> some HVM-ism disabled (so point to Intel SDM and AMD). And then
> going through the hypercalls and seeing which are enabled.
>
> Then there is the business of the startup which is complex, but
> fortunatly there is a Wiki page to rip:
> http://wiki.xenproject.org/wiki/X86_Paravirtualised_Memory_Management
>
> Andrew, that nice template you used for the migrationv2 - where can
> one find it?

For the pdf spec? That is just completely standard pandoc.

See
http://xenbits.xen.org/gitweb/?p=people/andrewcoop/xen.git;a=commitdiff;h=ba4c1c9072c623ffb795310e538ea6eed81bd658
for how I expect it to be committed when migration v2 is accepted.

~Andrew

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2014-08-15 21:48 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-08-05  1:33 [RFH]: AMD CR intercept for lmsw/clts Mukesh Rathor
2014-08-05  7:46 ` Jan Beulich
2014-08-05 11:16   ` Andrew Cooper
2014-08-05 12:11     ` Jan Beulich
2014-08-05 13:00       ` Andrew Cooper
2014-08-05 13:15         ` Jan Beulich
2014-08-05 22:30         ` Mukesh Rathor
2014-08-06  9:34           ` Andrew Cooper
2014-08-15 21:04             ` Konrad Rzeszutek Wilk
2014-08-15 21:48               ` Andrew Cooper
2014-08-05 22:22   ` Mukesh Rathor

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.