All of lore.kernel.org
 help / color / mirror / Atom feed
* HVM x86 deprivileged mode: AMD SVM TR problem
@ 2015-08-19 15:04 Ben Catterall
  2015-08-19 15:43 ` Tim Deegan
  0 siblings, 1 reply; 6+ messages in thread
From: Ben Catterall @ 2015-08-19 15:04 UTC (permalink / raw)
  To: Ian Campbell, Andrew Cooper, Jan Beulich, Tim Deegan; +Cc: xen-devel

Hi all,

I've hit a blocker on getting this working for AMD's SVM and would 
appreciate any thoughts. Hopefully I've missed a much simpler way of 
doing this or I've missed something!

So, AMD and Intel differ in how they handle the TR on a VMEXIT and 
VMRUM. On a VMEXIT, Intel Save the guest's TR and then restore the 
host's TR. AMD do not save the guest's TR nor do they restore the host's 
TR.

So, we need to context switch it out. The only ways that I know of to do 
this are with the ltr and str instructions. Now, ltr will throw #GP if 
loaded with a null selector and, when loaded, will immediately fetch 
from the current GDT the descriptor's data.

After issuing a VMEXIT and moving into deprivileged mode, I need a valid 
TSS so that we can handle exceptions in ring 3, otherwise, thanks to an 
invalid TSS selector in the TR causing a system shutdown (AMD manual), 
the guest could crash the system.

At the moment, I can save the guest's TR, load the host's TR and then 
happily handle exceptions when we are in ring 3 now so that's fixed the 
shutdown issue. But, when moving back to the guest, I have no easy way 
to restore the TR. If the guest's TR is 0, issuing ltr 0 throws #GP 
(manual) and won't load the register so we can't just ignore the fault 
and then VMRUN. If it's not 0, and we issue an ltr, it will immediately 
load from the host's GDT not the guest's GDT so then the guest will use 
those values.

Now, the ways I can see to fix this are:
We have to copy the guest's GDT entry for the TSS into the host's GDT at 
the same selector location, issue the ltr and then fix up the host's 
GDT. I think, given that AMD does store the GDTR and CR3, it will be 
possible to read the guest's GDT (not 100% on that) as we fetch this 
data (but that may be slow, as the GDTR holds a linear address so we'd 
need to traverse the page tables). But, this still doesn't solve the 
problem of when the guest's TR is 0. I guess we could load in a fake 
non-zero TR for a fake GDT entry and hope that the guest then sets up 
it's TR correctly. I think this situation is happening because the guest 
is causing a VMEXIT before it's setup its own TSS so I _think_ doing 
this is fine. However, this would leak information to the guest so it 
could tell it's in an HVM environment - don't know how serious that is.

Or: We have to inject some code into the guest's domain which the VMRUN 
jumps to. That code then loads in the TR for the guest and then we jump 
back to where we should have gone. The problem there is where to put 
that code in the guest's address space and how we do that injection.

Hopefully there is a better fix! Do you have any ideas or suggestions?

Many thanks in advance,
Ben

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: HVM x86 deprivileged mode: AMD SVM TR problem
  2015-08-19 15:04 HVM x86 deprivileged mode: AMD SVM TR problem Ben Catterall
@ 2015-08-19 15:43 ` Tim Deegan
  2015-08-19 16:36   ` Ben Catterall
  0 siblings, 1 reply; 6+ messages in thread
From: Tim Deegan @ 2015-08-19 15:43 UTC (permalink / raw)
  To: Ben Catterall; +Cc: Andrew Cooper, xen-devel, Ian Campbell, Jan Beulich

At 16:04 +0100 on 19 Aug (1440000260), Ben Catterall wrote:
> I've hit a blocker on getting this working for AMD's SVM and would 
> appreciate any thoughts. Hopefully I've missed a much simpler way of 
> doing this or I've missed something!
> 
> So, AMD and Intel differ in how they handle the TR on a VMEXIT and 
> VMRUM. On a VMEXIT, Intel Save the guest's TR and then restore the 
> host's TR. AMD do not save the guest's TR nor do they restore the host's 
> TR.
> 
> So, we need to context switch it out. The only ways that I know of to do 
> this are with the ltr and str instructions. Now, ltr will throw #GP if 
> loaded with a null selector and, when loaded, will immediately fetch 
> from the current GDT the descriptor's data.
> 
> After issuing a VMEXIT and moving into deprivileged mode, I need a valid 
> TSS so that we can handle exceptions in ring 3, otherwise, thanks to an 
> invalid TSS selector in the TR causing a system shutdown (AMD manual), 
> the guest could crash the system.
> 
> At the moment, I can save the guest's TR, load the host's TR and then 
> happily handle exceptions when we are in ring 3 now so that's fixed the 
> shutdown issue. But, when moving back to the guest, I have no easy way 
> to restore the TR.

I think the CPU will load that state for you from the VMCB when
entering the guest.  (At least, if it doesn't, I don't know how VCPU
migration works at the moment.)  So only the VMEXIT path needs any
attention.

Cheers,

Tim.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: HVM x86 deprivileged mode: AMD SVM TR problem
  2015-08-19 15:43 ` Tim Deegan
@ 2015-08-19 16:36   ` Ben Catterall
  2015-08-20  9:34     ` Tim Deegan
  0 siblings, 1 reply; 6+ messages in thread
From: Ben Catterall @ 2015-08-19 16:36 UTC (permalink / raw)
  To: Tim Deegan; +Cc: Andrew Cooper, xen-devel, Ian Campbell, Jan Beulich



On 19/08/15 16:43, Tim Deegan wrote:
> At 16:04 +0100 on 19 Aug (1440000260), Ben Catterall wrote:
>> I've hit a blocker on getting this working for AMD's SVM and would
>> appreciate any thoughts. Hopefully I've missed a much simpler way of
>> doing this or I've missed something!
>>
>> So, AMD and Intel differ in how they handle the TR on a VMEXIT and
>> VMRUM. On a VMEXIT, Intel Save the guest's TR and then restore the
>> host's TR. AMD do not save the guest's TR nor do they restore the host's
>> TR.
>>
>> So, we need to context switch it out. The only ways that I know of to do
>> this are with the ltr and str instructions. Now, ltr will throw #GP if
>> loaded with a null selector and, when loaded, will immediately fetch
>> from the current GDT the descriptor's data.
>>
>> After issuing a VMEXIT and moving into deprivileged mode, I need a valid
>> TSS so that we can handle exceptions in ring 3, otherwise, thanks to an
>> invalid TSS selector in the TR causing a system shutdown (AMD manual),
>> the guest could crash the system.
>>
>> At the moment, I can save the guest's TR, load the host's TR and then
>> happily handle exceptions when we are in ring 3 now so that's fixed the
>> shutdown issue. But, when moving back to the guest, I have no easy way
>> to restore the TR.
>
> I think the CPU will load that state for you from the VMCB when
> entering the guest.  (At least, if it doesn't, I don't know how VCPU
> migration works at the moment.)  So only the VMEXIT path needs any
> attention.
This pointed me in another direction, thanks!

 From what I've understood, the behaviour of VMEXIT and VMRUN 
instructions don't save/load that state from the VMCB. Though, if that's 
the case, I'd also like to know how the migration code works :).

However, AMD provides VMSAVE and VMLOAD (section 15.4.4 AMD manual 2) 
which DO save/load the TR (and other registers) but, it's an optional 
extra and Xen's entry.S for SVM doesn't use it. (I can't find any uses 
via grep in the source code either)

I guess if we use this then this alleviates much of the complexity as, 
looking at what it saves, I think we would be fine to use VMSAVE and 
VMLOAD just when we are doing a HVM depriv operation, and not need to 
call them every time we took a VMEXIT and that then gets round this problem.

Thanks!
>
> Cheers,
>
> Tim.
>

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: HVM x86 deprivileged mode: AMD SVM TR problem
  2015-08-19 16:36   ` Ben Catterall
@ 2015-08-20  9:34     ` Tim Deegan
  2015-08-20 10:51       ` Ben Catterall
  2015-08-20 14:32       ` Andrew Cooper
  0 siblings, 2 replies; 6+ messages in thread
From: Tim Deegan @ 2015-08-20  9:34 UTC (permalink / raw)
  To: Ben Catterall; +Cc: Andrew Cooper, xen-devel, Ian Campbell, Jan Beulich

At 17:36 +0100 on 19 Aug (1440005801), Ben Catterall wrote:
> 
> 
> On 19/08/15 16:43, Tim Deegan wrote:
> > At 16:04 +0100 on 19 Aug (1440000260), Ben Catterall wrote:
> >> I've hit a blocker on getting this working for AMD's SVM and would
> >> appreciate any thoughts. Hopefully I've missed a much simpler way of
> >> doing this or I've missed something!
> >>
> >> So, AMD and Intel differ in how they handle the TR on a VMEXIT and
> >> VMRUM. On a VMEXIT, Intel Save the guest's TR and then restore the
> >> host's TR. AMD do not save the guest's TR nor do they restore the host's
> >> TR.
> >>
> >> So, we need to context switch it out. The only ways that I know of to do
> >> this are with the ltr and str instructions. Now, ltr will throw #GP if
> >> loaded with a null selector and, when loaded, will immediately fetch
> >> from the current GDT the descriptor's data.
> >>
> >> After issuing a VMEXIT and moving into deprivileged mode, I need a valid
> >> TSS so that we can handle exceptions in ring 3, otherwise, thanks to an
> >> invalid TSS selector in the TR causing a system shutdown (AMD manual),
> >> the guest could crash the system.
> >>
> >> At the moment, I can save the guest's TR, load the host's TR and then
> >> happily handle exceptions when we are in ring 3 now so that's fixed the
> >> shutdown issue. But, when moving back to the guest, I have no easy way
> >> to restore the TR.
> >
> > I think the CPU will load that state for you from the VMCB when
> > entering the guest.  (At least, if it doesn't, I don't know how VCPU
> > migration works at the moment.)  So only the VMEXIT path needs any
> > attention.
> This pointed me in another direction, thanks!
> 
>  From what I've understood, the behaviour of VMEXIT and VMRUN 
> instructions don't save/load that state from the VMCB. Though, if that's 
> the case, I'd also like to know how the migration code works :).
> 
> However, AMD provides VMSAVE and VMLOAD (section 15.4.4 AMD manual 2) 
> which DO save/load the TR (and other registers)

Ah, quite right.  E.g., svm_ctxt_switch_to() uses it to load that state on
context switch.

> I guess if we use this then this alleviates much of the complexity as, 
> looking at what it saves, I think we would be fine to use VMSAVE and 
> VMLOAD just when we are doing a HVM depriv operation, and not need to 
> call them every time we took a VMEXIT and that then gets round this problem.

...this looks like a fine plan.  In fact, looking at svm.c, I think
you can just use hvm_get_segment_register()/hvm_set_segment_register(), 
which will DTRT internally.

You'll want to make sure that the depriv code can't itself set the
VCPU's TR state in the VMCB (which would be clobbered by the
hvm_set_segment_register() on return to priv mode), but AFAICS that
would be a desirable property anyway.

Cheers,

Tim.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: HVM x86 deprivileged mode: AMD SVM TR problem
  2015-08-20  9:34     ` Tim Deegan
@ 2015-08-20 10:51       ` Ben Catterall
  2015-08-20 14:32       ` Andrew Cooper
  1 sibling, 0 replies; 6+ messages in thread
From: Ben Catterall @ 2015-08-20 10:51 UTC (permalink / raw)
  To: Tim Deegan; +Cc: Andrew Cooper, xen-devel, Ian Campbell, Jan Beulich



On 20/08/15 10:34, Tim Deegan wrote:
> At 17:36 +0100 on 19 Aug (1440005801), Ben Catterall wrote:
>>
>>
>> On 19/08/15 16:43, Tim Deegan wrote:
>>> At 16:04 +0100 on 19 Aug (1440000260), Ben Catterall wrote:
>>>> I've hit a blocker on getting this working for AMD's SVM and would
>>>> appreciate any thoughts. Hopefully I've missed a much simpler way of
>>>> doing this or I've missed something!
>>>>
>>>> So, AMD and Intel differ in how they handle the TR on a VMEXIT and
>>>> VMRUM. On a VMEXIT, Intel Save the guest's TR and then restore the
>>>> host's TR. AMD do not save the guest's TR nor do they restore the host's
>>>> TR.
>>>>
>>>> So, we need to context switch it out. The only ways that I know of to do
>>>> this are with the ltr and str instructions. Now, ltr will throw #GP if
>>>> loaded with a null selector and, when loaded, will immediately fetch
>>>> from the current GDT the descriptor's data.
>>>>
>>>> After issuing a VMEXIT and moving into deprivileged mode, I need a valid
>>>> TSS so that we can handle exceptions in ring 3, otherwise, thanks to an
>>>> invalid TSS selector in the TR causing a system shutdown (AMD manual),
>>>> the guest could crash the system.
>>>>
>>>> At the moment, I can save the guest's TR, load the host's TR and then
>>>> happily handle exceptions when we are in ring 3 now so that's fixed the
>>>> shutdown issue. But, when moving back to the guest, I have no easy way
>>>> to restore the TR.
>>>
>>> I think the CPU will load that state for you from the VMCB when
>>> entering the guest.  (At least, if it doesn't, I don't know how VCPU
>>> migration works at the moment.)  So only the VMEXIT path needs any
>>> attention.
>> This pointed me in another direction, thanks!
>>
>>   From what I've understood, the behaviour of VMEXIT and VMRUN
>> instructions don't save/load that state from the VMCB. Though, if that's
>> the case, I'd also like to know how the migration code works :).
>>
>> However, AMD provides VMSAVE and VMLOAD (section 15.4.4 AMD manual 2)
>> which DO save/load the TR (and other registers)
>
> Ah, quite right.  E.g., svm_ctxt_switch_to() uses it to load that state on
> context switch.
>
>> I guess if we use this then this alleviates much of the complexity as,
>> looking at what it saves, I think we would be fine to use VMSAVE and
>> VMLOAD just when we are doing a HVM depriv operation, and not need to
>> call them every time we took a VMEXIT and that then gets round this problem.
>
> ...this looks like a fine plan.  In fact, looking at svm.c, I think
> you can just use hvm_get_segment_register()/hvm_set_segment_register(),
> which will DTRT internally.
>
> You'll want to make sure that the depriv code can't itself set the
> VCPU's TR state in the VMCB (which would be clobbered by the
> hvm_set_segment_register() on return to priv mode), but AFAICS that
> would be a desirable property anyway.
>
> Cheers,
>
> Tim.
>
Thanks Tim, really appreciated!

Ben

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: HVM x86 deprivileged mode: AMD SVM TR problem
  2015-08-20  9:34     ` Tim Deegan
  2015-08-20 10:51       ` Ben Catterall
@ 2015-08-20 14:32       ` Andrew Cooper
  1 sibling, 0 replies; 6+ messages in thread
From: Andrew Cooper @ 2015-08-20 14:32 UTC (permalink / raw)
  To: Tim Deegan, Ben Catterall; +Cc: xen-devel, Ian Campbell, Jan Beulich

On 20/08/15 10:34, Tim Deegan wrote:
> At 17:36 +0100 on 19 Aug (1440005801), Ben Catterall wrote:
>> On 19/08/15 16:43, Tim Deegan wrote:
>>> At 16:04 +0100 on 19 Aug (1440000260), Ben Catterall wrote:
>>>> I've hit a blocker on getting this working for AMD's SVM and would
>>>> appreciate any thoughts. Hopefully I've missed a much simpler way of
>>>> doing this or I've missed something!
>>>>
>>>> So, AMD and Intel differ in how they handle the TR on a VMEXIT and
>>>> VMRUM. On a VMEXIT, Intel Save the guest's TR and then restore the
>>>> host's TR. AMD do not save the guest's TR nor do they restore the host's
>>>> TR.
>>>>
>>>> So, we need to context switch it out. The only ways that I know of to do
>>>> this are with the ltr and str instructions. Now, ltr will throw #GP if
>>>> loaded with a null selector and, when loaded, will immediately fetch
>>>> from the current GDT the descriptor's data.
>>>>
>>>> After issuing a VMEXIT and moving into deprivileged mode, I need a valid
>>>> TSS so that we can handle exceptions in ring 3, otherwise, thanks to an
>>>> invalid TSS selector in the TR causing a system shutdown (AMD manual),
>>>> the guest could crash the system.
>>>>
>>>> At the moment, I can save the guest's TR, load the host's TR and then
>>>> happily handle exceptions when we are in ring 3 now so that's fixed the
>>>> shutdown issue. But, when moving back to the guest, I have no easy way
>>>> to restore the TR.
>>> I think the CPU will load that state for you from the VMCB when
>>> entering the guest.  (At least, if it doesn't, I don't know how VCPU
>>> migration works at the moment.)  So only the VMEXIT path needs any
>>> attention.
>> This pointed me in another direction, thanks!
>>
>>   From what I've understood, the behaviour of VMEXIT and VMRUN
>> instructions don't save/load that state from the VMCB. Though, if that's
>> the case, I'd also like to know how the migration code works :).
>>
>> However, AMD provides VMSAVE and VMLOAD (section 15.4.4 AMD manual 2)
>> which DO save/load the TR (and other registers)
> Ah, quite right.  E.g., svm_ctxt_switch_to() uses it to load that state on
> context switch.
>
>> I guess if we use this then this alleviates much of the complexity as,
>> looking at what it saves, I think we would be fine to use VMSAVE and
>> VMLOAD just when we are doing a HVM depriv operation, and not need to
>> call them every time we took a VMEXIT and that then gets round this problem.
> ...this looks like a fine plan.  In fact, looking at svm.c, I think
> you can just use hvm_get_segment_register()/hvm_set_segment_register(),
> which will DTRT internally.

(Apologies for being late to this.  I have been travelling.)

This is insufficient.  Xen must reload all host state before moving into 
depriv mode, not just TR.

Consider what would happen if the guest kernel set some crafty values in 
the sysenter MSRs, then used a vulnerability to cause depriv mode to 
execute a sysenter instruction.

Every single item covered by vmsave/vmload (off the top of my head, TR, 
LDTR, ?s_base msrs, syscall msrs, sysenter msrs) are unsafe to leave 
guest controlled while in depriv mode.

~Andrew

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2015-08-20 14:32 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-08-19 15:04 HVM x86 deprivileged mode: AMD SVM TR problem Ben Catterall
2015-08-19 15:43 ` Tim Deegan
2015-08-19 16:36   ` Ben Catterall
2015-08-20  9:34     ` Tim Deegan
2015-08-20 10:51       ` Ben Catterall
2015-08-20 14:32       ` Andrew Cooper

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.