* [PATCH v2] SVM: limit GIF=0 region
@ 2018-08-15 6:09 Jan Beulich
2018-08-28 23:00 ` Boris Ostrovsky
2018-08-28 23:06 ` Andrew Cooper
0 siblings, 2 replies; 6+ messages in thread
From: Jan Beulich @ 2018-08-15 6:09 UTC (permalink / raw)
To: xen-devel
Cc: Andrew Cooper, Boris Ostrovsky, Brian Woods, Suravee Suthikulpanit
Use EFLAGS.IF for all ordinary purposes; there's in particular no need
to unduly defer NMI/#MC. Clear/set GIF solely around VMRUN itself. This
has the additional advantage that svm_stgi_label now indeed marks the
only place where GIF gets set.
A note regarding the main STI(s) placement: Quite counterintuitively the
host's EFLAGS.IF continues to have a meaning while the guest runs; see
PM Vol 2 section "Physical (INTR) Interrupt Masking in EFLAGS". Hence we
need to set the flag for the duration of time being in guest context.
However, SPEC_CTRL_ENTRY_FROM_HVM wants to be carried out with EFLAGS.IF
clear.
Suggested-by: Andrew Cooper <andrew.cooper3@citrix.com>
Signed-off-by: Jan Beulich <jbeulich@suse.com>
---
v2: Add CLI after VMRUN. Adjust description.
--- a/xen/arch/x86/hvm/svm/entry.S
+++ b/xen/arch/x86/hvm/svm/entry.S
@@ -43,7 +43,7 @@ ENTRY(svm_asm_do_resume)
lea irq_stat+IRQSTAT_softirq_pending(%rip),%rdx
xor %ecx,%ecx
shl $IRQSTAT_shift,%eax
- CLGI
+ cli
cmp %ecx,(%rdx,%rax,1)
jne .Lsvm_process_softirqs
@@ -57,7 +57,7 @@ UNLIKELY_START(ne, nsvm_hap)
* Someone shot down our nested p2m table; go round again
* and nsvm_vcpu_switch() will fix it for us.
*/
- STGI
+ sti
jmp .Lsvm_do_resume
__UNLIKELY_END(nsvm_hap)
@@ -87,7 +87,12 @@ __UNLIKELY_END(nsvm_hap)
pop %rsi
pop %rdi
+ CLGI
+ sti
VMRUN
+ cli
+ STGI
+GLOBAL(svm_stgi_label)
SAVE_ALL
@@ -96,13 +101,12 @@ __UNLIKELY_END(nsvm_hap)
SPEC_CTRL_ENTRY_FROM_HVM /* Req: b=curr %rsp=regs/cpuinfo, Clob: acd */
/* WARNING! `ret`, `call *`, `jmp *` not safe before this point. */
- STGI
-GLOBAL(svm_stgi_label)
+ sti
mov %rsp,%rdi
call svm_vmexit_handler
jmp .Lsvm_do_resume
.Lsvm_process_softirqs:
- STGI
+ sti
call do_softirq
jmp .Lsvm_do_resume
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH v2] SVM: limit GIF=0 region
2018-08-15 6:09 [PATCH v2] SVM: limit GIF=0 region Jan Beulich
@ 2018-08-28 23:00 ` Boris Ostrovsky
2018-08-28 23:06 ` Andrew Cooper
1 sibling, 0 replies; 6+ messages in thread
From: Boris Ostrovsky @ 2018-08-28 23:00 UTC (permalink / raw)
To: Jan Beulich, xen-devel; +Cc: Andrew Cooper, Brian Woods, Suravee Suthikulpanit
On 08/15/2018 02:09 AM, Jan Beulich wrote:
> Use EFLAGS.IF for all ordinary purposes; there's in particular no need
> to unduly defer NMI/#MC. Clear/set GIF solely around VMRUN itself. This
> has the additional advantage that svm_stgi_label now indeed marks the
> only place where GIF gets set.
>
> A note regarding the main STI(s) placement: Quite counterintuitively the
> host's EFLAGS.IF continues to have a meaning while the guest runs; see
> PM Vol 2 section "Physical (INTR) Interrupt Masking in EFLAGS". Hence we
> need to set the flag for the duration of time being in guest context.
> However, SPEC_CTRL_ENTRY_FROM_HVM wants to be carried out with EFLAGS.IF
> clear.
>
> Suggested-by: Andrew Cooper <andrew.cooper3@citrix.com>
> Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH v2] SVM: limit GIF=0 region
2018-08-15 6:09 [PATCH v2] SVM: limit GIF=0 region Jan Beulich
2018-08-28 23:00 ` Boris Ostrovsky
@ 2018-08-28 23:06 ` Andrew Cooper
2018-08-29 6:26 ` Jan Beulich
1 sibling, 1 reply; 6+ messages in thread
From: Andrew Cooper @ 2018-08-28 23:06 UTC (permalink / raw)
To: Jan Beulich, xen-devel
Cc: Boris Ostrovsky, Brian Woods, Suravee Suthikulpanit
On 15/08/18 07:09, Jan Beulich wrote:
> Use EFLAGS.IF for all ordinary purposes; there's in particular no need
> to unduly defer NMI/#MC. Clear/set GIF solely around VMRUN itself. This
> has the additional advantage that svm_stgi_label now indeed marks the
> only place where GIF gets set.
>
> A note regarding the main STI(s) placement: Quite counterintuitively the
> host's EFLAGS.IF continues to have a meaning while the guest runs; see
> PM Vol 2 section "Physical (INTR) Interrupt Masking in EFLAGS". Hence we
> need to set the flag for the duration of time being in guest context.
> However, SPEC_CTRL_ENTRY_FROM_HVM wants to be carried out with EFLAGS.IF
> clear.
>
> Suggested-by: Andrew Cooper <andrew.cooper3@citrix.com>
> Signed-off-by: Jan Beulich <jbeulich@suse.com>
> ---
> v2: Add CLI after VMRUN. Adjust description.
>
> --- a/xen/arch/x86/hvm/svm/entry.S
> +++ b/xen/arch/x86/hvm/svm/entry.S
> @@ -43,7 +43,7 @@ ENTRY(svm_asm_do_resume)
> lea irq_stat+IRQSTAT_softirq_pending(%rip),%rdx
> xor %ecx,%ecx
> shl $IRQSTAT_shift,%eax
> - CLGI
> + cli
> cmp %ecx,(%rdx,%rax,1)
> jne .Lsvm_process_softirqs
>
> @@ -57,7 +57,7 @@ UNLIKELY_START(ne, nsvm_hap)
> * Someone shot down our nested p2m table; go round again
> * and nsvm_vcpu_switch() will fix it for us.
> */
> - STGI
> + sti
> jmp .Lsvm_do_resume
> __UNLIKELY_END(nsvm_hap)
>
> @@ -87,7 +87,12 @@ __UNLIKELY_END(nsvm_hap)
> pop %rsi
> pop %rdi
>
> + CLGI
> + sti
> VMRUN
> + cli
> + STGI
> +GLOBAL(svm_stgi_label)
>
> SAVE_ALL
>
> @@ -96,13 +101,12 @@ __UNLIKELY_END(nsvm_hap)
> SPEC_CTRL_ENTRY_FROM_HVM /* Req: b=curr %rsp=regs/cpuinfo, Clob: acd */
> /* WARNING! `ret`, `call *`, `jmp *` not safe before this point. */
>
> - STGI
> -GLOBAL(svm_stgi_label)
> + sti
Nack. As indicated in v1, moving this breaks SPEC_CTRL_ENTRY_FROM_HVM
(Even if there is an unexpected bug on the VT-x side of things which
needs fixing differently).
Furthermore, to fix LBR handling, the first thing I'd have to do is
revert this, so please leave it as it is.
~Andrew
> mov %rsp,%rdi
> call svm_vmexit_handler
> jmp .Lsvm_do_resume
>
> .Lsvm_process_softirqs:
> - STGI
> + sti
> call do_softirq
> jmp .Lsvm_do_resume
>
>
>
>
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH v2] SVM: limit GIF=0 region
2018-08-28 23:06 ` Andrew Cooper
@ 2018-08-29 6:26 ` Jan Beulich
2018-08-29 17:55 ` Andrew Cooper
0 siblings, 1 reply; 6+ messages in thread
From: Jan Beulich @ 2018-08-29 6:26 UTC (permalink / raw)
To: Andrew Cooper
Cc: xen-devel, Boris Ostrovsky, Brian Woods, Suravee Suthikulpanit
>>> On 29.08.18 at 01:06, <andrew.cooper3@citrix.com> wrote:
> On 15/08/18 07:09, Jan Beulich wrote:
>> @@ -96,13 +101,12 @@ __UNLIKELY_END(nsvm_hap)
>> SPEC_CTRL_ENTRY_FROM_HVM /* Req: b=curr %rsp=regs/cpuinfo, Clob: acd */
>> /* WARNING! `ret`, `call *`, `jmp *` not safe before this point. */
>>
>> - STGI
>> -GLOBAL(svm_stgi_label)
>> + sti
>
> Nack. As indicated in v1, moving this breaks SPEC_CTRL_ENTRY_FROM_HVM
> (Even if there is an unexpected bug on the VT-x side of things which
> needs fixing differently).
Well, this increases the pressure to fix the issue. I very much
expect that fix to also take care of the code here. It was
therefore intentional that I did only the technically necessary
adjustments in v2.
I would, btw, likely have tried to find time to look into fixing
that issue, but so far I was under the impression that you are
planning to, and since you've written the code I thought you'd
likely also be in a better position to do so.
> Furthermore, to fix LBR handling, the first thing I'd have to do is
> revert this, so please leave it as it is.
Mind being a little more specific as to the whys here? From a
purely formal pov, Boris'es R-b allows me to put the change in
as is. I'm not overly happy to do the requested change, but I
certainly will, provided - once again - I understand the reason.
Furthermore, please don't forget that the more we delay
especially #MC, the more likely it becomes that a system will
crash in a far less obvious way than by logging an #MC.
Jan
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH v2] SVM: limit GIF=0 region
2018-08-29 6:26 ` Jan Beulich
@ 2018-08-29 17:55 ` Andrew Cooper
2018-08-30 6:11 ` Jan Beulich
0 siblings, 1 reply; 6+ messages in thread
From: Andrew Cooper @ 2018-08-29 17:55 UTC (permalink / raw)
To: Jan Beulich
Cc: xen-devel, Boris Ostrovsky, Brian Woods, Suravee Suthikulpanit
On 29/08/18 07:26, Jan Beulich wrote:
>>>> On 29.08.18 at 01:06, <andrew.cooper3@citrix.com> wrote:
>> On 15/08/18 07:09, Jan Beulich wrote:
>>> @@ -96,13 +101,12 @@ __UNLIKELY_END(nsvm_hap)
>>> SPEC_CTRL_ENTRY_FROM_HVM /* Req: b=curr %rsp=regs/cpuinfo, Clob: acd */
>>> /* WARNING! `ret`, `call *`, `jmp *` not safe before this point. */
>>>
>>> - STGI
>>> -GLOBAL(svm_stgi_label)
>>> + sti
>> Nack. As indicated in v1, moving this breaks SPEC_CTRL_ENTRY_FROM_HVM
>> (Even if there is an unexpected bug on the VT-x side of things which
>> needs fixing differently).
> Well, this increases the pressure to fix the issue. I very much
> expect that fix to also take care of the code here. It was
> therefore intentional that I did only the technically necessary
> adjustments in v2.
>
> I would, btw, likely have tried to find time to look into fixing
> that issue, but so far I was under the impression that you are
> planning to, and since you've written the code I thought you'd
> likely also be in a better position to do so.
Its somewhere in the massive pile of still-security work that needs
doing. It is not the most urgent of the still-security work to do,
partly because this protection is currently in place. Top of the list,
and most important, is still the toolstack CPUID/MSR interactions so we
can feasibly tell a guest what it needs to know WRT L1TF.
>
>> Furthermore, to fix LBR handling, the first thing I'd have to do is
>> revert this, so please leave it as it is.
> Mind being a little more specific as to the whys here?
When vmcb->virt_ext.fields.lbr_enable is clear and DEBUGCTL.LBR is set,
the hypervisor must atomically switch the 5 LBR MSRs inside the critical
region.
Usage of DEBUGCTL.LBR is broken on pre-lbrv hardware (gen 1/2 svm?), and
when suitably (mis)configured by nested virt.
Other bugs include Xen's LBR setting leaking into guests.
> From a purely formal pov, Boris'es R-b allows me to put the change in
> as is. I'm not overly happy to do the requested change, but I
> certainly will, provided - once again - I understand the reason.
>
> Furthermore, please don't forget that the more we delay
> especially #MC, the more likely it becomes that a system will
> crash in a far less obvious way than by logging an #MC.
#MC's only get raised after the bank registers have been updated, so
even if a shutdown occurs, the data will be captured by the firmware
after reset.
As for non-deferrable #MC's, a couple of extra stack/msr/vcpu references
in the critical region doesn't alter the chances. We were never going
to survive, or won't trigger a new one.
~Andrew
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH v2] SVM: limit GIF=0 region
2018-08-29 17:55 ` Andrew Cooper
@ 2018-08-30 6:11 ` Jan Beulich
0 siblings, 0 replies; 6+ messages in thread
From: Jan Beulich @ 2018-08-30 6:11 UTC (permalink / raw)
To: Andrew Cooper
Cc: xen-devel, Boris Ostrovsky, Brian Woods, Suravee Suthikulpanit
>>> On 29.08.18 at 19:55, <andrew.cooper3@citrix.com> wrote:
> On 29/08/18 07:26, Jan Beulich wrote:
>>>>> On 29.08.18 at 01:06, <andrew.cooper3@citrix.com> wrote:
>>> Furthermore, to fix LBR handling, the first thing I'd have to do is
>>> revert this, so please leave it as it is.
>> Mind being a little more specific as to the whys here?
>
> When vmcb->virt_ext.fields.lbr_enable is clear and DEBUGCTL.LBR is set,
> the hypervisor must atomically switch the 5 LBR MSRs inside the critical
> region.
>
> Usage of DEBUGCTL.LBR is broken on pre-lbrv hardware (gen 1/2 svm?), and
> when suitably (mis)configured by nested virt.
Okay, that's what I had concluded on the way home yesterday.
But this needs doing before the first branch, i.e. in particular
prior to SPEC_CTRL_ENTRY_FROM_HVM. Hence the STGI could
still move ahead (provided the NMI/#MC issue was addressed).
Anyway - I guess I'll make the change and record further room
for improvement in the description.
>> From a purely formal pov, Boris'es R-b allows me to put the change in
>> as is. I'm not overly happy to do the requested change, but I
>> certainly will, provided - once again - I understand the reason.
>>
>> Furthermore, please don't forget that the more we delay
>> especially #MC, the more likely it becomes that a system will
>> crash in a far less obvious way than by logging an #MC.
>
> #MC's only get raised after the bank registers have been updated, so
> even if a shutdown occurs, the data will be captured by the firmware
> after reset.
>
> As for non-deferrable #MC's, a couple of extra stack/msr/vcpu references
> in the critical region doesn't alter the chances. We were never going
> to survive, or won't trigger a new one.
The question isn't whether we survive, but whether the logged
data allows concluding what has happened (and in roughly what
context) in a halfway reasonable manner.
Jan
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2018-08-30 6:11 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-08-15 6:09 [PATCH v2] SVM: limit GIF=0 region Jan Beulich
2018-08-28 23:00 ` Boris Ostrovsky
2018-08-28 23:06 ` Andrew Cooper
2018-08-29 6:26 ` Jan Beulich
2018-08-29 17:55 ` Andrew Cooper
2018-08-30 6:11 ` Jan Beulich
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.