stable.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] x86/speculation: Mitigate eIBRS PBRSB predictions with WRMSR
@ 2022-10-05 22:02 Suraj Jitindar Singh
  2022-10-05 22:29 ` Jim Mattson
                   ` (4 more replies)
  0 siblings, 5 replies; 13+ messages in thread
From: Suraj Jitindar Singh @ 2022-10-05 22:02 UTC (permalink / raw)
  To: kvm
  Cc: surajjs, sjitindarsingh, linux-kernel, x86, tglx, mingo, bp,
	dave.hansen, seanjc, pbonzini, peterz, jpoimboe, daniel.sneddon,
	pawan.kumar.gupta, benh, stable

tl;dr: The existing mitigation for eIBRS PBRSB predictions uses an INT3 to
ensure a call instruction retires before a following unbalanced RET. Replace
this with a WRMSR serialising instruction which has a lower performance
penalty.

== Background ==

eIBRS (enhanced indirect branch restricted speculation) is used to prevent
predictor addresses from one privilege domain from being used for prediction
in a higher privilege domain.

== Problem ==

On processors with eIBRS protections there can be a case where upon VM exit
a guest address may be used as an RSB prediction for an unbalanced RET if a
CALL instruction hasn't yet been retired. This is termed PBRSB (Post-Barrier
Return Stack Buffer).

A mitigation for this was introduced in:
(2b1299322016731d56807aa49254a5ea3080b6b3 x86/speculation: Add RSB VM Exit protections)

This mitigation [1] has a ~1% performance impact on VM exit compared to without
it [2].

== Solution ==

The WRMSR instruction can be used as a speculation barrier and a serialising
instruction. Use this on the VM exit path instead to ensure that a CALL
instruction (in this case the call to vmx_spec_ctrl_restore_host) has retired
before the prediction of a following unbalanced RET.

This mitigation [3] has a negligible performance impact.

== Testing ==

Run the outl_to_kernel kvm-unit-tests test 200 times per configuration which
counts the cycles for an exit to kernel mode.

[1] With existing mitigation:
Average: 2026 cycles
[2] With no mitigation:
Average: 2008 cycles
[3] With proposed mitigation:
Average: 2008 cycles

Signed-off-by: Suraj Jitindar Singh <surajjs@amazon.com>
Cc: stable@vger.kernel.org
---
 arch/x86/include/asm/nospec-branch.h | 7 +++----
 arch/x86/kvm/vmx/vmenter.S           | 3 +--
 arch/x86/kvm/vmx/vmx.c               | 5 +++++
 3 files changed, 9 insertions(+), 6 deletions(-)

diff --git a/arch/x86/include/asm/nospec-branch.h b/arch/x86/include/asm/nospec-branch.h
index c936ce9f0c47..e5723e024b47 100644
--- a/arch/x86/include/asm/nospec-branch.h
+++ b/arch/x86/include/asm/nospec-branch.h
@@ -159,10 +159,9 @@
   * A simpler FILL_RETURN_BUFFER macro. Don't make people use the CPP
   * monstrosity above, manually.
   */
-.macro FILL_RETURN_BUFFER reg:req nr:req ftr:req ftr2=ALT_NOT(X86_FEATURE_ALWAYS)
-	ALTERNATIVE_2 "jmp .Lskip_rsb_\@", \
-		__stringify(__FILL_RETURN_BUFFER(\reg,\nr)), \ftr, \
-		__stringify(__FILL_ONE_RETURN), \ftr2
+.macro FILL_RETURN_BUFFER reg:req nr:req ftr:req
+	ALTERNATIVE "jmp .Lskip_rsb_\@", \
+		__stringify(__FILL_RETURN_BUFFER(\reg,\nr)), \ftr
 
 .Lskip_rsb_\@:
 .endm
diff --git a/arch/x86/kvm/vmx/vmenter.S b/arch/x86/kvm/vmx/vmenter.S
index 6de96b943804..eb82797bd7bf 100644
--- a/arch/x86/kvm/vmx/vmenter.S
+++ b/arch/x86/kvm/vmx/vmenter.S
@@ -231,8 +231,7 @@ SYM_INNER_LABEL(vmx_vmexit, SYM_L_GLOBAL)
 	 * single call to retire, before the first unbalanced RET.
          */
 
-	FILL_RETURN_BUFFER %_ASM_CX, RSB_CLEAR_LOOPS, X86_FEATURE_RSB_VMEXIT,\
-			   X86_FEATURE_RSB_VMEXIT_LITE
+	FILL_RETURN_BUFFER %_ASM_CX, RSB_CLEAR_LOOPS, X86_FEATURE_RSB_VMEXIT
 
 
 	pop %_ASM_ARG2	/* @flags */
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index c9b49a09e6b5..fdcd8e10c2ab 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -7049,8 +7049,13 @@ void noinstr vmx_spec_ctrl_restore_host(struct vcpu_vmx *vmx,
 	 * For legacy IBRS, the IBRS bit always needs to be written after
 	 * transitioning from a less privileged predictor mode, regardless of
 	 * whether the guest/host values differ.
+	 *
+	 * For eIBRS affected by Post Barrier RSB Predictions a serialising
+	 * instruction (wrmsr) must be executed to ensure a call instruction has
+	 * retired before the prediction of a following unbalanced ret.
 	 */
 	if (cpu_feature_enabled(X86_FEATURE_KERNEL_IBRS) ||
+	    cpu_feature_enabled(X86_FEATURE_RSB_VMEXIT_LITE) ||
 	    vmx->spec_ctrl != hostval)
 		native_wrmsrl(MSR_IA32_SPEC_CTRL, hostval);
 
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [PATCH] x86/speculation: Mitigate eIBRS PBRSB predictions with WRMSR
  2022-10-05 22:02 [PATCH] x86/speculation: Mitigate eIBRS PBRSB predictions with WRMSR Suraj Jitindar Singh
@ 2022-10-05 22:29 ` Jim Mattson
  2022-10-06  8:25   ` David Laight
  2022-10-05 23:24 ` Jim Mattson
                   ` (3 subsequent siblings)
  4 siblings, 1 reply; 13+ messages in thread
From: Jim Mattson @ 2022-10-05 22:29 UTC (permalink / raw)
  To: Suraj Jitindar Singh
  Cc: kvm, sjitindarsingh, linux-kernel, x86, tglx, mingo, bp,
	dave.hansen, seanjc, pbonzini, peterz, jpoimboe, daniel.sneddon,
	pawan.kumar.gupta, benh, stable

On Wed, Oct 5, 2022 at 3:03 PM Suraj Jitindar Singh <surajjs@amazon.com> wrote:
>
> tl;dr: The existing mitigation for eIBRS PBRSB predictions uses an INT3 to
> ensure a call instruction retires before a following unbalanced RET. Replace
> this with a WRMSR serialising instruction which has a lower performance
> penalty.

The INT3 is only on a speculative path and should not impact performance.

> == Background ==
>
> eIBRS (enhanced indirect branch restricted speculation) is used to prevent
> predictor addresses from one privilege domain from being used for prediction
> in a higher privilege domain.
>
> == Problem ==
>
> On processors with eIBRS protections there can be a case where upon VM exit
> a guest address may be used as an RSB prediction for an unbalanced RET if a
> CALL instruction hasn't yet been retired. This is termed PBRSB (Post-Barrier
> Return Stack Buffer).
>
> A mitigation for this was introduced in:
> (2b1299322016731d56807aa49254a5ea3080b6b3 x86/speculation: Add RSB VM Exit protections)
>
> This mitigation [1] has a ~1% performance impact on VM exit compared to without
> it [2].
>
> == Solution ==
>
> The WRMSR instruction can be used as a speculation barrier and a serialising
> instruction. Use this on the VM exit path instead to ensure that a CALL
> instruction (in this case the call to vmx_spec_ctrl_restore_host) has retired
> before the prediction of a following unbalanced RET.

I don't buy this solution. According to
https://www.intel.com/content/www/us/en/developer/articles/technical/software-security-guidance/advisory-guidance/post-barrier-return-stack-buffer-predictions.html:

"Note that a WRMSR instruction (used to set IBRS, for example), could
also serve as a speculation barrier for such a sequence in place of
LFENCE."

This says only that you can replace the LFENCE with a WRMSR. It
doesn't say that you can drop the rest of the sequence.

> This mitigation [3] has a negligible performance impact.
>
> == Testing ==
>
> Run the outl_to_kernel kvm-unit-tests test 200 times per configuration which
> counts the cycles for an exit to kernel mode.
>
> [1] With existing mitigation:
> Average: 2026 cycles
> [2] With no mitigation:
> Average: 2008 cycles
> [3] With proposed mitigation:
> Average: 2008 cycles

What testing did you do to see that this is an effective mitigation?
Improved timings are irrelevant if it doesn't work.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] x86/speculation: Mitigate eIBRS PBRSB predictions with WRMSR
  2022-10-05 22:02 [PATCH] x86/speculation: Mitigate eIBRS PBRSB predictions with WRMSR Suraj Jitindar Singh
  2022-10-05 22:29 ` Jim Mattson
@ 2022-10-05 23:24 ` Jim Mattson
  2022-10-05 23:45   ` Pawan Gupta
  2022-10-05 23:46 ` Jim Mattson
                   ` (2 subsequent siblings)
  4 siblings, 1 reply; 13+ messages in thread
From: Jim Mattson @ 2022-10-05 23:24 UTC (permalink / raw)
  To: Suraj Jitindar Singh
  Cc: kvm, sjitindarsingh, linux-kernel, x86, tglx, mingo, bp,
	dave.hansen, seanjc, pbonzini, peterz, jpoimboe, daniel.sneddon,
	pawan.kumar.gupta, benh, stable

On Wed, Oct 5, 2022 at 3:03 PM Suraj Jitindar Singh <surajjs@amazon.com> wrote:
>
> tl;dr: The existing mitigation for eIBRS PBRSB predictions uses an INT3 to
> ensure a call instruction retires before a following unbalanced RET. Replace
> this with a WRMSR serialising instruction which has a lower performance
> penalty.
>
> == Background ==
>
> eIBRS (enhanced indirect branch restricted speculation) is used to prevent
> predictor addresses from one privilege domain from being used for prediction
> in a higher privilege domain.
>
> == Problem ==
>
> On processors with eIBRS protections there can be a case where upon VM exit
> a guest address may be used as an RSB prediction for an unbalanced RET if a
> CALL instruction hasn't yet been retired. This is termed PBRSB (Post-Barrier
> Return Stack Buffer).
>
> A mitigation for this was introduced in:
> (2b1299322016731d56807aa49254a5ea3080b6b3 x86/speculation: Add RSB VM Exit protections)
>
> This mitigation [1] has a ~1% performance impact on VM exit compared to without
> it [2].
>
> == Solution ==
>
> The WRMSR instruction can be used as a speculation barrier and a serialising
> instruction. Use this on the VM exit path instead to ensure that a CALL
> instruction (in this case the call to vmx_spec_ctrl_restore_host) has retired
> before the prediction of a following unbalanced RET.
>
> This mitigation [3] has a negligible performance impact.
>
> == Testing ==
>
> Run the outl_to_kernel kvm-unit-tests test 200 times per configuration which
> counts the cycles for an exit to kernel mode.
>
> [1] With existing mitigation:
> Average: 2026 cycles
> [2] With no mitigation:
> Average: 2008 cycles
> [3] With proposed mitigation:
> Average: 2008 cycles
>
> Signed-off-by: Suraj Jitindar Singh <surajjs@amazon.com>
> Cc: stable@vger.kernel.org
> ---
>  arch/x86/include/asm/nospec-branch.h | 7 +++----
>  arch/x86/kvm/vmx/vmenter.S           | 3 +--
>  arch/x86/kvm/vmx/vmx.c               | 5 +++++
>  3 files changed, 9 insertions(+), 6 deletions(-)
>
> diff --git a/arch/x86/include/asm/nospec-branch.h b/arch/x86/include/asm/nospec-branch.h
> index c936ce9f0c47..e5723e024b47 100644
> --- a/arch/x86/include/asm/nospec-branch.h
> +++ b/arch/x86/include/asm/nospec-branch.h
> @@ -159,10 +159,9 @@
>    * A simpler FILL_RETURN_BUFFER macro. Don't make people use the CPP
>    * monstrosity above, manually.
>    */
> -.macro FILL_RETURN_BUFFER reg:req nr:req ftr:req ftr2=ALT_NOT(X86_FEATURE_ALWAYS)
> -       ALTERNATIVE_2 "jmp .Lskip_rsb_\@", \
> -               __stringify(__FILL_RETURN_BUFFER(\reg,\nr)), \ftr, \
> -               __stringify(__FILL_ONE_RETURN), \ftr2
> +.macro FILL_RETURN_BUFFER reg:req nr:req ftr:req
> +       ALTERNATIVE "jmp .Lskip_rsb_\@", \
> +               __stringify(__FILL_RETURN_BUFFER(\reg,\nr)), \ftr
>
>  .Lskip_rsb_\@:
>  .endm
> diff --git a/arch/x86/kvm/vmx/vmenter.S b/arch/x86/kvm/vmx/vmenter.S
> index 6de96b943804..eb82797bd7bf 100644
> --- a/arch/x86/kvm/vmx/vmenter.S
> +++ b/arch/x86/kvm/vmx/vmenter.S
> @@ -231,8 +231,7 @@ SYM_INNER_LABEL(vmx_vmexit, SYM_L_GLOBAL)
>          * single call to retire, before the first unbalanced RET.
>           */
>
> -       FILL_RETURN_BUFFER %_ASM_CX, RSB_CLEAR_LOOPS, X86_FEATURE_RSB_VMEXIT,\
> -                          X86_FEATURE_RSB_VMEXIT_LITE
> +       FILL_RETURN_BUFFER %_ASM_CX, RSB_CLEAR_LOOPS, X86_FEATURE_RSB_VMEXIT
>
>
>         pop %_ASM_ARG2  /* @flags */
> diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
> index c9b49a09e6b5..fdcd8e10c2ab 100644
> --- a/arch/x86/kvm/vmx/vmx.c
> +++ b/arch/x86/kvm/vmx/vmx.c
> @@ -7049,8 +7049,13 @@ void noinstr vmx_spec_ctrl_restore_host(struct vcpu_vmx *vmx,
>          * For legacy IBRS, the IBRS bit always needs to be written after
>          * transitioning from a less privileged predictor mode, regardless of
>          * whether the guest/host values differ.
> +        *
> +        * For eIBRS affected by Post Barrier RSB Predictions a serialising
> +        * instruction (wrmsr) must be executed to ensure a call instruction has
> +        * retired before the prediction of a following unbalanced ret.
>          */
>         if (cpu_feature_enabled(X86_FEATURE_KERNEL_IBRS) ||
> +           cpu_feature_enabled(X86_FEATURE_RSB_VMEXIT_LITE) ||
>             vmx->spec_ctrl != hostval)
>                 native_wrmsrl(MSR_IA32_SPEC_CTRL, hostval);

Okay. I see how this almost meets the requirements. But this WRMSR is
conditional, which means that there's a speculative path through this
code that ends up at the unbalanced RET without executing the WRMSR.

Also, for your timings of "no mitigation" and this proposed mitigation
to be the same, I assume that the guest in your timing test has a
different IA32_SPEC_CTRL value than the host, which isn't always going
to be the case in practice. How much does this WRMSR cost if the guest
and the host have the same IA32_SPEC_CTRL value?

> --
> 2.17.1
>

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] x86/speculation: Mitigate eIBRS PBRSB predictions with WRMSR
  2022-10-05 23:24 ` Jim Mattson
@ 2022-10-05 23:45   ` Pawan Gupta
  0 siblings, 0 replies; 13+ messages in thread
From: Pawan Gupta @ 2022-10-05 23:45 UTC (permalink / raw)
  To: Jim Mattson
  Cc: Suraj Jitindar Singh, kvm, sjitindarsingh, linux-kernel, x86,
	tglx, mingo, bp, dave.hansen, seanjc, pbonzini, peterz, jpoimboe,
	daniel.sneddon, benh, stable

On Wed, Oct 05, 2022 at 04:24:54PM -0700, Jim Mattson wrote:
>On Wed, Oct 5, 2022 at 3:03 PM Suraj Jitindar Singh <surajjs@amazon.com> wrote:
>>
>> tl;dr: The existing mitigation for eIBRS PBRSB predictions uses an INT3 to
>> ensure a call instruction retires before a following unbalanced RET. Replace
>> this with a WRMSR serialising instruction which has a lower performance
>> penalty.
>>
>> == Background ==
>>
>> eIBRS (enhanced indirect branch restricted speculation) is used to prevent
>> predictor addresses from one privilege domain from being used for prediction
>> in a higher privilege domain.
>>
>> == Problem ==
>>
>> On processors with eIBRS protections there can be a case where upon VM exit
>> a guest address may be used as an RSB prediction for an unbalanced RET if a
>> CALL instruction hasn't yet been retired. This is termed PBRSB (Post-Barrier
>> Return Stack Buffer).
>>
>> A mitigation for this was introduced in:
>> (2b1299322016731d56807aa49254a5ea3080b6b3 x86/speculation: Add RSB VM Exit protections)
>>
>> This mitigation [1] has a ~1% performance impact on VM exit compared to without
>> it [2].
>>
>> == Solution ==
>>
>> The WRMSR instruction can be used as a speculation barrier and a serialising
>> instruction. Use this on the VM exit path instead to ensure that a CALL
>> instruction (in this case the call to vmx_spec_ctrl_restore_host) has retired
>> before the prediction of a following unbalanced RET.
>>
>> This mitigation [3] has a negligible performance impact.
>>
>> == Testing ==
>>
>> Run the outl_to_kernel kvm-unit-tests test 200 times per configuration which
>> counts the cycles for an exit to kernel mode.
>>
>> [1] With existing mitigation:
>> Average: 2026 cycles
>> [2] With no mitigation:
>> Average: 2008 cycles
>> [3] With proposed mitigation:
>> Average: 2008 cycles
>>
>> Signed-off-by: Suraj Jitindar Singh <surajjs@amazon.com>
>> Cc: stable@vger.kernel.org
>> ---
>>  arch/x86/include/asm/nospec-branch.h | 7 +++----
>>  arch/x86/kvm/vmx/vmenter.S           | 3 +--
>>  arch/x86/kvm/vmx/vmx.c               | 5 +++++
>>  3 files changed, 9 insertions(+), 6 deletions(-)
>>
>> diff --git a/arch/x86/include/asm/nospec-branch.h b/arch/x86/include/asm/nospec-branch.h
>> index c936ce9f0c47..e5723e024b47 100644
>> --- a/arch/x86/include/asm/nospec-branch.h
>> +++ b/arch/x86/include/asm/nospec-branch.h
>> @@ -159,10 +159,9 @@
>>    * A simpler FILL_RETURN_BUFFER macro. Don't make people use the CPP
>>    * monstrosity above, manually.
>>    */
>> -.macro FILL_RETURN_BUFFER reg:req nr:req ftr:req ftr2=ALT_NOT(X86_FEATURE_ALWAYS)
>> -       ALTERNATIVE_2 "jmp .Lskip_rsb_\@", \
>> -               __stringify(__FILL_RETURN_BUFFER(\reg,\nr)), \ftr, \
>> -               __stringify(__FILL_ONE_RETURN), \ftr2
>> +.macro FILL_RETURN_BUFFER reg:req nr:req ftr:req
>> +       ALTERNATIVE "jmp .Lskip_rsb_\@", \
>> +               __stringify(__FILL_RETURN_BUFFER(\reg,\nr)), \ftr
>>
>>  .Lskip_rsb_\@:
>>  .endm
>> diff --git a/arch/x86/kvm/vmx/vmenter.S b/arch/x86/kvm/vmx/vmenter.S
>> index 6de96b943804..eb82797bd7bf 100644
>> --- a/arch/x86/kvm/vmx/vmenter.S
>> +++ b/arch/x86/kvm/vmx/vmenter.S
>> @@ -231,8 +231,7 @@ SYM_INNER_LABEL(vmx_vmexit, SYM_L_GLOBAL)
>>          * single call to retire, before the first unbalanced RET.
>>           */
>>
>> -       FILL_RETURN_BUFFER %_ASM_CX, RSB_CLEAR_LOOPS, X86_FEATURE_RSB_VMEXIT,\
>> -                          X86_FEATURE_RSB_VMEXIT_LITE
>> +       FILL_RETURN_BUFFER %_ASM_CX, RSB_CLEAR_LOOPS, X86_FEATURE_RSB_VMEXIT
>>
>>
>>         pop %_ASM_ARG2  /* @flags */
>> diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
>> index c9b49a09e6b5..fdcd8e10c2ab 100644
>> --- a/arch/x86/kvm/vmx/vmx.c
>> +++ b/arch/x86/kvm/vmx/vmx.c
>> @@ -7049,8 +7049,13 @@ void noinstr vmx_spec_ctrl_restore_host(struct vcpu_vmx *vmx,
>>          * For legacy IBRS, the IBRS bit always needs to be written after
>>          * transitioning from a less privileged predictor mode, regardless of
>>          * whether the guest/host values differ.
>> +        *
>> +        * For eIBRS affected by Post Barrier RSB Predictions a serialising
>> +        * instruction (wrmsr) must be executed to ensure a call instruction has
>> +        * retired before the prediction of a following unbalanced ret.
>>          */
>>         if (cpu_feature_enabled(X86_FEATURE_KERNEL_IBRS) ||
>> +           cpu_feature_enabled(X86_FEATURE_RSB_VMEXIT_LITE) ||
>>             vmx->spec_ctrl != hostval)
>>                 native_wrmsrl(MSR_IA32_SPEC_CTRL, hostval);
>
>Okay. I see how this almost meets the requirements. But this WRMSR is
>conditional, which means that there's a speculative path through this
>code that ends up at the unbalanced RET without executing the WRMSR.

Agree. I was just about to post this.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] x86/speculation: Mitigate eIBRS PBRSB predictions with WRMSR
  2022-10-05 22:02 [PATCH] x86/speculation: Mitigate eIBRS PBRSB predictions with WRMSR Suraj Jitindar Singh
  2022-10-05 22:29 ` Jim Mattson
  2022-10-05 23:24 ` Jim Mattson
@ 2022-10-05 23:46 ` Jim Mattson
  2022-10-06  0:26   ` Daniel Sneddon
  2022-10-06  8:18   ` Peter Zijlstra
  2022-10-06  2:42 ` Andrew Cooper
  2022-10-07  1:54 ` Pawan Gupta
  4 siblings, 2 replies; 13+ messages in thread
From: Jim Mattson @ 2022-10-05 23:46 UTC (permalink / raw)
  To: Suraj Jitindar Singh
  Cc: kvm, sjitindarsingh, linux-kernel, x86, tglx, mingo, bp,
	dave.hansen, seanjc, pbonzini, peterz, jpoimboe, daniel.sneddon,
	pawan.kumar.gupta, benh, stable

On Wed, Oct 5, 2022 at 3:03 PM Suraj Jitindar Singh <surajjs@amazon.com> wrote:
>
> tl;dr: The existing mitigation for eIBRS PBRSB predictions uses an INT3 to
> ensure a call instruction retires before a following unbalanced RET. Replace
> this with a WRMSR serialising instruction which has a lower performance
> penalty.
>
> == Background ==
>
> eIBRS (enhanced indirect branch restricted speculation) is used to prevent
> predictor addresses from one privilege domain from being used for prediction
> in a higher privilege domain.
>
> == Problem ==
>
> On processors with eIBRS protections there can be a case where upon VM exit
> a guest address may be used as an RSB prediction for an unbalanced RET if a
> CALL instruction hasn't yet been retired. This is termed PBRSB (Post-Barrier
> Return Stack Buffer).
>
> A mitigation for this was introduced in:
> (2b1299322016731d56807aa49254a5ea3080b6b3 x86/speculation: Add RSB VM Exit protections)
>
> This mitigation [1] has a ~1% performance impact on VM exit compared to without
> it [2].
>
> == Solution ==
>
> The WRMSR instruction can be used as a speculation barrier and a serialising
> instruction. Use this on the VM exit path instead to ensure that a CALL
> instruction (in this case the call to vmx_spec_ctrl_restore_host) has retired
> before the prediction of a following unbalanced RET.
>
> This mitigation [3] has a negligible performance impact.
>
> == Testing ==
>
> Run the outl_to_kernel kvm-unit-tests test 200 times per configuration which
> counts the cycles for an exit to kernel mode.
>
> [1] With existing mitigation:
> Average: 2026 cycles
> [2] With no mitigation:
> Average: 2008 cycles
> [3] With proposed mitigation:
> Average: 2008 cycles
>
> Signed-off-by: Suraj Jitindar Singh <surajjs@amazon.com>
> Cc: stable@vger.kernel.org
> ---
>  arch/x86/include/asm/nospec-branch.h | 7 +++----
>  arch/x86/kvm/vmx/vmenter.S           | 3 +--
>  arch/x86/kvm/vmx/vmx.c               | 5 +++++
>  3 files changed, 9 insertions(+), 6 deletions(-)
>
> diff --git a/arch/x86/include/asm/nospec-branch.h b/arch/x86/include/asm/nospec-branch.h
> index c936ce9f0c47..e5723e024b47 100644
> --- a/arch/x86/include/asm/nospec-branch.h
> +++ b/arch/x86/include/asm/nospec-branch.h
> @@ -159,10 +159,9 @@
>    * A simpler FILL_RETURN_BUFFER macro. Don't make people use the CPP
>    * monstrosity above, manually.
>    */
> -.macro FILL_RETURN_BUFFER reg:req nr:req ftr:req ftr2=ALT_NOT(X86_FEATURE_ALWAYS)
> -       ALTERNATIVE_2 "jmp .Lskip_rsb_\@", \
> -               __stringify(__FILL_RETURN_BUFFER(\reg,\nr)), \ftr, \
> -               __stringify(__FILL_ONE_RETURN), \ftr2
> +.macro FILL_RETURN_BUFFER reg:req nr:req ftr:req
> +       ALTERNATIVE "jmp .Lskip_rsb_\@", \
> +               __stringify(__FILL_RETURN_BUFFER(\reg,\nr)), \ftr
>
>  .Lskip_rsb_\@:
>  .endm
> diff --git a/arch/x86/kvm/vmx/vmenter.S b/arch/x86/kvm/vmx/vmenter.S
> index 6de96b943804..eb82797bd7bf 100644
> --- a/arch/x86/kvm/vmx/vmenter.S
> +++ b/arch/x86/kvm/vmx/vmenter.S
> @@ -231,8 +231,7 @@ SYM_INNER_LABEL(vmx_vmexit, SYM_L_GLOBAL)
>          * single call to retire, before the first unbalanced RET.
>           */
>
> -       FILL_RETURN_BUFFER %_ASM_CX, RSB_CLEAR_LOOPS, X86_FEATURE_RSB_VMEXIT,\
> -                          X86_FEATURE_RSB_VMEXIT_LITE
> +       FILL_RETURN_BUFFER %_ASM_CX, RSB_CLEAR_LOOPS, X86_FEATURE_RSB_VMEXIT
>
>
>         pop %_ASM_ARG2  /* @flags */
> diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
> index c9b49a09e6b5..fdcd8e10c2ab 100644
> --- a/arch/x86/kvm/vmx/vmx.c
> +++ b/arch/x86/kvm/vmx/vmx.c
> @@ -7049,8 +7049,13 @@ void noinstr vmx_spec_ctrl_restore_host(struct vcpu_vmx *vmx,
>          * For legacy IBRS, the IBRS bit always needs to be written after
>          * transitioning from a less privileged predictor mode, regardless of
>          * whether the guest/host values differ.
> +        *
> +        * For eIBRS affected by Post Barrier RSB Predictions a serialising
> +        * instruction (wrmsr) must be executed to ensure a call instruction has
> +        * retired before the prediction of a following unbalanced ret.
>          */
>         if (cpu_feature_enabled(X86_FEATURE_KERNEL_IBRS) ||
> +           cpu_feature_enabled(X86_FEATURE_RSB_VMEXIT_LITE) ||
>             vmx->spec_ctrl != hostval)
>                 native_wrmsrl(MSR_IA32_SPEC_CTRL, hostval);

Better, I think, would be to leave the condition alone and put an
LFENCE on the 'else' path:

         if (cpu_feature_enabled(X86_FEATURE_KERNEL_IBRS) ||
             vmx->spec_ctrl != hostval)
                 native_wrmsrl(MSR_IA32_SPEC_CTRL, hostval);
        else
                rmb();

When the guest and host have different IA32_SPEC_CTRL values, you get
the serialization from the WRMSR. Otherwise, you get it from the
cheaper LFENCE.

This is still more convoluted than having the mitigation in one place.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] x86/speculation: Mitigate eIBRS PBRSB predictions with WRMSR
  2022-10-05 23:46 ` Jim Mattson
@ 2022-10-06  0:26   ` Daniel Sneddon
  2022-10-06  1:28     ` Jim Mattson
  2022-10-06  8:18   ` Peter Zijlstra
  1 sibling, 1 reply; 13+ messages in thread
From: Daniel Sneddon @ 2022-10-06  0:26 UTC (permalink / raw)
  To: Jim Mattson, Suraj Jitindar Singh
  Cc: kvm, sjitindarsingh, linux-kernel, x86, tglx, mingo, bp,
	dave.hansen, seanjc, pbonzini, peterz, jpoimboe,
	pawan.kumar.gupta, benh, stable

On 10/5/22 16:46, Jim Mattson wrote:
> On Wed, Oct 5, 2022 at 3:03 PM Suraj Jitindar Singh <surajjs@amazon.com> wrote:
>>
>> tl;dr: The existing mitigation for eIBRS PBRSB predictions uses an INT3 to
>> ensure a call instruction retires before a following unbalanced RET. Replace
>> this with a WRMSR serialising instruction which has a lower performance
>> penalty.
>>
>> == Background ==
>>
>> eIBRS (enhanced indirect branch restricted speculation) is used to prevent
>> predictor addresses from one privilege domain from being used for prediction
>> in a higher privilege domain.
>>
>> == Problem ==
>>
>> On processors with eIBRS protections there can be a case where upon VM exit
>> a guest address may be used as an RSB prediction for an unbalanced RET if a
>> CALL instruction hasn't yet been retired. This is termed PBRSB (Post-Barrier
>> Return Stack Buffer).
>>
>> A mitigation for this was introduced in:
>> (2b1299322016731d56807aa49254a5ea3080b6b3 x86/speculation: Add RSB VM Exit protections)
>>
>> This mitigation [1] has a ~1% performance impact on VM exit compared to without
>> it [2].
>>
>> == Solution ==
>>
>> The WRMSR instruction can be used as a speculation barrier and a serialising
>> instruction. Use this on the VM exit path instead to ensure that a CALL
>> instruction (in this case the call to vmx_spec_ctrl_restore_host) has retired
>> before the prediction of a following unbalanced RET.
>>
>> This mitigation [3] has a negligible performance impact.
>>
>> == Testing ==
>>
>> Run the outl_to_kernel kvm-unit-tests test 200 times per configuration which
>> counts the cycles for an exit to kernel mode.
>>
>> [1] With existing mitigation:
>> Average: 2026 cycles
>> [2] With no mitigation:
>> Average: 2008 cycles
>> [3] With proposed mitigation:
>> Average: 2008 cycles
>>
>> Signed-off-by: Suraj Jitindar Singh <surajjs@amazon.com>
>> Cc: stable@vger.kernel.org
>> ---
>>  arch/x86/include/asm/nospec-branch.h | 7 +++----
>>  arch/x86/kvm/vmx/vmenter.S           | 3 +--
>>  arch/x86/kvm/vmx/vmx.c               | 5 +++++
>>  3 files changed, 9 insertions(+), 6 deletions(-)
>>
>> diff --git a/arch/x86/include/asm/nospec-branch.h b/arch/x86/include/asm/nospec-branch.h
>> index c936ce9f0c47..e5723e024b47 100644
>> --- a/arch/x86/include/asm/nospec-branch.h
>> +++ b/arch/x86/include/asm/nospec-branch.h
>> @@ -159,10 +159,9 @@
>>    * A simpler FILL_RETURN_BUFFER macro. Don't make people use the CPP
>>    * monstrosity above, manually.
>>    */
>> -.macro FILL_RETURN_BUFFER reg:req nr:req ftr:req ftr2=ALT_NOT(X86_FEATURE_ALWAYS)
>> -       ALTERNATIVE_2 "jmp .Lskip_rsb_\@", \
>> -               __stringify(__FILL_RETURN_BUFFER(\reg,\nr)), \ftr, \
>> -               __stringify(__FILL_ONE_RETURN), \ftr2
>> +.macro FILL_RETURN_BUFFER reg:req nr:req ftr:req
>> +       ALTERNATIVE "jmp .Lskip_rsb_\@", \
>> +               __stringify(__FILL_RETURN_BUFFER(\reg,\nr)), \ftr
>>
>>  .Lskip_rsb_\@:
>>  .endm
>> diff --git a/arch/x86/kvm/vmx/vmenter.S b/arch/x86/kvm/vmx/vmenter.S
>> index 6de96b943804..eb82797bd7bf 100644
>> --- a/arch/x86/kvm/vmx/vmenter.S
>> +++ b/arch/x86/kvm/vmx/vmenter.S
>> @@ -231,8 +231,7 @@ SYM_INNER_LABEL(vmx_vmexit, SYM_L_GLOBAL)
>>          * single call to retire, before the first unbalanced RET.
>>           */
>>
>> -       FILL_RETURN_BUFFER %_ASM_CX, RSB_CLEAR_LOOPS, X86_FEATURE_RSB_VMEXIT,\
>> -                          X86_FEATURE_RSB_VMEXIT_LITE
>> +       FILL_RETURN_BUFFER %_ASM_CX, RSB_CLEAR_LOOPS, X86_FEATURE_RSB_VMEXIT
>>
>>
>>         pop %_ASM_ARG2  /* @flags */
>> diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
>> index c9b49a09e6b5..fdcd8e10c2ab 100644
>> --- a/arch/x86/kvm/vmx/vmx.c
>> +++ b/arch/x86/kvm/vmx/vmx.c
>> @@ -7049,8 +7049,13 @@ void noinstr vmx_spec_ctrl_restore_host(struct vcpu_vmx *vmx,
>>          * For legacy IBRS, the IBRS bit always needs to be written after
>>          * transitioning from a less privileged predictor mode, regardless of
>>          * whether the guest/host values differ.
>> +        *
>> +        * For eIBRS affected by Post Barrier RSB Predictions a serialising
>> +        * instruction (wrmsr) must be executed to ensure a call instruction has
>> +        * retired before the prediction of a following unbalanced ret.
>>          */
>>         if (cpu_feature_enabled(X86_FEATURE_KERNEL_IBRS) ||
>> +           cpu_feature_enabled(X86_FEATURE_RSB_VMEXIT_LITE) ||
>>             vmx->spec_ctrl != hostval)
>>                 native_wrmsrl(MSR_IA32_SPEC_CTRL, hostval);
> 
> Better, I think, would be to leave the condition alone and put an
> LFENCE on the 'else' path:
> 
>          if (cpu_feature_enabled(X86_FEATURE_KERNEL_IBRS) ||
>              vmx->spec_ctrl != hostval)
>                  native_wrmsrl(MSR_IA32_SPEC_CTRL, hostval);
>         else
>                 rmb();
> 
> When the guest and host have different IA32_SPEC_CTRL values, you get
> the serialization from the WRMSR. Otherwise, you get it from the
> cheaper LFENCE.
In this case systems that don't suffer from PBRSB (i.e. don've have
X86_FEATURE_RSB_VMEXIT_LITE set) would be doing a barrier for no reason.  We're
just trading performance on vulnerable systems for a performance hit on systems
that aren't vulnerable.
> 
> This is still more convoluted than having the mitigation in one place.
Agreed.


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] x86/speculation: Mitigate eIBRS PBRSB predictions with WRMSR
  2022-10-06  0:26   ` Daniel Sneddon
@ 2022-10-06  1:28     ` Jim Mattson
  0 siblings, 0 replies; 13+ messages in thread
From: Jim Mattson @ 2022-10-06  1:28 UTC (permalink / raw)
  To: Daniel Sneddon
  Cc: Suraj Jitindar Singh, kvm, sjitindarsingh, linux-kernel, x86,
	tglx, mingo, bp, dave.hansen, seanjc, pbonzini, peterz, jpoimboe,
	pawan.kumar.gupta, benh, stable

On Wed, Oct 5, 2022 at 5:26 PM Daniel Sneddon
<daniel.sneddon@linux.intel.com> wrote:
>
> On 10/5/22 16:46, Jim Mattson wrote:
> > On Wed, Oct 5, 2022 at 3:03 PM Suraj Jitindar Singh <surajjs@amazon.com> wrote:
> >>
> >> tl;dr: The existing mitigation for eIBRS PBRSB predictions uses an INT3 to
> >> ensure a call instruction retires before a following unbalanced RET. Replace
> >> this with a WRMSR serialising instruction which has a lower performance
> >> penalty.
> >>
> >> == Background ==
> >>
> >> eIBRS (enhanced indirect branch restricted speculation) is used to prevent
> >> predictor addresses from one privilege domain from being used for prediction
> >> in a higher privilege domain.
> >>
> >> == Problem ==
> >>
> >> On processors with eIBRS protections there can be a case where upon VM exit
> >> a guest address may be used as an RSB prediction for an unbalanced RET if a
> >> CALL instruction hasn't yet been retired. This is termed PBRSB (Post-Barrier
> >> Return Stack Buffer).
> >>
> >> A mitigation for this was introduced in:
> >> (2b1299322016731d56807aa49254a5ea3080b6b3 x86/speculation: Add RSB VM Exit protections)
> >>
> >> This mitigation [1] has a ~1% performance impact on VM exit compared to without
> >> it [2].
> >>
> >> == Solution ==
> >>
> >> The WRMSR instruction can be used as a speculation barrier and a serialising
> >> instruction. Use this on the VM exit path instead to ensure that a CALL
> >> instruction (in this case the call to vmx_spec_ctrl_restore_host) has retired
> >> before the prediction of a following unbalanced RET.
> >>
> >> This mitigation [3] has a negligible performance impact.
> >>
> >> == Testing ==
> >>
> >> Run the outl_to_kernel kvm-unit-tests test 200 times per configuration which
> >> counts the cycles for an exit to kernel mode.
> >>
> >> [1] With existing mitigation:
> >> Average: 2026 cycles
> >> [2] With no mitigation:
> >> Average: 2008 cycles
> >> [3] With proposed mitigation:
> >> Average: 2008 cycles
> >>
> >> Signed-off-by: Suraj Jitindar Singh <surajjs@amazon.com>
> >> Cc: stable@vger.kernel.org
> >> ---
> >>  arch/x86/include/asm/nospec-branch.h | 7 +++----
> >>  arch/x86/kvm/vmx/vmenter.S           | 3 +--
> >>  arch/x86/kvm/vmx/vmx.c               | 5 +++++
> >>  3 files changed, 9 insertions(+), 6 deletions(-)
> >>
> >> diff --git a/arch/x86/include/asm/nospec-branch.h b/arch/x86/include/asm/nospec-branch.h
> >> index c936ce9f0c47..e5723e024b47 100644
> >> --- a/arch/x86/include/asm/nospec-branch.h
> >> +++ b/arch/x86/include/asm/nospec-branch.h
> >> @@ -159,10 +159,9 @@
> >>    * A simpler FILL_RETURN_BUFFER macro. Don't make people use the CPP
> >>    * monstrosity above, manually.
> >>    */
> >> -.macro FILL_RETURN_BUFFER reg:req nr:req ftr:req ftr2=ALT_NOT(X86_FEATURE_ALWAYS)
> >> -       ALTERNATIVE_2 "jmp .Lskip_rsb_\@", \
> >> -               __stringify(__FILL_RETURN_BUFFER(\reg,\nr)), \ftr, \
> >> -               __stringify(__FILL_ONE_RETURN), \ftr2
> >> +.macro FILL_RETURN_BUFFER reg:req nr:req ftr:req
> >> +       ALTERNATIVE "jmp .Lskip_rsb_\@", \
> >> +               __stringify(__FILL_RETURN_BUFFER(\reg,\nr)), \ftr
> >>
> >>  .Lskip_rsb_\@:
> >>  .endm
> >> diff --git a/arch/x86/kvm/vmx/vmenter.S b/arch/x86/kvm/vmx/vmenter.S
> >> index 6de96b943804..eb82797bd7bf 100644
> >> --- a/arch/x86/kvm/vmx/vmenter.S
> >> +++ b/arch/x86/kvm/vmx/vmenter.S
> >> @@ -231,8 +231,7 @@ SYM_INNER_LABEL(vmx_vmexit, SYM_L_GLOBAL)
> >>          * single call to retire, before the first unbalanced RET.
> >>           */
> >>
> >> -       FILL_RETURN_BUFFER %_ASM_CX, RSB_CLEAR_LOOPS, X86_FEATURE_RSB_VMEXIT,\
> >> -                          X86_FEATURE_RSB_VMEXIT_LITE
> >> +       FILL_RETURN_BUFFER %_ASM_CX, RSB_CLEAR_LOOPS, X86_FEATURE_RSB_VMEXIT
> >>
> >>
> >>         pop %_ASM_ARG2  /* @flags */
> >> diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
> >> index c9b49a09e6b5..fdcd8e10c2ab 100644
> >> --- a/arch/x86/kvm/vmx/vmx.c
> >> +++ b/arch/x86/kvm/vmx/vmx.c
> >> @@ -7049,8 +7049,13 @@ void noinstr vmx_spec_ctrl_restore_host(struct vcpu_vmx *vmx,
> >>          * For legacy IBRS, the IBRS bit always needs to be written after
> >>          * transitioning from a less privileged predictor mode, regardless of
> >>          * whether the guest/host values differ.
> >> +        *
> >> +        * For eIBRS affected by Post Barrier RSB Predictions a serialising
> >> +        * instruction (wrmsr) must be executed to ensure a call instruction has
> >> +        * retired before the prediction of a following unbalanced ret.
> >>          */
> >>         if (cpu_feature_enabled(X86_FEATURE_KERNEL_IBRS) ||
> >> +           cpu_feature_enabled(X86_FEATURE_RSB_VMEXIT_LITE) ||
> >>             vmx->spec_ctrl != hostval)
> >>                 native_wrmsrl(MSR_IA32_SPEC_CTRL, hostval);
> >
> > Better, I think, would be to leave the condition alone and put an
> > LFENCE on the 'else' path:
> >
> >          if (cpu_feature_enabled(X86_FEATURE_KERNEL_IBRS) ||
> >              vmx->spec_ctrl != hostval)
> >                  native_wrmsrl(MSR_IA32_SPEC_CTRL, hostval);
> >         else
> >                 rmb();
> >
> > When the guest and host have different IA32_SPEC_CTRL values, you get
> > the serialization from the WRMSR. Otherwise, you get it from the
> > cheaper LFENCE.
> In this case systems that don't suffer from PBRSB (i.e. don've have
> X86_FEATURE_RSB_VMEXIT_LITE set) would be doing a barrier for no reason.  We're
> just trading performance on vulnerable systems for a performance hit on systems
> that aren't vulnerable.

The lfence could be buried in an ALTERNATIVE keyed to
X86_FEATURE_RSB_VMEXIT_LITE.

> > This is still more convoluted than having the mitigation in one place.
> Agreed.

For a mere 18 cycles, it doesn't really seem worth the obfuscation.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] x86/speculation: Mitigate eIBRS PBRSB predictions with WRMSR
  2022-10-05 22:02 [PATCH] x86/speculation: Mitigate eIBRS PBRSB predictions with WRMSR Suraj Jitindar Singh
                   ` (2 preceding siblings ...)
  2022-10-05 23:46 ` Jim Mattson
@ 2022-10-06  2:42 ` Andrew Cooper
  2022-10-07  1:44   ` pawan.kumar.gupta
  2022-10-07  1:54 ` Pawan Gupta
  4 siblings, 1 reply; 13+ messages in thread
From: Andrew Cooper @ 2022-10-06  2:42 UTC (permalink / raw)
  To: Suraj Jitindar Singh, kvm
  Cc: sjitindarsingh, linux-kernel, x86, tglx, mingo, bp, dave.hansen,
	seanjc, pbonzini, peterz, jpoimboe, daniel.sneddon,
	pawan.kumar.gupta, benh, stable, Andrew Cooper

On 05/10/2022 23:02, Suraj Jitindar Singh wrote:
> tl;dr: The existing mitigation for eIBRS PBRSB predictions uses an INT3 to
> ensure a call instruction retires before a following unbalanced RET.

No it doesn't.  The INT3 is transient.  The existing mitigation uses an
LFENCE.

> Replace this with a WRMSR serialising instruction which has a lower performance
> penalty.

What is "this"?  The INT3^W LFENCE?

WRMSR is not lower overhead than an LFENCE.

> == Solution ==
>
> The WRMSR instruction can be used as a speculation barrier and a serialising
> instruction. Use this on the VM exit path instead to ensure that a CALL
> instruction (in this case the call to vmx_spec_ctrl_restore_host) has retired
> before the prediction of a following unbalanced RET.

While both of these sentences are true statements, you've missed the
necessary safety property.

One CALL has to retire before *any* RET can execute.

There are several ways the frontend can end up eventually consuming the
bad RSB entry; they all stem from an execute (not prediction) of the
next RET instruction.

As to the change, ...

> diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
> index c9b49a09e6b5..fdcd8e10c2ab 100644
> --- a/arch/x86/kvm/vmx/vmx.c
> +++ b/arch/x86/kvm/vmx/vmx.c
> @@ -7049,8 +7049,13 @@ void noinstr vmx_spec_ctrl_restore_host(struct vcpu_vmx *vmx,

... out of context above this hunk is:

    if (!cpu_feature_enabled(X86_FEATURE_MSR_SPEC_CTRL))
        return;

meaning that there is a return instruction which is programmatically
reachable ahead of the WRMSR.

Whether it is speculatively reachable depends on whether the frontend
can see through the _static_cpu_has(), as well as
X86_FEATURE_MSR_SPEC_CTRL never becoming compile time evaluable.

There is also a second latent bug, to do with the code generation for
this_cpu_read(x86_spec_ctrl_current).


It's worth saying that, in principle, this optimisation is safe, but
pretty much all the discussion about it is wrong.  Here is one I
prepared earlier:
https://lore.kernel.org/xen-devel/20220809170016.25148-3-andrew.cooper3@citrix.com/


OTOH, below the hunk in question, there's a barrier_nospec() which is
giving you the actual projection you need (subject to latent and/or code
layout bugs), irrespective of the extra WRMSR.

~Andrew

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] x86/speculation: Mitigate eIBRS PBRSB predictions with WRMSR
  2022-10-05 23:46 ` Jim Mattson
  2022-10-06  0:26   ` Daniel Sneddon
@ 2022-10-06  8:18   ` Peter Zijlstra
  1 sibling, 0 replies; 13+ messages in thread
From: Peter Zijlstra @ 2022-10-06  8:18 UTC (permalink / raw)
  To: Jim Mattson
  Cc: Suraj Jitindar Singh, kvm, sjitindarsingh, linux-kernel, x86,
	tglx, mingo, bp, dave.hansen, seanjc, pbonzini, jpoimboe,
	daniel.sneddon, pawan.kumar.gupta, benh, stable

On Wed, Oct 05, 2022 at 04:46:12PM -0700, Jim Mattson wrote:

>                 rmb();

Since we're not doing memory ordering but speculation hardening, this is
completely the wrong way to spell LFENCE. Please use barrier_nospec() in
those cases.

And as has already been pointed out, there's one of those just two lines
down already.

Also, what Andrew said.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* RE: [PATCH] x86/speculation: Mitigate eIBRS PBRSB predictions with WRMSR
  2022-10-05 22:29 ` Jim Mattson
@ 2022-10-06  8:25   ` David Laight
  2022-10-06 20:27     ` pawan.kumar.gupta
  0 siblings, 1 reply; 13+ messages in thread
From: David Laight @ 2022-10-06  8:25 UTC (permalink / raw)
  To: 'Jim Mattson', Suraj Jitindar Singh
  Cc: kvm, sjitindarsingh, linux-kernel, x86, tglx, mingo, bp,
	dave.hansen, seanjc, pbonzini, peterz, jpoimboe, daniel.sneddon,
	pawan.kumar.gupta, benh, stable

From: Jim Mattson
> Sent: 05 October 2022 23:29
> 
> On Wed, Oct 5, 2022 at 3:03 PM Suraj Jitindar Singh <surajjs@amazon.com> wrote:
> >
> > tl;dr: The existing mitigation for eIBRS PBRSB predictions uses an INT3 to
> > ensure a call instruction retires before a following unbalanced RET. Replace
> > this with a WRMSR serialising instruction which has a lower performance
> > penalty.
> 
> The INT3 is only on a speculative path and should not impact performance.

Doesn't that depend on how quickly the cpu can abort the
decode and execution of the INT3 instruction?
INT3 is bound to generate a lot of uops and/or be microcoded.

Old cpu couldn't abort fpu instructions.
IIRC the Intel performance guide even suggested not interleaving
code and data because the data might get speculatively executed
and take a long time to abort.

I actually wonder whether 'JMPS .' (eb fe) shouldn't be used
instead of INT3 (cc) because it is fast to decode and execute.
But I'm no expect here.

	David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] x86/speculation: Mitigate eIBRS PBRSB predictions with WRMSR
  2022-10-06  8:25   ` David Laight
@ 2022-10-06 20:27     ` pawan.kumar.gupta
  0 siblings, 0 replies; 13+ messages in thread
From: pawan.kumar.gupta @ 2022-10-06 20:27 UTC (permalink / raw)
  To: David Laight
  Cc: 'Jim Mattson',
	Suraj Jitindar Singh, kvm, sjitindarsingh, linux-kernel, x86,
	tglx, mingo, bp, dave.hansen, seanjc, pbonzini, peterz, jpoimboe,
	daniel.sneddon, benh, stable

On Thu, Oct 06, 2022 at 08:25:15AM +0000, David Laight wrote:
>From: Jim Mattson
>> Sent: 05 October 2022 23:29
>>
>> On Wed, Oct 5, 2022 at 3:03 PM Suraj Jitindar Singh <surajjs@amazon.com> wrote:
>> >
>> > tl;dr: The existing mitigation for eIBRS PBRSB predictions uses an INT3 to
>> > ensure a call instruction retires before a following unbalanced RET. Replace
>> > this with a WRMSR serialising instruction which has a lower performance
>> > penalty.
>>
>> The INT3 is only on a speculative path and should not impact performance.
>
>Doesn't that depend on how quickly the cpu can abort the
>decode and execution of the INT3 instruction?
>INT3 is bound to generate a lot of uops and/or be microcoded.
>
>Old cpu couldn't abort fpu instructions.
>IIRC the Intel performance guide even suggested not interleaving
>code and data because the data might get speculatively executed
>and take a long time to abort.
>
>I actually wonder whether 'JMPS .' (eb fe) shouldn't be used
>instead of INT3 (cc) because it is fast to decode and execute.
>But I'm no expect here.

I have been told that INT3 is better in this case because 'JMP .' would
waste CPU resources.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] x86/speculation: Mitigate eIBRS PBRSB predictions with WRMSR
  2022-10-06  2:42 ` Andrew Cooper
@ 2022-10-07  1:44   ` pawan.kumar.gupta
  0 siblings, 0 replies; 13+ messages in thread
From: pawan.kumar.gupta @ 2022-10-07  1:44 UTC (permalink / raw)
  To: Andrew Cooper
  Cc: Suraj Jitindar Singh, kvm, sjitindarsingh, linux-kernel, x86,
	tglx, mingo, bp, dave.hansen, seanjc, pbonzini, peterz, jpoimboe,
	daniel.sneddon, benh, stable

On Thu, Oct 06, 2022 at 02:42:10AM +0000, Andrew Cooper wrote:
>On 05/10/2022 23:02, Suraj Jitindar Singh wrote:
>> == Solution ==
>>
>> The WRMSR instruction can be used as a speculation barrier and a serialising
>> instruction. Use this on the VM exit path instead to ensure that a CALL
>> instruction (in this case the call to vmx_spec_ctrl_restore_host) has retired
>> before the prediction of a following unbalanced RET.
>
>While both of these sentences are true statements, you've missed the
>necessary safety property.
>
>One CALL has to retire before *any* RET can execute.
>
>There are several ways the frontend can end up eventually consuming the
>bad RSB entry; they all stem from an execute (not prediction) of the
>next RET instruction.
>
>As to the change, ...
>
>> diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
>> index c9b49a09e6b5..fdcd8e10c2ab 100644
>> --- a/arch/x86/kvm/vmx/vmx.c
>> +++ b/arch/x86/kvm/vmx/vmx.c
>> @@ -7049,8 +7049,13 @@ void noinstr vmx_spec_ctrl_restore_host(struct vcpu_vmx *vmx,
>
>... out of context above this hunk is:
>
>    if (!cpu_feature_enabled(X86_FEATURE_MSR_SPEC_CTRL))
>        return;
>
>meaning that there is a return instruction which is programmatically
>reachable ahead of the WRMSR.
>
>Whether it is speculatively reachable depends on whether the frontend
>can see through the _static_cpu_has(), as well as
>X86_FEATURE_MSR_SPEC_CTRL never becoming compile time evaluable.

In this case wouldn't _static_cpu_has() be runtime patched to a JMP
(<+8> below) or a NOP? RET (at <+13>) should not be reachable even
speculatively. What am I missing?

Dump of assembler code for function vmx_spec_ctrl_restore_host:

   arch/x86/kvm/vmx/vmx.c:
                 u64 hostval = this_cpu_read(x86_spec_ctrl_current);
   		<+0>:     mov    %gs:0x7e022e60(%rip),%r8        # 0x1ad48 <x86_spec_ctrl_current>

   ./arch/x86/include/asm/cpufeature.h:
                 asm_volatile_goto(
   		<+8>:     jmp    <vmx_spec_ctrl_restore_host+14>
   		<+10>:    nopl   (%rax)
   		<+13>:    ret

   arch/x86/kvm/vmx/vmx.c:
                 if (flags & VMX_RUN_SAVE_SPEC_CTRL)
   		<+14>:    and    $0x2,%esi
   		<+17>:    je     <vmx_spec_ctrl_restore_host+40>
   [...]

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] x86/speculation: Mitigate eIBRS PBRSB predictions with WRMSR
  2022-10-05 22:02 [PATCH] x86/speculation: Mitigate eIBRS PBRSB predictions with WRMSR Suraj Jitindar Singh
                   ` (3 preceding siblings ...)
  2022-10-06  2:42 ` Andrew Cooper
@ 2022-10-07  1:54 ` Pawan Gupta
  4 siblings, 0 replies; 13+ messages in thread
From: Pawan Gupta @ 2022-10-07  1:54 UTC (permalink / raw)
  To: Suraj Jitindar Singh
  Cc: kvm, sjitindarsingh, linux-kernel, x86, tglx, mingo, bp,
	dave.hansen, seanjc, pbonzini, peterz, jpoimboe, daniel.sneddon,
	benh, stable

Hi Suraj,

On Wed, Oct 05, 2022 at 03:02:27PM -0700, Suraj Jitindar Singh wrote:
>tl;dr: The existing mitigation for eIBRS PBRSB predictions uses an INT3 to
>ensure a call instruction retires before a following unbalanced RET. Replace
>this with a WRMSR serialising instruction which has a lower performance
>penalty.
>
>== Background ==
>
>eIBRS (enhanced indirect branch restricted speculation) is used to prevent
>predictor addresses from one privilege domain from being used for prediction
>in a higher privilege domain.
>
>== Problem ==
>
>On processors with eIBRS protections there can be a case where upon VM exit
>a guest address may be used as an RSB prediction for an unbalanced RET if a
>CALL instruction hasn't yet been retired. This is termed PBRSB (Post-Barrier
>Return Stack Buffer).
>
>A mitigation for this was introduced in:
>(2b1299322016731d56807aa49254a5ea3080b6b3 x86/speculation: Add RSB VM Exit protections)
>
>This mitigation [1] has a ~1% performance impact on VM exit compared to without
>it [2].
>
>== Solution ==
>
>The WRMSR instruction can be used as a speculation barrier and a serialising
>instruction. Use this on the VM exit path instead to ensure that a CALL
>instruction (in this case the call to vmx_spec_ctrl_restore_host) has retired
>before the prediction of a following unbalanced RET.
>
>This mitigation [3] has a negligible performance impact.
>
>== Testing ==
>
>Run the outl_to_kernel kvm-unit-tests test 200 times per configuration which
>counts the cycles for an exit to kernel mode.
>
>[1] With existing mitigation:
>Average: 2026 cycles
>[2] With no mitigation:
>Average: 2008 cycles

During these tests was the value of MSR SPEC_CTRL for host and guest different?

>[3] With proposed mitigation:
>Average: 2008 cycles

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2022-10-07  1:54 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-10-05 22:02 [PATCH] x86/speculation: Mitigate eIBRS PBRSB predictions with WRMSR Suraj Jitindar Singh
2022-10-05 22:29 ` Jim Mattson
2022-10-06  8:25   ` David Laight
2022-10-06 20:27     ` pawan.kumar.gupta
2022-10-05 23:24 ` Jim Mattson
2022-10-05 23:45   ` Pawan Gupta
2022-10-05 23:46 ` Jim Mattson
2022-10-06  0:26   ` Daniel Sneddon
2022-10-06  1:28     ` Jim Mattson
2022-10-06  8:18   ` Peter Zijlstra
2022-10-06  2:42 ` Andrew Cooper
2022-10-07  1:44   ` pawan.kumar.gupta
2022-10-07  1:54 ` Pawan Gupta

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).