* [PATCH 0/2] xen/arm: Mitigate straight-line speculation
@ 2020-06-16 17:59 Julien Grall
2020-06-16 17:59 ` [PATCH 1/2] xen/arm: entry: Place a speculation barrier following an ret instruction Julien Grall
2020-06-16 17:59 ` [PATCH 2/2] xen/arm: Mitigate straight-line speculation for SMC call Julien Grall
0 siblings, 2 replies; 20+ messages in thread
From: Julien Grall @ 2020-06-16 17:59 UTC (permalink / raw)
To: security
Cc: sstabellini, paul, Andre.Przywara, Julien Grall,
Bertrand.Marquis, xen-devel, Volodymyr_Babchuk
From: Julien Grall <jgrall@amazon.com>
Hi all,
Arm recently released a whitepaper about a new category of speculation.
(see [1] and [2]). In short, a processor may be able to speculate past
some of the unconditional control flow instructions (e.g eret, smc, br).
In some of the cases, the registers will contain values controlled by
the guest. While there is no known gadget afterwards, we still want to
prevent any leakage in the future.
The mitigation is planned in two parts:
1) Arm provided patches for both GCC and LLVM to add speculation barrier
and remove problematic code sequence.
2) Inspection of assembly code and call to higher level (e.g smc in our case).
I am still waiting on more input for 1), so this series only address 2)
for the moment.
Note that the ERET instruction was already addressed as part of XSA-312.
The patch series is directly sent on the mailing list as the
security team has been aware of the issues after the whitepaper was
publicly released.
Cheers,
[1] https://developer.arm.com/support/arm-security-updates/speculative-processor-vulnerability
[2] https://developer.arm.com/support/arm-security-updates/speculative-processor-vulnerability/downloads/straight-line-speculation
Julien Grall (2):
xen/arm: entry: Place a speculation barrier following an ret
instruction
xen/arm: Mitigate straight-line speculation for SMC call
xen/arch/arm/arm32/entry.S | 1 +
xen/arch/arm/arm64/entry.S | 2 ++
xen/arch/arm/arm64/smc.S | 1 +
xen/include/asm-arm/smccc.h | 13 +++++++++++++
xen/include/asm-arm/system.h | 8 ++++++++
5 files changed, 25 insertions(+)
--
2.17.1
^ permalink raw reply [flat|nested] 20+ messages in thread
* [PATCH 1/2] xen/arm: entry: Place a speculation barrier following an ret instruction
2020-06-16 17:59 [PATCH 0/2] xen/arm: Mitigate straight-line speculation Julien Grall
@ 2020-06-16 17:59 ` Julien Grall
2020-06-16 21:24 ` Stefano Stabellini
2020-06-16 17:59 ` [PATCH 2/2] xen/arm: Mitigate straight-line speculation for SMC call Julien Grall
1 sibling, 1 reply; 20+ messages in thread
From: Julien Grall @ 2020-06-16 17:59 UTC (permalink / raw)
To: security
Cc: sstabellini, paul, Andre.Przywara, Julien Grall,
Bertrand.Marquis, xen-devel, Volodymyr_Babchuk
From: Julien Grall <jgrall@amazon.com>
Some CPUs can speculate past a RET instruction and potentially perform
speculative accesses to memory before processing the return.
There is no known gadget available after the RET instruction today.
However some of the registers (such as in check_pending_guest_serror())
may contain a value provided the guest.
In order to harden the code, it would be better to add a speculation
barrier after each RET instruction. The performance is meant to be
negligeable as the speculation barrier is not meant to be archicturally
executed.
Note that on arm32, the ldmia instruction will act as a return from the
function __context_switch(). While the whitepaper doesn't suggest it is
possible to speculate after the instruction, add preventively a
speculation barrier after it as well.
This is part of the work to mitigate straight-line speculation.
Signed-off-by: Julien Grall <jgrall@amazon.com>
---
I am still unsure whether we preventively should add a speculation barrier
preventively after all the RET instructions in arm*/lib/. The smc call be
taken care in a follow-up patch.
---
xen/arch/arm/arm32/entry.S | 1 +
xen/arch/arm/arm64/entry.S | 2 ++
2 files changed, 3 insertions(+)
diff --git a/xen/arch/arm/arm32/entry.S b/xen/arch/arm/arm32/entry.S
index b228d44b190c..bd54fc43558b 100644
--- a/xen/arch/arm/arm32/entry.S
+++ b/xen/arch/arm/arm32/entry.S
@@ -442,6 +442,7 @@ ENTRY(__context_switch)
add r4, r1, #VCPU_arch_saved_context
ldmia r4, {r4 - sl, fp, sp, pc} /* Load registers and return */
+ sb
/*
* Local variables:
diff --git a/xen/arch/arm/arm64/entry.S b/xen/arch/arm/arm64/entry.S
index 175ea2981e72..b37d8fe09dc2 100644
--- a/xen/arch/arm/arm64/entry.S
+++ b/xen/arch/arm/arm64/entry.S
@@ -522,6 +522,7 @@ abort_guest_exit_end:
cset x19, ne
ret
+ sb
ENDPROC(check_pending_guest_serror)
/*
@@ -583,6 +584,7 @@ ENTRY(__context_switch)
ldr lr, [x8]
mov sp, x9
ret
+ sb
/*
* Local variables:
--
2.17.1
^ permalink raw reply related [flat|nested] 20+ messages in thread
* [PATCH 2/2] xen/arm: Mitigate straight-line speculation for SMC call
2020-06-16 17:59 [PATCH 0/2] xen/arm: Mitigate straight-line speculation Julien Grall
2020-06-16 17:59 ` [PATCH 1/2] xen/arm: entry: Place a speculation barrier following an ret instruction Julien Grall
@ 2020-06-16 17:59 ` Julien Grall
2020-06-16 21:34 ` Stefano Stabellini
2020-06-18 17:46 ` Julien Grall
1 sibling, 2 replies; 20+ messages in thread
From: Julien Grall @ 2020-06-16 17:59 UTC (permalink / raw)
To: security
Cc: sstabellini, paul, Andre.Przywara, Julien Grall,
Bertrand.Marquis, xen-devel, Volodymyr_Babchuk
From: Julien Grall <jgrall@amazon.com>
SMC call will update some of registers (typically only x0) depending on
the arguments provided.
Some CPUs can speculate past a SMC instruction and potentially perform
speculative access to emrmoy using the pre-call values before executing
the SMC.
There is no known gadget available after the SMC call today. However
some of the registers may contain values from the guest and are expected
to be updated by the SMC call.
In order to harden the code, it would be better to prevent straight-line
speculation from an SMC. Architecturally executing the speculation
barrier after every SMC can be rather expensive (particularly on core
not SB). Therefore we want to mitigate it diferrently:
* For arm_smccc_1_0_smc, we can avoid a speculation barrier right
after the SMC instruction and instead rely on the one after eret.
* For arm_smccc_1_1_smc, we can place a B instruction after the SMC
instruction to skip the barrier.
Note that arm_smccc_1_0_smc version on arm32 is just an alias to
arm_smccc_1_1_smc.
Note that no speculation barrier has been added after the SMC
instruction in arm64/entry.S. This is fine because the call is not
expected to modify any registers. So straight-line speculation doesn't
matter.
Signed-off-by: Julien Grall <jgrall@amazon.com>
---
Note this hasn't been vetted by Arm but they are using the same
sort of mitigation for blr. So I am quite confident this could do the
trick.
---
xen/arch/arm/arm64/smc.S | 1 +
xen/include/asm-arm/smccc.h | 13 +++++++++++++
xen/include/asm-arm/system.h | 8 ++++++++
3 files changed, 22 insertions(+)
diff --git a/xen/arch/arm/arm64/smc.S b/xen/arch/arm/arm64/smc.S
index b0752be57e8f..e0a3106dd7ec 100644
--- a/xen/arch/arm/arm64/smc.S
+++ b/xen/arch/arm/arm64/smc.S
@@ -30,3 +30,4 @@ ENTRY(__arm_smccc_1_0_smc)
stp x2, x3, [x4, #SMCCC_RES_a2]
1:
ret
+ sb
diff --git a/xen/include/asm-arm/smccc.h b/xen/include/asm-arm/smccc.h
index 9d94beb3df2d..8650224923b1 100644
--- a/xen/include/asm-arm/smccc.h
+++ b/xen/include/asm-arm/smccc.h
@@ -200,11 +200,24 @@ struct arm_smccc_res {
* We have an output list that is not necessarily used, and GCC feels
* entitled to optimise the whole sequence away. "volatile" is what
* makes it stick.
+ *
+ * Some of the SMC callers are passing directly values that are
+ * controlled by the guest. To mitigate against straight-line
+ * speculation, a speculation barrier is required. As it may be
+ * expensive to architecturally execute the speculation barrier, we are
+ * using a B instruction to architecturally skip it.
+ *
+ * Note that the speculation barrier is technically not necessary as the
+ * B instruction should already block straight-line speculation. But
+ * better be safe than sorry ;).
*/
#define arm_smccc_1_1_smc(...) \
do { \
__declare_args(__count_args(__VA_ARGS__), __VA_ARGS__); \
asm volatile("smc #0\n" \
+ "b 1f\n" \
+ ASM_SB \
+ "1:\n" \
__constraints(__count_args(__VA_ARGS__))); \
if ( ___res ) \
*___res = (typeof(*___res)){r0, r1, r2, r3}; \
diff --git a/xen/include/asm-arm/system.h b/xen/include/asm-arm/system.h
index 65d5c8e423d7..e33ff4e0fc39 100644
--- a/xen/include/asm-arm/system.h
+++ b/xen/include/asm-arm/system.h
@@ -33,6 +33,14 @@
#define smp_mb__before_atomic() smp_mb()
#define smp_mb__after_atomic() smp_mb()
+/*
+ * Speculative barrier
+ * XXX: Add support for the 'sb' instruction
+ */
+#define ASM_SB "dsb nsh \n isb \n"
+
+#define sb() asm volatile(ASM_SB)
+
/*
* This is used to ensure the compiler did actually allocate the register we
* asked it for some inline assembly sequences. Apparently we can't trust
--
2.17.1
^ permalink raw reply related [flat|nested] 20+ messages in thread
* Re: [PATCH 1/2] xen/arm: entry: Place a speculation barrier following an ret instruction
2020-06-16 17:59 ` [PATCH 1/2] xen/arm: entry: Place a speculation barrier following an ret instruction Julien Grall
@ 2020-06-16 21:24 ` Stefano Stabellini
2020-06-17 16:23 ` Julien Grall
0 siblings, 1 reply; 20+ messages in thread
From: Stefano Stabellini @ 2020-06-16 21:24 UTC (permalink / raw)
To: Julien Grall
Cc: sstabellini, paul, Andre.Przywara, Julien Grall,
Bertrand.Marquis, security, xen-devel, Volodymyr_Babchuk
On Tue, 16 Jun 2020, Julien Grall wrote:
> From: Julien Grall <jgrall@amazon.com>
>
> Some CPUs can speculate past a RET instruction and potentially perform
> speculative accesses to memory before processing the return.
>
> There is no known gadget available after the RET instruction today.
> However some of the registers (such as in check_pending_guest_serror())
> may contain a value provided the guest.
^ by
> In order to harden the code, it would be better to add a speculation
> barrier after each RET instruction. The performance is meant to be
> negligeable as the speculation barrier is not meant to be archicturally
> executed.
>
> Note that on arm32, the ldmia instruction will act as a return from the
> function __context_switch(). While the whitepaper doesn't suggest it is
> possible to speculate after the instruction, add preventively a
> speculation barrier after it as well.
>
> This is part of the work to mitigate straight-line speculation.
>
> Signed-off-by: Julien Grall <jgrall@amazon.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
I did a compile-test on the patch too.
> ---
>
> I am still unsure whether we preventively should add a speculation barrier
> preventively after all the RET instructions in arm*/lib/. The smc call be
> taken care in a follow-up patch.
SMC is great to have but it seems to be overkill to do the ones under
lib/.
> ---
> xen/arch/arm/arm32/entry.S | 1 +
> xen/arch/arm/arm64/entry.S | 2 ++
> 2 files changed, 3 insertions(+)
>
> diff --git a/xen/arch/arm/arm32/entry.S b/xen/arch/arm/arm32/entry.S
> index b228d44b190c..bd54fc43558b 100644
> --- a/xen/arch/arm/arm32/entry.S
> +++ b/xen/arch/arm/arm32/entry.S
> @@ -442,6 +442,7 @@ ENTRY(__context_switch)
>
> add r4, r1, #VCPU_arch_saved_context
> ldmia r4, {r4 - sl, fp, sp, pc} /* Load registers and return */
> + sb
>
> /*
> * Local variables:
> diff --git a/xen/arch/arm/arm64/entry.S b/xen/arch/arm/arm64/entry.S
> index 175ea2981e72..b37d8fe09dc2 100644
> --- a/xen/arch/arm/arm64/entry.S
> +++ b/xen/arch/arm/arm64/entry.S
> @@ -522,6 +522,7 @@ abort_guest_exit_end:
> cset x19, ne
>
> ret
> + sb
> ENDPROC(check_pending_guest_serror)
>
> /*
> @@ -583,6 +584,7 @@ ENTRY(__context_switch)
> ldr lr, [x8]
> mov sp, x9
> ret
> + sb
>
> /*
> * Local variables:
> --
> 2.17.1
>
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH 2/2] xen/arm: Mitigate straight-line speculation for SMC call
2020-06-16 17:59 ` [PATCH 2/2] xen/arm: Mitigate straight-line speculation for SMC call Julien Grall
@ 2020-06-16 21:34 ` Stefano Stabellini
2020-06-16 21:57 ` Julien Grall
2020-06-18 17:46 ` Julien Grall
1 sibling, 1 reply; 20+ messages in thread
From: Stefano Stabellini @ 2020-06-16 21:34 UTC (permalink / raw)
To: Julien Grall
Cc: sstabellini, paul, Andre.Przywara, Julien Grall,
Bertrand.Marquis, security, xen-devel, Volodymyr_Babchuk
On Tue, 16 Jun 2020, Julien Grall wrote:
> From: Julien Grall <jgrall@amazon.com>
>
> SMC call will update some of registers (typically only x0) depending on
^a SMC call
> the arguments provided.
>
> Some CPUs can speculate past a SMC instruction and potentially perform
> speculative access to emrmoy using the pre-call values before executing
^ memory
> the SMC.
>
> There is no known gadget available after the SMC call today. However
> some of the registers may contain values from the guest and are expected
> to be updated by the SMC call.
>
> In order to harden the code, it would be better to prevent straight-line
> speculation from an SMC. Architecturally executing the speculation
^ a? any?
> barrier after every SMC can be rather expensive (particularly on core
> not SB). Therefore we want to mitigate it diferrently:
^ not SB capable? ^ differently
>
> * For arm_smccc_1_0_smc, we can avoid a speculation barrier right
> after the SMC instruction and instead rely on the one after eret.
^ ret
> * For arm_smccc_1_1_smc, we can place a B instruction after the SMC
> instruction to skip the barrier.
>
> Note that arm_smccc_1_0_smc version on arm32 is just an alias to
> arm_smccc_1_1_smc.
>
> Note that no speculation barrier has been added after the SMC
> instruction in arm64/entry.S. This is fine because the call is not
> expected to modify any registers. So straight-line speculation doesn't
> matter.
>
> Signed-off-by: Julien Grall <jgrall@amazon.com>
I couldn't spot any issues with the patch and I compile-tested it.
> ---
>
> Note this hasn't been vetted by Arm but they are using the same
> sort of mitigation for blr. So I am quite confident this could do the
> trick.
Noted
> ---
> xen/arch/arm/arm64/smc.S | 1 +
> xen/include/asm-arm/smccc.h | 13 +++++++++++++
> xen/include/asm-arm/system.h | 8 ++++++++
> 3 files changed, 22 insertions(+)
>
> diff --git a/xen/arch/arm/arm64/smc.S b/xen/arch/arm/arm64/smc.S
> index b0752be57e8f..e0a3106dd7ec 100644
> --- a/xen/arch/arm/arm64/smc.S
> +++ b/xen/arch/arm/arm64/smc.S
> @@ -30,3 +30,4 @@ ENTRY(__arm_smccc_1_0_smc)
> stp x2, x3, [x4, #SMCCC_RES_a2]
> 1:
> ret
> + sb
> diff --git a/xen/include/asm-arm/smccc.h b/xen/include/asm-arm/smccc.h
> index 9d94beb3df2d..8650224923b1 100644
> --- a/xen/include/asm-arm/smccc.h
> +++ b/xen/include/asm-arm/smccc.h
> @@ -200,11 +200,24 @@ struct arm_smccc_res {
> * We have an output list that is not necessarily used, and GCC feels
> * entitled to optimise the whole sequence away. "volatile" is what
> * makes it stick.
> + *
> + * Some of the SMC callers are passing directly values that are
> + * controlled by the guest. To mitigate against straight-line
> + * speculation, a speculation barrier is required. As it may be
> + * expensive to architecturally execute the speculation barrier, we are
> + * using a B instruction to architecturally skip it.
> + *
> + * Note that the speculation barrier is technically not necessary as the
> + * B instruction should already block straight-line speculation. But
> + * better be safe than sorry ;).
Eh eh indeed :-)
I think this would be one thing to consider removing depending on ARM's
feedback.
> */
> #define arm_smccc_1_1_smc(...) \
> do { \
> __declare_args(__count_args(__VA_ARGS__), __VA_ARGS__); \
> asm volatile("smc #0\n" \
> + "b 1f\n" \
> + ASM_SB \
> + "1:\n" \
> __constraints(__count_args(__VA_ARGS__))); \
> if ( ___res ) \
> *___res = (typeof(*___res)){r0, r1, r2, r3}; \
> diff --git a/xen/include/asm-arm/system.h b/xen/include/asm-arm/system.h
> index 65d5c8e423d7..e33ff4e0fc39 100644
> --- a/xen/include/asm-arm/system.h
> +++ b/xen/include/asm-arm/system.h
> @@ -33,6 +33,14 @@
> #define smp_mb__before_atomic() smp_mb()
> #define smp_mb__after_atomic() smp_mb()
>
> +/*
> + * Speculative barrier
> + * XXX: Add support for the 'sb' instruction
> + */
> +#define ASM_SB "dsb nsh \n isb \n"
Why not ';' ? Anyway it doesn't matter.
> +#define sb() asm volatile(ASM_SB)
> +
> /*
> * This is used to ensure the compiler did actually allocate the register we
> * asked it for some inline assembly sequences. Apparently we can't trust
> --
> 2.17.1
>
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH 2/2] xen/arm: Mitigate straight-line speculation for SMC call
2020-06-16 21:34 ` Stefano Stabellini
@ 2020-06-16 21:57 ` Julien Grall
2020-06-16 23:16 ` Andrew Cooper
0 siblings, 1 reply; 20+ messages in thread
From: Julien Grall @ 2020-06-16 21:57 UTC (permalink / raw)
To: Stefano Stabellini
Cc: Paul Durrant, Andre Przywara, Julien Grall, Bertrand Marquis,
Xen.org security team, xen-devel, Volodymyr Babchuk
On Tue, 16 Jun 2020 at 22:34, Stefano Stabellini <sstabellini@kernel.org> wrote:
>
> On Tue, 16 Jun 2020, Julien Grall wrote:
> > From: Julien Grall <jgrall@amazon.com>
> >
> > SMC call will update some of registers (typically only x0) depending on
> ^a SMC call
>
> > the arguments provided.
> >
> > Some CPUs can speculate past a SMC instruction and potentially perform
> > speculative access to emrmoy using the pre-call values before executing
> ^ memory
>
> > the SMC.
> >
> > There is no known gadget available after the SMC call today. However
> > some of the registers may contain values from the guest and are expected
> > to be updated by the SMC call.
> >
> > In order to harden the code, it would be better to prevent straight-line
> > speculation from an SMC. Architecturally executing the speculation
> ^ a? any?
"any" might be better.
>
>
> > barrier after every SMC can be rather expensive (particularly on core
> > not SB). Therefore we want to mitigate it diferrently:
> ^ not SB capable? ^ differently
It might be better to say "which doesn't support ARMv8.0-SB"
> > */
> > #define arm_smccc_1_1_smc(...) \
> > do { \
> > __declare_args(__count_args(__VA_ARGS__), __VA_ARGS__); \
> > asm volatile("smc #0\n" \
> > + "b 1f\n" \
> > + ASM_SB \
> > + "1:\n" \
> > __constraints(__count_args(__VA_ARGS__))); \
> > if ( ___res ) \
> > *___res = (typeof(*___res)){r0, r1, r2, r3}; \
> > diff --git a/xen/include/asm-arm/system.h b/xen/include/asm-arm/system.h
> > index 65d5c8e423d7..e33ff4e0fc39 100644
> > --- a/xen/include/asm-arm/system.h
> > +++ b/xen/include/asm-arm/system.h
> > @@ -33,6 +33,14 @@
> > #define smp_mb__before_atomic() smp_mb()
> > #define smp_mb__after_atomic() smp_mb()
> >
> > +/*
> > + * Speculative barrier
> > + * XXX: Add support for the 'sb' instruction
> > + */
> > +#define ASM_SB "dsb nsh \n isb \n"
>
> Why not ';' ? Anyway it doesn't matter.
Per [1] and [2], a semicolon is not portable as some assemblers may
treat anything after it as a comment.
This reminds me that I have been using semicolons in entry.S. I
should probably have a look to avoid them.
Cheers,
[1] https://developer.arm.com/docs/dui0801/d/structure-of-assembly-language-modules/syntax-of-source-lines-in-assembly-language
[2] https://gcc.gnu.org/onlinedocs/gcc/Extended-Asm.html#AssemblerTemplate
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH 2/2] xen/arm: Mitigate straight-line speculation for SMC call
2020-06-16 21:57 ` Julien Grall
@ 2020-06-16 23:16 ` Andrew Cooper
2020-06-16 23:27 ` Stefano Stabellini
0 siblings, 1 reply; 20+ messages in thread
From: Andrew Cooper @ 2020-06-16 23:16 UTC (permalink / raw)
To: Julien Grall, Stefano Stabellini
Cc: Paul Durrant, Andre Przywara, Julien Grall, Bertrand Marquis,
Xen.org security team, xen-devel, Volodymyr Babchuk
On 16/06/2020 22:57, Julien Grall wrote:
> On Tue, 16 Jun 2020 at 22:34, Stefano Stabellini <sstabellini@kernel.org> wrote:
>> On Tue, 16 Jun 2020, Julien Grall wrote:
>>> From: Julien Grall <jgrall@amazon.com>
>>>
>>> SMC call will update some of registers (typically only x0) depending on
>> ^a SMC call
An SMC call.
>>
>>> the arguments provided.
>>>
>>> Some CPUs can speculate past a SMC instruction and potentially perform
>>> speculative access to emrmoy using the pre-call values before executing
>> ^ memory
>>
>>> the SMC.
>>>
>>> There is no known gadget available after the SMC call today. However
>>> some of the registers may contain values from the guest and are expected
>>> to be updated by the SMC call.
>>>
>>> In order to harden the code, it would be better to prevent straight-line
>>> speculation from an SMC. Architecturally executing the speculation
>> ^ a? any?
> "any" might be better.
"an SMC" is correct, but "any" is also fine.
'a' vs 'an' is based on the sound of the following. S in "S-M-C" as an
abbreviation starts with an 'e' vowel sound, unlike 's' in secure, so
the correct grammar is "an SMC" and "a secure monitor call".
~Andrew
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH 2/2] xen/arm: Mitigate straight-line speculation for SMC call
2020-06-16 23:16 ` Andrew Cooper
@ 2020-06-16 23:27 ` Stefano Stabellini
0 siblings, 0 replies; 20+ messages in thread
From: Stefano Stabellini @ 2020-06-16 23:27 UTC (permalink / raw)
To: Andrew Cooper
Cc: Stefano Stabellini, Paul Durrant, Andre Przywara, Julien Grall,
Bertrand Marquis, Xen.org security team, xen-devel,
Volodymyr Babchuk, Julien Grall
[-- Attachment #1: Type: text/plain, Size: 1395 bytes --]
On Wed, 17 Jun 2020, Andrew Cooper wrote:
> On 16/06/2020 22:57, Julien Grall wrote:
> > On Tue, 16 Jun 2020 at 22:34, Stefano Stabellini <sstabellini@kernel.org> wrote:
> >> On Tue, 16 Jun 2020, Julien Grall wrote:
> >>> From: Julien Grall <jgrall@amazon.com>
> >>>
> >>> SMC call will update some of registers (typically only x0) depending on
> >> ^a SMC call
>
> An SMC call.
>
> >>
> >>> the arguments provided.
> >>>
> >>> Some CPUs can speculate past a SMC instruction and potentially perform
> >>> speculative access to emrmoy using the pre-call values before executing
> >> ^ memory
> >>
> >>> the SMC.
> >>>
> >>> There is no known gadget available after the SMC call today. However
> >>> some of the registers may contain values from the guest and are expected
> >>> to be updated by the SMC call.
> >>>
> >>> In order to harden the code, it would be better to prevent straight-line
> >>> speculation from an SMC. Architecturally executing the speculation
> >> ^ a? any?
> > "any" might be better.
>
> "an SMC" is correct, but "any" is also fine.
>
> 'a' vs 'an' is based on the sound of the following. S in "S-M-C" as an
> abbreviation starts with an 'e' vowel sound, unlike 's' in secure, so
> the correct grammar is "an SMC" and "a secure monitor call".
LOL! English sometimes... damn.
Anyway, many thanks for the correction :-)
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH 1/2] xen/arm: entry: Place a speculation barrier following an ret instruction
2020-06-16 21:24 ` Stefano Stabellini
@ 2020-06-17 16:23 ` Julien Grall
2020-07-04 16:07 ` Julien Grall
0 siblings, 1 reply; 20+ messages in thread
From: Julien Grall @ 2020-06-17 16:23 UTC (permalink / raw)
To: Stefano Stabellini
Cc: paul, Andre.Przywara, Julien Grall, Bertrand.Marquis, security,
xen-devel, Volodymyr_Babchuk
Hi,
On 16/06/2020 22:24, Stefano Stabellini wrote:
> On Tue, 16 Jun 2020, Julien Grall wrote:
>> From: Julien Grall <jgrall@amazon.com>
>>
>> Some CPUs can speculate past a RET instruction and potentially perform
>> speculative accesses to memory before processing the return.
>>
>> There is no known gadget available after the RET instruction today.
>> However some of the registers (such as in check_pending_guest_serror())
>> may contain a value provided the guest.
> ^ by
>
>
>> In order to harden the code, it would be better to add a speculation
>> barrier after each RET instruction. The performance is meant to be
>> negligeable as the speculation barrier is not meant to be archicturally
>> executed.
>>
>> Note that on arm32, the ldmia instruction will act as a return from the
>> function __context_switch(). While the whitepaper doesn't suggest it is
>> possible to speculate after the instruction, add preventively a
>> speculation barrier after it as well.
>>
>> This is part of the work to mitigate straight-line speculation.
>>
>> Signed-off-by: Julien Grall <jgrall@amazon.com>
>
> Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
>
> I did a compile-test on the patch too.
>
>
>> ---
>>
>> I am still unsure whether we preventively should add a speculation barrier
>> preventively after all the RET instructions in arm*/lib/. The smc call be
>> taken care in a follow-up patch.
>
> SMC is great to have but it seems to be overkill to do the ones under
> lib/.
From my understanding, the compiler will add a speculation barrier
preventively after each 'ret' when the mitigation are turned on.So it
feels to me we want to follow the same approach.
Obviously, we can avoid them but I would like to have a justification
for not adding them (nothing is overkilled against speculation ;)).
Cheers,
--
Julien Grall
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH 2/2] xen/arm: Mitigate straight-line speculation for SMC call
2020-06-16 17:59 ` [PATCH 2/2] xen/arm: Mitigate straight-line speculation for SMC call Julien Grall
2020-06-16 21:34 ` Stefano Stabellini
@ 2020-06-18 17:46 ` Julien Grall
1 sibling, 0 replies; 20+ messages in thread
From: Julien Grall @ 2020-06-18 17:46 UTC (permalink / raw)
To: security
Cc: sstabellini, paul, Andre.Przywara, Julien Grall,
Bertrand.Marquis, xen-devel, Volodymyr_Babchuk
On 16/06/2020 18:59, Julien Grall wrote:
> From: Julien Grall <jgrall@amazon.com>
>
> SMC call will update some of registers (typically only x0) depending on
> the arguments provided.
>
> Some CPUs can speculate past a SMC instruction and potentially perform
> speculative access to emrmoy using the pre-call values before executing
> the SMC.
>
> There is no known gadget available after the SMC call today. However
> some of the registers may contain values from the guest and are expected
> to be updated by the SMC call.
>
> In order to harden the code, it would be better to prevent straight-line
> speculation from an SMC. Architecturally executing the speculation
> barrier after every SMC can be rather expensive (particularly on core
> not SB). Therefore we want to mitigate it diferrently:
>
> * For arm_smccc_1_0_smc, we can avoid a speculation barrier right
> after the SMC instruction and instead rely on the one after eret.
> * For arm_smccc_1_1_smc, we can place a B instruction after the SMC
> instruction to skip the barrier.
>
> Note that arm_smccc_1_0_smc version on arm32 is just an alias to
> arm_smccc_1_1_smc.
>
> Note that no speculation barrier has been added after the SMC
> instruction in arm64/entry.S. This is fine because the call is not
> expected to modify any registers. So straight-line speculation doesn't
> matter.
>
> Signed-off-by: Julien Grall <jgrall@amazon.com>
>
> ---
>
> Note this hasn't been vetted by Arm but they are using the same
> sort of mitigation for blr. So I am quite confident this could do the
> trick.
Actually there is some unknown on whether this may introduce issue on
other sort of speculation. As there is no known reveal gadge after the
SMC call and this is only about prevention, I will withdraw this patch
for the time being.
Patch #1 is still valid though.
Cheers,
--
Julien Grall
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH 1/2] xen/arm: entry: Place a speculation barrier following an ret instruction
2020-06-17 16:23 ` Julien Grall
@ 2020-07-04 16:07 ` Julien Grall
2020-08-18 16:35 ` Bertrand Marquis
0 siblings, 1 reply; 20+ messages in thread
From: Julien Grall @ 2020-07-04 16:07 UTC (permalink / raw)
To: Stefano Stabellini
Cc: paul, Andre.Przywara, Julien Grall, Bertrand.Marquis, security,
xen-devel, Volodymyr_Babchuk
On 17/06/2020 17:23, Julien Grall wrote:
> Hi,
>
> On 16/06/2020 22:24, Stefano Stabellini wrote:
>> On Tue, 16 Jun 2020, Julien Grall wrote:
>>> From: Julien Grall <jgrall@amazon.com>
>>>
>>> Some CPUs can speculate past a RET instruction and potentially perform
>>> speculative accesses to memory before processing the return.
>>>
>>> There is no known gadget available after the RET instruction today.
>>> However some of the registers (such as in check_pending_guest_serror())
>>> may contain a value provided the guest.
>> ^ by
>>
>>
>>> In order to harden the code, it would be better to add a speculation
>>> barrier after each RET instruction. The performance is meant to be
>>> negligeable as the speculation barrier is not meant to be archicturally
>>> executed.
>>>
>>> Note that on arm32, the ldmia instruction will act as a return from the
>>> function __context_switch(). While the whitepaper doesn't suggest it is
>>> possible to speculate after the instruction, add preventively a
>>> speculation barrier after it as well.
>>>
>>> This is part of the work to mitigate straight-line speculation.
>>>
>>> Signed-off-by: Julien Grall <jgrall@amazon.com>
>>
>> Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
>>
>> I did a compile-test on the patch too.
>>
>>
>>> ---
>>>
>>> I am still unsure whether we preventively should add a speculation
>>> barrier
>>> preventively after all the RET instructions in arm*/lib/. The smc
>>> call be
>>> taken care in a follow-up patch.
>>
>> SMC is great to have but it seems to be overkill to do the ones under
>> lib/.
> From my understanding, the compiler will add a speculation barrier
> preventively after each 'ret' when the mitigation are turned on.So it
> feels to me we want to follow the same approach.
>
> Obviously, we can avoid them but I would like to have a justification
> for not adding them (nothing is overkilled against speculation ;)).
I finally found some time to look at arm*/lib in more details. Some of
the helpers can definitely be called with guest inputs.
For instance, memchr() is called from hypfs_get_path_user() with the 3rd
argument controlled by the guest. In both 32-bit and 64-bit
implementation, you will reach the end of the function memchr() with
r2/w2 and r3/w3 (it contains a character from the buffer) controlled by
the guest.
As this is the only function in the unit, we don't know what will be the
instructions right after RET. So it would be safer to add a speculation
barrier there too.
Note that hypfs is currently only accessibly by Dom0. Yet, I still think
we should try to harden any code if we can :).
Cheers,
--
Julien Grall
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH 1/2] xen/arm: entry: Place a speculation barrier following an ret instruction
2020-07-04 16:07 ` Julien Grall
@ 2020-08-18 16:35 ` Bertrand Marquis
2020-08-18 16:43 ` Julien Grall
0 siblings, 1 reply; 20+ messages in thread
From: Bertrand Marquis @ 2020-08-18 16:35 UTC (permalink / raw)
To: Julien Grall
Cc: Stefano Stabellini, security, Paul Durrant, Xen-devel,
Volodymyr Babchuk, Andre Przywara, Julien Grall, nd
Hi Julien,
Somehow we stopped on this thread and you did already most of the work so I think we should try to finish what you started
> On 4 Jul 2020, at 17:07, Julien Grall <julien@xen.org> wrote:
>
> On 17/06/2020 17:23, Julien Grall wrote:
>> Hi,
>> On 16/06/2020 22:24, Stefano Stabellini wrote:
>>> On Tue, 16 Jun 2020, Julien Grall wrote:
>>>> From: Julien Grall <jgrall@amazon.com>
>>>>
>>>> Some CPUs can speculate past a RET instruction and potentially perform
>>>> speculative accesses to memory before processing the return.
>>>>
>>>> There is no known gadget available after the RET instruction today.
>>>> However some of the registers (such as in check_pending_guest_serror())
>>>> may contain a value provided the guest.
>>> ^ by
>>>
>>>
>>>> In order to harden the code, it would be better to add a speculation
>>>> barrier after each RET instruction. The performance is meant to be
>>>> negligeable as the speculation barrier is not meant to be archicturally
>>>> executed.
>>>>
>>>> Note that on arm32, the ldmia instruction will act as a return from the
>>>> function __context_switch(). While the whitepaper doesn't suggest it is
>>>> possible to speculate after the instruction, add preventively a
>>>> speculation barrier after it as well.
>>>>
>>>> This is part of the work to mitigate straight-line speculation.
>>>>
>>>> Signed-off-by: Julien Grall <jgrall@amazon.com>
>>>
>>> Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
>>>
>>> I did a compile-test on the patch too.
>>>
>>>
>>>> ---
>>>>
>>>> I am still unsure whether we preventively should add a speculation barrier
>>>> preventively after all the RET instructions in arm*/lib/. The smc call be
>>>> taken care in a follow-up patch.
>>>
>>> SMC is great to have but it seems to be overkill to do the ones under
>>> lib/.
>> From my understanding, the compiler will add a speculation barrier preventively after each 'ret' when the mitigation are turned on.So it feels to me we want to follow the same approach.
>> Obviously, we can avoid them but I would like to have a justification for not adding them (nothing is overkilled against speculation ;)).
>
> I finally found some time to look at arm*/lib in more details. Some of the helpers can definitely be called with guest inputs.
>
> For instance, memchr() is called from hypfs_get_path_user() with the 3rd argument controlled by the guest. In both 32-bit and 64-bit implementation, you will reach the end of the function memchr() with r2/w2 and r3/w3 (it contains a character from the buffer) controlled by the guest.
>
> As this is the only function in the unit, we don't know what will be the instructions right after RET. So it would be safer to add a speculation barrier there too.
How about adding a speculation barrier directly in the ENDPROC macro ?
>
> Note that hypfs is currently only accessibly by Dom0. Yet, I still think we should try to harden any code if we can :).
Agree with that.
Cheers (and sorry for the delay)
Bertrand
>
> Cheers,
>
> --
> Julien Grall
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH 1/2] xen/arm: entry: Place a speculation barrier following an ret instruction
2020-08-18 16:35 ` Bertrand Marquis
@ 2020-08-18 16:43 ` Julien Grall
2020-08-18 17:06 ` Bertrand Marquis
0 siblings, 1 reply; 20+ messages in thread
From: Julien Grall @ 2020-08-18 16:43 UTC (permalink / raw)
To: Bertrand Marquis
Cc: Stefano Stabellini, security, Paul Durrant, Xen-devel,
Volodymyr Babchuk, Andre Przywara, Julien Grall, nd
On 18/08/2020 17:35, Bertrand Marquis wrote:
> Hi Julien,
Hi Bertrand,
>
> Somehow we stopped on this thread and you did already most of the work so I think we should try to finish what you started
Sorry this fell-through the cracks. I have a new version for patch #1,
but not yet patch #2.
I am still debating with myself where the speculation barrier should be
added after the SMC :).
>
>
>> On 4 Jul 2020, at 17:07, Julien Grall <julien@xen.org> wrote:
>>
>> On 17/06/2020 17:23, Julien Grall wrote:
>>> Hi,
>>> On 16/06/2020 22:24, Stefano Stabellini wrote:
>>>> On Tue, 16 Jun 2020, Julien Grall wrote:
>>>>> From: Julien Grall <jgrall@amazon.com>
>>>>>
>>>>> Some CPUs can speculate past a RET instruction and potentially perform
>>>>> speculative accesses to memory before processing the return.
>>>>>
>>>>> There is no known gadget available after the RET instruction today.
>>>>> However some of the registers (such as in check_pending_guest_serror())
>>>>> may contain a value provided the guest.
>>>> ^ by
>>>>
>>>>
>>>>> In order to harden the code, it would be better to add a speculation
>>>>> barrier after each RET instruction. The performance is meant to be
>>>>> negligeable as the speculation barrier is not meant to be archicturally
>>>>> executed.
>>>>>
>>>>> Note that on arm32, the ldmia instruction will act as a return from the
>>>>> function __context_switch(). While the whitepaper doesn't suggest it is
>>>>> possible to speculate after the instruction, add preventively a
>>>>> speculation barrier after it as well.
>>>>>
>>>>> This is part of the work to mitigate straight-line speculation.
>>>>>
>>>>> Signed-off-by: Julien Grall <jgrall@amazon.com>
>>>>
>>>> Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
>>>>
>>>> I did a compile-test on the patch too.
>>>>
>>>>
>>>>> ---
>>>>>
>>>>> I am still unsure whether we preventively should add a speculation barrier
>>>>> preventively after all the RET instructions in arm*/lib/. The smc call be
>>>>> taken care in a follow-up patch.
>>>>
>>>> SMC is great to have but it seems to be overkill to do the ones under
>>>> lib/.
>>> From my understanding, the compiler will add a speculation barrier preventively after each 'ret' when the mitigation are turned on.So it feels to me we want to follow the same approach.
>>> Obviously, we can avoid them but I would like to have a justification for not adding them (nothing is overkilled against speculation ;)).
>>
>> I finally found some time to look at arm*/lib in more details. Some of the helpers can definitely be called with guest inputs.
>>
>> For instance, memchr() is called from hypfs_get_path_user() with the 3rd argument controlled by the guest. In both 32-bit and 64-bit implementation, you will reach the end of the function memchr() with r2/w2 and r3/w3 (it contains a character from the buffer) controlled by the guest.
>>
>> As this is the only function in the unit, we don't know what will be the instructions right after RET. So it would be safer to add a speculation barrier there too.
>
> How about adding a speculation barrier directly in the ENDPROC macro ?
This would unfortunately not cover all the cases because you can return
in the middle of the function. I will have a look to see if we can
leverage it.
Below the first draft for v2 of this patch.
commit 7f7cf1558283d0d35fc22ead909e8a812da82f27
Author: Julien Grall <jgrall@amazon.com>
Date: Tue Jun 16 16:33:12 2020 +0100
xen/arm: Place a speculation barrier following an ret instruction
Some CPUs can speculate past a RET instruction and potentially perform
speculative accesses to memory before processing the return.
There is no known gadget available after the RET instruction today.
However some of the registers (such as in check_pending_guest_serror())
may contain a value provided the guest.
In order to harden the code, it would be better to add a speculation
barrier after each RET instruction. The performance is meant to be
negligeable as the speculation barrier is not meant to be
architecturally
executed.
The RET in boot code (i.e arm*/entry.S) are not followed by an SB
because this is boot code. Therefore there is no concern as all the
registers are controlled by the hypervisor.
Note that on arm32, the ldmia instruction will act as a return from the
function __context_switch(). While the whitepaper doesn't suggest it is
possible to speculate after the instruction, add preventively a
speculation barrier after it as well.
This is part of the work to mitigate straight-line speculation.
Signed-off-by: Julien Grall <jgrall@amazon.com>
---
Changes in v2:
- Add an a speculation barrier for the __arm_smccc_1_0_smc.
This
was originally part of the second patch.
- I have now included all the arm*/lib/ assembly helpers
because some of them are called with guest input value (e.g
memchr).
diff --git a/xen/arch/arm/arm32/entry.S b/xen/arch/arm/arm32/entry.S
index b228d44b190c..bd54fc43558b 100644
--- a/xen/arch/arm/arm32/entry.S
+++ b/xen/arch/arm/arm32/entry.S
@@ -442,6 +442,7 @@ ENTRY(__context_switch)
add r4, r1, #VCPU_arch_saved_context
ldmia r4, {r4 - sl, fp, sp, pc} /* Load registers and
return */
+ sb
/*
* Local variables:
diff --git a/xen/arch/arm/arm32/lib/assembler.h
b/xen/arch/arm/arm32/lib/assembler.h
index 6de2638a36d5..cffdb60baa29 100644
--- a/xen/arch/arm/arm32/lib/assembler.h
+++ b/xen/arch/arm/arm32/lib/assembler.h
@@ -17,6 +17,8 @@
#ifndef __ASM_ASSEMBLER_H__
#define __ASM_ASSEMBLER_H__
+#include <asm/macros.h>
+
#ifndef __ASSEMBLY__
#error "Only include this from assembly code"
#endif
diff --git a/xen/arch/arm/arm32/lib/findbit.S
b/xen/arch/arm/arm32/lib/findbit.S
index 00674a31efad..df86922a25bc 100644
--- a/xen/arch/arm/arm32/lib/findbit.S
+++ b/xen/arch/arm/arm32/lib/findbit.S
@@ -35,6 +35,7 @@ ENTRY(_find_first_zero_bit_le)
blo 1b
3: mov r0, r1 @ no free bits
mov pc, lr
+ sb
ENDPROC(_find_first_zero_bit_le)
/*
@@ -76,6 +77,7 @@ ENTRY(_find_first_bit_le)
blo 1b
3: mov r0, r1 @ no free bits
mov pc, lr
+ sb
ENDPROC(_find_first_bit_le)
/*
@@ -114,6 +116,7 @@ ENTRY(_find_first_zero_bit_be)
blo 1b
3: mov r0, r1 @ no free bits
mov pc, lr
+ sb
ENDPROC(_find_first_zero_bit_be)
ENTRY(_find_next_zero_bit_be)
@@ -148,6 +151,7 @@ ENTRY(_find_first_bit_be)
blo 1b
3: mov r0, r1 @ no free bits
mov pc, lr
+ sb
ENDPROC(_find_first_bit_be)
ENTRY(_find_next_bit_be)
@@ -192,4 +196,5 @@ ENDPROC(_find_next_bit_be)
cmp r1, r0 @ Clamp to maxbit
movlo r0, r1
mov pc, lr
+ sb
diff --git a/xen/arch/arm/arm32/lib/lib1funcs.S
b/xen/arch/arm/arm32/lib/lib1funcs.S
index f1278bd6c139..ebb9b4f1babf 100644
--- a/xen/arch/arm/arm32/lib/lib1funcs.S
+++ b/xen/arch/arm/arm32/lib/lib1funcs.S
@@ -217,15 +217,18 @@ UNWIND(.fnstart)
mov r0, r2
mov pc, lr
+ sb
11: moveq r0, #1
movne r0, #0
mov pc, lr
+ sb
12: ARM_DIV2_ORDER r1, r2
mov r0, r0, lsr r2
mov pc, lr
+ sb
UNWIND(.fnend)
ENDPROC(__udivsi3)
@@ -245,6 +248,7 @@ UNWIND(.fnstart)
ARM_MOD_BODY r0, r1, r2, r3
mov pc, lr
+ sb
UNWIND(.fnend)
ENDPROC(__umodsi3)
@@ -271,15 +275,18 @@ UNWIND(.fnstart)
cmp ip, #0
rsbmi r0, r0, #0
mov pc, lr
+ sb
10: teq ip, r0 @ same sign ?
rsbmi r0, r0, #0
mov pc, lr
+ sb
11: movlo r0, #0
moveq r0, ip, asr #31
orreq r0, r0, #1
mov pc, lr
+ sb
12: ARM_DIV2_ORDER r1, r2
@@ -287,6 +294,7 @@ UNWIND(.fnstart)
mov r0, r3, lsr r2
rsbmi r0, r0, #0
mov pc, lr
+ sb
UNWIND(.fnend)
ENDPROC(__divsi3)
@@ -312,6 +320,7 @@ UNWIND(.fnstart)
10: cmp ip, #0
rsbmi r0, r0, #0
mov pc, lr
+ sb
UNWIND(.fnend)
ENDPROC(__modsi3)
@@ -328,6 +337,7 @@ UNWIND(.save {r0, r1, ip, lr} )
mul r3, r0, r2
sub r1, r1, r3
mov pc, lr
+ sb
UNWIND(.fnend)
ENDPROC(__aeabi_uidivmod)
@@ -341,6 +351,7 @@ UNWIND(.save {r0, r1, ip, lr} )
mul r3, r0, r2
sub r1, r1, r3
mov pc, lr
+ sb
UNWIND(.fnend)
ENDPROC(__aeabi_idivmod)
@@ -355,6 +366,7 @@ UNWIND(.save {lr} )
add sp, sp, #8
ldmfd sp!, {r2, r3}
mov pc, lr
+ sb
UNWIND(.fnend)
ENDPROC(__aeabi_uldivmod)
@@ -369,6 +381,7 @@ UNWIND(.save {lr} )
add sp, sp, #16
ldmfd sp!, {r2, r3}
mov pc, lr
+ sb
UNWIND(.fnend)
ENDPROC(__aeabi_ldivmod)
@@ -382,5 +395,6 @@ UNWIND(.save {lr})
bl __div0
mov r0, #0 @ About as wrong as it could be.
ldr pc, [sp], #8
+ sb
UNWIND(.fnend)
ENDPROC(Ldiv0)
diff --git a/xen/arch/arm/arm32/lib/lshrdi3.S
b/xen/arch/arm/arm32/lib/lshrdi3.S
index 88b9655282ab..e4d275842c00 100644
--- a/xen/arch/arm/arm32/lib/lshrdi3.S
+++ b/xen/arch/arm/arm32/lib/lshrdi3.S
@@ -46,6 +46,7 @@ ENTRY(__aeabi_llsr)
THUMB( orrmi al, al, r3 )
mov ah, ah, lsr r2
mov pc, lr
+ sb
ENDPROC(__lshrdi3)
ENDPROC(__aeabi_llsr)
diff --git a/xen/arch/arm/arm32/lib/memchr.S
b/xen/arch/arm/arm32/lib/memchr.S
index 7dae4ab6a6f8..e2910d381b08 100644
--- a/xen/arch/arm/arm32/lib/memchr.S
+++ b/xen/arch/arm/arm32/lib/memchr.S
@@ -22,4 +22,5 @@ ENTRY(memchr)
sub r0, r0, #1
2: movne r0, #0
mov pc, lr
+ sb
ENDPROC(memchr)
diff --git a/xen/arch/arm/arm32/lib/memset.S
b/xen/arch/arm/arm32/lib/memset.S
index 5a1dadf767d7..9ce78d9e4695 100644
--- a/xen/arch/arm/arm32/lib/memset.S
+++ b/xen/arch/arm/arm32/lib/memset.S
@@ -110,6 +110,7 @@ ENTRY(memset)
tst r2, #1
strneb r1, [ip], #1
mov pc, lr
+ sb
6: subs r2, r2, #4 @ 1 do we have enough
blt 5b @ 1 bytes to align with?
diff --git a/xen/arch/arm/arm32/lib/strchr.S
b/xen/arch/arm/arm32/lib/strchr.S
index fa53ad8bc659..697b5b30e38b 100644
--- a/xen/arch/arm/arm32/lib/strchr.S
+++ b/xen/arch/arm/arm32/lib/strchr.S
@@ -25,4 +25,5 @@ ENTRY(strchr)
movne r0, #0
subeq r0, r0, #1
mov pc, lr
+ sb
ENDPROC(strchr)
diff --git a/xen/arch/arm/arm32/lib/strrchr.S
b/xen/arch/arm/arm32/lib/strrchr.S
index ec4d40de5e5a..9f64f23d5cb0 100644
--- a/xen/arch/arm/arm32/lib/strrchr.S
+++ b/xen/arch/arm/arm32/lib/strrchr.S
@@ -22,4 +22,5 @@ ENTRY(strrchr)
bne 1b
mov r0, r3
mov pc, lr
+ sb
ENDPROC(strrchr)
diff --git a/xen/arch/arm/arm64/entry.S b/xen/arch/arm/arm64/entry.S
index 175ea2981e72..b37d8fe09dc2 100644
--- a/xen/arch/arm/arm64/entry.S
+++ b/xen/arch/arm/arm64/entry.S
@@ -522,6 +522,7 @@ abort_guest_exit_end:
cset x19, ne
ret
+ sb
ENDPROC(check_pending_guest_serror)
/*
@@ -583,6 +584,7 @@ ENTRY(__context_switch)
ldr lr, [x8]
mov sp, x9
ret
+ sb
/*
* Local variables:
diff --git a/xen/arch/arm/arm64/lib/assembler.h
b/xen/arch/arm/arm64/lib/assembler.h
index 3f9c0dcf5d32..8356fd5d11fd 100644
--- a/xen/arch/arm/arm64/lib/assembler.h
+++ b/xen/arch/arm/arm64/lib/assembler.h
@@ -5,6 +5,8 @@
#error "Only include this from assembly code"
#endif
+#include <asm/macros.h>
+
/* Only LE support so far */
#define CPU_BE(x...)
#define CPU_LE(x...) x
diff --git a/xen/arch/arm/arm64/lib/clear_page.S
b/xen/arch/arm/arm64/lib/clear_page.S
index 9f8a680e1880..cf07bd00cb7e 100644
--- a/xen/arch/arm/arm64/lib/clear_page.S
+++ b/xen/arch/arm/arm64/lib/clear_page.S
@@ -14,6 +14,8 @@
* along with this program. If not, see <http://www.gnu.org/licenses/>.
*/
+#include <asm/macros.h>
+
/*
* Clear page @dest
*
@@ -31,4 +33,5 @@ ENTRY(clear_page)
tst x0, #(PAGE_SIZE - 1)
b.ne 1b
ret
+ sb
ENDPROC(clear_page)
diff --git a/xen/arch/arm/arm64/lib/memchr.S
b/xen/arch/arm/arm64/lib/memchr.S
index 81f113bb1cab..ff75f03d86bc 100644
--- a/xen/arch/arm/arm64/lib/memchr.S
+++ b/xen/arch/arm/arm64/lib/memchr.S
@@ -38,6 +38,8 @@ ENTRY(memchr)
b.ne 1b
sub x0, x0, #1
ret
+ sb
2: mov x0, #0
ret
+ sb
ENDPROC(memchr)
diff --git a/xen/arch/arm/arm64/lib/memcmp.S
b/xen/arch/arm/arm64/lib/memcmp.S
index 87c2537ffeb7..2583bd404249 100644
--- a/xen/arch/arm/arm64/lib/memcmp.S
+++ b/xen/arch/arm/arm64/lib/memcmp.S
@@ -160,6 +160,7 @@ CPU_LE( lsr tmp2, tmp2, tmp1 )
1:
sub result, data1, data2
ret
+ sb
.Lstart_align:
lsr limit_wd, limit, #3
@@ -236,6 +237,7 @@ CPU_LE( rev data2, data2 )
lsr data1, data1, #56
sub result, data1, data2, lsr #56
ret
+ sb
.Lremain8:
/* Limit % 8 == 0 =>. all data are equal.*/
@@ -251,7 +253,9 @@ CPU_LE( rev data2, data2 )
b.eq .Ltiny8proc
sub result, data1, data2
ret
+ sb
.Lret0:
mov result, #0
ret
+ sb
ENDPROC(memcmp)
diff --git a/xen/arch/arm/arm64/lib/memcpy.S
b/xen/arch/arm/arm64/lib/memcpy.S
index d90d20ef3ea8..f79ba8cb9515 100644
--- a/xen/arch/arm/arm64/lib/memcpy.S
+++ b/xen/arch/arm/arm64/lib/memcpy.S
@@ -142,6 +142,7 @@ ENTRY(memcpy)
.Lexitfunc:
ret
+ sb
.Lcpy_over64:
subs count, count, #128
@@ -162,6 +163,7 @@ ENTRY(memcpy)
tst count, #0x3f
b.ne .Ltail63
ret
+ sb
/*
* Critical loop. Start at a new cache line boundary. Assuming
@@ -197,4 +199,5 @@ ENTRY(memcpy)
tst count, #0x3f
b.ne .Ltail63
ret
+ sb
ENDPROC(memcpy)
diff --git a/xen/arch/arm/arm64/lib/memmove.S
b/xen/arch/arm/arm64/lib/memmove.S
index a49de845d066..dc0429583db7 100644
--- a/xen/arch/arm/arm64/lib/memmove.S
+++ b/xen/arch/arm/arm64/lib/memmove.S
@@ -138,6 +138,7 @@ ENTRY(memmove)
.Lexitfunc:
ret
+ sb
.Lcpy_over64:
subs count, count, #128
@@ -158,6 +159,7 @@ ENTRY(memmove)
tst count, #0x3f
b.ne .Ltail63
ret
+ sb
/*
* Critical loop. Start at a new cache line boundary. Assuming
@@ -193,4 +195,5 @@ ENTRY(memmove)
tst count, #0x3f
b.ne .Ltail63
ret
+ sb
ENDPROC(memmove)
diff --git a/xen/arch/arm/arm64/lib/memset.S
b/xen/arch/arm/arm64/lib/memset.S
index 5bf751521b60..e052d7c937d8 100644
--- a/xen/arch/arm/arm64/lib/memset.S
+++ b/xen/arch/arm/arm64/lib/memset.S
@@ -76,6 +76,7 @@ ENTRY(memset)
strb A_lw, [dst]
4:
ret
+ sb
.Lover16_proc:
/*Whether the start address is aligned with 16.*/
@@ -120,6 +121,7 @@ ENTRY(memset)
stp A_l, A_l, [dst, #-16] /* Repeat some/all of last store. */
4:
ret
+ sb
/*
* Critical loop. Start at a new cache line boundary. Assuming
@@ -141,6 +143,7 @@ ENTRY(memset)
b.ne .Ltail63
.Lexitfunc:
ret
+ sb
/*
* For zeroing memory, check to see if we can use the ZVA feature to
@@ -212,4 +215,5 @@ ENTRY(memset)
ands count, count, zva_bits_x
b.ne .Ltail_maybe_long
ret
+ sb
ENDPROC(memset)
diff --git a/xen/arch/arm/arm64/lib/strchr.S
b/xen/arch/arm/arm64/lib/strchr.S
index 0506b0ff7f3a..be1d2fa2ec82 100644
--- a/xen/arch/arm/arm64/lib/strchr.S
+++ b/xen/arch/arm/arm64/lib/strchr.S
@@ -17,6 +17,7 @@
* along with this program. If not, see <http://www.gnu.org/licenses/>.
*/
+#include <asm/macros.h>
/*
* Find the first occurrence of a character in a string.
@@ -37,4 +38,5 @@ ENTRY(strchr)
cmp w2, w1
csel x0, x0, xzr, eq
ret
+ sb
ENDPROC(strchr)
diff --git a/xen/arch/arm/arm64/lib/strcmp.S
b/xen/arch/arm/arm64/lib/strcmp.S
index c6f42dd2550b..08eda29302f0 100644
--- a/xen/arch/arm/arm64/lib/strcmp.S
+++ b/xen/arch/arm/arm64/lib/strcmp.S
@@ -133,6 +133,7 @@ CPU_LE( lsr tmp2, tmp2, tmp1 ) /* Shift (tmp1 & 63). */
1:
sub result, data1, data2
ret
+ sb
.Lstart_align:
ands xzr, src1, #7
@@ -205,6 +206,7 @@ CPU_BE( cmp data1, data2 )
CPU_BE( cset result, ne )
CPU_BE( cneg result, result, lo )
CPU_BE( ret )
+CPU_BE( sb )
CPU_BE( 1: )
/*Re-compute the NUL-byte detection, using a byte-reversed value. */
CPU_BE( rev tmp3, data1 )
@@ -230,4 +232,5 @@ CPU_BE( orr syndrome, diff, has_nul )
lsr data1, data1, #56
sub result, data1, data2, lsr #56
ret
+ sb
ENDPROC(strcmp)
diff --git a/xen/arch/arm/arm64/lib/strlen.S
b/xen/arch/arm/arm64/lib/strlen.S
index fb6aaf1a6afb..92df3c5e4418 100644
--- a/xen/arch/arm/arm64/lib/strlen.S
+++ b/xen/arch/arm/arm64/lib/strlen.S
@@ -106,6 +106,7 @@ CPU_BE( bic has_nul2, tmp1, tmp2 )
clz pos, has_nul2
add len, len, pos, lsr #3 /* Bits to bytes. */
ret
+ sb
.Lmisaligned:
cmp tmp1, #8
diff --git a/xen/arch/arm/arm64/lib/strncmp.S
b/xen/arch/arm/arm64/lib/strncmp.S
index a4a0f779f56e..e13ff99fe75e 100644
--- a/xen/arch/arm/arm64/lib/strncmp.S
+++ b/xen/arch/arm/arm64/lib/strncmp.S
@@ -182,6 +182,7 @@ CPU_LE( lsr tmp2, tmp2, tmp3 ) /* Shift (tmp1 & 63). */
1:
sub result, data1, data2
ret
+ sb
.Lstart_align:
lsr limit_wd, limit, #3
@@ -264,6 +265,7 @@ CPU_BE( cmp data1, data2 )
CPU_BE( cset result, ne )
CPU_BE( cneg result, result, lo )
CPU_BE( ret )
+CPU_BE( ret )
CPU_BE( 1: )
/* Re-compute the NUL-byte detection, using a byte-reversed value.*/
CPU_BE( rev tmp3, data1 )
@@ -288,6 +290,7 @@ CPU_BE( orr syndrome, diff, has_nul )
lsr data1, data1, #56
sub result, data1, data2, lsr #56
ret
+ sb
.Lremain8:
/* Limit % 8 == 0 => all bytes significant. */
@@ -303,8 +306,10 @@ CPU_BE( orr syndrome, diff, has_nul )
b.eq .Ltiny8proc
sub result, data1, data2
ret
+ sb
.Lret0:
mov result, #0
ret
+ sb
ENDPROC(strncmp)
diff --git a/xen/arch/arm/arm64/lib/strnlen.S
b/xen/arch/arm/arm64/lib/strnlen.S
index 81c8e8b54ea9..c59467bf17e1 100644
--- a/xen/arch/arm/arm64/lib/strnlen.S
+++ b/xen/arch/arm/arm64/lib/strnlen.S
@@ -126,6 +126,7 @@ CPU_BE( bic has_nul2, tmp1, tmp2 )
cmp len, limit
csel len, len, limit, ls /* Return the lower value. */
ret
+ sb
.Lmisaligned:
/*
@@ -168,4 +169,5 @@ CPU_LE( lsr tmp2, tmp2, tmp4 ) /* Shift (tmp1 & 63). */
.Lhit_limit:
mov len, limit
ret
+ sb
ENDPROC(strnlen)
diff --git a/xen/arch/arm/arm64/lib/strrchr.S
b/xen/arch/arm/arm64/lib/strrchr.S
index 07059983f828..fc8b6e446071 100644
--- a/xen/arch/arm/arm64/lib/strrchr.S
+++ b/xen/arch/arm/arm64/lib/strrchr.S
@@ -17,6 +17,8 @@
* along with this program. If not, see <http://www.gnu.org/licenses/>.
*/
+#include <asm/macros.h>
+
/*
* Find the last occurrence of a character in a string.
*
@@ -37,4 +39,5 @@ ENTRY(strrchr)
b 1b
2: mov x0, x3
ret
+ sb
ENDPROC(strrchr)
diff --git a/xen/arch/arm/arm64/smc.S b/xen/arch/arm/arm64/smc.S
index b0752be57e8f..e0a3106dd7ec 100644
--- a/xen/arch/arm/arm64/smc.S
+++ b/xen/arch/arm/arm64/smc.S
@@ -30,3 +30,4 @@ ENTRY(__arm_smccc_1_0_smc)
stp x2, x3, [x4, #SMCCC_RES_a2]
1:
ret
+ sb
--
Julien Grall
^ permalink raw reply related [flat|nested] 20+ messages in thread
* Re: [PATCH 1/2] xen/arm: entry: Place a speculation barrier following an ret instruction
2020-08-18 16:43 ` Julien Grall
@ 2020-08-18 17:06 ` Bertrand Marquis
2020-08-18 17:34 ` Julien Grall
0 siblings, 1 reply; 20+ messages in thread
From: Bertrand Marquis @ 2020-08-18 17:06 UTC (permalink / raw)
To: Julien Grall
Cc: Stefano Stabellini, security, Paul Durrant, Xen-devel,
Volodymyr Babchuk, Andre Przywara, Julien Grall, nd
> On 18 Aug 2020, at 17:43, Julien Grall <julien@xen.org> wrote:
>
>
>
> On 18/08/2020 17:35, Bertrand Marquis wrote:
>> Hi Julien,
>
> Hi Bertrand,
>
>> Somehow we stopped on this thread and you did already most of the work so I think we should try to finish what you started
>
> Sorry this fell-through the cracks. I have a new version for patch #1, but not yet patch #2.
No problem this came back while trying to reduce my todolist stack :-)
>
> I am still debating with myself where the speculation barrier should be added after the SMC :).
I think that we should unless the SMC is in the context switch path (as all other calls should not have a performance impact).
>
>>> On 4 Jul 2020, at 17:07, Julien Grall <julien@xen.org> wrote:
>>>
>>> On 17/06/2020 17:23, Julien Grall wrote:
>>>> Hi,
>>>> On 16/06/2020 22:24, Stefano Stabellini wrote:
>>>>> On Tue, 16 Jun 2020, Julien Grall wrote:
>>>>>> From: Julien Grall <jgrall@amazon.com>
>>>>>>
>>>>>> Some CPUs can speculate past a RET instruction and potentially perform
>>>>>> speculative accesses to memory before processing the return.
>>>>>>
>>>>>> There is no known gadget available after the RET instruction today.
>>>>>> However some of the registers (such as in check_pending_guest_serror())
>>>>>> may contain a value provided the guest.
>>>>> ^ by
>>>>>
>>>>>
>>>>>> In order to harden the code, it would be better to add a speculation
>>>>>> barrier after each RET instruction. The performance is meant to be
>>>>>> negligeable as the speculation barrier is not meant to be archicturally
>>>>>> executed.
>>>>>>
>>>>>> Note that on arm32, the ldmia instruction will act as a return from the
>>>>>> function __context_switch(). While the whitepaper doesn't suggest it is
>>>>>> possible to speculate after the instruction, add preventively a
>>>>>> speculation barrier after it as well.
>>>>>>
>>>>>> This is part of the work to mitigate straight-line speculation.
>>>>>>
>>>>>> Signed-off-by: Julien Grall <jgrall@amazon.com>
>>>>>
>>>>> Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
>>>>>
>>>>> I did a compile-test on the patch too.
>>>>>
>>>>>
>>>>>> ---
>>>>>>
>>>>>> I am still unsure whether we preventively should add a speculation barrier
>>>>>> preventively after all the RET instructions in arm*/lib/. The smc call be
>>>>>> taken care in a follow-up patch.
>>>>>
>>>>> SMC is great to have but it seems to be overkill to do the ones under
>>>>> lib/.
>>>> From my understanding, the compiler will add a speculation barrier preventively after each 'ret' when the mitigation are turned on.So it feels to me we want to follow the same approach.
>>>> Obviously, we can avoid them but I would like to have a justification for not adding them (nothing is overkilled against speculation ;)).
>>>
>>> I finally found some time to look at arm*/lib in more details. Some of the helpers can definitely be called with guest inputs.
>>>
>>> For instance, memchr() is called from hypfs_get_path_user() with the 3rd argument controlled by the guest. In both 32-bit and 64-bit implementation, you will reach the end of the function memchr() with r2/w2 and r3/w3 (it contains a character from the buffer) controlled by the guest.
>>>
>>> As this is the only function in the unit, we don't know what will be the instructions right after RET. So it would be safer to add a speculation barrier there too.
>> How about adding a speculation barrier directly in the ENDPROC macro ?
>
> This would unfortunately not cover all the cases because you can return in the middle of the function. I will have a look to see if we can leverage it.
I agree that it would not solve all of them but a big part would be solved by it.
An other solution might be to have a RETURN macro encoded as "mov pc,lr; sb" and "ret; sb”.
The patch sounds good, i just need to find a way to analyse if you missed a ret or not which is not easy with such a patch :-)
Bertrand
>
> Below the first draft for v2 of this patch.
>
> commit 7f7cf1558283d0d35fc22ead909e8a812da82f27
> Author: Julien Grall <jgrall@amazon.com>
> Date: Tue Jun 16 16:33:12 2020 +0100
>
> xen/arm: Place a speculation barrier following an ret instruction
>
> Some CPUs can speculate past a RET instruction and potentially perform
> speculative accesses to memory before processing the return.
>
> There is no known gadget available after the RET instruction today.
> However some of the registers (such as in check_pending_guest_serror())
> may contain a value provided the guest.
>
> In order to harden the code, it would be better to add a speculation
> barrier after each RET instruction. The performance is meant to be
> negligeable as the speculation barrier is not meant to be architecturally
> executed.
>
> The RET in boot code (i.e arm*/entry.S) are not followed by an SB
> because this is boot code. Therefore there is no concern as all the
> registers are controlled by the hypervisor.
>
> Note that on arm32, the ldmia instruction will act as a return from the
> function __context_switch(). While the whitepaper doesn't suggest it is
> possible to speculate after the instruction, add preventively a
> speculation barrier after it as well.
>
> This is part of the work to mitigate straight-line speculation.
>
> Signed-off-by: Julien Grall <jgrall@amazon.com>
>
> ---
>
> Changes in v2:
> - Add an a speculation barrier for the __arm_smccc_1_0_smc. This
> was originally part of the second patch.
> - I have now included all the arm*/lib/ assembly helpers
> because some of them are called with guest input value (e.g
> memchr).
>
> diff --git a/xen/arch/arm/arm32/entry.S b/xen/arch/arm/arm32/entry.S
> index b228d44b190c..bd54fc43558b 100644
> --- a/xen/arch/arm/arm32/entry.S
> +++ b/xen/arch/arm/arm32/entry.S
> @@ -442,6 +442,7 @@ ENTRY(__context_switch)
>
> add r4, r1, #VCPU_arch_saved_context
> ldmia r4, {r4 - sl, fp, sp, pc} /* Load registers and return */
> + sb
>
> /*
> * Local variables:
> diff --git a/xen/arch/arm/arm32/lib/assembler.h b/xen/arch/arm/arm32/lib/assembler.h
> index 6de2638a36d5..cffdb60baa29 100644
> --- a/xen/arch/arm/arm32/lib/assembler.h
> +++ b/xen/arch/arm/arm32/lib/assembler.h
> @@ -17,6 +17,8 @@
> #ifndef __ASM_ASSEMBLER_H__
> #define __ASM_ASSEMBLER_H__
>
> +#include <asm/macros.h>
> +
> #ifndef __ASSEMBLY__
> #error "Only include this from assembly code"
> #endif
> diff --git a/xen/arch/arm/arm32/lib/findbit.S b/xen/arch/arm/arm32/lib/findbit.S
> index 00674a31efad..df86922a25bc 100644
> --- a/xen/arch/arm/arm32/lib/findbit.S
> +++ b/xen/arch/arm/arm32/lib/findbit.S
> @@ -35,6 +35,7 @@ ENTRY(_find_first_zero_bit_le)
> blo 1b
> 3: mov r0, r1 @ no free bits
> mov pc, lr
> + sb
> ENDPROC(_find_first_zero_bit_le)
>
> /*
> @@ -76,6 +77,7 @@ ENTRY(_find_first_bit_le)
> blo 1b
> 3: mov r0, r1 @ no free bits
> mov pc, lr
> + sb
> ENDPROC(_find_first_bit_le)
>
> /*
> @@ -114,6 +116,7 @@ ENTRY(_find_first_zero_bit_be)
> blo 1b
> 3: mov r0, r1 @ no free bits
> mov pc, lr
> + sb
> ENDPROC(_find_first_zero_bit_be)
>
> ENTRY(_find_next_zero_bit_be)
> @@ -148,6 +151,7 @@ ENTRY(_find_first_bit_be)
> blo 1b
> 3: mov r0, r1 @ no free bits
> mov pc, lr
> + sb
> ENDPROC(_find_first_bit_be)
>
> ENTRY(_find_next_bit_be)
> @@ -192,4 +196,5 @@ ENDPROC(_find_next_bit_be)
> cmp r1, r0 @ Clamp to maxbit
> movlo r0, r1
> mov pc, lr
> + sb
>
> diff --git a/xen/arch/arm/arm32/lib/lib1funcs.S b/xen/arch/arm/arm32/lib/lib1funcs.S
> index f1278bd6c139..ebb9b4f1babf 100644
> --- a/xen/arch/arm/arm32/lib/lib1funcs.S
> +++ b/xen/arch/arm/arm32/lib/lib1funcs.S
> @@ -217,15 +217,18 @@ UNWIND(.fnstart)
>
> mov r0, r2
> mov pc, lr
> + sb
>
> 11: moveq r0, #1
> movne r0, #0
> mov pc, lr
> + sb
>
> 12: ARM_DIV2_ORDER r1, r2
>
> mov r0, r0, lsr r2
> mov pc, lr
> + sb
>
> UNWIND(.fnend)
> ENDPROC(__udivsi3)
> @@ -245,6 +248,7 @@ UNWIND(.fnstart)
> ARM_MOD_BODY r0, r1, r2, r3
>
> mov pc, lr
> + sb
>
> UNWIND(.fnend)
> ENDPROC(__umodsi3)
> @@ -271,15 +275,18 @@ UNWIND(.fnstart)
> cmp ip, #0
> rsbmi r0, r0, #0
> mov pc, lr
> + sb
>
> 10: teq ip, r0 @ same sign ?
> rsbmi r0, r0, #0
> mov pc, lr
> + sb
>
> 11: movlo r0, #0
> moveq r0, ip, asr #31
> orreq r0, r0, #1
> mov pc, lr
> + sb
>
> 12: ARM_DIV2_ORDER r1, r2
>
> @@ -287,6 +294,7 @@ UNWIND(.fnstart)
> mov r0, r3, lsr r2
> rsbmi r0, r0, #0
> mov pc, lr
> + sb
>
> UNWIND(.fnend)
> ENDPROC(__divsi3)
> @@ -312,6 +320,7 @@ UNWIND(.fnstart)
> 10: cmp ip, #0
> rsbmi r0, r0, #0
> mov pc, lr
> + sb
>
> UNWIND(.fnend)
> ENDPROC(__modsi3)
> @@ -328,6 +337,7 @@ UNWIND(.save {r0, r1, ip, lr} )
> mul r3, r0, r2
> sub r1, r1, r3
> mov pc, lr
> + sb
>
> UNWIND(.fnend)
> ENDPROC(__aeabi_uidivmod)
> @@ -341,6 +351,7 @@ UNWIND(.save {r0, r1, ip, lr} )
> mul r3, r0, r2
> sub r1, r1, r3
> mov pc, lr
> + sb
>
> UNWIND(.fnend)
> ENDPROC(__aeabi_idivmod)
> @@ -355,6 +366,7 @@ UNWIND(.save {lr} )
> add sp, sp, #8
> ldmfd sp!, {r2, r3}
> mov pc, lr
> + sb
>
> UNWIND(.fnend)
> ENDPROC(__aeabi_uldivmod)
> @@ -369,6 +381,7 @@ UNWIND(.save {lr} )
> add sp, sp, #16
> ldmfd sp!, {r2, r3}
> mov pc, lr
> + sb
>
> UNWIND(.fnend)
> ENDPROC(__aeabi_ldivmod)
> @@ -382,5 +395,6 @@ UNWIND(.save {lr})
> bl __div0
> mov r0, #0 @ About as wrong as it could be.
> ldr pc, [sp], #8
> + sb
> UNWIND(.fnend)
> ENDPROC(Ldiv0)
> diff --git a/xen/arch/arm/arm32/lib/lshrdi3.S b/xen/arch/arm/arm32/lib/lshrdi3.S
> index 88b9655282ab..e4d275842c00 100644
> --- a/xen/arch/arm/arm32/lib/lshrdi3.S
> +++ b/xen/arch/arm/arm32/lib/lshrdi3.S
> @@ -46,6 +46,7 @@ ENTRY(__aeabi_llsr)
> THUMB( orrmi al, al, r3 )
> mov ah, ah, lsr r2
> mov pc, lr
> + sb
>
> ENDPROC(__lshrdi3)
> ENDPROC(__aeabi_llsr)
> diff --git a/xen/arch/arm/arm32/lib/memchr.S b/xen/arch/arm/arm32/lib/memchr.S
> index 7dae4ab6a6f8..e2910d381b08 100644
> --- a/xen/arch/arm/arm32/lib/memchr.S
> +++ b/xen/arch/arm/arm32/lib/memchr.S
> @@ -22,4 +22,5 @@ ENTRY(memchr)
> sub r0, r0, #1
> 2: movne r0, #0
> mov pc, lr
> + sb
> ENDPROC(memchr)
> diff --git a/xen/arch/arm/arm32/lib/memset.S b/xen/arch/arm/arm32/lib/memset.S
> index 5a1dadf767d7..9ce78d9e4695 100644
> --- a/xen/arch/arm/arm32/lib/memset.S
> +++ b/xen/arch/arm/arm32/lib/memset.S
> @@ -110,6 +110,7 @@ ENTRY(memset)
> tst r2, #1
> strneb r1, [ip], #1
> mov pc, lr
> + sb
>
> 6: subs r2, r2, #4 @ 1 do we have enough
> blt 5b @ 1 bytes to align with?
> diff --git a/xen/arch/arm/arm32/lib/strchr.S b/xen/arch/arm/arm32/lib/strchr.S
> index fa53ad8bc659..697b5b30e38b 100644
> --- a/xen/arch/arm/arm32/lib/strchr.S
> +++ b/xen/arch/arm/arm32/lib/strchr.S
> @@ -25,4 +25,5 @@ ENTRY(strchr)
> movne r0, #0
> subeq r0, r0, #1
> mov pc, lr
> + sb
> ENDPROC(strchr)
> diff --git a/xen/arch/arm/arm32/lib/strrchr.S b/xen/arch/arm/arm32/lib/strrchr.S
> index ec4d40de5e5a..9f64f23d5cb0 100644
> --- a/xen/arch/arm/arm32/lib/strrchr.S
> +++ b/xen/arch/arm/arm32/lib/strrchr.S
> @@ -22,4 +22,5 @@ ENTRY(strrchr)
> bne 1b
> mov r0, r3
> mov pc, lr
> + sb
> ENDPROC(strrchr)
> diff --git a/xen/arch/arm/arm64/entry.S b/xen/arch/arm/arm64/entry.S
> index 175ea2981e72..b37d8fe09dc2 100644
> --- a/xen/arch/arm/arm64/entry.S
> +++ b/xen/arch/arm/arm64/entry.S
> @@ -522,6 +522,7 @@ abort_guest_exit_end:
> cset x19, ne
>
> ret
> + sb
> ENDPROC(check_pending_guest_serror)
>
> /*
> @@ -583,6 +584,7 @@ ENTRY(__context_switch)
> ldr lr, [x8]
> mov sp, x9
> ret
> + sb
>
> /*
> * Local variables:
> diff --git a/xen/arch/arm/arm64/lib/assembler.h b/xen/arch/arm/arm64/lib/assembler.h
> index 3f9c0dcf5d32..8356fd5d11fd 100644
> --- a/xen/arch/arm/arm64/lib/assembler.h
> +++ b/xen/arch/arm/arm64/lib/assembler.h
> @@ -5,6 +5,8 @@
> #error "Only include this from assembly code"
> #endif
>
> +#include <asm/macros.h>
> +
> /* Only LE support so far */
> #define CPU_BE(x...)
> #define CPU_LE(x...) x
> diff --git a/xen/arch/arm/arm64/lib/clear_page.S b/xen/arch/arm/arm64/lib/clear_page.S
> index 9f8a680e1880..cf07bd00cb7e 100644
> --- a/xen/arch/arm/arm64/lib/clear_page.S
> +++ b/xen/arch/arm/arm64/lib/clear_page.S
> @@ -14,6 +14,8 @@
> * along with this program. If not, see <http://www.gnu.org/licenses/>.
> */
>
> +#include <asm/macros.h>
> +
> /*
> * Clear page @dest
> *
> @@ -31,4 +33,5 @@ ENTRY(clear_page)
> tst x0, #(PAGE_SIZE - 1)
> b.ne 1b
> ret
> + sb
> ENDPROC(clear_page)
> diff --git a/xen/arch/arm/arm64/lib/memchr.S b/xen/arch/arm/arm64/lib/memchr.S
> index 81f113bb1cab..ff75f03d86bc 100644
> --- a/xen/arch/arm/arm64/lib/memchr.S
> +++ b/xen/arch/arm/arm64/lib/memchr.S
> @@ -38,6 +38,8 @@ ENTRY(memchr)
> b.ne 1b
> sub x0, x0, #1
> ret
> + sb
> 2: mov x0, #0
> ret
> + sb
> ENDPROC(memchr)
> diff --git a/xen/arch/arm/arm64/lib/memcmp.S b/xen/arch/arm/arm64/lib/memcmp.S
> index 87c2537ffeb7..2583bd404249 100644
> --- a/xen/arch/arm/arm64/lib/memcmp.S
> +++ b/xen/arch/arm/arm64/lib/memcmp.S
> @@ -160,6 +160,7 @@ CPU_LE( lsr tmp2, tmp2, tmp1 )
> 1:
> sub result, data1, data2
> ret
> + sb
>
> .Lstart_align:
> lsr limit_wd, limit, #3
> @@ -236,6 +237,7 @@ CPU_LE( rev data2, data2 )
> lsr data1, data1, #56
> sub result, data1, data2, lsr #56
> ret
> + sb
>
> .Lremain8:
> /* Limit % 8 == 0 =>. all data are equal.*/
> @@ -251,7 +253,9 @@ CPU_LE( rev data2, data2 )
> b.eq .Ltiny8proc
> sub result, data1, data2
> ret
> + sb
> .Lret0:
> mov result, #0
> ret
> + sb
> ENDPROC(memcmp)
> diff --git a/xen/arch/arm/arm64/lib/memcpy.S b/xen/arch/arm/arm64/lib/memcpy.S
> index d90d20ef3ea8..f79ba8cb9515 100644
> --- a/xen/arch/arm/arm64/lib/memcpy.S
> +++ b/xen/arch/arm/arm64/lib/memcpy.S
> @@ -142,6 +142,7 @@ ENTRY(memcpy)
>
> .Lexitfunc:
> ret
> + sb
>
> .Lcpy_over64:
> subs count, count, #128
> @@ -162,6 +163,7 @@ ENTRY(memcpy)
> tst count, #0x3f
> b.ne .Ltail63
> ret
> + sb
>
> /*
> * Critical loop. Start at a new cache line boundary. Assuming
> @@ -197,4 +199,5 @@ ENTRY(memcpy)
> tst count, #0x3f
> b.ne .Ltail63
> ret
> + sb
> ENDPROC(memcpy)
> diff --git a/xen/arch/arm/arm64/lib/memmove.S b/xen/arch/arm/arm64/lib/memmove.S
> index a49de845d066..dc0429583db7 100644
> --- a/xen/arch/arm/arm64/lib/memmove.S
> +++ b/xen/arch/arm/arm64/lib/memmove.S
> @@ -138,6 +138,7 @@ ENTRY(memmove)
>
> .Lexitfunc:
> ret
> + sb
>
> .Lcpy_over64:
> subs count, count, #128
> @@ -158,6 +159,7 @@ ENTRY(memmove)
> tst count, #0x3f
> b.ne .Ltail63
> ret
> + sb
>
> /*
> * Critical loop. Start at a new cache line boundary. Assuming
> @@ -193,4 +195,5 @@ ENTRY(memmove)
> tst count, #0x3f
> b.ne .Ltail63
> ret
> + sb
> ENDPROC(memmove)
> diff --git a/xen/arch/arm/arm64/lib/memset.S b/xen/arch/arm/arm64/lib/memset.S
> index 5bf751521b60..e052d7c937d8 100644
> --- a/xen/arch/arm/arm64/lib/memset.S
> +++ b/xen/arch/arm/arm64/lib/memset.S
> @@ -76,6 +76,7 @@ ENTRY(memset)
> strb A_lw, [dst]
> 4:
> ret
> + sb
>
> .Lover16_proc:
> /*Whether the start address is aligned with 16.*/
> @@ -120,6 +121,7 @@ ENTRY(memset)
> stp A_l, A_l, [dst, #-16] /* Repeat some/all of last store. */
> 4:
> ret
> + sb
>
> /*
> * Critical loop. Start at a new cache line boundary. Assuming
> @@ -141,6 +143,7 @@ ENTRY(memset)
> b.ne .Ltail63
> .Lexitfunc:
> ret
> + sb
>
> /*
> * For zeroing memory, check to see if we can use the ZVA feature to
> @@ -212,4 +215,5 @@ ENTRY(memset)
> ands count, count, zva_bits_x
> b.ne .Ltail_maybe_long
> ret
> + sb
> ENDPROC(memset)
> diff --git a/xen/arch/arm/arm64/lib/strchr.S b/xen/arch/arm/arm64/lib/strchr.S
> index 0506b0ff7f3a..be1d2fa2ec82 100644
> --- a/xen/arch/arm/arm64/lib/strchr.S
> +++ b/xen/arch/arm/arm64/lib/strchr.S
> @@ -17,6 +17,7 @@
> * along with this program. If not, see <http://www.gnu.org/licenses/>.
> */
>
> +#include <asm/macros.h>
>
> /*
> * Find the first occurrence of a character in a string.
> @@ -37,4 +38,5 @@ ENTRY(strchr)
> cmp w2, w1
> csel x0, x0, xzr, eq
> ret
> + sb
> ENDPROC(strchr)
> diff --git a/xen/arch/arm/arm64/lib/strcmp.S b/xen/arch/arm/arm64/lib/strcmp.S
> index c6f42dd2550b..08eda29302f0 100644
> --- a/xen/arch/arm/arm64/lib/strcmp.S
> +++ b/xen/arch/arm/arm64/lib/strcmp.S
> @@ -133,6 +133,7 @@ CPU_LE( lsr tmp2, tmp2, tmp1 ) /* Shift (tmp1 & 63). */
> 1:
> sub result, data1, data2
> ret
> + sb
>
> .Lstart_align:
> ands xzr, src1, #7
> @@ -205,6 +206,7 @@ CPU_BE( cmp data1, data2 )
> CPU_BE( cset result, ne )
> CPU_BE( cneg result, result, lo )
> CPU_BE( ret )
> +CPU_BE( sb )
> CPU_BE( 1: )
> /*Re-compute the NUL-byte detection, using a byte-reversed value. */
> CPU_BE( rev tmp3, data1 )
> @@ -230,4 +232,5 @@ CPU_BE( orr syndrome, diff, has_nul )
> lsr data1, data1, #56
> sub result, data1, data2, lsr #56
> ret
> + sb
> ENDPROC(strcmp)
> diff --git a/xen/arch/arm/arm64/lib/strlen.S b/xen/arch/arm/arm64/lib/strlen.S
> index fb6aaf1a6afb..92df3c5e4418 100644
> --- a/xen/arch/arm/arm64/lib/strlen.S
> +++ b/xen/arch/arm/arm64/lib/strlen.S
> @@ -106,6 +106,7 @@ CPU_BE( bic has_nul2, tmp1, tmp2 )
> clz pos, has_nul2
> add len, len, pos, lsr #3 /* Bits to bytes. */
> ret
> + sb
>
> .Lmisaligned:
> cmp tmp1, #8
> diff --git a/xen/arch/arm/arm64/lib/strncmp.S b/xen/arch/arm/arm64/lib/strncmp.S
> index a4a0f779f56e..e13ff99fe75e 100644
> --- a/xen/arch/arm/arm64/lib/strncmp.S
> +++ b/xen/arch/arm/arm64/lib/strncmp.S
> @@ -182,6 +182,7 @@ CPU_LE( lsr tmp2, tmp2, tmp3 ) /* Shift (tmp1 & 63). */
> 1:
> sub result, data1, data2
> ret
> + sb
>
> .Lstart_align:
> lsr limit_wd, limit, #3
> @@ -264,6 +265,7 @@ CPU_BE( cmp data1, data2 )
> CPU_BE( cset result, ne )
> CPU_BE( cneg result, result, lo )
> CPU_BE( ret )
> +CPU_BE( ret )
> CPU_BE( 1: )
> /* Re-compute the NUL-byte detection, using a byte-reversed value.*/
> CPU_BE( rev tmp3, data1 )
> @@ -288,6 +290,7 @@ CPU_BE( orr syndrome, diff, has_nul )
> lsr data1, data1, #56
> sub result, data1, data2, lsr #56
> ret
> + sb
>
> .Lremain8:
> /* Limit % 8 == 0 => all bytes significant. */
> @@ -303,8 +306,10 @@ CPU_BE( orr syndrome, diff, has_nul )
> b.eq .Ltiny8proc
> sub result, data1, data2
> ret
> + sb
>
> .Lret0:
> mov result, #0
> ret
> + sb
> ENDPROC(strncmp)
> diff --git a/xen/arch/arm/arm64/lib/strnlen.S b/xen/arch/arm/arm64/lib/strnlen.S
> index 81c8e8b54ea9..c59467bf17e1 100644
> --- a/xen/arch/arm/arm64/lib/strnlen.S
> +++ b/xen/arch/arm/arm64/lib/strnlen.S
> @@ -126,6 +126,7 @@ CPU_BE( bic has_nul2, tmp1, tmp2 )
> cmp len, limit
> csel len, len, limit, ls /* Return the lower value. */
> ret
> + sb
>
> .Lmisaligned:
> /*
> @@ -168,4 +169,5 @@ CPU_LE( lsr tmp2, tmp2, tmp4 ) /* Shift (tmp1 & 63). */
> .Lhit_limit:
> mov len, limit
> ret
> + sb
> ENDPROC(strnlen)
> diff --git a/xen/arch/arm/arm64/lib/strrchr.S b/xen/arch/arm/arm64/lib/strrchr.S
> index 07059983f828..fc8b6e446071 100644
> --- a/xen/arch/arm/arm64/lib/strrchr.S
> +++ b/xen/arch/arm/arm64/lib/strrchr.S
> @@ -17,6 +17,8 @@
> * along with this program. If not, see <http://www.gnu.org/licenses/>.
> */
>
> +#include <asm/macros.h>
> +
> /*
> * Find the last occurrence of a character in a string.
> *
> @@ -37,4 +39,5 @@ ENTRY(strrchr)
> b 1b
> 2: mov x0, x3
> ret
> + sb
> ENDPROC(strrchr)
> diff --git a/xen/arch/arm/arm64/smc.S b/xen/arch/arm/arm64/smc.S
> index b0752be57e8f..e0a3106dd7ec 100644
> --- a/xen/arch/arm/arm64/smc.S
> +++ b/xen/arch/arm/arm64/smc.S
> @@ -30,3 +30,4 @@ ENTRY(__arm_smccc_1_0_smc)
> stp x2, x3, [x4, #SMCCC_RES_a2]
> 1:
> ret
> + sb
>
> --
> Julien Grall
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH 1/2] xen/arm: entry: Place a speculation barrier following an ret instruction
2020-08-18 17:06 ` Bertrand Marquis
@ 2020-08-18 17:34 ` Julien Grall
2020-08-19 7:59 ` Bertrand Marquis
0 siblings, 1 reply; 20+ messages in thread
From: Julien Grall @ 2020-08-18 17:34 UTC (permalink / raw)
To: Bertrand Marquis
Cc: Stefano Stabellini, security, Paul Durrant, Xen-devel,
Volodymyr Babchuk, Andre Przywara, Julien Grall, nd
Hi Bertrand,
On 18/08/2020 18:06, Bertrand Marquis wrote:
>
>
>> On 18 Aug 2020, at 17:43, Julien Grall <julien@xen.org> wrote:
>>
>>
>>
>> On 18/08/2020 17:35, Bertrand Marquis wrote:
>>> Hi Julien,
>>
>> Hi Bertrand,
>>
>>> Somehow we stopped on this thread and you did already most of the work so I think we should try to finish what you started
>>
>> Sorry this fell-through the cracks. I have a new version for patch #1, but not yet patch #2.
>
> No problem this came back while trying to reduce my todolist stack :-)
>
>>
>> I am still debating with myself where the speculation barrier should be added after the SMC :).
>
> I think that we should unless the SMC is in the context switch path (as all other calls should not have a performance impact).
I will introduce *_unsafe() helpers that will not contain the
speculation barrier. It could then be used in place where we think the
barrier is unnecessary.
>
>>
>>>> On 4 Jul 2020, at 17:07, Julien Grall <julien@xen.org> wrote:
>>>>
>>>> On 17/06/2020 17:23, Julien Grall wrote:
>>>>> Hi,
>>>>> On 16/06/2020 22:24, Stefano Stabellini wrote:
>>>>>> On Tue, 16 Jun 2020, Julien Grall wrote:
>>>>>>> From: Julien Grall <jgrall@amazon.com>
>>>>>>>
>>>>>>> Some CPUs can speculate past a RET instruction and potentially perform
>>>>>>> speculative accesses to memory before processing the return.
>>>>>>>
>>>>>>> There is no known gadget available after the RET instruction today.
>>>>>>> However some of the registers (such as in check_pending_guest_serror())
>>>>>>> may contain a value provided the guest.
>>>>>> ^ by
>>>>>>
>>>>>>
>>>>>>> In order to harden the code, it would be better to add a speculation
>>>>>>> barrier after each RET instruction. The performance is meant to be
>>>>>>> negligeable as the speculation barrier is not meant to be archicturally
>>>>>>> executed.
>>>>>>>
>>>>>>> Note that on arm32, the ldmia instruction will act as a return from the
>>>>>>> function __context_switch(). While the whitepaper doesn't suggest it is
>>>>>>> possible to speculate after the instruction, add preventively a
>>>>>>> speculation barrier after it as well.
>>>>>>>
>>>>>>> This is part of the work to mitigate straight-line speculation.
>>>>>>>
>>>>>>> Signed-off-by: Julien Grall <jgrall@amazon.com>
>>>>>>
>>>>>> Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
>>>>>>
>>>>>> I did a compile-test on the patch too.
>>>>>>
>>>>>>
>>>>>>> ---
>>>>>>>
>>>>>>> I am still unsure whether we preventively should add a speculation barrier
>>>>>>> preventively after all the RET instructions in arm*/lib/. The smc call be
>>>>>>> taken care in a follow-up patch.
>>>>>>
>>>>>> SMC is great to have but it seems to be overkill to do the ones under
>>>>>> lib/.
>>>>> From my understanding, the compiler will add a speculation barrier preventively after each 'ret' when the mitigation are turned on.So it feels to me we want to follow the same approach.
>>>>> Obviously, we can avoid them but I would like to have a justification for not adding them (nothing is overkilled against speculation ;)).
>>>>
>>>> I finally found some time to look at arm*/lib in more details. Some of the helpers can definitely be called with guest inputs.
>>>>
>>>> For instance, memchr() is called from hypfs_get_path_user() with the 3rd argument controlled by the guest. In both 32-bit and 64-bit implementation, you will reach the end of the function memchr() with r2/w2 and r3/w3 (it contains a character from the buffer) controlled by the guest.
>>>>
>>>> As this is the only function in the unit, we don't know what will be the instructions right after RET. So it would be safer to add a speculation barrier there too.
>>> How about adding a speculation barrier directly in the ENDPROC macro ?
>>
>> This would unfortunately not cover all the cases because you can return in the middle of the function. I will have a look to see if we can leverage it.
>
> I agree that it would not solve all of them but a big part would be solved by it.
> An other solution might be to have a RETURN macro encoded as "mov pc,lr; sb" and "ret; sb”.
This is a bit messy on Arm32 because not all the return are using "mov
pc,lr". Anyway, I will explore the two approaches.
>
> The patch sounds good, i just need to find a way to analyse if you missed a ret or not which is not easy with such a patch :-)
I did struggle to find all the instances. The directory lib/ is actually
quite difficult to go through on 32-bit because they are multiple way to
implement a return.
Finding a way to reduce manual speculation barrier would definitely be
helpful. I will try to revise the patch during this week.
Cheers,
--
Julien Grall
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH 1/2] xen/arm: entry: Place a speculation barrier following an ret instruction
2020-08-18 17:34 ` Julien Grall
@ 2020-08-19 7:59 ` Bertrand Marquis
2020-08-19 8:02 ` Jan Beulich
0 siblings, 1 reply; 20+ messages in thread
From: Bertrand Marquis @ 2020-08-19 7:59 UTC (permalink / raw)
To: Julien Grall
Cc: Stefano Stabellini, security, Paul Durrant, Xen-devel,
Volodymyr Babchuk, Andre Przywara, Julien Grall, nd
> On 18 Aug 2020, at 18:34, Julien Grall <julien@xen.org> wrote:
>
> Hi Bertrand,
>
> On 18/08/2020 18:06, Bertrand Marquis wrote:
>>> On 18 Aug 2020, at 17:43, Julien Grall <julien@xen.org> wrote:
>>>
>>>
>>>
>>> On 18/08/2020 17:35, Bertrand Marquis wrote:
>>>> Hi Julien,
>>>
>>> Hi Bertrand,
>>>
>>>> Somehow we stopped on this thread and you did already most of the work so I think we should try to finish what you started
>>>
>>> Sorry this fell-through the cracks. I have a new version for patch #1, but not yet patch #2.
>> No problem this came back while trying to reduce my todolist stack :-)
>>>
>>> I am still debating with myself where the speculation barrier should be added after the SMC :).
>> I think that we should unless the SMC is in the context switch path (as all other calls should not have a performance impact).
> I will introduce *_unsafe() helpers that will not contain the speculation barrier. It could then be used in place where we think the barrier is unnecessary.
great idea, this will make it clear by reading the code reducing the need of documentation.
>
>>>
>>>>> On 4 Jul 2020, at 17:07, Julien Grall <julien@xen.org> wrote:
>>>>>
>>>>> On 17/06/2020 17:23, Julien Grall wrote:
>>>>>> Hi,
>>>>>> On 16/06/2020 22:24, Stefano Stabellini wrote:
>>>>>>> On Tue, 16 Jun 2020, Julien Grall wrote:
>>>>>>>> From: Julien Grall <jgrall@amazon.com>
>>>>>>>>
>>>>>>>> Some CPUs can speculate past a RET instruction and potentially perform
>>>>>>>> speculative accesses to memory before processing the return.
>>>>>>>>
>>>>>>>> There is no known gadget available after the RET instruction today.
>>>>>>>> However some of the registers (such as in check_pending_guest_serror())
>>>>>>>> may contain a value provided the guest.
>>>>>>> ^ by
>>>>>>>
>>>>>>>
>>>>>>>> In order to harden the code, it would be better to add a speculation
>>>>>>>> barrier after each RET instruction. The performance is meant to be
>>>>>>>> negligeable as the speculation barrier is not meant to be archicturally
>>>>>>>> executed.
>>>>>>>>
>>>>>>>> Note that on arm32, the ldmia instruction will act as a return from the
>>>>>>>> function __context_switch(). While the whitepaper doesn't suggest it is
>>>>>>>> possible to speculate after the instruction, add preventively a
>>>>>>>> speculation barrier after it as well.
>>>>>>>>
>>>>>>>> This is part of the work to mitigate straight-line speculation.
>>>>>>>>
>>>>>>>> Signed-off-by: Julien Grall <jgrall@amazon.com>
>>>>>>>
>>>>>>> Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
>>>>>>>
>>>>>>> I did a compile-test on the patch too.
>>>>>>>
>>>>>>>
>>>>>>>> ---
>>>>>>>>
>>>>>>>> I am still unsure whether we preventively should add a speculation barrier
>>>>>>>> preventively after all the RET instructions in arm*/lib/. The smc call be
>>>>>>>> taken care in a follow-up patch.
>>>>>>>
>>>>>>> SMC is great to have but it seems to be overkill to do the ones under
>>>>>>> lib/.
>>>>>> From my understanding, the compiler will add a speculation barrier preventively after each 'ret' when the mitigation are turned on.So it feels to me we want to follow the same approach.
>>>>>> Obviously, we can avoid them but I would like to have a justification for not adding them (nothing is overkilled against speculation ;)).
>>>>>
>>>>> I finally found some time to look at arm*/lib in more details. Some of the helpers can definitely be called with guest inputs.
>>>>>
>>>>> For instance, memchr() is called from hypfs_get_path_user() with the 3rd argument controlled by the guest. In both 32-bit and 64-bit implementation, you will reach the end of the function memchr() with r2/w2 and r3/w3 (it contains a character from the buffer) controlled by the guest.
>>>>>
>>>>> As this is the only function in the unit, we don't know what will be the instructions right after RET. So it would be safer to add a speculation barrier there too.
>>>> How about adding a speculation barrier directly in the ENDPROC macro ?
>>>
>>> This would unfortunately not cover all the cases because you can return in the middle of the function. I will have a look to see if we can leverage it.
>> I agree that it would not solve all of them but a big part would be solved by it.
>> An other solution might be to have a RETURN macro encoded as "mov pc,lr; sb" and "ret; sb”.
>
> This is a bit messy on Arm32 because not all the return are using "mov pc,lr". Anyway, I will explore the two approaches.
ack.
>
>> The patch sounds good, i just need to find a way to analyse if you missed a ret or not which is not easy with such a patch :-)
>
> I did struggle to find all the instances. The directory lib/ is actually quite difficult to go through on 32-bit because they are multiple way to
> implement a return.
some part of the assembler code might benefit from a few branch instead of middle return to make the code clearer also but this might impact
a bit the performances.
>
> Finding a way to reduce manual speculation barrier would definitely be helpful. I will try to revise the patch during this week.
ok i will on my side list the returns to be able to review the final patch more easily.
Cheers
Bertrand
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH 1/2] xen/arm: entry: Place a speculation barrier following an ret instruction
2020-08-19 7:59 ` Bertrand Marquis
@ 2020-08-19 8:02 ` Jan Beulich
2020-08-19 8:50 ` Julien Grall
0 siblings, 1 reply; 20+ messages in thread
From: Jan Beulich @ 2020-08-19 8:02 UTC (permalink / raw)
To: Bertrand Marquis, Julien Grall
Cc: Stefano Stabellini, Paul Durrant, Xen-devel, Volodymyr Babchuk,
Andre Przywara, Julien Grall, nd
On 19.08.2020 09:59, Bertrand Marquis wrote:
>> On 18 Aug 2020, at 18:34, Julien Grall <julien@xen.org> wrote:
Btw - is there any need for this thread to be cross posted to both
xen-devel@ and security@? (I've dropped the latter here.)
Jan
>> On 18/08/2020 18:06, Bertrand Marquis wrote:
>>>> On 18 Aug 2020, at 17:43, Julien Grall <julien@xen.org> wrote:
>>>>
>>>>
>>>>
>>>> On 18/08/2020 17:35, Bertrand Marquis wrote:
>>>>> Hi Julien,
>>>>
>>>> Hi Bertrand,
>>>>
>>>>> Somehow we stopped on this thread and you did already most of the work so I think we should try to finish what you started
>>>>
>>>> Sorry this fell-through the cracks. I have a new version for patch #1, but not yet patch #2.
>>> No problem this came back while trying to reduce my todolist stack :-)
>>>>
>>>> I am still debating with myself where the speculation barrier should be added after the SMC :).
>>> I think that we should unless the SMC is in the context switch path (as all other calls should not have a performance impact).
>> I will introduce *_unsafe() helpers that will not contain the speculation barrier. It could then be used in place where we think the barrier is unnecessary.
>
> great idea, this will make it clear by reading the code reducing the need of documentation.
>
>>
>>>>
>>>>>> On 4 Jul 2020, at 17:07, Julien Grall <julien@xen.org> wrote:
>>>>>>
>>>>>> On 17/06/2020 17:23, Julien Grall wrote:
>>>>>>> Hi,
>>>>>>> On 16/06/2020 22:24, Stefano Stabellini wrote:
>>>>>>>> On Tue, 16 Jun 2020, Julien Grall wrote:
>>>>>>>>> From: Julien Grall <jgrall@amazon.com>
>>>>>>>>>
>>>>>>>>> Some CPUs can speculate past a RET instruction and potentially perform
>>>>>>>>> speculative accesses to memory before processing the return.
>>>>>>>>>
>>>>>>>>> There is no known gadget available after the RET instruction today.
>>>>>>>>> However some of the registers (such as in check_pending_guest_serror())
>>>>>>>>> may contain a value provided the guest.
>>>>>>>> ^ by
>>>>>>>>
>>>>>>>>
>>>>>>>>> In order to harden the code, it would be better to add a speculation
>>>>>>>>> barrier after each RET instruction. The performance is meant to be
>>>>>>>>> negligeable as the speculation barrier is not meant to be archicturally
>>>>>>>>> executed.
>>>>>>>>>
>>>>>>>>> Note that on arm32, the ldmia instruction will act as a return from the
>>>>>>>>> function __context_switch(). While the whitepaper doesn't suggest it is
>>>>>>>>> possible to speculate after the instruction, add preventively a
>>>>>>>>> speculation barrier after it as well.
>>>>>>>>>
>>>>>>>>> This is part of the work to mitigate straight-line speculation.
>>>>>>>>>
>>>>>>>>> Signed-off-by: Julien Grall <jgrall@amazon.com>
>>>>>>>>
>>>>>>>> Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
>>>>>>>>
>>>>>>>> I did a compile-test on the patch too.
>>>>>>>>
>>>>>>>>
>>>>>>>>> ---
>>>>>>>>>
>>>>>>>>> I am still unsure whether we preventively should add a speculation barrier
>>>>>>>>> preventively after all the RET instructions in arm*/lib/. The smc call be
>>>>>>>>> taken care in a follow-up patch.
>>>>>>>>
>>>>>>>> SMC is great to have but it seems to be overkill to do the ones under
>>>>>>>> lib/.
>>>>>>> From my understanding, the compiler will add a speculation barrier preventively after each 'ret' when the mitigation are turned on.So it feels to me we want to follow the same approach.
>>>>>>> Obviously, we can avoid them but I would like to have a justification for not adding them (nothing is overkilled against speculation ;)).
>>>>>>
>>>>>> I finally found some time to look at arm*/lib in more details. Some of the helpers can definitely be called with guest inputs.
>>>>>>
>>>>>> For instance, memchr() is called from hypfs_get_path_user() with the 3rd argument controlled by the guest. In both 32-bit and 64-bit implementation, you will reach the end of the function memchr() with r2/w2 and r3/w3 (it contains a character from the buffer) controlled by the guest.
>>>>>>
>>>>>> As this is the only function in the unit, we don't know what will be the instructions right after RET. So it would be safer to add a speculation barrier there too.
>>>>> How about adding a speculation barrier directly in the ENDPROC macro ?
>>>>
>>>> This would unfortunately not cover all the cases because you can return in the middle of the function. I will have a look to see if we can leverage it.
>>> I agree that it would not solve all of them but a big part would be solved by it.
>>> An other solution might be to have a RETURN macro encoded as "mov pc,lr; sb" and "ret; sb”.
>>
>> This is a bit messy on Arm32 because not all the return are using "mov pc,lr". Anyway, I will explore the two approaches.
>
> ack.
>
>>
>>> The patch sounds good, i just need to find a way to analyse if you missed a ret or not which is not easy with such a patch :-)
>>
>> I did struggle to find all the instances. The directory lib/ is actually quite difficult to go through on 32-bit because they are multiple way to
>> implement a return.
>
> some part of the assembler code might benefit from a few branch instead of middle return to make the code clearer also but this might impact
> a bit the performances.
>
>>
>> Finding a way to reduce manual speculation barrier would definitely be helpful. I will try to revise the patch during this week.
>
> ok i will on my side list the returns to be able to review the final patch more easily.
>
> Cheers
> Bertrand
>
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH 1/2] xen/arm: entry: Place a speculation barrier following an ret instruction
2020-08-19 8:02 ` Jan Beulich
@ 2020-08-19 8:50 ` Julien Grall
2020-08-19 8:58 ` Jan Beulich
0 siblings, 1 reply; 20+ messages in thread
From: Julien Grall @ 2020-08-19 8:50 UTC (permalink / raw)
To: Jan Beulich, Bertrand Marquis
Cc: Stefano Stabellini, Paul Durrant, Xen-devel, Volodymyr Babchuk,
Andre Przywara, Julien Grall, security@xenproject.org
(Adding back security)
Hi Jan,
On 19/08/2020 09:02, Jan Beulich wrote:
> On 19.08.2020 09:59, Bertrand Marquis wrote:
>>> On 18 Aug 2020, at 18:34, Julien Grall <julien@xen.org> wrote:
>
> Btw - is there any need for this thread to be cross posted to both
> xen-devel@ and security@? (I've dropped the latter here.)
From the cover letter:
"The patch series is directly sent on the mailing list as the
security team has been aware of the issues after the whitepaper was
publicly released."
This is technically still a security issue except this is discussed in
the open as it is a zero day for us. An XSA will have to be issued in
due course. Hence why security@ is added to keep track of the conversation.
Cheers,
--
Julien Grall
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH 1/2] xen/arm: entry: Place a speculation barrier following an ret instruction
2020-08-19 8:50 ` Julien Grall
@ 2020-08-19 8:58 ` Jan Beulich
2020-08-19 9:56 ` Julien Grall
0 siblings, 1 reply; 20+ messages in thread
From: Jan Beulich @ 2020-08-19 8:58 UTC (permalink / raw)
To: Julien Grall
Cc: Bertrand Marquis, Stefano Stabellini, Paul Durrant, Xen-devel,
Volodymyr Babchuk, Andre Przywara, Julien Grall
On 19.08.2020 10:50, Julien Grall wrote:
> On 19/08/2020 09:02, Jan Beulich wrote:
>> On 19.08.2020 09:59, Bertrand Marquis wrote:
>>>> On 18 Aug 2020, at 18:34, Julien Grall <julien@xen.org> wrote:
>>
>> Btw - is there any need for this thread to be cross posted to both
>> xen-devel@ and security@? (I've dropped the latter here.)
>
> From the cover letter:
>
> "The patch series is directly sent on the mailing list as the
> security team has been aware of the issues after the whitepaper was
> publicly released."
>
> This is technically still a security issue except this is discussed in
> the open as it is a zero day for us. An XSA will have to be issued in
> due course. Hence why security@ is added to keep track of the conversation.
I thought cross-posting is generally considered bad practice. I can't
see what extra "keeping track of the conversation" gets added by CCing
security@: Everything will already be recorded in the list archives of
xen-devel.
For some background of my original question: The cross posting confuses
the rules I have set up in my mail client - the mail gets moved back
and forth between the two distinct folders for each of the lists. I
haven't been able to figure a non-clumsy way yet to avoid this
happening. The mail client we used to use until about a year ago did
not have any issue with the same scenario.
Jan
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH 1/2] xen/arm: entry: Place a speculation barrier following an ret instruction
2020-08-19 8:58 ` Jan Beulich
@ 2020-08-19 9:56 ` Julien Grall
0 siblings, 0 replies; 20+ messages in thread
From: Julien Grall @ 2020-08-19 9:56 UTC (permalink / raw)
To: Jan Beulich
Cc: Bertrand Marquis, Stefano Stabellini, Paul Durrant, Xen-devel,
Volodymyr Babchuk, Andre Przywara, Julien Grall
On 19/08/2020 09:58, Jan Beulich wrote:
> On 19.08.2020 10:50, Julien Grall wrote:
>> On 19/08/2020 09:02, Jan Beulich wrote:
>>> On 19.08.2020 09:59, Bertrand Marquis wrote:
>>>>> On 18 Aug 2020, at 18:34, Julien Grall <julien@xen.org> wrote:
>>>
>>> Btw - is there any need for this thread to be cross posted to both
>>> xen-devel@ and security@? (I've dropped the latter here.)
>>
>> From the cover letter:
>>
>> "The patch series is directly sent on the mailing list as the
>> security team has been aware of the issues after the whitepaper was
>> publicly released."
>>
>> This is technically still a security issue except this is discussed in
>> the open as it is a zero day for us. An XSA will have to be issued in
>> due course. Hence why security@ is added to keep track of the conversation.
>
> I thought cross-posting is generally considered bad practice. I can't
> see what extra "keeping track of the conversation" gets added by CCing
> security@: Everything will already be recorded in the list archives of
> xen-devel.
The "keep track of the conversation" wasn't in the sense of recording
but that we are aware that there is a pending 0-day discussion and take
action one the discussion as settled.
>
> For some background of my original question: The cross posting confuses
> the rules I have set up in my mail client - the mail gets moved back
> and forth between the two distinct folders for each of the lists. I
> haven't been able to figure a non-clumsy way yet to avoid this
> happening. The mail client we used to use until about a year ago did
> not have any issue with the same scenario.
I am sorry to hear your e-mail client is not capable to do basic
filtering. However, this is not the first place where we do that (see
Linux or QEMU patches).
Cheers,
--
Julien Grall
^ permalink raw reply [flat|nested] 20+ messages in thread
end of thread, other threads:[~2020-08-19 9:56 UTC | newest]
Thread overview: 20+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-06-16 17:59 [PATCH 0/2] xen/arm: Mitigate straight-line speculation Julien Grall
2020-06-16 17:59 ` [PATCH 1/2] xen/arm: entry: Place a speculation barrier following an ret instruction Julien Grall
2020-06-16 21:24 ` Stefano Stabellini
2020-06-17 16:23 ` Julien Grall
2020-07-04 16:07 ` Julien Grall
2020-08-18 16:35 ` Bertrand Marquis
2020-08-18 16:43 ` Julien Grall
2020-08-18 17:06 ` Bertrand Marquis
2020-08-18 17:34 ` Julien Grall
2020-08-19 7:59 ` Bertrand Marquis
2020-08-19 8:02 ` Jan Beulich
2020-08-19 8:50 ` Julien Grall
2020-08-19 8:58 ` Jan Beulich
2020-08-19 9:56 ` Julien Grall
2020-06-16 17:59 ` [PATCH 2/2] xen/arm: Mitigate straight-line speculation for SMC call Julien Grall
2020-06-16 21:34 ` Stefano Stabellini
2020-06-16 21:57 ` Julien Grall
2020-06-16 23:16 ` Andrew Cooper
2020-06-16 23:27 ` Stefano Stabellini
2020-06-18 17:46 ` Julien Grall
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).