All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] x86: guard against straight-line speculation past RET
@ 2020-08-24 12:50 Jan Beulich
  2020-09-04 18:18 ` Andrew Cooper
  0 siblings, 1 reply; 6+ messages in thread
From: Jan Beulich @ 2020-08-24 12:50 UTC (permalink / raw)
  To: xen-devel; +Cc: Andrew Cooper, Wei Liu, Roger Pau Monné

Under certain conditions CPUs can speculate into the instruction stream
past a RET instruction. Guard against this just like 3b7dab93f240
("x86/spec-ctrl: Protect against CALL/JMP straight-line speculation")
did - by inserting an "INT $3" insn. It's merely the mechanics of how to
achieve this that differ: A pair of macros gets introduced to post-
process RET insns issued by the compiler (or living in assembly files).

Signed-off-by: Jan Beulich <jbeulich@suse.com>
---
Should this depend on CONFIG_SPECULATIVE_HARDEN_BRANCH?
---
This depends on the "x86: some assembler macro rework" series posted
over a month ago.

--- a/xen/include/asm-x86/asm-defns.h
+++ b/xen/include/asm-x86/asm-defns.h
@@ -50,3 +50,19 @@
 .macro INDIRECT_JMP arg:req
     INDIRECT_BRANCH jmp \arg
 .endm
+
+/*
+ * To guard against speculation past RET, insert a breakpoint insn
+ * immediately after them.
+ */
+.macro ret operand:vararg
+    ret$ \operand
+.endm
+.macro ret$ operand:vararg
+    .purgem ret
+    ret \operand
+    int $3
+    .macro ret operand:vararg
+        ret$ \\(operand)
+    .endm
+.endm


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] x86: guard against straight-line speculation past RET
  2020-08-24 12:50 [PATCH] x86: guard against straight-line speculation past RET Jan Beulich
@ 2020-09-04 18:18 ` Andrew Cooper
  2020-09-07  9:25   ` Jan Beulich
  0 siblings, 1 reply; 6+ messages in thread
From: Andrew Cooper @ 2020-09-04 18:18 UTC (permalink / raw)
  To: Jan Beulich, xen-devel; +Cc: Wei Liu, Roger Pau Monné

On 24/08/2020 13:50, Jan Beulich wrote:
> --- a/xen/include/asm-x86/asm-defns.h
> +++ b/xen/include/asm-x86/asm-defns.h
> @@ -50,3 +50,19 @@
>  .macro INDIRECT_JMP arg:req
>      INDIRECT_BRANCH jmp \arg
>  .endm
> +
> +/*
> + * To guard against speculation past RET, insert a breakpoint insn
> + * immediately after them.
> + */
> +.macro ret operand:vararg
> +    ret$ \operand
> +.endm
> +.macro ret$ operand:vararg
> +    .purgem ret
> +    ret \operand
> +    int $3
> +    .macro ret operand:vararg
> +        ret$ \\(operand)
> +    .endm
> +.endm

Several observations.  First, clang chokes on this:

<instantiation>:2:9: error: unknown token in expression
    ret \\(operand)
        ^

Second, you mean int3 rather than int $3.  By convention, they are
synonymous, but the second one really ought to be the two byte encoding,
rather than the single byte encoding, and we really want the single byte
version for code volume reasons.

Third, there is a huge quantity of complexity for a form of the
instruction we don't use.  Instead:

.macro ret operand:vararg
    .ifnb \operand
        .error "TODO - speculation safety for 'ret $imm'"
    .endif

    .byte 0xc3
    int3
.endm

is much simpler, and compatible with both GCC and Clang.

Almost...

Clang doesn't actually expand the macro for ret instructions, so a Clang
build of Xen only ends up getting protected in the assembly files.

The following experiment demonstrates the issue:

$ cat ret.c
asm (".macro ret\n\t"
     ".error \"foo\"\n\t"
     ".endm\n\t");
void foo(void) {}

$ gcc -O3 -c ret.c -o ret.o && objdump -d ret.o
/tmp/ccf8hkyN.s: Assembler messages:
/tmp/ccf8hkyN.s:16: Error: foo

$ clang-10 -O3 -c ret.c -o ret.o && objdump -d ret.o

ret.o:     file format elf64-x86-64


Disassembly of section .text:

0000000000000000 <foo>:
   0:    c3                       retq


Worse, -no-integrated-as doesn't immediately help, even though it
invokes $(AS).

I tracked that down to the difference between ret and retq, which
highlights an assumption about GCC which may not remain true in the future.

Adding a second macro covering retq fixes the scenario in combination
with -no-integrated-as.

So overall I think we can make a safe binary with a clang build. 
However, it is at the expense of the integrated assembler, which I
believe is now mandatory for LTO, and control-flow integrity, neither of
which we want to lose in the long term.

~Andrew


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] x86: guard against straight-line speculation past RET
  2020-09-04 18:18 ` Andrew Cooper
@ 2020-09-07  9:25   ` Jan Beulich
  2020-09-07 10:14     ` Jan Beulich
  2020-09-07 13:50     ` Andrew Cooper
  0 siblings, 2 replies; 6+ messages in thread
From: Jan Beulich @ 2020-09-07  9:25 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: xen-devel, Wei Liu, Roger Pau Monné

On 04.09.2020 20:18, Andrew Cooper wrote:
> On 24/08/2020 13:50, Jan Beulich wrote:
>> --- a/xen/include/asm-x86/asm-defns.h
>> +++ b/xen/include/asm-x86/asm-defns.h
>> @@ -50,3 +50,19 @@
>>  .macro INDIRECT_JMP arg:req
>>      INDIRECT_BRANCH jmp \arg
>>  .endm
>> +
>> +/*
>> + * To guard against speculation past RET, insert a breakpoint insn
>> + * immediately after them.
>> + */
>> +.macro ret operand:vararg
>> +    ret$ \operand
>> +.endm
>> +.macro ret$ operand:vararg
>> +    .purgem ret
>> +    ret \operand
>> +    int $3
>> +    .macro ret operand:vararg
>> +        ret$ \\(operand)
>> +    .endm
>> +.endm
> 
> Several observations.  First, clang chokes on this:
> 
> <instantiation>:2:9: error: unknown token in expression
>     ret \\(operand)
>         ^

Must be clang more recent than the 5.x one I've tested with; likely
because there we end up using -mno-integrated-as.

> Second, you mean int3 rather than int $3.  By convention, they are
> synonymous, but the second one really ought to be the two byte encoding,
> rather than the single byte encoding, and we really want the single byte
> version for code volume reasons.

Well, no, I didn't mean "int3", but I've switched nevertheless, just
for consistency with the earlier change of yours referenced in the
description. To me "int3" has only ever been a kludge. Assemblers
I've grown up with don't know such a mnemonic. Nor did Intel
originally document it, and AMD still doesn't.

> Third, there is a huge quantity of complexity for a form of the
> instruction we don't use.

The complexity isn't with handling the possible immediate operand,
but with the need to override the "ret" insn, and then to transiently
cancel this override.

>  Instead:
> 
> .macro ret operand:vararg
>     .ifnb \operand
>         .error "TODO - speculation safety for 'ret $imm'"
>     .endif
> 
>     .byte 0xc3
>     int3
> .endm
> 
> is much simpler, and compatible with both GCC and Clang.

I really wish to avoid .byte for code emission whenever possible.
It subverts the assembler applying sanity checks. This may not be
overly relevant here, but then we would still better avoid setting
precedents. However, if clang can't be made work without going
this route, so be it.

> Almost...
> 
> Clang doesn't actually expand the macro for ret instructions, so a Clang
> build of Xen only ends up getting protected in the assembly files.
> 
> The following experiment demonstrates the issue:
> 
> $ cat ret.c
> asm (".macro ret\n\t"
>      ".error \"foo\"\n\t"
>      ".endm\n\t");
> void foo(void) {}
> 
> $ gcc -O3 -c ret.c -o ret.o && objdump -d ret.o
> /tmp/ccf8hkyN.s: Assembler messages:
> /tmp/ccf8hkyN.s:16: Error: foo
> 
> $ clang-10 -O3 -c ret.c -o ret.o && objdump -d ret.o
> 
> ret.o:     file format elf64-x86-64
> 
> 
> Disassembly of section .text:
> 
> 0000000000000000 <foo>:
>    0:    c3                       retq
> 
> 
> Worse, -no-integrated-as doesn't immediately help, even though it
> invokes $(AS).
> 
> I tracked that down to the difference between ret and retq, which
> highlights an assumption about GCC which may not remain true in the future.
> 
> Adding a second macro covering retq fixes the scenario in combination
> with -no-integrated-as.

Ah, yes, I should of course have thought of retq. Albeit as per
above - generated code looks fine here when using clang 5.

> So overall I think we can make a safe binary with a clang build. 
> However, it is at the expense of the integrated assembler, which I
> believe is now mandatory for LTO, and control-flow integrity, neither of
> which we want to lose in the long term.

Why at this expense? Are you saying that even when going the .byte
route and even with very new clang one has to force
-mno-integrated-as?

Jan


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] x86: guard against straight-line speculation past RET
  2020-09-07  9:25   ` Jan Beulich
@ 2020-09-07 10:14     ` Jan Beulich
  2020-09-07 13:50     ` Andrew Cooper
  1 sibling, 0 replies; 6+ messages in thread
From: Jan Beulich @ 2020-09-07 10:14 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: xen-devel, Wei Liu, Roger Pau Monné

On 07.09.2020 11:25, Jan Beulich wrote:
> On 04.09.2020 20:18, Andrew Cooper wrote:
>> Clang doesn't actually expand the macro for ret instructions, so a Clang
>> build of Xen only ends up getting protected in the assembly files.
>>
>> The following experiment demonstrates the issue:
>>
>> $ cat ret.c
>> asm (".macro ret\n\t"
>>      ".error \"foo\"\n\t"
>>      ".endm\n\t");
>> void foo(void) {}
>>
>> $ gcc -O3 -c ret.c -o ret.o && objdump -d ret.o
>> /tmp/ccf8hkyN.s: Assembler messages:
>> /tmp/ccf8hkyN.s:16: Error: foo
>>
>> $ clang-10 -O3 -c ret.c -o ret.o && objdump -d ret.o
>>
>> ret.o:     file format elf64-x86-64
>>
>>
>> Disassembly of section .text:
>>
>> 0000000000000000 <foo>:
>>    0:    c3                       retq
>>
>>
>> Worse, -no-integrated-as doesn't immediately help, even though it
>> invokes $(AS).
>>
>> I tracked that down to the difference between ret and retq, which
>> highlights an assumption about GCC which may not remain true in the future.
>>
>> Adding a second macro covering retq fixes the scenario in combination
>> with -no-integrated-as.
> 
> Ah, yes, I should of course have thought of retq. Albeit as per
> above - generated code looks fine here when using clang 5.

I'm sorry, I can indeed see this part of the issue. I did look at
the wrong build tree when putting together the earlier reply.

Jan


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] x86: guard against straight-line speculation past RET
  2020-09-07  9:25   ` Jan Beulich
  2020-09-07 10:14     ` Jan Beulich
@ 2020-09-07 13:50     ` Andrew Cooper
  2020-09-07 14:40       ` Jan Beulich
  1 sibling, 1 reply; 6+ messages in thread
From: Andrew Cooper @ 2020-09-07 13:50 UTC (permalink / raw)
  To: Jan Beulich; +Cc: xen-devel, Wei Liu, Roger Pau Monné

On 07/09/2020 10:25, Jan Beulich wrote:
> On 04.09.2020 20:18, Andrew Cooper wrote:
>> On 24/08/2020 13:50, Jan Beulich wrote:
>>> --- a/xen/include/asm-x86/asm-defns.h
>>> +++ b/xen/include/asm-x86/asm-defns.h
>>> @@ -50,3 +50,19 @@
>>>  .macro INDIRECT_JMP arg:req
>>>      INDIRECT_BRANCH jmp \arg
>>>  .endm
>>> +
>>> +/*
>>> + * To guard against speculation past RET, insert a breakpoint insn
>>> + * immediately after them.
>>> + */
>>> +.macro ret operand:vararg
>>> +    ret$ \operand
>>> +.endm
>>> +.macro ret$ operand:vararg
>>> +    .purgem ret
>>> +    ret \operand
>>> +    int $3
>>> +    .macro ret operand:vararg
>>> +        ret$ \\(operand)
>>> +    .endm
>>> +.endm
>> Several observations.  First, clang chokes on this:
>>
>> <instantiation>:2:9: error: unknown token in expression
>>     ret \\(operand)
>>         ^
> Must be clang more recent than the 5.x one I've tested with; likely
> because there we end up using -mno-integrated-as.
>
>> Second, you mean int3 rather than int $3.  By convention, they are
>> synonymous, but the second one really ought to be the two byte encoding,
>> rather than the single byte encoding, and we really want the single byte
>> version for code volume reasons.
> Well, no, I didn't mean "int3", but I've switched nevertheless, just
> for consistency with the earlier change of yours referenced in the
> description. To me "int3" has only ever been a kludge.

Far less of a kludge than having `int $3` magically have a different
encoding to every other `int $n` instructions, which is how assemblers
behave in practice.

> Assemblers
> I've grown up with don't know such a mnemonic. Nor did Intel
> originally document it, and AMD still doesn't.

APM 3, page 413.

>> Third, there is a huge quantity of complexity for a form of the
>> instruction we don't use.
> The complexity isn't with handling the possible immediate operand,
> but with the need to override the "ret" insn, and then to transiently
> cancel this override.

What is the purpose of transiently cancelling the override?

It's not possible to pull this trick twice, so its not as if you're
falling back to a different macro rather than the plain instruction.

>>   Instead:
>>
>> .macro ret operand:vararg
>>     .ifnb \operand
>>         .error "TODO - speculation safety for 'ret $imm'"
>>     .endif
>>
>>     .byte 0xc3
>>     int3
>> .endm
>>
>> is much simpler, and compatible with both GCC and Clang.
> I really wish to avoid .byte for code emission whenever possible.
> It subverts the assembler applying sanity checks. This may not be
> overly relevant here, but then we would still better avoid setting
> precedents. However, if clang can't be made work without going
> this route, so be it.
>
>> Almost...
>>
>> Clang doesn't actually expand the macro for ret instructions, so a Clang
>> build of Xen only ends up getting protected in the assembly files.
>>
>> The following experiment demonstrates the issue:
>>
>> $ cat ret.c
>> asm (".macro ret\n\t"
>>      ".error \"foo\"\n\t"
>>      ".endm\n\t");
>> void foo(void) {}
>>
>> $ gcc -O3 -c ret.c -o ret.o && objdump -d ret.o
>> /tmp/ccf8hkyN.s: Assembler messages:
>> /tmp/ccf8hkyN.s:16: Error: foo
>>
>> $ clang-10 -O3 -c ret.c -o ret.o && objdump -d ret.o
>>
>> ret.o:     file format elf64-x86-64
>>
>>
>> Disassembly of section .text:
>>
>> 0000000000000000 <foo>:
>>    0:    c3                       retq
>>
>>
>> Worse, -no-integrated-as doesn't immediately help, even though it
>> invokes $(AS).
>>
>> I tracked that down to the difference between ret and retq, which
>> highlights an assumption about GCC which may not remain true in the future.
>>
>> Adding a second macro covering retq fixes the scenario in combination
>> with -no-integrated-as.
> Ah, yes, I should of course have thought of retq. Albeit as per
> above - generated code looks fine here when using clang 5.
>
>> So overall I think we can make a safe binary with a clang build. 
>> However, it is at the expense of the integrated assembler, which I
>> believe is now mandatory for LTO, and control-flow integrity, neither of
>> which we want to lose in the long term.
> Why at this expense? Are you saying that even when going the .byte
> route and even with very new clang one has to force
> -mno-integrated-as?

Yes, which was the whole point of providing the full transcript above.

You can't wrap an instruction with a macro with the integrated assembler.

~Andrew


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] x86: guard against straight-line speculation past RET
  2020-09-07 13:50     ` Andrew Cooper
@ 2020-09-07 14:40       ` Jan Beulich
  0 siblings, 0 replies; 6+ messages in thread
From: Jan Beulich @ 2020-09-07 14:40 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: xen-devel, Wei Liu, Roger Pau Monné

On 07.09.2020 15:50, Andrew Cooper wrote:
> On 07/09/2020 10:25, Jan Beulich wrote:
>> On 04.09.2020 20:18, Andrew Cooper wrote:
>>> Third, there is a huge quantity of complexity for a form of the
>>> instruction we don't use.
>> The complexity isn't with handling the possible immediate operand,
>> but with the need to override the "ret" insn, and then to transiently
>> cancel this override.
> 
> What is the purpose of transiently cancelling the override?

To have access to the underlying insn again.

> It's not possible to pull this trick twice, so its not as if you're
> falling back to a different macro rather than the plain instruction.

Not sure I understand what you're saying here. The helper macro
purging and then re-instating the insn surrogate macro can be
done any number of times.

>>> Clang doesn't actually expand the macro for ret instructions, so a Clang
>>> build of Xen only ends up getting protected in the assembly files.
>>>
>>> The following experiment demonstrates the issue:
>>>
>>> $ cat ret.c
>>> asm (".macro ret\n\t"
>>>      ".error \"foo\"\n\t"
>>>      ".endm\n\t");
>>> void foo(void) {}
>>>
>>> $ gcc -O3 -c ret.c -o ret.o && objdump -d ret.o
>>> /tmp/ccf8hkyN.s: Assembler messages:
>>> /tmp/ccf8hkyN.s:16: Error: foo
>>>
>>> $ clang-10 -O3 -c ret.c -o ret.o && objdump -d ret.o
>>>
>>> ret.o:     file format elf64-x86-64
>>>
>>>
>>> Disassembly of section .text:
>>>
>>> 0000000000000000 <foo>:
>>>    0:    c3                       retq
>>>
>>>
>>> Worse, -no-integrated-as doesn't immediately help, even though it
>>> invokes $(AS).
>>>
>>> I tracked that down to the difference between ret and retq, which
>>> highlights an assumption about GCC which may not remain true in the future.
>>>
>>> Adding a second macro covering retq fixes the scenario in combination
>>> with -no-integrated-as.
>> Ah, yes, I should of course have thought of retq. Albeit as per
>> above - generated code looks fine here when using clang 5.
>>
>>> So overall I think we can make a safe binary with a clang build. 
>>> However, it is at the expense of the integrated assembler, which I
>>> believe is now mandatory for LTO, and control-flow integrity, neither of
>>> which we want to lose in the long term.
>> Why at this expense? Are you saying that even when going the .byte
>> route and even with very new clang one has to force
>> -mno-integrated-as?
> 
> Yes, which was the whole point of providing the full transcript above.
> 
> You can't wrap an instruction with a macro with the integrated assembler.

Ugly. I can't currently think of a way to address this then, for clang,
other than by them offering a suitable command line option. Question is:
Do we consider it worthwhile then to still address for gcc? Or shall I
drop the patch?

Jan


^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2020-09-07 14:40 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-08-24 12:50 [PATCH] x86: guard against straight-line speculation past RET Jan Beulich
2020-09-04 18:18 ` Andrew Cooper
2020-09-07  9:25   ` Jan Beulich
2020-09-07 10:14     ` Jan Beulich
2020-09-07 13:50     ` Andrew Cooper
2020-09-07 14:40       ` Jan Beulich

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.