x86: pad assembly functions with INT3
diff mbox series

Message ID 20180507213755.GA32406@avx2
State New, archived
Headers show
Series
  • x86: pad assembly functions with INT3
Related show

Commit Message

Alexey Dobriyan May 7, 2018, 9:37 p.m. UTC
Use INT3 instead of NOP. All that padding between functions is
an illegal area, no legitimate code should jump into it.

I've checked x86_64 allyesconfig disassembly, all changes looks sane:
INT3 is only used after RET or unconditional JMP.

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
---

 arch/x86/include/asm/linkage.h |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Comments

H. Peter Anvin May 7, 2018, 9:41 p.m. UTC | #1
On May 7, 2018 2:37:55 PM PDT, Alexey Dobriyan <adobriyan@gmail.com> wrote:
>Use INT3 instead of NOP. All that padding between functions is
>an illegal area, no legitimate code should jump into it.
>
>I've checked x86_64 allyesconfig disassembly, all changes looks sane:
>INT3 is only used after RET or unconditional JMP.
>
>Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
>---
>
> arch/x86/include/asm/linkage.h |    2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
>--- a/arch/x86/include/asm/linkage.h
>+++ b/arch/x86/include/asm/linkage.h
>@@ -18,7 +18,7 @@
> 	name:
> 
> #if defined(CONFIG_X86_64) || defined(CONFIG_X86_ALIGNMENT_16)
>-#define __ALIGN		.p2align 4, 0x90
>+#define __ALIGN		.p2align 4, 0xCC
> #define __ALIGN_STR	__stringify(__ALIGN)
> #endif
> 

Acked-by: H. Peter Anvin <h.peter.anvin@in tel.com>
Alexey Dobriyan May 9, 2018, 4:55 p.m. UTC | #2
On Mon, May 07, 2018 at 02:41:14PM -0700, hpa@zytor.com wrote:
> >-#define __ALIGN		.p2align 4, 0x90
> >+#define __ALIGN		.p2align 4, 0xCC
> 
> Acked-by: H. Peter Anvin <h.peter.anvin@in tel.com>

Thanks!

If someone knows how to control _compiler_ inter-function padding,
please tell.
H. Peter Anvin May 9, 2018, 7:28 p.m. UTC | #3
On 05/09/18 09:55, Alexey Dobriyan wrote:
> On Mon, May 07, 2018 at 02:41:14PM -0700, hpa@zytor.com wrote:
>>> -#define __ALIGN		.p2align 4, 0x90
>>> +#define __ALIGN		.p2align 4, 0xCC
>>
>> Acked-by: H. Peter Anvin <h.peter.anvin@in tel.com>
> 
> Thanks!
> 
> If someone knows how to control _compiler_ inter-function padding,
> please tell.
> 

I think I have a gcc feature enhancement request for that. It can't be
done without gcc's knowledge of when a fallthrough is happening (I
presume you took that into account in your patch set.)

	-hpa
David Laight May 10, 2018, 4:39 p.m. UTC | #4
From: Alexey Dobriyan
> Sent: 07 May 2018 22:38
> 
> Use INT3 instead of NOP. All that padding between functions is
> an illegal area, no legitimate code should jump into it.
> 
> I've checked x86_64 allyesconfig disassembly, all changes looks sane:
> INT3 is only used after RET or unconditional JMP.

I thought there was a performance penalty (on at least some cpu)
depending on the number of and the actual instructions used for padding.

I believe that is why gcc generates a small number of very long 'nop'
instructions when padding code.

	David
H. Peter Anvin May 11, 2018, 6:53 p.m. UTC | #5
On 05/10/18 09:39, David Laight wrote:
> From: Alexey Dobriyan
>> Sent: 07 May 2018 22:38
>>
>> Use INT3 instead of NOP. All that padding between functions is
>> an illegal area, no legitimate code should jump into it.
>>
>> I've checked x86_64 allyesconfig disassembly, all changes looks sane:
>> INT3 is only used after RET or unconditional JMP.
> 
> I thought there was a performance penalty (on at least some cpu)
> depending on the number of and the actual instructions used for padding.
> 
> I believe that is why gcc generates a small number of very long 'nop'
> instructions when padding code.
> 

There is a performance penalty for using NOP instructions *in the
fallthrough case.*  In the case where the padding is never supposed to
be executed, which is what we're talking about here, it is irrelevant.

I thought I had filed a gcc enhancement request, but I can't find it
now, so I just filed this:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85751

	-hpa
David Laight May 14, 2018, 9:04 a.m. UTC | #6
From: H. Peter Anvin
> Sent: 11 May 2018 19:54
> 
> On 05/10/18 09:39, David Laight wrote:
> > From: Alexey Dobriyan
> >> Sent: 07 May 2018 22:38
> >>
> >> Use INT3 instead of NOP. All that padding between functions is
> >> an illegal area, no legitimate code should jump into it.
> >>
> >> I've checked x86_64 allyesconfig disassembly, all changes looks sane:
> >> INT3 is only used after RET or unconditional JMP.
> >
> > I thought there was a performance penalty (on at least some cpu)
> > depending on the number of and the actual instructions used for padding.
> >
> > I believe that is why gcc generates a small number of very long 'nop'
> > instructions when padding code.
> >
> 
> There is a performance penalty for using NOP instructions *in the
> fallthrough case.*  In the case where the padding is never supposed to
> be executed, which is what we're talking about here, it is irrelevant.

Not completely irrelevant, the instructions stream gets fetched and decoded
beyond the jmp/ret at the end of the function.
At some point the cpu will decode the jmp/ret and fetch/decode from the
target address.
I guess it won't try to speculatively execute the 'pad' instructions
- but you can never really tell!

	David
H. Peter Anvin May 14, 2018, 11:05 a.m. UTC | #7
On May 14, 2018 2:04:38 AM PDT, David Laight <David.Laight@ACULAB.COM> wrote:
>From: H. Peter Anvin
>> Sent: 11 May 2018 19:54
>> 
>> On 05/10/18 09:39, David Laight wrote:
>> > From: Alexey Dobriyan
>> >> Sent: 07 May 2018 22:38
>> >>
>> >> Use INT3 instead of NOP. All that padding between functions is
>> >> an illegal area, no legitimate code should jump into it.
>> >>
>> >> I've checked x86_64 allyesconfig disassembly, all changes looks
>sane:
>> >> INT3 is only used after RET or unconditional JMP.
>> >
>> > I thought there was a performance penalty (on at least some cpu)
>> > depending on the number of and the actual instructions used for
>padding.
>> >
>> > I believe that is why gcc generates a small number of very long
>'nop'
>> > instructions when padding code.
>> >
>> 
>> There is a performance penalty for using NOP instructions *in the
>> fallthrough case.*  In the case where the padding is never supposed
>to
>> be executed, which is what we're talking about here, it is
>irrelevant.
>
>Not completely irrelevant, the instructions stream gets fetched and
>decoded
>beyond the jmp/ret at the end of the function.
>At some point the cpu will decode the jmp/ret and fetch/decode from the
>target address.
>I guess it won't try to speculatively execute the 'pad' instructions
>- but you can never really tell!
>
>	David

The CPU doesn't speculate down past an unconditional control transfer. Doing so would be idiotic.
Ingo Molnar May 15, 2018, 6:54 a.m. UTC | #8
* hpa@zytor.com <hpa@zytor.com> wrote:

> > I guess it won't try to speculatively execute the 'pad' instructions - but you 
> > can never really tell!
> >
> >	David
> 
> The CPU doesn't speculate down past an unconditional control transfer. Doing so 
> would be idiotic.

I think, when it comes to speculative execution, our general expectation that CPUs 
don't do idiotic things got somewhat weakened in the past year or so ...

Thanks,

	Ingo
H. Peter Anvin May 15, 2018, 6:59 a.m. UTC | #9
On May 14, 2018 11:54:05 PM PDT, Ingo Molnar <mingo@kernel.org> wrote:
>
>* hpa@zytor.com <hpa@zytor.com> wrote:
>
>> > I guess it won't try to speculatively execute the 'pad'
>instructions - but you 
>> > can never really tell!
>> >
>> >	David
>> 
>> The CPU doesn't speculate down past an unconditional control
>transfer. Doing so 
>> would be idiotic.
>
>I think, when it comes to speculative execution, our general
>expectation that CPUs 
>don't do idiotic things got somewhat weakened in the past year or so
>...
>
>Thanks,
>
>	Ingo

Sort-of-kind-of... the things that have cropped up were reasonable consequences of designing under a set of assumptions which turned out to be incorrect. Speculating through an unconditional control transfer did not make sense under those assumptions either.

Patch
diff mbox series

--- a/arch/x86/include/asm/linkage.h
+++ b/arch/x86/include/asm/linkage.h
@@ -18,7 +18,7 @@ 
 	name:
 
 #if defined(CONFIG_X86_64) || defined(CONFIG_X86_ALIGNMENT_16)
-#define __ALIGN		.p2align 4, 0x90
+#define __ALIGN		.p2align 4, 0xCC
 #define __ALIGN_STR	__stringify(__ALIGN)
 #endif