All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/3] nolibc fixes marked for -stable
@ 2021-10-24 17:28 Willy Tarreau
  2021-10-24 17:28 ` [PATCH 1/3] tools/nolibc: x86-64: Fix startup code bug Willy Tarreau
                   ` (3 more replies)
  0 siblings, 4 replies; 8+ messages in thread
From: Willy Tarreau @ 2021-10-24 17:28 UTC (permalink / raw)
  To: Paul E . McKenney
  Cc: Bedirhan KURT, Louvian Lyndal, Ammar Faizi, Peter Cordes, linux-kernel

Hi Paul,

here are a few fixes for nolibc. Ammar Faizi figured that the stack was
not properly aligned on x86_64, it was aligned for after the call instead
of before due to my misunderstanding of the spec. This made me check i386
and I got it wrong there as well. Others are OK as they do not push but
switch pointers on a call. The problem is essentially detected when using
SIMD instructions (either voluntarily or when the compiler does it on its
own).

A second (less important) issue is that I thought that it was up to the
userland code to truncate the code passed to exit() to 8 bits while it's
the kernel that does it. The difference is subtle but is visible in strace,
and this was reported by Ammar as well. This time it affected all supported
archs.

This series is based on 5.15-rc6. I marked them for backport to stable,
just in case anyone uses nolibc for bisecting bugs.

Thanks!
Willy


Ammar Faizi (1):
  tools/nolibc: x86-64: Fix startup code bug

Willy Tarreau (2):
  tools/nolibc: i386: fix initial stack alignment
  tools/nolibc: fix incorrect truncation of exit code

 tools/include/nolibc/nolibc.h | 33 ++++++++++++++++++++++-----------
 1 file changed, 22 insertions(+), 11 deletions(-)

-- 
2.17.5


^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH 1/3] tools/nolibc: x86-64: Fix startup code bug
  2021-10-24 17:28 [PATCH 0/3] nolibc fixes marked for -stable Willy Tarreau
@ 2021-10-24 17:28 ` Willy Tarreau
  2021-10-24 17:28 ` [PATCH 2/3] tools/nolibc: i386: fix initial stack alignment Willy Tarreau
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 8+ messages in thread
From: Willy Tarreau @ 2021-10-24 17:28 UTC (permalink / raw)
  To: Paul E . McKenney
  Cc: Bedirhan KURT, Louvian Lyndal, Ammar Faizi, Peter Cordes,
	linux-kernel, stable

From: Ammar Faizi <ammar.faizi@students.amikom.ac.id>

Before this patch, the `_start` function looks like this:
```
0000000000001170 <_start>:
    1170:	pop    %rdi
    1171:	mov    %rsp,%rsi
    1174:	lea    0x8(%rsi,%rdi,8),%rdx
    1179:	and    $0xfffffffffffffff0,%rsp
    117d:	sub    $0x8,%rsp
    1181:	call   1000 <main>
    1186:	movzbq %al,%rdi
    118a:	mov    $0x3c,%rax
    1191:	syscall
    1193:	hlt
    1194:	data16 cs nopw 0x0(%rax,%rax,1)
    119f:	nop
```
Note the "and" to %rsp with $-16, it makes the %rsp be 16-byte aligned,
but then there is a "sub" with $0x8 which makes the %rsp no longer
16-byte aligned, then it calls main. That's the bug!

What actually the x86-64 System V ABI mandates is that right before the
"call", the %rsp must be 16-byte aligned, not after the "call". So the
"sub" with $0x8 here breaks the alignment. Remove it.

An example where this rule matters is when the callee needs to align
its stack at 16-byte for aligned move instruction, like `movdqa` and
`movaps`. If the callee can't align its stack properly, it will result
in segmentation fault.

x86-64 System V ABI also mandates the deepest stack frame should be
zero. Just to be safe, let's zero the %rbp on startup as the content
of %rbp may be unspecified when the program starts. Now it looks like
this:
```
0000000000001170 <_start>:
    1170:	pop    %rdi
    1171:	mov    %rsp,%rsi
    1174:	lea    0x8(%rsi,%rdi,8),%rdx
    1179:	xor    %ebp,%ebp                # zero the %rbp
    117b:	and    $0xfffffffffffffff0,%rsp # align the %rsp
    117f:	call   1000 <main>
    1184:	movzbq %al,%rdi
    1188:	mov    $0x3c,%rax
    118f:	syscall
    1191:	hlt
    1192:	data16 cs nopw 0x0(%rax,%rax,1)
    119d:	nopl   (%rax)
```

Cc: Bedirhan KURT <windowz414@gnuweeb.org>
Cc: Louvian Lyndal <louvianlyndal@gmail.com>
Reported-by: Peter Cordes <peter@cordes.ca>
Signed-off-by: Ammar Faizi <ammar.faizi@students.amikom.ac.id>
[wt: I did this on purpose due to a misunderstanding of the spec, other
     archs will thus have to be rechecked, particularly i386]
Cc: stable@vger.kernel.org
Signed-off-by: Willy Tarreau <w@1wt.eu>
---
 tools/include/nolibc/nolibc.h | 10 ++++++++--
 1 file changed, 8 insertions(+), 2 deletions(-)

diff --git a/tools/include/nolibc/nolibc.h b/tools/include/nolibc/nolibc.h
index 3430667b0d24..96b6d56acb57 100644
--- a/tools/include/nolibc/nolibc.h
+++ b/tools/include/nolibc/nolibc.h
@@ -399,14 +399,20 @@ struct stat {
 })
 
 /* startup code */
+/*
+ * x86-64 System V ABI mandates:
+ * 1) %rsp must be 16-byte aligned right before the function call.
+ * 2) The deepest stack frame should be zero (the %rbp).
+ *
+ */
 asm(".section .text\n"
     ".global _start\n"
     "_start:\n"
     "pop %rdi\n"                // argc   (first arg, %rdi)
     "mov %rsp, %rsi\n"          // argv[] (second arg, %rsi)
     "lea 8(%rsi,%rdi,8),%rdx\n" // then a NULL then envp (third arg, %rdx)
-    "and $-16, %rsp\n"          // x86 ABI : esp must be 16-byte aligned when
-    "sub $8, %rsp\n"            // entering the callee
+    "xor %ebp, %ebp\n"          // zero the stack frame
+    "and $-16, %rsp\n"          // x86 ABI : esp must be 16-byte aligned before call
     "call main\n"               // main() returns the status code, we'll exit with it.
     "movzb %al, %rdi\n"         // retrieve exit code from 8 lower bits
     "mov $60, %rax\n"           // NR_exit == 60
-- 
2.17.5


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH 2/3] tools/nolibc: i386: fix initial stack alignment
  2021-10-24 17:28 [PATCH 0/3] nolibc fixes marked for -stable Willy Tarreau
  2021-10-24 17:28 ` [PATCH 1/3] tools/nolibc: x86-64: Fix startup code bug Willy Tarreau
@ 2021-10-24 17:28 ` Willy Tarreau
  2021-10-25  7:46   ` David Laight
  2021-10-24 17:28 ` [PATCH 3/3] tools/nolibc: fix incorrect truncation of exit code Willy Tarreau
  2021-10-24 22:52 ` [PATCH 0/3] nolibc fixes marked for -stable Paul E. McKenney
  3 siblings, 1 reply; 8+ messages in thread
From: Willy Tarreau @ 2021-10-24 17:28 UTC (permalink / raw)
  To: Paul E . McKenney
  Cc: Bedirhan KURT, Louvian Lyndal, Ammar Faizi, Peter Cordes,
	linux-kernel, stable

After re-checking in the spec and comparing stack offsets with glibc,
The last pushed argument must be 16-byte aligned (i.e. aligned before the
call) so that in the callee esp+4 is multiple of 16, so the principle is
the 32-bit equivalent to what Ammar fixed for x86_64. It's possible that
32-bit code using SSE2 or MMX could have been affected. In addition the
frame pointer ought to be zero at the deepest level.

Link: https://gitlab.com/x86-psABIs/i386-ABI/-/wikis/Intel386-psABI
Cc: Ammar Faizi <ammar.faizi@students.amikom.ac.id>
Cc: stable@vger.kernel.org
Signed-off-by: Willy Tarreau <w@1wt.eu>
---
 tools/include/nolibc/nolibc.h | 10 +++++++++-
 1 file changed, 9 insertions(+), 1 deletion(-)

diff --git a/tools/include/nolibc/nolibc.h b/tools/include/nolibc/nolibc.h
index 96b6d56acb57..7f300dc379e7 100644
--- a/tools/include/nolibc/nolibc.h
+++ b/tools/include/nolibc/nolibc.h
@@ -583,13 +583,21 @@ struct sys_stat_struct {
 })
 
 /* startup code */
+/*
+ * i386 System V ABI mandates:
+ * 1) last pushed argument must be 16-byte aligned.
+ * 2) The deepest stack frame should be set to zero
+ *
+ */
 asm(".section .text\n"
     ".global _start\n"
     "_start:\n"
     "pop %eax\n"                // argc   (first arg, %eax)
     "mov %esp, %ebx\n"          // argv[] (second arg, %ebx)
     "lea 4(%ebx,%eax,4),%ecx\n" // then a NULL then envp (third arg, %ecx)
-    "and $-16, %esp\n"          // x86 ABI : esp must be 16-byte aligned when
+    "xor %ebp, %ebp\n"          // zero the stack frame
+    "and $-16, %esp\n"          // x86 ABI : esp must be 16-byte aligned before
+    "sub $4, %esp\n"            // the call instruction (args are aligned)
     "push %ecx\n"               // push all registers on the stack so that we
     "push %ebx\n"               // support both regparm and plain stack modes
     "push %eax\n"
-- 
2.17.5


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH 3/3] tools/nolibc: fix incorrect truncation of exit code
  2021-10-24 17:28 [PATCH 0/3] nolibc fixes marked for -stable Willy Tarreau
  2021-10-24 17:28 ` [PATCH 1/3] tools/nolibc: x86-64: Fix startup code bug Willy Tarreau
  2021-10-24 17:28 ` [PATCH 2/3] tools/nolibc: i386: fix initial stack alignment Willy Tarreau
@ 2021-10-24 17:28 ` Willy Tarreau
  2021-10-24 22:52 ` [PATCH 0/3] nolibc fixes marked for -stable Paul E. McKenney
  3 siblings, 0 replies; 8+ messages in thread
From: Willy Tarreau @ 2021-10-24 17:28 UTC (permalink / raw)
  To: Paul E . McKenney
  Cc: Bedirhan KURT, Louvian Lyndal, Ammar Faizi, Peter Cordes,
	linux-kernel, stable

Ammar Faizi reported that our exit code handling is wrong. We truncate
it to the lowest 8 bits but the syscall itself is expected to take a
regular 32-bit signed integer, not an unsigned char. It's the kernel
that later truncates it to the lowest 8 bits. The difference is visible
in strace, where the program below used to show exit(255) instead of
exit(-1):

  int main(void)
  {
        return -1;
  }

This patch applies the fix to all archs. x86_64, i386, arm64, armv7 and
mips were all tested and confirmed to work fine now. Risc-v was not
tested but the change is trivial and exactly the same as for other archs.

Reported-by: Ammar Faizi <ammar.faizi@students.amikom.ac.id>
Cc: stable@vger.kernel.org
Signed-off-by: Willy Tarreau <w@1wt.eu>
---
 tools/include/nolibc/nolibc.h | 13 +++++--------
 1 file changed, 5 insertions(+), 8 deletions(-)

diff --git a/tools/include/nolibc/nolibc.h b/tools/include/nolibc/nolibc.h
index 7f300dc379e7..3e2c6f2ed587 100644
--- a/tools/include/nolibc/nolibc.h
+++ b/tools/include/nolibc/nolibc.h
@@ -414,7 +414,7 @@ asm(".section .text\n"
     "xor %ebp, %ebp\n"          // zero the stack frame
     "and $-16, %rsp\n"          // x86 ABI : esp must be 16-byte aligned before call
     "call main\n"               // main() returns the status code, we'll exit with it.
-    "movzb %al, %rdi\n"         // retrieve exit code from 8 lower bits
+    "mov %eax, %edi\n"          // retrieve exit code (32 bit)
     "mov $60, %rax\n"           // NR_exit == 60
     "syscall\n"                 // really exit
     "hlt\n"                     // ensure it does not return
@@ -602,9 +602,9 @@ asm(".section .text\n"
     "push %ebx\n"               // support both regparm and plain stack modes
     "push %eax\n"
     "call main\n"               // main() returns the status code in %eax
-    "movzbl %al, %ebx\n"        // retrieve exit code from lower 8 bits
-    "movl   $1, %eax\n"         // NR_exit == 1
-    "int    $0x80\n"            // exit now
+    "mov %eax, %ebx\n"          // retrieve exit code (32-bit int)
+    "movl $1, %eax\n"           // NR_exit == 1
+    "int $0x80\n"               // exit now
     "hlt\n"                     // ensure it does not
     "");
 
@@ -788,7 +788,6 @@ asm(".section .text\n"
     "and %r3, %r1, $-8\n"         // AAPCS : sp must be 8-byte aligned in the
     "mov %sp, %r3\n"              //         callee, an bl doesn't push (lr=pc)
     "bl main\n"                   // main() returns the status code, we'll exit with it.
-    "and %r0, %r0, $0xff\n"       // limit exit code to 8 bits
     "movs r7, $1\n"               // NR_exit == 1
     "svc $0x00\n"
     "");
@@ -985,7 +984,6 @@ asm(".section .text\n"
     "add x2, x2, x1\n"            //           + argv
     "and sp, x1, -16\n"           // sp must be 16-byte aligned in the callee
     "bl main\n"                   // main() returns the status code, we'll exit with it.
-    "and x0, x0, 0xff\n"          // limit exit code to 8 bits
     "mov x8, 93\n"                // NR_exit == 93
     "svc #0\n"
     "");
@@ -1190,7 +1188,7 @@ asm(".section .text\n"
     "addiu $sp,$sp,-16\n"         // the callee expects to save a0..a3 there!
     "jal main\n"                  // main() returns the status code, we'll exit with it.
     "nop\n"                       // delayed slot
-    "and $a0, $v0, 0xff\n"        // limit exit code to 8 bits
+    "move $a0, $v0\n"             // retrieve 32-bit exit code from v0
     "li $v0, 4001\n"              // NR_exit == 4001
     "syscall\n"
     ".end __start\n"
@@ -1388,7 +1386,6 @@ asm(".section .text\n"
     "add   a2,a2,a1\n"           //             + argv
     "andi  sp,a1,-16\n"          // sp must be 16-byte aligned
     "call  main\n"               // main() returns the status code, we'll exit with it.
-    "andi  a0, a0, 0xff\n"       // limit exit code to 8 bits
     "li a7, 93\n"                // NR_exit == 93
     "ecall\n"
     "");
-- 
2.17.5


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH 0/3] nolibc fixes marked for -stable
  2021-10-24 17:28 [PATCH 0/3] nolibc fixes marked for -stable Willy Tarreau
                   ` (2 preceding siblings ...)
  2021-10-24 17:28 ` [PATCH 3/3] tools/nolibc: fix incorrect truncation of exit code Willy Tarreau
@ 2021-10-24 22:52 ` Paul E. McKenney
  3 siblings, 0 replies; 8+ messages in thread
From: Paul E. McKenney @ 2021-10-24 22:52 UTC (permalink / raw)
  To: Willy Tarreau
  Cc: Bedirhan KURT, Louvian Lyndal, Ammar Faizi, Peter Cordes, linux-kernel

On Sun, Oct 24, 2021 at 07:28:13PM +0200, Willy Tarreau wrote:
> Hi Paul,
> 
> here are a few fixes for nolibc. Ammar Faizi figured that the stack was
> not properly aligned on x86_64, it was aligned for after the call instead
> of before due to my misunderstanding of the spec. This made me check i386
> and I got it wrong there as well. Others are OK as they do not push but
> switch pointers on a call. The problem is essentially detected when using
> SIMD instructions (either voluntarily or when the compiler does it on its
> own).
> 
> A second (less important) issue is that I thought that it was up to the
> userland code to truncate the code passed to exit() to 8 bits while it's
> the kernel that does it. The difference is subtle but is visible in strace,
> and this was reported by Ammar as well. This time it affected all supported
> archs.
> 
> This series is based on 5.15-rc6. I marked them for backport to stable,
> just in case anyone uses nolibc for bisecting bugs.
> 
> Thanks!
> Willy

Queued for v5.17, thank you both!

							Thanx, Paul

> Ammar Faizi (1):
>   tools/nolibc: x86-64: Fix startup code bug
> 
> Willy Tarreau (2):
>   tools/nolibc: i386: fix initial stack alignment
>   tools/nolibc: fix incorrect truncation of exit code
> 
>  tools/include/nolibc/nolibc.h | 33 ++++++++++++++++++++++-----------
>  1 file changed, 22 insertions(+), 11 deletions(-)
> 
> -- 
> 2.17.5
> 

^ permalink raw reply	[flat|nested] 8+ messages in thread

* RE: [PATCH 2/3] tools/nolibc: i386: fix initial stack alignment
  2021-10-24 17:28 ` [PATCH 2/3] tools/nolibc: i386: fix initial stack alignment Willy Tarreau
@ 2021-10-25  7:46   ` David Laight
  2021-10-25  8:06     ` Willy Tarreau
  0 siblings, 1 reply; 8+ messages in thread
From: David Laight @ 2021-10-25  7:46 UTC (permalink / raw)
  To: 'Willy Tarreau', Paul E . McKenney
  Cc: Bedirhan KURT, Louvian Lyndal, Ammar Faizi, Peter Cordes,
	linux-kernel, stable

From: Willy Tarreau
> Sent: 24 October 2021 18:28
> 
> After re-checking in the spec and comparing stack offsets with glibc,
> The last pushed argument must be 16-byte aligned (i.e. aligned before the
> call) so that in the callee esp+4 is multiple of 16, so the principle is
> the 32-bit equivalent to what Ammar fixed for x86_64. It's possible that
> 32-bit code using SSE2 or MMX could have been affected. In addition the
> frame pointer ought to be zero at the deepest level.
> 
...
>  /* startup code */
> +/*
> + * i386 System V ABI mandates:
> + * 1) last pushed argument must be 16-byte aligned.
> + * 2) The deepest stack frame should be set to zero

I'm pretty sure that the historic SYSV i386 ABI only every required
4-byte alignment for the stack.

At some point it got 'randomly' changed to 16-byte.
I don't think this happened until after compiler support for SSE2
intrinsics was added.
ISTR the NetBSD found that it was gcc that moved the goalposts.

	David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH 2/3] tools/nolibc: i386: fix initial stack alignment
  2021-10-25  7:46   ` David Laight
@ 2021-10-25  8:06     ` Willy Tarreau
  2021-10-25 12:48       ` David Laight
  0 siblings, 1 reply; 8+ messages in thread
From: Willy Tarreau @ 2021-10-25  8:06 UTC (permalink / raw)
  To: David Laight
  Cc: Paul E . McKenney, Bedirhan KURT, Louvian Lyndal, Ammar Faizi,
	Peter Cordes, linux-kernel, stable

On Mon, Oct 25, 2021 at 07:46:11AM +0000, David Laight wrote:
> From: Willy Tarreau
> > Sent: 24 October 2021 18:28
> > 
> > After re-checking in the spec and comparing stack offsets with glibc,
> > The last pushed argument must be 16-byte aligned (i.e. aligned before the
> > call) so that in the callee esp+4 is multiple of 16, so the principle is
> > the 32-bit equivalent to what Ammar fixed for x86_64. It's possible that
> > 32-bit code using SSE2 or MMX could have been affected. In addition the
> > frame pointer ought to be zero at the deepest level.
> > 
> ...
> >  /* startup code */
> > +/*
> > + * i386 System V ABI mandates:
> > + * 1) last pushed argument must be 16-byte aligned.
> > + * 2) The deepest stack frame should be set to zero
> 
> I'm pretty sure that the historic SYSV i386 ABI only every required
> 4-byte alignment for the stack.
> 
> At some point it got 'randomly' changed to 16-byte.
> I don't think this happened until after compiler support for SSE2
> intrinsics was added.

It's very possible because I've done a number of tests and noticed
that in some cases the called functions' stack doesn't seem to be
more than 4-aligned. However the deepest function in the stack starts
with an aligned stack so I prefer to follow this same rule.

Willy

^ permalink raw reply	[flat|nested] 8+ messages in thread

* RE: [PATCH 2/3] tools/nolibc: i386: fix initial stack alignment
  2021-10-25  8:06     ` Willy Tarreau
@ 2021-10-25 12:48       ` David Laight
  0 siblings, 0 replies; 8+ messages in thread
From: David Laight @ 2021-10-25 12:48 UTC (permalink / raw)
  To: 'Willy Tarreau'
  Cc: Paul E . McKenney, Bedirhan KURT, Louvian Lyndal, Ammar Faizi,
	Peter Cordes, linux-kernel, stable

From: Willy Tarreau
> Sent: 25 October 2021 09:06
> 
> On Mon, Oct 25, 2021 at 07:46:11AM +0000, David Laight wrote:
> > From: Willy Tarreau
> > > Sent: 24 October 2021 18:28
> > >
> > > After re-checking in the spec and comparing stack offsets with glibc,
> > > The last pushed argument must be 16-byte aligned (i.e. aligned before the
> > > call) so that in the callee esp+4 is multiple of 16, so the principle is
> > > the 32-bit equivalent to what Ammar fixed for x86_64. It's possible that
> > > 32-bit code using SSE2 or MMX could have been affected. In addition the
> > > frame pointer ought to be zero at the deepest level.
> > >
> > ...
> > >  /* startup code */
> > > +/*
> > > + * i386 System V ABI mandates:
> > > + * 1) last pushed argument must be 16-byte aligned.
> > > + * 2) The deepest stack frame should be set to zero
> >
> > I'm pretty sure that the historic SYSV i386 ABI only every required
> > 4-byte alignment for the stack.
> >
> > At some point it got 'randomly' changed to 16-byte.
> > I don't think this happened until after compiler support for SSE2
> > intrinsics was added.
> 
> It's very possible because I've done a number of tests and noticed
> that in some cases the called functions' stack doesn't seem to be
> more than 4-aligned. However the deepest function in the stack starts
> with an aligned stack so I prefer to follow this same rule.

Any call through asm is unlikely to maintain the 16 byte alignment.
But yes, starting off on the required (by gcc) alignment does no harm.

	David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)


^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2021-10-25 12:48 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-10-24 17:28 [PATCH 0/3] nolibc fixes marked for -stable Willy Tarreau
2021-10-24 17:28 ` [PATCH 1/3] tools/nolibc: x86-64: Fix startup code bug Willy Tarreau
2021-10-24 17:28 ` [PATCH 2/3] tools/nolibc: i386: fix initial stack alignment Willy Tarreau
2021-10-25  7:46   ` David Laight
2021-10-25  8:06     ` Willy Tarreau
2021-10-25 12:48       ` David Laight
2021-10-24 17:28 ` [PATCH 3/3] tools/nolibc: fix incorrect truncation of exit code Willy Tarreau
2021-10-24 22:52 ` [PATCH 0/3] nolibc fixes marked for -stable Paul E. McKenney

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.