linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] tools/nolibc: x86: Remove `r8`, `r9` and `r10` from the clobber list
@ 2021-10-11  4:03 Ammar Faizi
  2021-10-12  5:28 ` Willy Tarreau
  2021-10-13  9:03 ` [PATCH v2] " Ammar Faizi
  0 siblings, 2 replies; 45+ messages in thread
From: Ammar Faizi @ 2021-10-11  4:03 UTC (permalink / raw)
  To: Willy Tarreau, Paul Walmsley, Palmer Dabbelt, Albert Ou,
	Linux Kernel Mailing List
  Cc: Ammar Faizi, Andy Lutomirski, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, x86, H. Peter Anvin

Linux x86-64 syscall only clobbers rax, rcx and r11 (and "memory").

  - rax for the return value.
  - rcx to save the return address.
  - r11 to save the rflags.

Other registers are preserved.

Having r8, r9 and r10 in the syscall clobber list is harmless, but this
results in a missed-optimization.

As the syscall doesn't clobber r8-r10, GCC should be allowed to reuse
their value after the syscall returns to userspace. But since they are
in the clobber list, GCC will always miss this opportunity.

Remove them from the x86-64 syscall clobber list to help GCC generate
better code and fix the comment.

Cc: Andy Lutomirski <luto@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: x86@kernel.org
Cc: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Ammar Faizi <ammar.faizi@students.amikom.ac.id>
---
 tools/include/nolibc/nolibc.h | 28 ++++++++++++++--------------
 1 file changed, 14 insertions(+), 14 deletions(-)

diff --git a/tools/include/nolibc/nolibc.h b/tools/include/nolibc/nolibc.h
index 3430667b0d24..8c9f2202d6b6 100644
--- a/tools/include/nolibc/nolibc.h
+++ b/tools/include/nolibc/nolibc.h
@@ -265,7 +265,7 @@ struct stat {
  *   - arguments are in rdi, rsi, rdx, r10, r8, r9 respectively
  *   - the system call is performed by calling the syscall instruction
  *   - syscall return comes in rax
- *   - rcx and r8..r11 may be clobbered, others are preserved.
+ *   - rcx and r11 are clobbered, others are preserved.
  *   - the arguments are cast to long and assigned into the target registers
  *     which are then simply passed as registers to the asm code, so that we
  *     don't have to experience issues with register constraints.
@@ -280,9 +280,9 @@ struct stat {
 									      \
 	asm volatile (                                                        \
 		"syscall\n"                                                   \
-		: "=a" (_ret)                                                 \
+		: "=a"(_ret)                                                  \
 		: "0"(_num)                                                   \
-		: "rcx", "r8", "r9", "r10", "r11", "memory", "cc"             \
+		: "rcx", "r11", "memory", "cc"                                \
 	);                                                                    \
 	_ret;                                                                 \
 })
@@ -295,10 +295,10 @@ struct stat {
 									      \
 	asm volatile (                                                        \
 		"syscall\n"                                                   \
-		: "=a" (_ret)                                                 \
+		: "=a"(_ret)                                                  \
 		: "r"(_arg1),                                                 \
 		  "0"(_num)                                                   \
-		: "rcx", "r8", "r9", "r10", "r11", "memory", "cc"             \
+		: "rcx", "r11", "memory", "cc"                                \
 	);                                                                    \
 	_ret;                                                                 \
 })
@@ -312,10 +312,10 @@ struct stat {
 									      \
 	asm volatile (                                                        \
 		"syscall\n"                                                   \
-		: "=a" (_ret)                                                 \
+		: "=a"(_ret)                                                  \
 		: "r"(_arg1), "r"(_arg2),                                     \
 		  "0"(_num)                                                   \
-		: "rcx", "r8", "r9", "r10", "r11", "memory", "cc"             \
+		: "rcx", "r11", "memory", "cc"                                \
 	);                                                                    \
 	_ret;                                                                 \
 })
@@ -330,10 +330,10 @@ struct stat {
 									      \
 	asm volatile (                                                        \
 		"syscall\n"                                                   \
-		: "=a" (_ret)                                                 \
+		: "=a"(_ret)                                                  \
 		: "r"(_arg1), "r"(_arg2), "r"(_arg3),                         \
 		  "0"(_num)                                                   \
-		: "rcx", "r8", "r9", "r10", "r11", "memory", "cc"             \
+		: "rcx", "r11", "memory", "cc"                                \
 	);                                                                    \
 	_ret;                                                                 \
 })
@@ -349,10 +349,10 @@ struct stat {
 									      \
 	asm volatile (                                                        \
 		"syscall\n"                                                   \
-		: "=a" (_ret), "=r"(_arg4)                                    \
+		: "=a"(_ret)                                                  \
 		: "r"(_arg1), "r"(_arg2), "r"(_arg3), "r"(_arg4),             \
 		  "0"(_num)                                                   \
-		: "rcx", "r8", "r9", "r11", "memory", "cc"                    \
+		: "rcx", "r11", "memory", "cc"                                \
 	);                                                                    \
 	_ret;                                                                 \
 })
@@ -369,10 +369,10 @@ struct stat {
 									      \
 	asm volatile (                                                        \
 		"syscall\n"                                                   \
-		: "=a" (_ret), "=r"(_arg4), "=r"(_arg5)                       \
+		: "=a" (_ret)                                                 \
 		: "r"(_arg1), "r"(_arg2), "r"(_arg3), "r"(_arg4), "r"(_arg5), \
 		  "0"(_num)                                                   \
-		: "rcx", "r9", "r11", "memory", "cc"                          \
+		: "rcx", "r11", "memory", "cc"                                \
 	);                                                                    \
 	_ret;                                                                 \
 })
@@ -390,7 +390,7 @@ struct stat {
 									      \
 	asm volatile (                                                        \
 		"syscall\n"                                                   \
-		: "=a" (_ret), "=r"(_arg4), "=r"(_arg5)                       \
+		: "=a"(_ret)                                                  \
 		: "r"(_arg1), "r"(_arg2), "r"(_arg3), "r"(_arg4), "r"(_arg5), \
 		  "r"(_arg6), "0"(_num)                                       \
 		: "rcx", "r11", "memory", "cc"                                \
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 45+ messages in thread

* Re: [PATCH] tools/nolibc: x86: Remove `r8`, `r9` and `r10` from the clobber list
  2021-10-11  4:03 [PATCH] tools/nolibc: x86: Remove `r8`, `r9` and `r10` from the clobber list Ammar Faizi
@ 2021-10-12  5:28 ` Willy Tarreau
  2021-10-12  8:36   ` Ammar Faizi
  2021-10-13  9:03 ` [PATCH v2] " Ammar Faizi
  1 sibling, 1 reply; 45+ messages in thread
From: Willy Tarreau @ 2021-10-12  5:28 UTC (permalink / raw)
  To: Ammar Faizi
  Cc: Paul Walmsley, Palmer Dabbelt, Albert Ou,
	Linux Kernel Mailing List, Andy Lutomirski, Thomas Gleixner,
	Ingo Molnar, Borislav Petkov, x86, H. Peter Anvin

Hello Ammar,

First, thanks for your patch. I have a few questions below.

On Mon, Oct 11, 2021 at 11:03:44AM +0700, Ammar Faizi wrote:
> Linux x86-64 syscall only clobbers rax, rcx and r11 (and "memory").
> 
>   - rax for the return value.
>   - rcx to save the return address.
>   - r11 to save the rflags.
> 
> Other registers are preserved.
> 
> Having r8, r9 and r10 in the syscall clobber list is harmless, but this
> results in a missed-optimization.
> 
> As the syscall doesn't clobber r8-r10, GCC should be allowed to reuse
> their value after the syscall returns to userspace. But since they are
> in the clobber list, GCC will always miss this opportunity.

I agree with your conclusion about this but need to be perfectly sure
about the exact clobber list since I got a different impression when
implementing the calls. I got the impression that these ones were
clobbered by reading entry_64.S below:

 * Registers on entry:
 * rax  system call number
 * rcx  return address
 * r11  saved rflags (note: r11 is callee-clobbered register in C ABI)
 * rdi  arg0
 * rsi  arg1
 * rdx  arg2
 * r10  arg3 (needs to be moved to rcx to conform to C ABI)
 * r8   arg4
 * r9   arg5
 * (note: r12-r15, rbp, rbx are callee-preserved in C ABI)

See this last note ? Indicating that r12-15, rbp and rbx are preserved
made me think it's not the case for the other ones, but that might of
course be a misunderstanding.

And calling.h says this:

 x86 function call convention, 64-bit:
 -------------------------------------
  arguments           |  callee-saved      | extra caller-saved | return
 [callee-clobbered]   |                    | [callee-clobbered] |
 ---------------------------------------------------------------------------
 rdi rsi rdx rcx r8-9 | rbx rbp [*] r12-15 | r10-11             | rax, rdx [**]

Note that it's indicated "function call convention", not "syscall",
leaving it open to have extra restrictions on syscalls. Later by
reading the POP_REGS macro called with pop_rdi=0 and skipr11rcx=1
on syscall leave, it seems to restore everything but r11, rcx, rax
and rdi (which is restored last with rsp).

As such, could you please add in your commit message a link to the
location where you found that authoritative information above it there
is a better place (i.e. one that doesn't require to read all the macros)?
This would significantly help anyone facing the same doubts about this
in the future.

Thank you!
Willy

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH] tools/nolibc: x86: Remove `r8`, `r9` and `r10` from the clobber list
  2021-10-12  5:28 ` Willy Tarreau
@ 2021-10-12  8:36   ` Ammar Faizi
  2021-10-12  9:06     ` Willy Tarreau
  0 siblings, 1 reply; 45+ messages in thread
From: Ammar Faizi @ 2021-10-12  8:36 UTC (permalink / raw)
  To: Willy Tarreau, Paul Walmsley, Palmer Dabbelt, Albert Ou,
	Linux Kernel Mailing List
  Cc: Andy Lutomirski, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	x86, H. Peter Anvin

On Tue, Oct 12, 2021 at 12:28 PM Willy Tarreau <w@1wt.eu> wrote:
>
> Hello Ammar,
>
> First, thanks for your patch. I have a few questions below.
>
> On Mon, Oct 11, 2021 at 11:03:44AM +0700, Ammar Faizi wrote:
> > Linux x86-64 syscall only clobbers rax, rcx and r11 (and "memory").
> >
> >   - rax for the return value.
> >   - rcx to save the return address.
> >   - r11 to save the rflags.
> >
> > Other registers are preserved.
> >
> > Having r8, r9 and r10 in the syscall clobber list is harmless, but this
> > results in a missed-optimization.
> >
> > As the syscall doesn't clobber r8-r10, GCC should be allowed to reuse
> > their value after the syscall returns to userspace. But since they are
> > in the clobber list, GCC will always miss this opportunity.
>
> I agree with your conclusion about this but need to be perfectly sure
> about the exact clobber list since I got a different impression when
> implementing the calls. I got the impression that these ones were
> clobbered by reading entry_64.S below:
>
>  * Registers on entry:
>  * rax  system call number
>  * rcx  return address
>  * r11  saved rflags (note: r11 is callee-clobbered register in C ABI)
>  * rdi  arg0
>  * rsi  arg1
>  * rdx  arg2
>  * r10  arg3 (needs to be moved to rcx to conform to C ABI)
>  * r8   arg4
>  * r9   arg5
>  * (note: r12-r15, rbp, rbx are callee-preserved in C ABI)
>
> See this last note ? Indicating that r12-15, rbp and rbx are preserved
> made me think it's not the case for the other ones, but that might of
> course be a misunderstanding.
>
> And calling.h says this:
>
>  x86 function call convention, 64-bit:
>  -------------------------------------
>   arguments           |  callee-saved      | extra caller-saved | return
>  [callee-clobbered]   |                    | [callee-clobbered] |
>  ---------------------------------------------------------------------------
>  rdi rsi rdx rcx r8-9 | rbx rbp [*] r12-15 | r10-11             | rax, rdx [**]
>
> Note that it's indicated "function call convention", not "syscall",
> leaving it open to have extra restrictions on syscalls. Later by
> reading the POP_REGS macro called with pop_rdi=0 and skipr11rcx=1
> on syscall leave, it seems to restore everything but r11, rcx, rax
> and rdi (which is restored last with rsp).
>
> As such, could you please add in your commit message a link to the
> location where you found that authoritative information above it there
> is a better place (i.e. one that doesn't require to read all the macros)?
> This would significantly help anyone facing the same doubts about this
> in the future.

Hi Willy,

I have tried to search for the documentation about this one, but I
couldn't find any. Checking at `Documentation/x86/entry_64.rst`, but
it doesn't tell anything relevant.

Background story, I browsed the nolibc.h file and found extra clobbers
for Linux x86-64 syscall which I think they are unneccesary (r8, r9
and r10).

This finding had me worried a bit, because I have written syscall in
inline ASM, based on my understanding the required clobbers are "rcx",
"r11" and "memory". So in my mind was "Is my app safe?".

So I opened a discussion on Stack Overflow, yesterday:
https://stackoverflow.com/questions/69515893/when-does-linux-x86-64-syscall-clobber-r8-r9-and-r10

While waiting for answers on Stack Overflow, I also checked "entry_64.S"
and "calling.h". Now I strongly believe syscall only clobbers 3
registers, they are rax, rcx and r11. The answer and comments on SO
support it as well.

These two macros do the job:

	// saves all GPRs and zero some of them
	PUSH_AND_CLEAR_REGS

	// restore all GPRs, except rdi, r11, rcx
	POP_REGS pop_rdi=0 skip_r11rcx=1

	// later rdi is also restored.

My stance comes from SO, Telegram group discussion, and reading source
code. Therefore, I don't think I can attach the link to it as
"authoritative information". Or can I?

When I sent this patch, I also added entry_64.S's maintainers to CC
list. In hope they can help to at least acknowledge it. Mainly because
I can't find the documentation from Linux that tells about this.

Andy, Thomas, Ingo, Borislav, H. Peter.

Could one of you shed some light so that I can attach the link to your
message in the commit message?

Thanks,

-- 
Ammar Faizi

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH] tools/nolibc: x86: Remove `r8`, `r9` and `r10` from the clobber list
  2021-10-12  8:36   ` Ammar Faizi
@ 2021-10-12  9:06     ` Willy Tarreau
  2021-10-12 20:29       ` Borislav Petkov
  2021-10-12 21:21       ` David Laight
  0 siblings, 2 replies; 45+ messages in thread
From: Willy Tarreau @ 2021-10-12  9:06 UTC (permalink / raw)
  To: Ammar Faizi
  Cc: Paul Walmsley, Palmer Dabbelt, Albert Ou,
	Linux Kernel Mailing List, Andy Lutomirski, Thomas Gleixner,
	Ingo Molnar, Borislav Petkov, x86, H. Peter Anvin

On Tue, Oct 12, 2021 at 03:36:44PM +0700, Ammar Faizi wrote:
> I have tried to search for the documentation about this one, but I
> couldn't find any. Checking at `Documentation/x86/entry_64.rst`, but
> it doesn't tell anything relevant.
(...)

OK thanks for the detailed story, thus I didn't miss any obvious
reference.

> My stance comes from SO, Telegram group discussion, and reading source
> code. Therefore, I don't think I can attach the link to it as
> "authoritative information". Or can I?

You're right, that's not exactly what we can call authoritative :-)

> When I sent this patch, I also added entry_64.S's maintainers to CC
> list. In hope they can help to at least acknowledge it. Mainly because
> I can't find the documentation from Linux that tells about this.
> 
> Andy, Thomas, Ingo, Borislav, H. Peter.
> 
> Could one of you shed some light so that I can attach the link to your
> message in the commit message?

Let's indeed wait for any of the x86 maintainers to confirm your
analysis.

Thanks!
Willy

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH] tools/nolibc: x86: Remove `r8`, `r9` and `r10` from the clobber list
  2021-10-12  9:06     ` Willy Tarreau
@ 2021-10-12 20:29       ` Borislav Petkov
  2021-10-12 21:51         ` Borislav Petkov
  2021-10-12 22:23         ` Ammar Faizi
  2021-10-12 21:21       ` David Laight
  1 sibling, 2 replies; 45+ messages in thread
From: Borislav Petkov @ 2021-10-12 20:29 UTC (permalink / raw)
  To: Willy Tarreau
  Cc: Ammar Faizi, Paul Walmsley, Palmer Dabbelt, Albert Ou,
	Linux Kernel Mailing List, Andy Lutomirski, Thomas Gleixner,
	Ingo Molnar, x86, H. Peter Anvin

On Tue, Oct 12, 2021 at 11:06:38AM +0200, Willy Tarreau wrote:
> Let's indeed wait for any of the x86 maintainers to confirm your
> analysis.

I guess the official doc you guys are looking for is the x86-64 ABI:

https://gitlab.com/x86-psABIs/x86-64-ABI/

and recent pdfs are here:

https://gitlab.com/x86-psABIs/x86-64-ABI/-/wikis/x86-64-psABI

and there you scroll to

"A.2 AMD64 Linux Kernel Conventions

...

A.2.1 Calling Conventions"

and that section explains which regs go where.

HTH.

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply	[flat|nested] 45+ messages in thread

* RE: [PATCH] tools/nolibc: x86: Remove `r8`, `r9` and `r10` from the clobber list
  2021-10-12  9:06     ` Willy Tarreau
  2021-10-12 20:29       ` Borislav Petkov
@ 2021-10-12 21:21       ` David Laight
  2021-10-12 23:02         ` Subject: " Ammar Faizi
  1 sibling, 1 reply; 45+ messages in thread
From: David Laight @ 2021-10-12 21:21 UTC (permalink / raw)
  To: 'Willy Tarreau', Ammar Faizi
  Cc: Paul Walmsley, Palmer Dabbelt, Albert Ou,
	Linux Kernel Mailing List, Andy Lutomirski, Thomas Gleixner,
	Ingo Molnar, Borislav Petkov, x86, H. Peter Anvin

From: Willy Tarreau
> Sent: 12 October 2021 10:07
> 
> On Tue, Oct 12, 2021 at 03:36:44PM +0700, Ammar Faizi wrote:
> > I have tried to search for the documentation about this one, but I
> > couldn't find any. Checking at `Documentation/x86/entry_64.rst`, but
> > it doesn't tell anything relevant.
> (...)
> 
> OK thanks for the detailed story, thus I didn't miss any obvious
> reference.
> 
> > My stance comes from SO, Telegram group discussion, and reading source
> > code. Therefore, I don't think I can attach the link to it as
> > "authoritative information". Or can I?
> 
> You're right, that's not exactly what we can call authoritative :-)

Given the cost of a system call the code benefit from telling
gcc that r8 to r10 are preserved is likely to be noise.
Especially since most syscalls are made from C library stubs
so the application calling code will assume they are trashed.

There may even be a bigger gain from the syscall exit code just
setting the registers to zero (instead of restoring them).

There are probably even bigger gains from zeroing the AVX
registers (which, IIRC, are all caller-saved) somewhere
between syscall entry and the process sleeping.
(This can't be done for non-syscall kernel entry.)

	David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)


^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH] tools/nolibc: x86: Remove `r8`, `r9` and `r10` from the clobber list
  2021-10-12 20:29       ` Borislav Petkov
@ 2021-10-12 21:51         ` Borislav Petkov
  2021-10-12 22:23         ` Ammar Faizi
  1 sibling, 0 replies; 45+ messages in thread
From: Borislav Petkov @ 2021-10-12 21:51 UTC (permalink / raw)
  To: Willy Tarreau
  Cc: Ammar Faizi, Paul Walmsley, Palmer Dabbelt, Albert Ou,
	Linux Kernel Mailing List, Andy Lutomirski, Thomas Gleixner,
	Ingo Molnar, x86, H. Peter Anvin

On Tue, Oct 12, 2021 at 10:29:55PM +0200, Borislav Petkov wrote:
> On Tue, Oct 12, 2021 at 11:06:38AM +0200, Willy Tarreau wrote:
> > Let's indeed wait for any of the x86 maintainers to confirm your
> > analysis.
> 
> I guess the official doc you guys are looking for is the x86-64 ABI:
> 
> https://gitlab.com/x86-psABIs/x86-64-ABI/
> 
> and recent pdfs are here:
> 
> https://gitlab.com/x86-psABIs/x86-64-ABI/-/wikis/x86-64-psABI
> 
> and there you scroll to
> 
> "A.2 AMD64 Linux Kernel Conventions
> 
> ...
> 
> A.2.1 Calling Conventions"
> 
> and that section explains which regs go where.

Ok, that didn't point to the exact answer - I realize now. I believe
what you're looking for in that doc is "Figure 3.4: Register Usage"
which has a column "callee saved" and says that the syscall arg
registers are all not callee-saved.

And I think that table is valid for the kernel too because the link
to it says "The Linux AMD64 kernel uses internally the same calling
conventions as user-level applications (see section 3.2.3 for details)."

HTH.

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH] tools/nolibc: x86: Remove `r8`, `r9` and `r10` from the clobber list
  2021-10-12 20:29       ` Borislav Petkov
  2021-10-12 21:51         ` Borislav Petkov
@ 2021-10-12 22:23         ` Ammar Faizi
  2021-10-13  3:01           ` Willy Tarreau
  2021-10-13 12:43           ` Borislav Petkov
  1 sibling, 2 replies; 45+ messages in thread
From: Ammar Faizi @ 2021-10-12 22:23 UTC (permalink / raw)
  To: Willy Tarreau, Paul Walmsley, Palmer Dabbelt, Albert Ou,
	Linux Kernel Mailing List
  Cc: Andy Lutomirski, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	x86, H. Peter Anvin

On Wed, Oct 13, 2021 at 4:51 AM Borislav Petkov <bp@alien8.de> wrote:
>
> On Tue, Oct 12, 2021 at 10:29:55PM +0200, Borislav Petkov wrote:
> > On Tue, Oct 12, 2021 at 11:06:38AM +0200, Willy Tarreau wrote:
> > > Let's indeed wait for any of the x86 maintainers to confirm your
> > > analysis.
> >
> > I guess the official doc you guys are looking for is the x86-64 ABI:
> >
> > https://gitlab.com/x86-psABIs/x86-64-ABI/
> >
> > and recent pdfs are here:
> >
> > https://gitlab.com/x86-psABIs/x86-64-ABI/-/wikis/x86-64-psABI
> >
> > and there you scroll to
> >
> > "A.2 AMD64 Linux Kernel Conventions
> >
> > ...
> >
> > A.2.1 Calling Conventions"
> >
> > and that section explains which regs go where.
>
> Ok, that didn't point to the exact answer - I realize now. I believe
> what you're looking for in that doc is "Figure 3.4: Register Usage"
> which has a column "callee saved" and says that the syscall arg
> registers are all not callee-saved.

No, you were right. A.2.1 was the part we are looking for, thanks for
pointed that. That's the exact answer.

"Figure 3.4: Register Usage" is not the answer, if it were, nolibc.h
would be broken as it is missing "rdi", "rsi", "rdx" in the clobber list.

More info:
I know for a fact that every "syscall" in the libc is wrapped with a
function call, hence those registers are not callee saved, thus clobbered.

However, that is not the case for nolibc.h, because we have a potential
to inline the "syscall" instruction (0f 05) to the user functions. Thus
A.2.1 applies, and "syscall" instruction is not a "call".

All syscalls in the nolibc.h are written as a static function with inline
ASM and are likely always inline if we use optimization flag, so this is
a profit not to have r8, r9 and r10 in the clobber list (currently we 
have them).

------------------------------------------------------------------

According to x86-64 ABI about syscall section A.2 AMD64 Linux Kernel
Conventions, A.2.1 Calling Conventions [1]:

1) User-level applications use as integer registers for passing the
   sequence %rdi, %rsi, %rdx, %rcx, %r8 and %r9. The kernel interface
   uses %rdi, %rsi, %rdx, %r10, %r8 and %r9.

2) A system-call is done via the syscall instruction. The kernel
   destroys registers %rcx and %r11.

3) The number of the syscall has to be passed in register %rax.

4) System-calls are limited to six arguments, no argument is passed
   directly on the stack.

5) Returning from the syscall, register %rax contains the result of
   the system-call. A value in the range between -4095 and -1
   indicates an error, it is -errno.

6) Only values of class INTEGER or class MEMORY are passed to the
   kernel.

From (2), (5) and (6), we can conclude that Linux x86-64 syscall only
clobbers rax, rcx and r11 (and "memory").

  - rax for the return value.
  - rcx to save the return address.
  - r11 to save the rflags.

Other registers are preserved. Right?
Am I missing something?

>
> And I think that table is valid for the kernel too because the link
> to it says "The Linux AMD64 kernel uses internally the same calling
> conventions as user-level applications (see section 3.2.3 for details)."
>
> HTH.
>

-- 
Ammar Faizi

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Subject: RE: [PATCH] tools/nolibc: x86: Remove `r8`, `r9` and `r10` from the clobber list
  2021-10-12 21:21       ` David Laight
@ 2021-10-12 23:02         ` Ammar Faizi
  0 siblings, 0 replies; 45+ messages in thread
From: Ammar Faizi @ 2021-10-12 23:02 UTC (permalink / raw)
  To: David Laight
  Cc: Willy Tarreau, Paul Walmsley, Palmer Dabbelt, Albert Ou,
	Linux Kernel Mailing List, Andy Lutomirski, Thomas Gleixner,
	Ingo Molnar, Borislav Petkov, x86, H. Peter Anvin

On Wed, Oct 13, 2021 at 4:21 AM David Laight <David.Laight@aculab.com> wrote:
>
> From: Willy Tarreau
> > Sent: 12 October 2021 10:07
> >
> > On Tue, Oct 12, 2021 at 03:36:44PM +0700, Ammar Faizi wrote:
> > > I have tried to search for the documentation about this one, but I
> > > couldn't find any. Checking at `Documentation/x86/entry_64.rst`, but
> > > it doesn't tell anything relevant.
> > (...)
> >
> > OK thanks for the detailed story, thus I didn't miss any obvious
> > reference.
> >
> > > My stance comes from SO, Telegram group discussion, and reading source
> > > code. Therefore, I don't think I can attach the link to it as
> > > "authoritative information". Or can I?
> >
> > You're right, that's not exactly what we can call authoritative :-)
>
> Given the cost of a system call the code benefit from telling
> gcc that r8 to r10 are preserved is likely to be noise.
> Especially since most syscalls are made from C library stubs
> so the application calling code will assume they are trashed.
>
> There may even be a bigger gain from the syscall exit code just
> setting the registers to zero (instead of restoring them).

Setting those registers to zero on "syscall_return_via_sysret" would
need to edit entry_64.S and that apparently breaks the userspace and
results in an ABI change.

>
> There are probably even bigger gains from zeroing the AVX
> registers (which, IIRC, are all caller-saved) somewhere
> between syscall entry and the process sleeping.
> (This can't be done for non-syscall kernel entry.)
>

I copy and paste my message just to clarify the misunderstanding here. We
don't intend to change the ABI, so we can only strive for gaining more
profit to optimize what we can do based on the current situation.

I know for a fact that every "syscall" in the libc is wrapped with a
function call.

However, that is not the case for nolibc.h, because we have a potential
to inline the "syscall" instruction (0f 05) to the user functions.

All syscalls in the nolibc.h are written as a static function with inline
ASM and are likely always inline if we use optimization flag, so this is
a profit not to have r8, r9 and r10 in the clobber list (currently we 
have them).

FWIIW, I created an example where this matters.

```
#include "tools/include/nolibc/nolibc.h"

#define read_abc(a, b, c) __asm__ volatile(""::"r"(a),"r"(b),"r"(c))

int main(void)
{
    int a = 0xaa;
    int b = 0xbb;
    int c = 0xcc;

    read_abc(a, b, c);
    write(1, "test\n", 5);
    read_abc(a, b, c);

    return 0;
}
```

Compile with:
    gcc -Os test.c -o test -nostdlib


With r8, r9, r10 in the clobber list, results in this:

0000000000001000 <main>:
    1000:	f3 0f 1e fa          	endbr64 
    1004:	41 54                	push   %r12
    1006:	41 bc cc 00 00 00    	mov    $0xcc,%r12d
    100c:	55                   	push   %rbp
    100d:	bd bb 00 00 00       	mov    $0xbb,%ebp
    1012:	53                   	push   %rbx
    1013:	bb aa 00 00 00       	mov    $0xaa,%ebx
    1018:	b8 01 00 00 00       	mov    $0x1,%eax
    101d:	bf 01 00 00 00       	mov    $0x1,%edi
    1022:	ba 05 00 00 00       	mov    $0x5,%edx
    1027:	48 8d 35 d2 0f 00 00 	lea    0xfd2(%rip),%rsi
    102e:	0f 05                	syscall 
    1030:	31 c0                	xor    %eax,%eax
    1032:	5b                   	pop    %rbx
    1033:	5d                   	pop    %rbp
    1034:	41 5c                	pop    %r12
    1036:	c3                   	ret 

GCC thinks that syscall will clobber r8, r9, r10. So it spills 0xaa,
0xbb and 0xcc to callee saved registers (r12, rbp and rbx). This is 
clearly extra memory access and extra stack size for preserving them.

But syscall does not actually clobber them, so this is a missed
optimization.

Now without r8, r9, r10 in the clobber list, results in better ASM code:

0000000000001000 <main>:
    1000:   f3 0f 1e fa             endbr64 
    1004:   41 b8 aa 00 00 00       mov    $0xaa,%r8d
    100a:   41 b9 bb 00 00 00       mov    $0xbb,%r9d
    1010:   41 ba cc 00 00 00       mov    $0xcc,%r10d
    1016:   b8 01 00 00 00          mov    $0x1,%eax
    101b:   bf 01 00 00 00          mov    $0x1,%edi
    1020:   ba 05 00 00 00          mov    $0x5,%edx
    1025:   48 8d 35 d4 0f 00 00    lea    0xfd4(%rip),%rsi
    102c:   0f 05                   syscall 
    102e:   31 c0                   xor    %eax,%eax
    1030:   c3                      ret  

Does that make sense?

-- 
Ammar Faizi

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH] tools/nolibc: x86: Remove `r8`, `r9` and `r10` from the clobber list
  2021-10-12 22:23         ` Ammar Faizi
@ 2021-10-13  3:01           ` Willy Tarreau
  2021-10-13  3:32             ` Ammar Faizi
  2021-10-13 12:43           ` Borislav Petkov
  1 sibling, 1 reply; 45+ messages in thread
From: Willy Tarreau @ 2021-10-13  3:01 UTC (permalink / raw)
  To: Ammar Faizi
  Cc: Paul Walmsley, Palmer Dabbelt, Albert Ou,
	Linux Kernel Mailing List, Andy Lutomirski, Thomas Gleixner,
	Ingo Molnar, Borislav Petkov, x86, H. Peter Anvin

On Wed, Oct 13, 2021 at 05:23:11AM +0700, Ammar Faizi wrote:
> > > A.2.1 Calling Conventions"
> > >
> > > and that section explains which regs go where.
> >
> > Ok, that didn't point to the exact answer - I realize now. I believe
> > what you're looking for in that doc is "Figure 3.4: Register Usage"
> > which has a column "callee saved" and says that the syscall arg
> > registers are all not callee-saved.
> 
> No, you were right. A.2.1 was the part we are looking for, thanks for
> pointed that. That's the exact answer.

Yes I agree, that's exactly it, thank you Boris!

> According to x86-64 ABI about syscall section A.2 AMD64 Linux Kernel
> Conventions, A.2.1 Calling Conventions [1]:
> 
> 1) User-level applications use as integer registers for passing the
>    sequence %rdi, %rsi, %rdx, %rcx, %r8 and %r9. The kernel interface
>    uses %rdi, %rsi, %rdx, %r10, %r8 and %r9.
> 
> 2) A system-call is done via the syscall instruction. The kernel
>    destroys registers %rcx and %r11.
> 
> 3) The number of the syscall has to be passed in register %rax.
> 
> 4) System-calls are limited to six arguments, no argument is passed
>    directly on the stack.
> 
> 5) Returning from the syscall, register %rax contains the result of
>    the system-call. A value in the range between -4095 and -1
>    indicates an error, it is -errno.
> 
> 6) Only values of class INTEGER or class MEMORY are passed to the
>    kernel.
> 
> From (2), (5) and (6), we can conclude that Linux x86-64 syscall only
> clobbers rax, rcx and r11 (and "memory").
> 
>   - rax for the return value.
>   - rcx to save the return address.
>   - r11 to save the rflags.
> 
> Other registers are preserved. Right?

I totally agree, and this doc is perfectly clear on this. I think it
would be worth updating the comments in calling.h to reference this
document and remind these rules, given that they're not trivial to
figure from the code itself.

Willy

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH] tools/nolibc: x86: Remove `r8`, `r9` and `r10` from the clobber list
  2021-10-13  3:01           ` Willy Tarreau
@ 2021-10-13  3:32             ` Ammar Faizi
  2021-10-13  3:34               ` Ammar Faizi
  0 siblings, 1 reply; 45+ messages in thread
From: Ammar Faizi @ 2021-10-13  3:32 UTC (permalink / raw)
  To: Willy Tarreau
  Cc: Paul Walmsley, Palmer Dabbelt, Albert Ou,
	Linux Kernel Mailing List, Andy Lutomirski, Thomas Gleixner,
	Ingo Molnar, Borislav Petkov, x86, H. Peter Anvin

On Wed, Oct 13, 2021 at 10:01 AM Willy Tarreau <w@1wt.eu> wrote:
>
> On Wed, Oct 13, 2021 at 05:23:11AM +0700, Ammar Faizi wrote:
> > According to x86-64 ABI about syscall section A.2 AMD64 Linux Kernel
> > Conventions, A.2.1 Calling Conventions [1]:
> >
> > 1) User-level applications use as integer registers for passing the
> >    sequence %rdi, %rsi, %rdx, %rcx, %r8 and %r9. The kernel interface
> >    uses %rdi, %rsi, %rdx, %r10, %r8 and %r9.
> >
> > 2) A system-call is done via the syscall instruction. The kernel
> >    destroys registers %rcx and %r11.
> >
> > 3) The number of the syscall has to be passed in register %rax.
> >
> > 4) System-calls are limited to six arguments, no argument is passed
> >    directly on the stack.
> >
> > 5) Returning from the syscall, register %rax contains the result of
> >    the system-call. A value in the range between -4095 and -1
> >    indicates an error, it is -errno.
> >
> > 6) Only values of class INTEGER or class MEMORY are passed to the
> >    kernel.
> >
> > From (2), (5) and (6), we can conclude that Linux x86-64 syscall only
> > clobbers rax, rcx and r11 (and "memory").
> >
> >   - rax for the return value.
> >   - rcx to save the return address.
> >   - r11 to save the rflags.
> >
> > Other registers are preserved. Right?
>
> I totally agree, and this doc is perfectly clear on this. I think it
> would be worth updating the comments in calling.h to reference this
> document and remind these rules, given that they're not trivial to
> figure from the code itself.
>

Alright, I will wire up patch v2 for this :-)

-- 
Ammar Faizi

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH] tools/nolibc: x86: Remove `r8`, `r9` and `r10` from the clobber list
  2021-10-13  3:32             ` Ammar Faizi
@ 2021-10-13  3:34               ` Ammar Faizi
  2021-10-13  3:37                 ` Ammar Faizi
  0 siblings, 1 reply; 45+ messages in thread
From: Ammar Faizi @ 2021-10-13  3:34 UTC (permalink / raw)
  To: Willy Tarreau
  Cc: Paul Walmsley, Palmer Dabbelt, Albert Ou,
	Linux Kernel Mailing List, Andy Lutomirski, Thomas Gleixner,
	Ingo Molnar, Borislav Petkov, x86, H. Peter Anvin

On Wed, Oct 13, 2021 at 10:32 AM Ammar Faizi
<ammar.faizi@students.amikom.ac.id> wrote:
>
> On Wed, Oct 13, 2021 at 10:01 AM Willy Tarreau <w@1wt.eu> wrote:
> >
> > On Wed, Oct 13, 2021 at 05:23:11AM +0700, Ammar Faizi wrote:
> > > According to x86-64 ABI about syscall section A.2 AMD64 Linux Kernel
> > > Conventions, A.2.1 Calling Conventions [1]:
> > >
> > > 1) User-level applications use as integer registers for passing the
> > >    sequence %rdi, %rsi, %rdx, %rcx, %r8 and %r9. The kernel interface
> > >    uses %rdi, %rsi, %rdx, %r10, %r8 and %r9.
> > >
> > > 2) A system-call is done via the syscall instruction. The kernel
> > >    destroys registers %rcx and %r11.
> > >
> > > 3) The number of the syscall has to be passed in register %rax.
> > >
> > > 4) System-calls are limited to six arguments, no argument is passed
> > >    directly on the stack.
> > >
> > > 5) Returning from the syscall, register %rax contains the result of
> > >    the system-call. A value in the range between -4095 and -1
> > >    indicates an error, it is -errno.
> > >
> > > 6) Only values of class INTEGER or class MEMORY are passed to the
> > >    kernel.
> > >
> > > From (2), (5) and (6), we can conclude that Linux x86-64 syscall only
> > > clobbers rax, rcx and r11 (and "memory").
> > >
> > >   - rax for the return value.
> > >   - rcx to save the return address.
> > >   - r11 to save the rflags.
> > >
> > > Other registers are preserved. Right?
> >
> > I totally agree, and this doc is perfectly clear on this. I think it
> > would be worth updating the comments in calling.h to reference this
> > document and remind these rules, given that they're not trivial to
> > figure from the code itself.
> >
>
> Alright, I will wire up patch v2 for this :-)
>

Not for calling.h, nolibc.h I mean.

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH] tools/nolibc: x86: Remove `r8`, `r9` and `r10` from the clobber list
  2021-10-13  3:34               ` Ammar Faizi
@ 2021-10-13  3:37                 ` Ammar Faizi
  0 siblings, 0 replies; 45+ messages in thread
From: Ammar Faizi @ 2021-10-13  3:37 UTC (permalink / raw)
  To: Willy Tarreau
  Cc: Paul Walmsley, Palmer Dabbelt, Albert Ou,
	Linux Kernel Mailing List, Andy Lutomirski, Thomas Gleixner,
	Ingo Molnar, Borislav Petkov, x86, H. Peter Anvin

On Wed, Oct 13, 2021 at 10:34 AM Ammar Faizi
<ammar.faizi@students.amikom.ac.id> wrote:
>
> On Wed, Oct 13, 2021 at 10:32 AM Ammar Faizi
> <ammar.faizi@students.amikom.ac.id> wrote:
> >
> > On Wed, Oct 13, 2021 at 10:01 AM Willy Tarreau <w@1wt.eu> wrote:
> > >
> > > On Wed, Oct 13, 2021 at 05:23:11AM +0700, Ammar Faizi wrote:
> > > > According to x86-64 ABI about syscall section A.2 AMD64 Linux Kernel
> > > > Conventions, A.2.1 Calling Conventions [1]:
> > > >
> > > > 1) User-level applications use as integer registers for passing the
> > > >    sequence %rdi, %rsi, %rdx, %rcx, %r8 and %r9. The kernel interface
> > > >    uses %rdi, %rsi, %rdx, %r10, %r8 and %r9.
> > > >
> > > > 2) A system-call is done via the syscall instruction. The kernel
> > > >    destroys registers %rcx and %r11.
> > > >
> > > > 3) The number of the syscall has to be passed in register %rax.
> > > >
> > > > 4) System-calls are limited to six arguments, no argument is passed
> > > >    directly on the stack.
> > > >
> > > > 5) Returning from the syscall, register %rax contains the result of
> > > >    the system-call. A value in the range between -4095 and -1
> > > >    indicates an error, it is -errno.
> > > >
> > > > 6) Only values of class INTEGER or class MEMORY are passed to the
> > > >    kernel.
> > > >
> > > > From (2), (5) and (6), we can conclude that Linux x86-64 syscall only
> > > > clobbers rax, rcx and r11 (and "memory").
> > > >
> > > >   - rax for the return value.
> > > >   - rcx to save the return address.
> > > >   - r11 to save the rflags.
> > > >
> > > > Other registers are preserved. Right?
> > >
> > > I totally agree, and this doc is perfectly clear on this. I think it
> > > would be worth updating the comments in calling.h to reference this
> > > document and remind these rules, given that they're not trivial to
> > > figure from the code itself.
> > >
> >
> > Alright, I will wire up patch v2 for this :-)
> >
>
> Not for calling.h, nolibc.h I mean.

If Borislav, or others agree with my reply above, it's fine if one can do that
for calling.h (or maybe better in entry_64.S)?

-- 
Ammar Faizi

^ permalink raw reply	[flat|nested] 45+ messages in thread

* [PATCH v2] tools/nolibc: x86: Remove `r8`, `r9` and `r10` from the clobber list
  2021-10-11  4:03 [PATCH] tools/nolibc: x86: Remove `r8`, `r9` and `r10` from the clobber list Ammar Faizi
  2021-10-12  5:28 ` Willy Tarreau
@ 2021-10-13  9:03 ` Ammar Faizi
  2021-10-15  8:25   ` [PATCH 0/2] Fix clobber list and startup code bug Ammar Faizi
  1 sibling, 1 reply; 45+ messages in thread
From: Ammar Faizi @ 2021-10-13  9:03 UTC (permalink / raw)
  To: Willy Tarreau
  Cc: David Laight, Paul Walmsley, Palmer Dabbelt, Albert Ou,
	Linux Kernel Mailing List, Andy Lutomirski, Thomas Gleixner,
	Ingo Molnar, Borislav Petkov, x86, H. Peter Anvin, Ammar Faizi

Linux x86-64 syscall only clobbers rax, rcx and r11 (and "memory").

  - rax for the return value.
  - rcx to save the return address.
  - r11 to save the rflags.

Other registers are preserved.

Having r8, r9 and r10 in the syscall clobber list is harmless, but this
results in a missed-optimization.

As the syscall doesn't clobber r8-r10, GCC should be allowed to reuse
their value after the syscall returns to userspace. But since they are
in the clobber list, GCC will always miss this opportunity.

Remove them from the x86-64 syscall clobber list to help GCC generate
better code and fix the comment.

See also the x86-64 ABI, section A.2 AMD64 Linux Kernel Conventions,
A.2.1 Calling Conventions [1].

Extra note:
Some people may think it does not really give a benefit to remove r8,
r9 and r10 from the syscall clobber list because the impression of
syscall is a C function call, and function call always clobbers those 3.

However, that is not the case for nolibc.h, because we have a potential
to inline the "syscall" instruction (which its opcode is "0f 05") to the
user functions.

All syscalls in the nolibc.h are written as a static function with inline
ASM and are likely always inline if we use optimization flag, so this is
a profit not to have r8, r9 and r10 in the clobber list.

Here is the example where this matters.

Consider the following C code:
```
  #include "tools/include/nolibc/nolibc.h"
  #define read_abc(a, b, c) __asm__ volatile("nop"::"r"(a),"r"(b),"r"(c))

  int main(void)
  {
  	int a = 0xaa;
  	int b = 0xbb;
  	int c = 0xcc;

  	read_abc(a, b, c);
  	write(1, "test\n", 5);
  	read_abc(a, b, c);

  	return 0;
  }
```

Compile with:
    gcc -Os test.c -o test -nostdlib

With r8, r9, r10 in the clobber list, GCC generates this:

0000000000001000 <main>:
    1000:	f3 0f 1e fa          	endbr64
    1004:	41 54                	push   %r12
    1006:	41 bc cc 00 00 00    	mov    $0xcc,%r12d
    100c:	55                   	push   %rbp
    100d:	bd bb 00 00 00       	mov    $0xbb,%ebp
    1012:	53                   	push   %rbx
    1013:	bb aa 00 00 00       	mov    $0xaa,%ebx
    1018:	90                   	nop
    1019:	b8 01 00 00 00       	mov    $0x1,%eax
    101e:	bf 01 00 00 00       	mov    $0x1,%edi
    1023:	ba 05 00 00 00       	mov    $0x5,%edx
    1028:	48 8d 35 d1 0f 00 00 	lea    0xfd1(%rip),%rsi
    102f:	0f 05                	syscall
    1031:	90                   	nop
    1032:	31 c0                	xor    %eax,%eax
    1034:	5b                   	pop    %rbx
    1035:	5d                   	pop    %rbp
    1036:	41 5c                	pop    %r12
    1038:	c3                   	ret

GCC thinks that syscall will clobber r8, r9, r10. So it spills 0xaa,
0xbb and 0xcc to callee saved registers (r12, rbp and rbx). This is
clearly extra memory access and extra stack size for preserving them.

But syscall does not actually clobber them, so this is a missed
optimization.

Now without r8, r9, r10 in the clobber list, GCC generates better code:

0000000000001000 <main>:
    1000:	f3 0f 1e fa          	endbr64
    1004:	41 b8 aa 00 00 00    	mov    $0xaa,%r8d
    100a:	41 b9 bb 00 00 00    	mov    $0xbb,%r9d
    1010:	41 ba cc 00 00 00    	mov    $0xcc,%r10d
    1016:	90                   	nop
    1017:	b8 01 00 00 00       	mov    $0x1,%eax
    101c:	bf 01 00 00 00       	mov    $0x1,%edi
    1021:	ba 05 00 00 00       	mov    $0x5,%edx
    1026:	48 8d 35 d3 0f 00 00 	lea    0xfd3(%rip),%rsi
    102d:	0f 05                	syscall
    102f:	90                   	nop
    1030:	31 c0                	xor    %eax,%eax
    1032:	c3                   	ret

Cc: Andy Lutomirski <luto@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: x86@kernel.org
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: David Laight <David.Laight@ACULAB.COM>
Signed-off-by: Ammar Faizi <ammar.faizi@students.amikom.ac.id>
Link: https://gitlab.com/x86-psABIs/x86-64-ABI/-/wikis/x86-64-psABI [1]
---

v2: Add more detailed information and link to x86-64 ABI.

Link to v1: https://lore.kernel.org/lkml/20211011040344.437264-1-ammar.faizi@students.amikom.ac.id/

 tools/include/nolibc/nolibc.h | 33 +++++++++++++++++++--------------
 1 file changed, 19 insertions(+), 14 deletions(-)

diff --git a/tools/include/nolibc/nolibc.h b/tools/include/nolibc/nolibc.h
index 3430667b0d24..1483d95c8330 100644
--- a/tools/include/nolibc/nolibc.h
+++ b/tools/include/nolibc/nolibc.h
@@ -265,12 +265,17 @@ struct stat {
  *   - arguments are in rdi, rsi, rdx, r10, r8, r9 respectively
  *   - the system call is performed by calling the syscall instruction
  *   - syscall return comes in rax
- *   - rcx and r8..r11 may be clobbered, others are preserved.
+ *   - rcx and r11 are clobbered, others are preserved.
  *   - the arguments are cast to long and assigned into the target registers
  *     which are then simply passed as registers to the asm code, so that we
  *     don't have to experience issues with register constraints.
  *   - the syscall number is always specified last in order to allow to force
  *     some registers before (gcc refuses a %-register at the last position).
+ *   - see also x86-64 ABI section A.2 AMD64 Linux Kernel Conventions, A.2.1
+ *     Calling Conventions.
+ *
+ * Link x86-64 ABI: https://gitlab.com/x86-psABIs/x86-64-ABI/-/wikis/x86-64-psABI
+ *
  */
 
 #define my_syscall0(num)                                                      \
@@ -280,9 +285,9 @@ struct stat {
 									      \
 	asm volatile (                                                        \
 		"syscall\n"                                                   \
-		: "=a" (_ret)                                                 \
+		: "=a"(_ret)                                                  \
 		: "0"(_num)                                                   \
-		: "rcx", "r8", "r9", "r10", "r11", "memory", "cc"             \
+		: "rcx", "r11", "memory", "cc"                                \
 	);                                                                    \
 	_ret;                                                                 \
 })
@@ -295,10 +300,10 @@ struct stat {
 									      \
 	asm volatile (                                                        \
 		"syscall\n"                                                   \
-		: "=a" (_ret)                                                 \
+		: "=a"(_ret)                                                  \
 		: "r"(_arg1),                                                 \
 		  "0"(_num)                                                   \
-		: "rcx", "r8", "r9", "r10", "r11", "memory", "cc"             \
+		: "rcx", "r11", "memory", "cc"                                \
 	);                                                                    \
 	_ret;                                                                 \
 })
@@ -312,10 +317,10 @@ struct stat {
 									      \
 	asm volatile (                                                        \
 		"syscall\n"                                                   \
-		: "=a" (_ret)                                                 \
+		: "=a"(_ret)                                                  \
 		: "r"(_arg1), "r"(_arg2),                                     \
 		  "0"(_num)                                                   \
-		: "rcx", "r8", "r9", "r10", "r11", "memory", "cc"             \
+		: "rcx", "r11", "memory", "cc"                                \
 	);                                                                    \
 	_ret;                                                                 \
 })
@@ -330,10 +335,10 @@ struct stat {
 									      \
 	asm volatile (                                                        \
 		"syscall\n"                                                   \
-		: "=a" (_ret)                                                 \
+		: "=a"(_ret)                                                  \
 		: "r"(_arg1), "r"(_arg2), "r"(_arg3),                         \
 		  "0"(_num)                                                   \
-		: "rcx", "r8", "r9", "r10", "r11", "memory", "cc"             \
+		: "rcx", "r11", "memory", "cc"                                \
 	);                                                                    \
 	_ret;                                                                 \
 })
@@ -349,10 +354,10 @@ struct stat {
 									      \
 	asm volatile (                                                        \
 		"syscall\n"                                                   \
-		: "=a" (_ret), "=r"(_arg4)                                    \
+		: "=a"(_ret)                                                  \
 		: "r"(_arg1), "r"(_arg2), "r"(_arg3), "r"(_arg4),             \
 		  "0"(_num)                                                   \
-		: "rcx", "r8", "r9", "r11", "memory", "cc"                    \
+		: "rcx", "r11", "memory", "cc"                                \
 	);                                                                    \
 	_ret;                                                                 \
 })
@@ -369,10 +374,10 @@ struct stat {
 									      \
 	asm volatile (                                                        \
 		"syscall\n"                                                   \
-		: "=a" (_ret), "=r"(_arg4), "=r"(_arg5)                       \
+		: "=a"(_ret)                                                  \
 		: "r"(_arg1), "r"(_arg2), "r"(_arg3), "r"(_arg4), "r"(_arg5), \
 		  "0"(_num)                                                   \
-		: "rcx", "r9", "r11", "memory", "cc"                          \
+		: "rcx", "r11", "memory", "cc"                                \
 	);                                                                    \
 	_ret;                                                                 \
 })
@@ -390,7 +395,7 @@ struct stat {
 									      \
 	asm volatile (                                                        \
 		"syscall\n"                                                   \
-		: "=a" (_ret), "=r"(_arg4), "=r"(_arg5)                       \
+		: "=a"(_ret)                                                  \
 		: "r"(_arg1), "r"(_arg2), "r"(_arg3), "r"(_arg4), "r"(_arg5), \
 		  "r"(_arg6), "0"(_num)                                       \
 		: "rcx", "r11", "memory", "cc"                                \
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 45+ messages in thread

* Re: [PATCH] tools/nolibc: x86: Remove `r8`, `r9` and `r10` from the clobber list
  2021-10-12 22:23         ` Ammar Faizi
  2021-10-13  3:01           ` Willy Tarreau
@ 2021-10-13 12:43           ` Borislav Petkov
  2021-10-13 12:51             ` Willy Tarreau
  2021-10-23 20:40             ` H. Peter Anvin
  1 sibling, 2 replies; 45+ messages in thread
From: Borislav Petkov @ 2021-10-13 12:43 UTC (permalink / raw)
  To: Ammar Faizi
  Cc: Willy Tarreau, Paul Walmsley, Palmer Dabbelt, Albert Ou,
	Linux Kernel Mailing List, Andy Lutomirski, Thomas Gleixner,
	Ingo Molnar, x86, H. Peter Anvin

On Wed, Oct 13, 2021 at 05:23:11AM +0700, Ammar Faizi wrote:
> "Figure 3.4: Register Usage" is not the answer, if it were, nolibc.h
> would be broken as it is missing "rdi", "rsi", "rdx" in the clobber list.

It is not about what happens in practice but what the contract is:
syscall argument registers can potentially get clobbered and userspace
should treat them as such. Because if the kernel decides to actually
clobber them for whatever reason and some userspace thing thinks
otherwise, then it is the userspace thing's problem as it doesn't adhere
to the well known ABI.

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH] tools/nolibc: x86: Remove `r8`, `r9` and `r10` from the clobber list
  2021-10-13 12:43           ` Borislav Petkov
@ 2021-10-13 12:51             ` Willy Tarreau
  2021-10-13 13:06               ` Borislav Petkov
  2021-10-23 20:40             ` H. Peter Anvin
  1 sibling, 1 reply; 45+ messages in thread
From: Willy Tarreau @ 2021-10-13 12:51 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Ammar Faizi, Paul Walmsley, Palmer Dabbelt, Albert Ou,
	Linux Kernel Mailing List, Andy Lutomirski, Thomas Gleixner,
	Ingo Molnar, x86, H. Peter Anvin

Hi Boris,

On Wed, Oct 13, 2021 at 02:43:25PM +0200, Borislav Petkov wrote:
> On Wed, Oct 13, 2021 at 05:23:11AM +0700, Ammar Faizi wrote:
> > "Figure 3.4: Register Usage" is not the answer, if it were, nolibc.h
> > would be broken as it is missing "rdi", "rsi", "rdx" in the clobber list.
> 
> It is not about what happens in practice but what the contract is:
> syscall argument registers can potentially get clobbered and userspace
> should treat them as such. Because if the kernel decides to actually
> clobber them for whatever reason and some userspace thing thinks
> otherwise, then it is the userspace thing's problem as it doesn't adhere
> to the well known ABI.

I agree and that's why my question was about that authoritative doc, to
know the contract (since this one will not change under our feet). But
according to the doc you pointed, here the contract for syscalls is that
only rcx and r11 are clobbered (and rax gets the result). If so, I'm
personally fine with this.

Thanks,
Willy

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH] tools/nolibc: x86: Remove `r8`, `r9` and `r10` from the clobber list
  2021-10-13 12:51             ` Willy Tarreau
@ 2021-10-13 13:06               ` Borislav Petkov
  2021-10-13 14:07                 ` Willy Tarreau
  0 siblings, 1 reply; 45+ messages in thread
From: Borislav Petkov @ 2021-10-13 13:06 UTC (permalink / raw)
  To: Willy Tarreau
  Cc: Ammar Faizi, Paul Walmsley, Palmer Dabbelt, Albert Ou,
	Linux Kernel Mailing List, Andy Lutomirski, Thomas Gleixner,
	Ingo Molnar, x86, H. Peter Anvin, Michael Matz

On Wed, Oct 13, 2021 at 02:51:42PM +0200, Willy Tarreau wrote:
> > > "Figure 3.4: Register Usage" is not the answer, if it were, nolibc.h
> > > would be broken as it is missing "rdi", "rsi", "rdx" in the clobber list.
> > 
> > It is not about what happens in practice but what the contract is:
> > syscall argument registers can potentially get clobbered and userspace
> > should treat them as such. Because if the kernel decides to actually
> > clobber them for whatever reason and some userspace thing thinks
> > otherwise, then it is the userspace thing's problem as it doesn't adhere
> > to the well known ABI.
> 
> I agree and that's why my question was about that authoritative doc, to
> know the contract (since this one will not change under our feet). But
> according to the doc you pointed, here the contract for syscalls is that
> only rcx and r11 are clobbered (and rax gets the result). If so, I'm
> personally fine with this.

Well, I guess that doc could probably state explicitly that argument
registers are callee-clobbered.

The way I'm reading that doc is, as already pasted:

"The Linux AMD64 kernel uses internally the same calling conventions as
user-level applications (see section 3.2.3 for details).

...

The interface between the C library and the Linux kernel is the same as
for the user-level applications with the following differences:

The kernel interface uses %rdi , %rsi , %rdx , %r10 , %r8 and %r9."

So, calling conventions in the kernel are the same as for user-level
apps with the register difference. And argument registers are
callee-clobbered.

Now, when you look at section 3.2.3 Parameter Passing and scroll to
Figure 3.4: Register Usage, it'll give you the info that those argument
registers are not callee-saved.

They appear to be but that's not in the contract. Which means that they
can *potentially* get clobbered. The stress being on "potentially".

And just to make sure I'm reading this correctly, I've asked one of the
authors of that document (CCed) and he confirmed what I'm stating above.

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH] tools/nolibc: x86: Remove `r8`, `r9` and `r10` from the clobber list
  2021-10-13 13:06               ` Borislav Petkov
@ 2021-10-13 14:07                 ` Willy Tarreau
  2021-10-13 14:20                   ` Borislav Petkov
  0 siblings, 1 reply; 45+ messages in thread
From: Willy Tarreau @ 2021-10-13 14:07 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Ammar Faizi, Paul Walmsley, Palmer Dabbelt, Albert Ou,
	Linux Kernel Mailing List, Andy Lutomirski, Thomas Gleixner,
	Ingo Molnar, x86, H. Peter Anvin, Michael Matz

On Wed, Oct 13, 2021 at 03:06:23PM +0200, Borislav Petkov wrote:
> On Wed, Oct 13, 2021 at 02:51:42PM +0200, Willy Tarreau wrote:
> > > > "Figure 3.4: Register Usage" is not the answer, if it were, nolibc.h
> > > > would be broken as it is missing "rdi", "rsi", "rdx" in the clobber list.
> > > 
> > > It is not about what happens in practice but what the contract is:
> > > syscall argument registers can potentially get clobbered and userspace
> > > should treat them as such. Because if the kernel decides to actually
> > > clobber them for whatever reason and some userspace thing thinks
> > > otherwise, then it is the userspace thing's problem as it doesn't adhere
> > > to the well known ABI.
> > 
> > I agree and that's why my question was about that authoritative doc, to
> > know the contract (since this one will not change under our feet). But
> > according to the doc you pointed, here the contract for syscalls is that
> > only rcx and r11 are clobbered (and rax gets the result). If so, I'm
> > personally fine with this.
> 
> Well, I guess that doc could probably state explicitly that argument
> registers are callee-clobbered.
> 
> The way I'm reading that doc is, as already pasted:
> 
> "The Linux AMD64 kernel uses internally the same calling conventions as
> user-level applications (see section 3.2.3 for details).
> 
> ...
> 
> The interface between the C library and the Linux kernel is the same as
> for the user-level applications with the following differences:
> 
> The kernel interface uses %rdi , %rsi , %rdx , %r10 , %r8 and %r9."
> 
> So, calling conventions in the kernel are the same as for user-level
> apps with the register difference. And argument registers are
> callee-clobbered.
> 
> Now, when you look at section 3.2.3 Parameter Passing and scroll to
> Figure 3.4: Register Usage, it'll give you the info that those argument
> registers are not callee-saved.
> 
> They appear to be but that's not in the contract. Which means that they
> can *potentially* get clobbered. The stress being on "potentially".

Yes I agree with the "potentially" here. If it can potentially be (i.e.
the kernel is allowed by contract to later change the way it's currently
done) then we have to save them even if it means lower code efficiency.

If, however, the kernel performs such savings on purpose because it is
willing to observe a stricter saving than the AMD64 ABI, we can follow
it but only once it's written down somewhere that it is by contract and
will not change.

> And just to make sure I'm reading this correctly, I've asked one of the
> authors of that document (CCed) and he confirmed what I'm stating above.

Thank you!

Willy

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH] tools/nolibc: x86: Remove `r8`, `r9` and `r10` from the clobber list
  2021-10-13 14:07                 ` Willy Tarreau
@ 2021-10-13 14:20                   ` Borislav Petkov
  2021-10-13 14:24                     ` Willy Tarreau
  0 siblings, 1 reply; 45+ messages in thread
From: Borislav Petkov @ 2021-10-13 14:20 UTC (permalink / raw)
  To: Willy Tarreau
  Cc: Ammar Faizi, Paul Walmsley, Palmer Dabbelt, Albert Ou,
	Linux Kernel Mailing List, Andy Lutomirski, Thomas Gleixner,
	Ingo Molnar, x86, H. Peter Anvin, Michael Matz

On Wed, Oct 13, 2021 at 04:07:23PM +0200, Willy Tarreau wrote:
> Yes I agree with the "potentially" here. If it can potentially be (i.e.
> the kernel is allowed by contract to later change the way it's currently
> done) then we have to save them even if it means lower code efficiency.
>
> If, however, the kernel performs such savings on purpose because it is
> willing to observe a stricter saving than the AMD64 ABI, we can follow
> it but only once it's written down somewhere that it is by contract and
> will not change.

Right, and Micha noted that such a change to the document can be done.
And we're basically doing that registers restoring anyway, in POP_REGS.

I'm not the least bit convinced it is worth enforcing that stricter
register saving, though.

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH] tools/nolibc: x86: Remove `r8`, `r9` and `r10` from the clobber list
  2021-10-13 14:20                   ` Borislav Petkov
@ 2021-10-13 14:24                     ` Willy Tarreau
  2021-10-13 16:24                       ` Michael Matz
  0 siblings, 1 reply; 45+ messages in thread
From: Willy Tarreau @ 2021-10-13 14:24 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Ammar Faizi, Paul Walmsley, Palmer Dabbelt, Albert Ou,
	Linux Kernel Mailing List, Andy Lutomirski, Thomas Gleixner,
	Ingo Molnar, x86, H. Peter Anvin, Michael Matz

On Wed, Oct 13, 2021 at 04:20:55PM +0200, Borislav Petkov wrote:
> On Wed, Oct 13, 2021 at 04:07:23PM +0200, Willy Tarreau wrote:
> > Yes I agree with the "potentially" here. If it can potentially be (i.e.
> > the kernel is allowed by contract to later change the way it's currently
> > done) then we have to save them even if it means lower code efficiency.
> >
> > If, however, the kernel performs such savings on purpose because it is
> > willing to observe a stricter saving than the AMD64 ABI, we can follow
> > it but only once it's written down somewhere that it is by contract and
> > will not change.
> 
> Right, and Micha noted that such a change to the document can be done.

great.

> And we're basically doing that registers restoring anyway, in POP_REGS.

That's what I based my analysis on when I wanted to verify Ammar's
finding. I would tend to think that if we're burning cycles popping
plenty of registers it's probably for a reason, maybe at least a good
one, which is that it's the only way to make sure we're not leaking
internal kernel data! This is not a concern for kernel->kernel nor
user->user calls but for user->kernel calls it definitely is one, and
I don't think we could relax that series of pop without causing leaks
anyway.

Willy

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH] tools/nolibc: x86: Remove `r8`, `r9` and `r10` from the clobber list
  2021-10-13 14:24                     ` Willy Tarreau
@ 2021-10-13 16:24                       ` Michael Matz
  2021-10-13 16:30                         ` Willy Tarreau
  2021-10-19  9:06                         ` David Laight
  0 siblings, 2 replies; 45+ messages in thread
From: Michael Matz @ 2021-10-13 16:24 UTC (permalink / raw)
  To: Willy Tarreau
  Cc: Borislav Petkov, Ammar Faizi, Paul Walmsley, Palmer Dabbelt,
	Albert Ou, Linux Kernel Mailing List, Andy Lutomirski,
	Thomas Gleixner, Ingo Molnar, x86, H. Peter Anvin

Hello,

On Wed, 13 Oct 2021, Willy Tarreau wrote:

> On Wed, Oct 13, 2021 at 04:20:55PM +0200, Borislav Petkov wrote:
> > On Wed, Oct 13, 2021 at 04:07:23PM +0200, Willy Tarreau wrote:
> > > Yes I agree with the "potentially" here. If it can potentially be (i.e.
> > > the kernel is allowed by contract to later change the way it's currently
> > > done) then we have to save them even if it means lower code efficiency.
> > >
> > > If, however, the kernel performs such savings on purpose because it is
> > > willing to observe a stricter saving than the AMD64 ABI, we can follow
> > > it but only once it's written down somewhere that it is by contract and
> > > will not change.
> > 
> > Right, and Micha noted that such a change to the document can be done.
> 
> great.
> 
> > And we're basically doing that registers restoring anyway, in POP_REGS.
> 
> That's what I based my analysis on when I wanted to verify Ammar's
> finding. I would tend to think that if we're burning cycles popping
> plenty of registers it's probably for a reason, maybe at least a good
> one, which is that it's the only way to make sure we're not leaking
> internal kernel data! This is not a concern for kernel->kernel nor
> user->user calls but for user->kernel calls it definitely is one, and
> I don't think we could relax that series of pop without causing leaks
> anyway.

It might also be interesting to know that while the wording of the psABI 
was indeed intended to imply that all argument registers are potentially 
clobbered (like with normal calls) glibc's inline assembler to call 
syscalls relies on most registers to actually be preserved:

# define REGISTERS_CLOBBERED_BY_SYSCALL "cc", "r11", "cx"
...
#define internal_syscall6(number, arg1, arg2, arg3, arg4, arg5, arg6) \
({                                                                      \
    unsigned long int resultvar;                                        \
    TYPEFY (arg6, __arg6) = ARGIFY (arg6);                              \
    TYPEFY (arg5, __arg5) = ARGIFY (arg5);                              \
    TYPEFY (arg4, __arg4) = ARGIFY (arg4);                              \
    TYPEFY (arg3, __arg3) = ARGIFY (arg3);                              \
    TYPEFY (arg2, __arg2) = ARGIFY (arg2);                              \
    TYPEFY (arg1, __arg1) = ARGIFY (arg1);                              \
    register TYPEFY (arg6, _a6) asm ("r9") = __arg6;                    \
    register TYPEFY (arg5, _a5) asm ("r8") = __arg5;                    \
    register TYPEFY (arg4, _a4) asm ("r10") = __arg4;                   \
    register TYPEFY (arg3, _a3) asm ("rdx") = __arg3;                   \
    register TYPEFY (arg2, _a2) asm ("rsi") = __arg2;                   \
    register TYPEFY (arg1, _a1) asm ("rdi") = __arg1;                   \
    asm volatile (                                                      \
    "syscall\n\t"                                                       \
    : "=a" (resultvar)                                                  \
    : "0" (number), "r" (_a1), "r" (_a2), "r" (_a3), "r" (_a4),         \
      "r" (_a5), "r" (_a6)                                              \
    : "memory", REGISTERS_CLOBBERED_BY_SYSCALL);                        \
    (long int) resultvar;                                               \
})


Note in particular the missing clobbers or outputs of any of the argument 
regs.

So, even though the psABI (might have) meant something else, as glibc is 
doing the above we in fact have a de-facto standard that the kernel can't 
clobber any of the argument regs.  The wording and the linux x86-64 
syscall implementation (and use in glibc) all come from the same time in 
2001, so there never was a time when the kernel was not saving/restoring 
the arg registers, so it can't stop now.

In effect this means the psABI should be clarified to explicitely say the 
the arg registers aren't clobbered, i.e. that the mentioned list of 
clobbered regs isn't inclusive but exclusive.  I will do that.

When I was discussing this with Boris earlier I hadn't yet looked at glibc 
use but only gave my interpretation from memory and reading.  Obviously 
reality trumps anything like that :-)

In short:  Ammars initial claim:

> Linux x86-64 syscall only clobbers rax, rcx and r11 (and "memory").
> 
>   - rax for the return value.
>   - rcx to save the return address.
>   - r11 to save the rflags.
> 
> Other registers are preserved.

is accurate and I will clarify the psABI to make that explicit.


Ciao,
Michael.

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH] tools/nolibc: x86: Remove `r8`, `r9` and `r10` from the clobber list
  2021-10-13 16:24                       ` Michael Matz
@ 2021-10-13 16:30                         ` Willy Tarreau
  2021-10-13 16:51                           ` Andy Lutomirski
  2021-10-13 16:52                           ` Borislav Petkov
  2021-10-19  9:06                         ` David Laight
  1 sibling, 2 replies; 45+ messages in thread
From: Willy Tarreau @ 2021-10-13 16:30 UTC (permalink / raw)
  To: Michael Matz
  Cc: Borislav Petkov, Ammar Faizi, Paul Walmsley, Palmer Dabbelt,
	Albert Ou, Linux Kernel Mailing List, Andy Lutomirski,
	Thomas Gleixner, Ingo Molnar, x86, H. Peter Anvin

Hello Michael,

On Wed, Oct 13, 2021 at 04:24:28PM +0000, Michael Matz wrote:
(...)
> In short:  Ammars initial claim:
> 
> > Linux x86-64 syscall only clobbers rax, rcx and r11 (and "memory").
> > 
> >   - rax for the return value.
> >   - rcx to save the return address.
> >   - r11 to save the rflags.
> > 
> > Other registers are preserved.
> 
> is accurate and I will clarify the psABI to make that explicit.

Many thanks for this very detailed explanation! Ammar, I'll take your
patch.

Thanks all,
Willy

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH] tools/nolibc: x86: Remove `r8`, `r9` and `r10` from the clobber list
  2021-10-13 16:30                         ` Willy Tarreau
@ 2021-10-13 16:51                           ` Andy Lutomirski
  2021-10-13 16:52                           ` Borislav Petkov
  1 sibling, 0 replies; 45+ messages in thread
From: Andy Lutomirski @ 2021-10-13 16:51 UTC (permalink / raw)
  To: Willy Tarreau, Michael Matz
  Cc: Borislav Petkov, Ammar Faizi, Paul Walmsley, Palmer Dabbelt,
	Albert Ou, Linux Kernel Mailing List, Thomas Gleixner,
	Ingo Molnar, the arch/x86 maintainers, H. Peter Anvin



On Wed, Oct 13, 2021, at 9:30 AM, Willy Tarreau wrote:
> Hello Michael,
>
> On Wed, Oct 13, 2021 at 04:24:28PM +0000, Michael Matz wrote:
> (...)
>> In short:  Ammars initial claim:
>> 
>> > Linux x86-64 syscall only clobbers rax, rcx and r11 (and "memory").
>> > 
>> >   - rax for the return value.
>> >   - rcx to save the return address.
>> >   - r11 to save the rflags.
>> > 
>> > Other registers are preserved.
>> 
>> is accurate and I will clarify the psABI to make that explicit.
>
> Many thanks for this very detailed explanation! Ammar, I'll take your
> patch.

Acked-by: Andy Lutomirski <luto@kernel.org>

>
> Thanks all,
> Willy

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH] tools/nolibc: x86: Remove `r8`, `r9` and `r10` from the clobber list
  2021-10-13 16:30                         ` Willy Tarreau
  2021-10-13 16:51                           ` Andy Lutomirski
@ 2021-10-13 16:52                           ` Borislav Petkov
  2021-10-14  8:44                             ` Ammar Faizi
  2021-10-14 12:44                             ` Michael Matz
  1 sibling, 2 replies; 45+ messages in thread
From: Borislav Petkov @ 2021-10-13 16:52 UTC (permalink / raw)
  To: Willy Tarreau
  Cc: Michael Matz, Ammar Faizi, Paul Walmsley, Palmer Dabbelt,
	Albert Ou, Linux Kernel Mailing List, Andy Lutomirski,
	Thomas Gleixner, Ingo Molnar, x86, H. Peter Anvin

On Wed, Oct 13, 2021 at 06:30:23PM +0200, Willy Tarreau wrote:
> Hello Michael,
> 
> On Wed, Oct 13, 2021 at 04:24:28PM +0000, Michael Matz wrote:
> (...)
> > In short:  Ammars initial claim:
> > 
> > > Linux x86-64 syscall only clobbers rax, rcx and r11 (and "memory").
> > > 
> > >   - rax for the return value.
> > >   - rcx to save the return address.
> > >   - r11 to save the rflags.
> > > 
> > > Other registers are preserved.
> > 
> > is accurate and I will clarify the psABI to make that explicit.
> 
> Many thanks for this very detailed explanation! Ammar, I'll take your
> patch.

Great, why are we dealing with some funky document when the law is in
glibc sources?!

:-)))

Ammar, if you wanna fixup the comment in entry_64.S too - make sure you
explain that glibc expects argument registers to be restored - I'll take
that patch too.

:-)

Thx, that was fun.

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH] tools/nolibc: x86: Remove `r8`, `r9` and `r10` from the clobber list
  2021-10-13 16:52                           ` Borislav Petkov
@ 2021-10-14  8:44                             ` Ammar Faizi
  2021-10-14 12:44                             ` Michael Matz
  1 sibling, 0 replies; 45+ messages in thread
From: Ammar Faizi @ 2021-10-14  8:44 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Ammar Faizi, Michael Matz, Willy Tarreau, Paul Walmsley,
	Palmer Dabbelt, Albert Ou, Linux Kernel Mailing List,
	Andy Lutomirski, Thomas Gleixner, Ingo Molnar, x86,
	H. Peter Anvin

On Wed, Oct 13, 2021 at 11:52 PM Borislav Petkov <bp@alien8.de> wrote:
>
> On Wed, Oct 13, 2021 at 06:30:23PM +0200, Willy Tarreau wrote:
> > Hello Michael,
> >
> > On Wed, Oct 13, 2021 at 04:24:28PM +0000, Michael Matz wrote:
> > (...)
> > > In short:  Ammars initial claim:
> > >
> > > > Linux x86-64 syscall only clobbers rax, rcx and r11 (and "memory").
> > > >
> > > >   - rax for the return value.
> > > >   - rcx to save the return address.
> > > >   - r11 to save the rflags.
> > > >
> > > > Other registers are preserved.
> > >
> > > is accurate and I will clarify the psABI to make that explicit.
> >
> > Many thanks for this very detailed explanation! Ammar, I'll take your
> > patch.

Thanks all.

>
> Great, why are we dealing with some funky document when the law is in
> glibc sources?!
>
> :-)))
>
> Ammar, if you wanna fixup the comment in entry_64.S too - make sure you
> explain that glibc expects argument registers to be restored - I'll take
> that patch too.
>
> :-)

Yay! I will send a patch for it, on this or next week.

>
> Thx, that was fun.
>

-- 
Ammar Faizi

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH] tools/nolibc: x86: Remove `r8`, `r9` and `r10` from the clobber list
  2021-10-13 16:52                           ` Borislav Petkov
  2021-10-14  8:44                             ` Ammar Faizi
@ 2021-10-14 12:44                             ` Michael Matz
  2021-10-14 14:31                               ` Borislav Petkov
  1 sibling, 1 reply; 45+ messages in thread
From: Michael Matz @ 2021-10-14 12:44 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Willy Tarreau, Ammar Faizi, Paul Walmsley, Palmer Dabbelt,
	Albert Ou, Linux Kernel Mailing List, Andy Lutomirski,
	Thomas Gleixner, Ingo Molnar, x86, H. Peter Anvin

Hello,

On Wed, 13 Oct 2021, Borislav Petkov wrote:

> > > In short:  Ammars initial claim:
> > > 
> > > > Linux x86-64 syscall only clobbers rax, rcx and r11 (and "memory").
> > > > 
> > > >   - rax for the return value.
> > > >   - rcx to save the return address.
> > > >   - r11 to save the rflags.
> > > > 
> > > > Other registers are preserved.
> > > 
> > > is accurate and I will clarify the psABI to make that explicit.
> > 
> > Many thanks for this very detailed explanation! Ammar, I'll take your
> > patch.
> 
> Great, why are we dealing with some funky document when the law is in
> glibc sources?!

In theory, theory and practice are the same, in practice, they are not.

Usually it's good to resolve a conflict towards what the document says, or 
intended to say.  But glibc of course provides a huge amount of pressure 
to resolve toward it ;-)

(laws are also changed toward practice when the latter overtakes :) )


Ciao,
Michael.

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH] tools/nolibc: x86: Remove `r8`, `r9` and `r10` from the clobber list
  2021-10-14 12:44                             ` Michael Matz
@ 2021-10-14 14:31                               ` Borislav Petkov
  0 siblings, 0 replies; 45+ messages in thread
From: Borislav Petkov @ 2021-10-14 14:31 UTC (permalink / raw)
  To: Michael Matz
  Cc: Willy Tarreau, Ammar Faizi, Paul Walmsley, Palmer Dabbelt,
	Albert Ou, Linux Kernel Mailing List, Andy Lutomirski,
	Thomas Gleixner, Ingo Molnar, x86, H. Peter Anvin

On Thu, Oct 14, 2021 at 12:44:27PM +0000, Michael Matz wrote:
> Usually it's good to resolve a conflict towards what the document says, or 
> intended to say.  But glibc of course provides a huge amount of pressure 
> to resolve toward it ;-)

I call that "wagging the dog" or in this case "wagging the doc". :-P

> (laws are also changed toward practice when the latter overtakes :) )

Hohumm.

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply	[flat|nested] 45+ messages in thread

* [PATCH 0/2] Fix clobber list and startup code bug
  2021-10-13  9:03 ` [PATCH v2] " Ammar Faizi
@ 2021-10-15  8:25   ` Ammar Faizi
  2021-10-15  8:25     ` [PATCH 1/2] tools/nolibc: x86: Remove `r8`, `r9` and `r10` from the clobber list Ammar Faizi
  2021-10-15  8:25     ` [PATCH 2/2] tools/nolibc: x86-64: Fix startup code bug Ammar Faizi
  0 siblings, 2 replies; 45+ messages in thread
From: Ammar Faizi @ 2021-10-15  8:25 UTC (permalink / raw)
  To: Willy Tarreau
  Cc: Ammar Faizi, Paul Walmsley, Palmer Dabbelt, Albert Ou,
	Linux Kernel Mailing List, Andy Lutomirski, Thomas Gleixner,
	Ingo Molnar, x86, H. Peter Anvin, David Laight, Peter Cordes,
	Bedirhan KURT, Louvian Lyndal

This is a patchset, there are 2 patches in this series.
 
[PATCH 1/2] is a resend of the last patch I sent. Appended Acked-by tag
from Andy.

[PATCH 2/2] is based on [PATCH 1/2], it's a new patch to fix startup
code bug.

Detailed story in the commit message. Thanks to Peter who reported the
bug, fixed in [PATCH 2/2] by me.

Please review!

Ref: https://lore.kernel.org/lkml/20211011040344.437264-1-ammar.faizi@students.amikom.ac.id/
----------------------------------------------------------------
Ammar Faizi (2):
      tools/nolibc: x86: Remove `r8`, `r9` and `r10` from the clobber list
      tools/nolibc: x86-64: Fix startup code bug

 tools/include/nolibc/nolibc.h | 94 +++++++++++++++++++++++----------
 1 file changed, 66 insertions(+), 28 deletions(-)

-- 
Ammar Faizi



^ permalink raw reply	[flat|nested] 45+ messages in thread

* [PATCH 1/2] tools/nolibc: x86: Remove `r8`, `r9` and `r10` from the clobber list
  2021-10-15  8:25   ` [PATCH 0/2] Fix clobber list and startup code bug Ammar Faizi
@ 2021-10-15  8:25     ` Ammar Faizi
  2021-10-18  5:52       ` Willy Tarreau
  2021-10-15  8:25     ` [PATCH 2/2] tools/nolibc: x86-64: Fix startup code bug Ammar Faizi
  1 sibling, 1 reply; 45+ messages in thread
From: Ammar Faizi @ 2021-10-15  8:25 UTC (permalink / raw)
  To: Willy Tarreau
  Cc: Ammar Faizi, Paul Walmsley, Palmer Dabbelt, Albert Ou,
	Linux Kernel Mailing List, Andy Lutomirski, Thomas Gleixner,
	Ingo Molnar, x86, H. Peter Anvin, David Laight, Peter Cordes,
	Bedirhan KURT, Louvian Lyndal, Borislav Petkov

Linux x86-64 syscall only clobbers rax, rcx and r11 (and "memory").

  - rax for the return value.
  - rcx to save the return address.
  - r11 to save the rflags.

Other registers are preserved.

Having r8, r9 and r10 in the syscall clobber list is harmless, but this
results in a missed-optimization.

As the syscall doesn't clobber r8-r10, GCC should be allowed to reuse
their value after the syscall returns to userspace. But since they are
in the clobber list, GCC will always miss this opportunity.

Remove them from the x86-64 syscall clobber list to help GCC generate
better code and fix the comment.

See also the x86-64 ABI, section A.2 AMD64 Linux Kernel Conventions,
A.2.1 Calling Conventions [1].

Extra note:
Some people may think it does not really give a benefit to remove r8,
r9 and r10 from the syscall clobber list because the impression of
syscall is a C function call, and function call always clobbers those 3.

However, that is not the case for nolibc.h, because we have a potential
to inline the "syscall" instruction (which its opcode is "0f 05") to the
user functions.

All syscalls in the nolibc.h are written as a static function with inline
ASM and are likely always inline if we use optimization flag, so this is
a profit not to have r8, r9 and r10 in the clobber list.

Here is the example where this matters.

Consider the following C code:
```
  #include "tools/include/nolibc/nolibc.h"
  #define read_abc(a, b, c) __asm__ volatile("nop"::"r"(a),"r"(b),"r"(c))

  int main(void)
  {
  	int a = 0xaa;
  	int b = 0xbb;
  	int c = 0xcc;

  	read_abc(a, b, c);
  	write(1, "test\n", 5);
  	read_abc(a, b, c);

  	return 0;
  }
```

Compile with:
    gcc -Os test.c -o test -nostdlib

With r8, r9, r10 in the clobber list, GCC generates this:

0000000000001000 <main>:
    1000:	f3 0f 1e fa          	endbr64
    1004:	41 54                	push   %r12
    1006:	41 bc cc 00 00 00    	mov    $0xcc,%r12d
    100c:	55                   	push   %rbp
    100d:	bd bb 00 00 00       	mov    $0xbb,%ebp
    1012:	53                   	push   %rbx
    1013:	bb aa 00 00 00       	mov    $0xaa,%ebx
    1018:	90                   	nop
    1019:	b8 01 00 00 00       	mov    $0x1,%eax
    101e:	bf 01 00 00 00       	mov    $0x1,%edi
    1023:	ba 05 00 00 00       	mov    $0x5,%edx
    1028:	48 8d 35 d1 0f 00 00 	lea    0xfd1(%rip),%rsi
    102f:	0f 05                	syscall
    1031:	90                   	nop
    1032:	31 c0                	xor    %eax,%eax
    1034:	5b                   	pop    %rbx
    1035:	5d                   	pop    %rbp
    1036:	41 5c                	pop    %r12
    1038:	c3                   	ret

GCC thinks that syscall will clobber r8, r9, r10. So it spills 0xaa,
0xbb and 0xcc to callee saved registers (r12, rbp and rbx). This is
clearly extra memory access and extra stack size for preserving them.

But syscall does not actually clobber them, so this is a missed
optimization.

Now without r8, r9, r10 in the clobber list, GCC generates better code:

0000000000001000 <main>:
    1000:	f3 0f 1e fa          	endbr64
    1004:	41 b8 aa 00 00 00    	mov    $0xaa,%r8d
    100a:	41 b9 bb 00 00 00    	mov    $0xbb,%r9d
    1010:	41 ba cc 00 00 00    	mov    $0xcc,%r10d
    1016:	90                   	nop
    1017:	b8 01 00 00 00       	mov    $0x1,%eax
    101c:	bf 01 00 00 00       	mov    $0x1,%edi
    1021:	ba 05 00 00 00       	mov    $0x5,%edx
    1026:	48 8d 35 d3 0f 00 00 	lea    0xfd3(%rip),%rsi
    102d:	0f 05                	syscall
    102f:	90                   	nop
    1030:	31 c0                	xor    %eax,%eax
    1032:	c3                   	ret

Cc: Andy Lutomirski <luto@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: x86@kernel.org
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: David Laight <David.Laight@ACULAB.COM>
Acked-by: Andy Lutomirski <luto@kernel.org>
Signed-off-by: Ammar Faizi <ammar.faizi@students.amikom.ac.id>
Link: https://gitlab.com/x86-psABIs/x86-64-ABI/-/wikis/x86-64-psABI [1]
---
 tools/include/nolibc/nolibc.h | 33 +++++++++++++++++++--------------
 1 file changed, 19 insertions(+), 14 deletions(-)

diff --git a/tools/include/nolibc/nolibc.h b/tools/include/nolibc/nolibc.h
index 3430667b0d24..1483d95c8330 100644
--- a/tools/include/nolibc/nolibc.h
+++ b/tools/include/nolibc/nolibc.h
@@ -265,12 +265,17 @@ struct stat {
  *   - arguments are in rdi, rsi, rdx, r10, r8, r9 respectively
  *   - the system call is performed by calling the syscall instruction
  *   - syscall return comes in rax
- *   - rcx and r8..r11 may be clobbered, others are preserved.
+ *   - rcx and r11 are clobbered, others are preserved.
  *   - the arguments are cast to long and assigned into the target registers
  *     which are then simply passed as registers to the asm code, so that we
  *     don't have to experience issues with register constraints.
  *   - the syscall number is always specified last in order to allow to force
  *     some registers before (gcc refuses a %-register at the last position).
+ *   - see also x86-64 ABI section A.2 AMD64 Linux Kernel Conventions, A.2.1
+ *     Calling Conventions.
+ *
+ * Link x86-64 ABI: https://gitlab.com/x86-psABIs/x86-64-ABI/-/wikis/x86-64-psABI
+ *
  */
 
 #define my_syscall0(num)                                                      \
@@ -280,9 +285,9 @@ struct stat {
 									      \
 	asm volatile (                                                        \
 		"syscall\n"                                                   \
-		: "=a" (_ret)                                                 \
+		: "=a"(_ret)                                                  \
 		: "0"(_num)                                                   \
-		: "rcx", "r8", "r9", "r10", "r11", "memory", "cc"             \
+		: "rcx", "r11", "memory", "cc"                                \
 	);                                                                    \
 	_ret;                                                                 \
 })
@@ -295,10 +300,10 @@ struct stat {
 									      \
 	asm volatile (                                                        \
 		"syscall\n"                                                   \
-		: "=a" (_ret)                                                 \
+		: "=a"(_ret)                                                  \
 		: "r"(_arg1),                                                 \
 		  "0"(_num)                                                   \
-		: "rcx", "r8", "r9", "r10", "r11", "memory", "cc"             \
+		: "rcx", "r11", "memory", "cc"                                \
 	);                                                                    \
 	_ret;                                                                 \
 })
@@ -312,10 +317,10 @@ struct stat {
 									      \
 	asm volatile (                                                        \
 		"syscall\n"                                                   \
-		: "=a" (_ret)                                                 \
+		: "=a"(_ret)                                                  \
 		: "r"(_arg1), "r"(_arg2),                                     \
 		  "0"(_num)                                                   \
-		: "rcx", "r8", "r9", "r10", "r11", "memory", "cc"             \
+		: "rcx", "r11", "memory", "cc"                                \
 	);                                                                    \
 	_ret;                                                                 \
 })
@@ -330,10 +335,10 @@ struct stat {
 									      \
 	asm volatile (                                                        \
 		"syscall\n"                                                   \
-		: "=a" (_ret)                                                 \
+		: "=a"(_ret)                                                  \
 		: "r"(_arg1), "r"(_arg2), "r"(_arg3),                         \
 		  "0"(_num)                                                   \
-		: "rcx", "r8", "r9", "r10", "r11", "memory", "cc"             \
+		: "rcx", "r11", "memory", "cc"                                \
 	);                                                                    \
 	_ret;                                                                 \
 })
@@ -349,10 +354,10 @@ struct stat {
 									      \
 	asm volatile (                                                        \
 		"syscall\n"                                                   \
-		: "=a" (_ret), "=r"(_arg4)                                    \
+		: "=a"(_ret)                                                  \
 		: "r"(_arg1), "r"(_arg2), "r"(_arg3), "r"(_arg4),             \
 		  "0"(_num)                                                   \
-		: "rcx", "r8", "r9", "r11", "memory", "cc"                    \
+		: "rcx", "r11", "memory", "cc"                                \
 	);                                                                    \
 	_ret;                                                                 \
 })
@@ -369,10 +374,10 @@ struct stat {
 									      \
 	asm volatile (                                                        \
 		"syscall\n"                                                   \
-		: "=a" (_ret), "=r"(_arg4), "=r"(_arg5)                       \
+		: "=a"(_ret)                                                  \
 		: "r"(_arg1), "r"(_arg2), "r"(_arg3), "r"(_arg4), "r"(_arg5), \
 		  "0"(_num)                                                   \
-		: "rcx", "r9", "r11", "memory", "cc"                          \
+		: "rcx", "r11", "memory", "cc"                                \
 	);                                                                    \
 	_ret;                                                                 \
 })
@@ -390,7 +395,7 @@ struct stat {
 									      \
 	asm volatile (                                                        \
 		"syscall\n"                                                   \
-		: "=a" (_ret), "=r"(_arg4), "=r"(_arg5)                       \
+		: "=a"(_ret)                                                  \
 		: "r"(_arg1), "r"(_arg2), "r"(_arg3), "r"(_arg4), "r"(_arg5), \
 		  "r"(_arg6), "0"(_num)                                       \
 		: "rcx", "r11", "memory", "cc"                                \
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [PATCH 2/2] tools/nolibc: x86-64: Fix startup code bug
  2021-10-15  8:25   ` [PATCH 0/2] Fix clobber list and startup code bug Ammar Faizi
  2021-10-15  8:25     ` [PATCH 1/2] tools/nolibc: x86: Remove `r8`, `r9` and `r10` from the clobber list Ammar Faizi
@ 2021-10-15  8:25     ` Ammar Faizi
  2021-10-15  8:57       ` Ammar Faizi
  2021-10-18  4:58       ` Willy Tarreau
  1 sibling, 2 replies; 45+ messages in thread
From: Ammar Faizi @ 2021-10-15  8:25 UTC (permalink / raw)
  To: Willy Tarreau
  Cc: Ammar Faizi, Paul Walmsley, Palmer Dabbelt, Albert Ou,
	Linux Kernel Mailing List, Andy Lutomirski, Thomas Gleixner,
	Ingo Molnar, x86, H. Peter Anvin, David Laight, Peter Cordes,
	Bedirhan KURT, Louvian Lyndal

Before this patch, the _start function looks like this:

0000000000001170 <_start>:
    1170:	pop    %rdi
    1171:	mov    %rsp,%rsi
    1174:	lea    0x8(%rsi,%rdi,8),%rdx
    1179:	and    $0xfffffffffffffff0,%rsp
    117d:	sub    $0x8,%rsp
    1181:	call   1000 <main>
    1186:	movzbq %al,%rdi
    118a:	mov    $0x3c,%rax
    1191:	syscall
    1193:	hlt
    1194:	data16 cs nopw 0x0(%rax,%rax,1)
    119f:	nop

Note the "and" to %rsp, it makes the %rsp be 16-byte aligned, but then
there is a "sub" with $0x8 which makes the %rsp no longer 16-byte
aligned, then it calls main. That's the bug!

Right before "call", the %rsp must be 16-byte aligned. So the "sub"
here breaks the alignment. Remove it.

Also the content of %rbp may be unspecified at process initialization
time. For example, if the _start gets called by an interpreter, the
interpreter may not zero the %rbp, so we should zero the %rbp on _start.

In this patch, we recode the _start function, and now it looks like
this:

0000000000001170 <_start>:
    1170:	pop    %rdi                       # argc
    1171:	mov    %rsp,%rsi                  # argv
    1174:	lea    0x8(%rsi,%rdi,8),%rdx      # envp
    1179:	xor    %ebp,%ebp                  # %rbp = 0
    117b:	and    $0xfffffffffffffff0,%rsp   # %rsp &= -16
    117f:	call   1000 <main>
    1184:	mov    %eax,%edi
    1186:	mov    $0xe7,%eax
    118b:	syscall                           # sys_exit_group(%rdi)
    118d:	hlt
    118e:	xchg   %ax,%ax

Extra fixes:
  - Use NR_exit_group instead of NR_exit.

  - Use `mov %eax,%edi` instead of `movzbq %al,%rdi`. This makes the
    exit code more observable from strace. While the exit code is
    only 8-bit, the kernel has taken care of that, so no need to
    worry about it.

Cc: Bedirhan KURT <windowz414@gnuweeb.org>
Cc: Louvian Lyndal <louvianlyndal@gmail.com>
Reported-by: Peter Cordes <peter@cordes.ca>
Signed-off-by: Ammar Faizi <ammar.faizi@students.amikom.ac.id>
---
 tools/include/nolibc/nolibc.h | 61 +++++++++++++++++++++++++++--------
 1 file changed, 47 insertions(+), 14 deletions(-)

diff --git a/tools/include/nolibc/nolibc.h b/tools/include/nolibc/nolibc.h
index 1483d95c8330..f502b645d363 100644
--- a/tools/include/nolibc/nolibc.h
+++ b/tools/include/nolibc/nolibc.h
@@ -403,21 +403,54 @@ struct stat {
 	_ret;                                                                 \
 })
 
+
 /* startup code */
-asm(".section .text\n"
-    ".global _start\n"
-    "_start:\n"
-    "pop %rdi\n"                // argc   (first arg, %rdi)
-    "mov %rsp, %rsi\n"          // argv[] (second arg, %rsi)
-    "lea 8(%rsi,%rdi,8),%rdx\n" // then a NULL then envp (third arg, %rdx)
-    "and $-16, %rsp\n"          // x86 ABI : esp must be 16-byte aligned when
-    "sub $8, %rsp\n"            // entering the callee
-    "call main\n"               // main() returns the status code, we'll exit with it.
-    "movzb %al, %rdi\n"         // retrieve exit code from 8 lower bits
-    "mov $60, %rax\n"           // NR_exit == 60
-    "syscall\n"                 // really exit
-    "hlt\n"                     // ensure it does not return
-    "");
+asm(
+	".section .text\n"
+	".global _start\n"
+
+	"_start:\n\t"
+	"popq %rdi\n\t"			// argc   (first arg, %rdi)
+	"movq %rsp, %rsi\n\t"		// argv[] (second arg, %rsi)
+	"leaq 8(%rsi,%rdi,8), %rdx\n\t"	// then a NULL, then envp (third arg, %rdx)
+
+	/*
+	 * The System V ABI mandates the deepest stack frame should be zero.
+	 * Thus we zero the %rbp here.
+	 */
+	"xorl %ebp, %ebp\n\t"
+
+	/*
+	 * The System V ABI mandates the %rsp needs to be aligned at 16-byte
+	 * before performing a function call.
+	 */
+	"andq $-16, %rsp\n\t"
+
+	/*
+	 * main() returns the status code, we will exit with it.
+	 */
+	"callq main\n\t"
+
+	/*
+	 * Move the return value to the first argument of exit_group.
+	 */
+	"movl %eax, %edi\n\t"
+
+	/*
+	 * NR_exit_group == 231
+	 */
+	"movl $231, %eax\n\t"
+
+	/*
+	 * Really exit.
+	 */
+	"syscall\n\t"
+
+	/*
+	 * Ensure it does not return.
+	 */
+	"hlt\n\t"
+);
 
 /* fcntl / open */
 #define O_RDONLY            0
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 45+ messages in thread

* Re: [PATCH 2/2] tools/nolibc: x86-64: Fix startup code bug
  2021-10-15  8:25     ` [PATCH 2/2] tools/nolibc: x86-64: Fix startup code bug Ammar Faizi
@ 2021-10-15  8:57       ` Ammar Faizi
  2021-10-15  9:26         ` Bedirhan KURT
  2021-10-15  9:41         ` Louvian Lyndal
  2021-10-18  4:58       ` Willy Tarreau
  1 sibling, 2 replies; 45+ messages in thread
From: Ammar Faizi @ 2021-10-15  8:57 UTC (permalink / raw)
  To: Willy Tarreau
  Cc: Ammar Faizi, Paul Walmsley, Palmer Dabbelt, Albert Ou,
	Linux Kernel Mailing List, Andy Lutomirski, Thomas Gleixner,
	Ingo Molnar, x86, H. Peter Anvin, David Laight, Peter Cordes,
	Bedirhan KURT, Louvian Lyndal

Hi,

This is a code to test.

Compile with:
  gcc -O3 -ggdb3 -nostdlib -o test test.c

Technical explanation:
The System V ABI mandates the %rsp must be 16-byte aligned before
performing a function call, but the current nolibc.h violates it.

This %rsp alignment violation makes the callee can't align its stack
properly. Note that the callee may have a situation where it requires
vector aligned move. For example, `movaps` with memory operand w.r.t.
xmm registers, it requires the src/dst address be 16-byte aligned.

Since the callee can't align its stack properly, it will segfault when
executing `movaps`. The following C code is the reproducer and test
to ensure the bug really exists and this patch fixes it.

```
/* SPDX-License-Identifier: LGPL-2.1 OR MIT */

/*
 * Bug reproducer and test for Willy's nolibc (x86-64) by Ammar.
 *
 * Compile with:
 *  gcc -O3 -ggdb3 -nostdlib -o test test.c
 */

#include "tools/include/nolibc/nolibc.h"

static void dump_argv(int argc, char *argv[])
{
	int i;
	const char str[] = "\nDumping argv...\n";

	write(1, str, strlen(str));
	for (i = 0; i < argc; i++) {
		write(1, argv[i], strlen(argv[i]));
		write(1, "\n", 1);
	}
}

static void dump_envp(char *envp[])
{
	int i = 0;
	const char str[] = "\nDumping envp...\n";

	write(1, str, strlen(str));
	while (envp[i] != NULL) {
		write(1, envp[i], strlen(envp[i]));
		write(1, "\n", 1);
		i++;
	}
}

#define read_barrier(PTR) __asm__ volatile(""::"r"(PTR):"memory")

__attribute__((__target__("sse2")))
static void test_sse_move_aligned(void)
{
	int i;
	int arr[32] __attribute__((__aligned__(16)));

	/*
	 * The assignment inside this loop is very likely
	 * performed with aligned move, thus if we don't
	 * align the %rsp properly, it will fault!
	 *
	 * If we fault due to misaligned here, then there
	 * must be a caller below us that violates SysV
	 * ABI w.r.t. to %rsp alignment before func call.
	 */
	for (i = 0; i < 32; i++)
		arr[i] = 1;

	read_barrier(arr);
}

int main(int argc, char *argv[], char *envp[])
{
	dump_argv(argc, argv);
	dump_envp(envp);
	test_sse_move_aligned();
	return 0;
}
```

-- 
Ammar Faizi

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH 2/2] tools/nolibc: x86-64: Fix startup code bug
  2021-10-15  8:57       ` Ammar Faizi
@ 2021-10-15  9:26         ` Bedirhan KURT
  2021-10-15  9:58           ` Ammar Faizi
  2021-10-15  9:41         ` Louvian Lyndal
  1 sibling, 1 reply; 45+ messages in thread
From: Bedirhan KURT @ 2021-10-15  9:26 UTC (permalink / raw)
  To: Ammar Faizi, Willy Tarreau
  Cc: Paul Walmsley, Palmer Dabbelt, Albert Ou,
	Linux Kernel Mailing List, Andy Lutomirski, Thomas Gleixner,
	Ingo Molnar, x86, H. Peter Anvin, David Laight, Peter Cordes,
	Louvian Lyndal

(Sending again as previous email had my Ubuntu username as sender and
Thunderbird attached my GPG key on it. Hope I cancelled on time.)

Hi Ammar,

I've tested your patchset on my local clone of Linux kernel with up to
date fetch of master branch and this is the output I've gotten after
executing the test binary compiled;

```
# I have replaced Powerline-specific characters to avoid font rendering
# issue.

  windowzytch@WindowZUbuntu > ~/linux > master > ./test

Dumping argv...
./test

Dumping envp...
SYSTEMD_EXEC_PID=6168
SSH_AUTH_SOCK=/run/user/1001/keyring/ssh
LANGUAGE=en_US:en
LANG=en_US.UTF-8
PAPERSIZE=letter
SESSION_MANAGER=local/WindowZUbuntu:@/tmp/.ICE-unix/3472,unix/WindowZUbuntu:/tmp/.ICE-unix/3472
XDG_CURRENT_DESKTOP=ubuntu:GNOME
LC_IDENTIFICATION=en_US.UTF-8
DEFAULTS_PATH=/usr/share/gconf/ubuntu-xorg.default.path
XDG_SESSION_CLASS=user
_=/home/windowzytch/linux/./test
COLORTERM=truecolor
GPG_AGENT_INFO=/run/user/1001/gnupg/S.gpg-agent:0:1
QT_IM_MODULE=ibus
DESKTOP_SESSION=ubuntu-xorg
USER=windowzytch
XDG_MENU_PREFIX=gnome-
LC_MEASUREMENT=en_US.UTF-8
G_ENABLE_DIAGNOSTIC=0
HOME=/home/windowzytch
GNOME_TERMINAL_SCREEN=/org/gnome/Terminal/screen/f89f0e46_b4c6_4c0a_8434_4fd73845d5aa
JOURNAL_STREAM=8:92304
DBUS_SESSION_BUS_ADDRESS=unix:path=/run/user/1001/bus
XMODIFIERS=@im=ibus
LC_NUMERIC=en_US.UTF-8
SSH_AGENT_LAUNCHER=gnome-keyring
GTK_MODULES=gail:atk-bridge
XDG_DATA_DIRS=/usr/share/ubuntu-xorg:/home/windowzytch/.local/share/flatpak/exports/share:/var/lib/flatpak/exports/share:/usr/local/share/:/usr/share/:/var/lib/snapd/desktop
WINDOWPATH=2
XDG_SESSION_DESKTOP=ubuntu-xorg
XDG_CONFIG_DIRS=/etc/xdg/xdg-ubuntu-xorg:/etc/xdg
QT_ACCESSIBILITY=1
GNOME_DESKTOP_SESSION_ID=this-is-deprecated
MANAGERPID=3189
LC_TIME=en_US.UTF-8
MANDATORY_PATH=/usr/share/gconf/ubuntu-xorg.mandatory.path
LOGNAME=windowzytch
GNOME_TERMINAL_SERVICE=:1.115
LC_PAPER=en_US.UTF-8
GNOME_SHELL_SESSION_MODE=ubuntu
PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games:/snap/bin:/snap/bin
XDG_RUNTIME_DIR=/run/user/1001
SHELL=/bin/zsh
XDG_SESSION_TYPE=x11
LC_MONETARY=en_US.UTF-8
LC_TELEPHONE=en_US.UTF-8
USERNAME=windowzytch
VTE_VERSION=6402
INVOCATION_ID=6052906e58884b5487d3c88314961722
PWD=/home/windowzytch/linux
SHLVL=1
XAUTHORITY=/run/user/1001/gdm/Xauthority
LC_NAME=en_US.UTF-8
IM_CONFIG_PHASE=1
GDMSESSION=ubuntu-xorg
LC_ADDRESS=en_US.UTF-8
DISPLAY=:0
TERM=xterm-256color
OLDPWD=/home/windowzytch/twrprebase/recovery_device_vestel_teos
ZSH=/home/windowzytch/.oh-my-zsh
PAGER=less
LESS=-R
LSCOLORS=Gxfxcxdxbxegedabagacad
LS_COLORS=rs=0:di=01;34:ln=01;36:mh=00:pi=40;33:so=01;35:do=01;35:bd=40;33;01:cd=40;33;01:or=40;31;01:mi=00:su=37;41:sg=30;43:ca=30;41:tw=30;42:ow=34;42:st=37;44:ex=01;32:*.tar=01;31:*.tgz=01;31:*.arc=01;31:*.arj=01;31:*.taz=01;31:*.lha=01;31:*.lz4=01;31:*.lzh=01;31:*.lzma=01;31:*.tlz=01;31:*.txz=01;31:*.tzo=01;31:*.t7z=01;31:*.zip=01;31:*.z=01;31:*.dz=01;31:*.gz=01;31:*.lrz=01;31:*.lz=01;31:*.lzo=01;31:*.xz=01;31:*.zst=01;31:*.tzst=01;31:*.bz2=01;31:*.bz=01;31:*.tbz=01;31:*.tbz2=01;31:*.tz=01;31:*.deb=01;31:*.rpm=01;31:*.jar=01;31:*.war=01;31:*.ear=01;31:*.sar=01;31:*.rar=01;31:*.alz=01;31:*.ace=01;31:*.zoo=01;31:*.cpio=01;31:*.7z=01;31:*.rz=01;31:*.cab=01;31:*.wim=01;31:*.swm=01;31:*.dwm=01;31:*.esd=01;31:*.jpg=01;35:*.jpeg=01;35:*.mjpg=01;35:*.mjpeg=01;35:*.gif=01;35:*.bmp=01;35:*.pbm=01;35:*.pgm=01;35:*.ppm=01;35:*.tga=01;35:*.xbm=01;35:*.xpm=01;35:*.tif=01;35:*.tiff=01;35:*.png=01;35:*.svg=01;35:*.svgz=01;35:*.mng=01;35:*.pcx=01;35:*.mov=01;35:*.mpg=01;35:*.mpeg=01;35:*.m2v=01;35:*.mkv=01;35:*.webm=01;35:*.webp=01;35:*.ogm=01;35:*.mp4=01;35:*.m4v=01;35:*.mp4v=01;35:*.vob=01;35:*.qt=01;35:*.nuv=01;35:*.wmv=01;35:*.asf=01;35:*.rm=01;35:*.rmvb=01;35:*.flc=01;35:*.avi=01;35:*.fli=01;35:*.flv=01;35:*.gl=01;35:*.dl=01;35:*.xcf=01;35:*.xwd=01;35:*.yuv=01;35:*.cgm=01;35:*.emf=01;35:*.ogv=01;35:*.ogx=01;35:*.aac=00;36:*.au=00;36:*.flac=00;36:*.m4a=00;36:*.mid=00;36:*.midi=00;36:*.mka=00;36:*.mp3=00;36:*.mpc=00;36:*.ogg=00;36:*.ra=00;36:*.wav=00;36:*.oga=00;36:*.opus=00;36:*.spx=00;36:*.xspf=00;36:
```

I hope these are helpful and I could help throughout this patchset. I
didn't get any SegFaults compared to my tests with same code on pure
state either so I think everything works just fine. You can append me in 
Tested-by tag if you want.

--
Bedirhan KURT

On 10/15/21 11:57, Ammar Faizi wrote:
> Hi,
> 
> This is a code to test.
> 
> Compile with:
>    gcc -O3 -ggdb3 -nostdlib -o test test.c
> 
> Technical explanation:
> The System V ABI mandates the %rsp must be 16-byte aligned before
> performing a function call, but the current nolibc.h violates it.
> 
> This %rsp alignment violation makes the callee can't align its stack
> properly. Note that the callee may have a situation where it requires
> vector aligned move. For example, `movaps` with memory operand w.r.t.
> xmm registers, it requires the src/dst address be 16-byte aligned.
> 
> Since the callee can't align its stack properly, it will segfault when
> executing `movaps`. The following C code is the reproducer and test
> to ensure the bug really exists and this patch fixes it.
> 
> ```
> /* SPDX-License-Identifier: LGPL-2.1 OR MIT */
> 
> /*
>   * Bug reproducer and test for Willy's nolibc (x86-64) by Ammar.
>   *
>   * Compile with:
>   *  gcc -O3 -ggdb3 -nostdlib -o test test.c
>   */
> 
> #include "tools/include/nolibc/nolibc.h"
> 
> static void dump_argv(int argc, char *argv[])
> {
> 	int i;
> 	const char str[] = "\nDumping argv...\n";
> 
> 	write(1, str, strlen(str));
> 	for (i = 0; i < argc; i++) {
> 		write(1, argv[i], strlen(argv[i]));
> 		write(1, "\n", 1);
> 	}
> }
> 
> static void dump_envp(char *envp[])
> {
> 	int i = 0;
> 	const char str[] = "\nDumping envp...\n";
> 
> 	write(1, str, strlen(str));
> 	while (envp[i] != NULL) {
> 		write(1, envp[i], strlen(envp[i]));
> 		write(1, "\n", 1);
> 		i++;
> 	}
> }
> 
> #define read_barrier(PTR) __asm__ volatile(""::"r"(PTR):"memory")
> 
> __attribute__((__target__("sse2")))
> static void test_sse_move_aligned(void)
> {
> 	int i;
> 	int arr[32] __attribute__((__aligned__(16)));
> 
> 	/*
> 	 * The assignment inside this loop is very likely
> 	 * performed with aligned move, thus if we don't
> 	 * align the %rsp properly, it will fault!
> 	 *
> 	 * If we fault due to misaligned here, then there
> 	 * must be a caller below us that violates SysV
> 	 * ABI w.r.t. to %rsp alignment before func call.
> 	 */
> 	for (i = 0; i < 32; i++)
> 		arr[i] = 1;
> 
> 	read_barrier(arr);
> }
> 
> int main(int argc, char *argv[], char *envp[])
> {
> 	dump_argv(argc, argv);
> 	dump_envp(envp);
> 	test_sse_move_aligned();
> 	return 0;
> }
> ```
> 

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH 2/2] tools/nolibc: x86-64: Fix startup code bug
  2021-10-15  8:57       ` Ammar Faizi
  2021-10-15  9:26         ` Bedirhan KURT
@ 2021-10-15  9:41         ` Louvian Lyndal
  1 sibling, 0 replies; 45+ messages in thread
From: Louvian Lyndal @ 2021-10-15  9:41 UTC (permalink / raw)
  To: Ammar Faizi
  Cc: Willy Tarreau, Paul Walmsley, Palmer Dabbelt, Albert Ou,
	Linux Kernel Mailing List, Andy Lutomirski, Thomas Gleixner,
	Ingo Molnar, x86, H. Peter Anvin, David Laight, Peter Cordes,
	Bedirhan KURT

On Fri, Oct 15, 2021 at 3:57 PM Ammar Faizi wrote:
>
> Hi,
>
> This is a code to test.
>
> Compile with:
>   gcc -O3 -ggdb3 -nostdlib -o test test.c
>
> Technical explanation:
> The System V ABI mandates the %rsp must be 16-byte aligned before
> performing a function call, but the current nolibc.h violates it.
>
> This %rsp alignment violation makes the callee can't align its stack
> properly. Note that the callee may have a situation where it requires
> vector aligned move. For example, `movaps` with memory operand w.r.t.
> xmm registers, it requires the src/dst address be 16-byte aligned.
>
> Since the callee can't align its stack properly, it will segfault when
> executing `movaps`. The following C code is the reproducer and test
> to ensure the bug really exists and this patch fixes it.

Hello,
With the current nolibc.h, the program segfault on movaps:
Program received signal SIGSEGV, Segmentation fault.
0x0000555555555032 in dump_argv (argv=0x7fffffffe288, argc=1) at test.c:15
15        const char str[] = "\nDumping argv...\n";
(gdb) x/20i main
   0x555555555000 <main>:    endbr64
   0x555555555004 <main+4>:    push   %r14
   0x555555555006 <main+6>:    push   %r13
   0x555555555008 <main+8>:    mov    %edi,%r13d
   0x55555555500b <main+11>:    push   %r12
   0x55555555500d <main+13>:    push   %rbp
   0x55555555500e <main+14>:    mov    %rdx,%rbp
   0x555555555011 <main+17>:    mov    $0xa,%edx
   0x555555555016 <main+22>:    push   %rbx
   0x555555555017 <main+23>:    mov    %rsi,%rbx
   0x55555555501a <main+26>:    sub    $0x8,%rsp
   0x55555555501e <main+30>:    movdqa 0xffa(%rip),%xmm0        # 0x555555556020
   0x555555555026 <main+38>:    mov    %dx,-0x68(%rsp)
   0x55555555502b <main+43>:    lea    -0x78(%rsp),%r12
   0x555555555030 <main+48>:    xor    %edx,%edx
=> 0x555555555032 <main+50>:    movaps %xmm0,-0x78(%rsp)
   0x555555555037 <main+55>:    nopw   0x0(%rax,%rax,1)
   0x555555555040 <main+64>:    add    $0x1,%rdx
   0x555555555044 <main+68>:    cmpb   $0x0,(%r12,%rdx,1)
   0x555555555049 <main+73>:    jne    0x555555555040 <main+64>
(gdb) p $rsp-0x78
$1 = (void *) 0x7fffffffe1c8
(gdb)

Apparently it's because $rsp-0x78 is not multiple of 16. After this
patchset, it works fine. gcc version 11.1.0

Tested-by: Louvian Lyndal <louvianlyndal@gmail.com>

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH 2/2] tools/nolibc: x86-64: Fix startup code bug
  2021-10-15  9:26         ` Bedirhan KURT
@ 2021-10-15  9:58           ` Ammar Faizi
  0 siblings, 0 replies; 45+ messages in thread
From: Ammar Faizi @ 2021-10-15  9:58 UTC (permalink / raw)
  To: Bedirhan KURT
  Cc: Willy Tarreau, Ammar Faizi, Paul Walmsley, Palmer Dabbelt,
	Albert Ou, Linux Kernel Mailing List, Andy Lutomirski,
	Thomas Gleixner, Ingo Molnar, x86, H. Peter Anvin, David Laight,
	Peter Cordes, Louvian Lyndal

On Fri, Oct 15, 2021 at 4:26 PM Bedirhan KURT <windowz414@gnuweeb.org> wrote:
>
> (Sending again as previous email had my Ubuntu username as sender and
> Thunderbird attached my GPG key on it. Hope I cancelled on time.)
>
> Hi Ammar,
>
> I've tested your patchset on my local clone of Linux kernel with up to
> date fetch of master branch and this is the output I've gotten after
> executing the test binary compiled;
>
[...]
>
> I hope these are helpful and I could help throughout this patchset. I
> didn't get any SegFaults compared to my tests with same code on pure
> state either so I think everything works just fine. You can append me in
> Tested-by tag if you want.
>

Thanks for testing!

Tested-by: Bedirhan KURT <windowz414@gnuweeb.org>

-- 
Ammar Faizi

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH 2/2] tools/nolibc: x86-64: Fix startup code bug
  2021-10-15  8:25     ` [PATCH 2/2] tools/nolibc: x86-64: Fix startup code bug Ammar Faizi
  2021-10-15  8:57       ` Ammar Faizi
@ 2021-10-18  4:58       ` Willy Tarreau
  2021-10-18  6:53         ` Ammar Faizi
  1 sibling, 1 reply; 45+ messages in thread
From: Willy Tarreau @ 2021-10-18  4:58 UTC (permalink / raw)
  To: Ammar Faizi
  Cc: Paul Walmsley, Palmer Dabbelt, Albert Ou,
	Linux Kernel Mailing List, Andy Lutomirski, Thomas Gleixner,
	Ingo Molnar, x86, H. Peter Anvin, David Laight, Peter Cordes,
	Bedirhan KURT, Louvian Lyndal

Hi Ammar,

sorry for the delay, I needed to check a few things first.

On Fri, Oct 15, 2021 at 03:25:07PM +0700, Ammar Faizi wrote:
> Before this patch, the _start function looks like this:
> 
> 0000000000001170 <_start>:
>     1170:	pop    %rdi
>     1171:	mov    %rsp,%rsi
>     1174:	lea    0x8(%rsi,%rdi,8),%rdx
>     1179:	and    $0xfffffffffffffff0,%rsp
>     117d:	sub    $0x8,%rsp
>     1181:	call   1000 <main>
>     1186:	movzbq %al,%rdi
>     118a:	mov    $0x3c,%rax
>     1191:	syscall
>     1193:	hlt
>     1194:	data16 cs nopw 0x0(%rax,%rax,1)
>     119f:	nop
> 
> Note the "and" to %rsp, it makes the %rsp be 16-byte aligned, but then
> there is a "sub" with $0x8 which makes the %rsp no longer 16-byte
> aligned, then it calls main. That's the bug!
> 
> Right before "call", the %rsp must be 16-byte aligned. So the "sub"
> here breaks the alignment. Remove it.

That's very interesting because my understanding till now was that
the stack had to be 16-aligned in the callee, not the caller. But I've
checked the psABI doc, and it indeed says in section 3.2.2 that it's
rsp+8 which is 16-aligned. Of course, when pushing a frame pointer
onto the stack, it becomes the same. Thanks for spotting this one!

> Also the content of %rbp may be unspecified at process initialization
> time. For example, if the _start gets called by an interpreter, the
> interpreter may not zero the %rbp, so we should zero the %rbp on _start.

OK.

> Extra fixes:
>   - Use NR_exit_group instead of NR_exit.

Please, this is independent on the fix above so it must be subject
of a different patch with its own justification. Also it should cover
all supported architectures, otherwise programs will start to behave
differently on different targets.

>   - Use `mov %eax,%edi` instead of `movzbq %al,%rdi`. This makes the
>     exit code more observable from strace. While the exit code is
>     only 8-bit, the kernel has taken care of that, so no need to
>     worry about it.

I'm fine with this one as well, but similarly, it should be in its own
patch and applied to all architectures.

>  /* startup code */
> -asm(".section .text\n"
> -    ".global _start\n"
> -    "_start:\n"
> -    "pop %rdi\n"                // argc   (first arg, %rdi)
> -    "mov %rsp, %rsi\n"          // argv[] (second arg, %rsi)
> -    "lea 8(%rsi,%rdi,8),%rdx\n" // then a NULL then envp (third arg, %rdx)
> -    "and $-16, %rsp\n"          // x86 ABI : esp must be 16-byte aligned when
> -    "sub $8, %rsp\n"            // entering the callee
> -    "call main\n"               // main() returns the status code, we'll exit with it.
> -    "movzb %al, %rdi\n"         // retrieve exit code from 8 lower bits
> -    "mov $60, %rax\n"           // NR_exit == 60
> -    "syscall\n"                 // really exit
> -    "hlt\n"                     // ensure it does not return
> -    "");
> +asm(
> +	".section .text\n"
> +	".global _start\n"
> +
> +	"_start:\n\t"
> +	"popq %rdi\n\t"			// argc   (first arg, %rdi)
> +	"movq %rsp, %rsi\n\t"		// argv[] (second arg, %rsi)
> +	"leaq 8(%rsi,%rdi,8), %rdx\n\t"	// then a NULL, then envp (third arg, %rdx)
> +
> +	/*
> +	 * The System V ABI mandates the deepest stack frame should be zero.
> +	 * Thus we zero the %rbp here.
> +	 */
> +	"xorl %ebp, %ebp\n\t"
> +
> +	/*
> +	 * The System V ABI mandates the %rsp needs to be aligned at 16-byte
> +	 * before performing a function call.
> +	 */
> +	"andq $-16, %rsp\n\t"
> +
> +	/*
> +	 * main() returns the status code, we will exit with it.
> +	 */
> +	"callq main\n\t"
> +
> +	/*
> +	 * Move the return value to the first argument of exit_group.
> +	 */
> +	"movl %eax, %edi\n\t"
> +
> +	/*
> +	 * NR_exit_group == 231
> +	 */
> +	"movl $231, %eax\n\t"
> +
> +	/*
> +	 * Really exit.
> +	 */
> +	"syscall\n\t"
> +
> +	/*
> +	 * Ensure it does not return.
> +	 */
> +	"hlt\n\t"
> +);

I find the whole thing much less readable here, as asm code tends to
be read as visual groups of patterns. I'm suggesting that you place a
multi-line comment before the asm statement indicating the general
rules (e.g. lowest stack frame must be zero, rsp+8 must be multiple of
16 etc), then only comment each instruction on the same line as it was
before.

Thank you!
Willy

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH 1/2] tools/nolibc: x86: Remove `r8`, `r9` and `r10` from the clobber list
  2021-10-15  8:25     ` [PATCH 1/2] tools/nolibc: x86: Remove `r8`, `r9` and `r10` from the clobber list Ammar Faizi
@ 2021-10-18  5:52       ` Willy Tarreau
  0 siblings, 0 replies; 45+ messages in thread
From: Willy Tarreau @ 2021-10-18  5:52 UTC (permalink / raw)
  To: Ammar Faizi
  Cc: Paul Walmsley, Palmer Dabbelt, Albert Ou,
	Linux Kernel Mailing List, Andy Lutomirski, Thomas Gleixner,
	Ingo Molnar, x86, H. Peter Anvin, David Laight, Peter Cordes,
	Bedirhan KURT, Louvian Lyndal, Borislav Petkov

On Fri, Oct 15, 2021 at 03:25:06PM +0700, Ammar Faizi wrote:
> Linux x86-64 syscall only clobbers rax, rcx and r11 (and "memory").
> 
>   - rax for the return value.
>   - rcx to save the return address.
>   - r11 to save the rflags.
(...)

Now queued, thank you Ammar. I'll pass the series to Paul once we're
done with the remaining patches.

Willy

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH 2/2] tools/nolibc: x86-64: Fix startup code bug
  2021-10-18  4:58       ` Willy Tarreau
@ 2021-10-18  6:53         ` Ammar Faizi
  2021-10-23 13:27           ` Ammar Faizi
  0 siblings, 1 reply; 45+ messages in thread
From: Ammar Faizi @ 2021-10-18  6:53 UTC (permalink / raw)
  To: Willy Tarreau
  Cc: Ammar Faizi, Paul Walmsley, Palmer Dabbelt, Albert Ou,
	Linux Kernel Mailing List, Andy Lutomirski, Thomas Gleixner,
	Ingo Molnar, x86, H. Peter Anvin, David Laight, Peter Cordes,
	Bedirhan KURT, Louvian Lyndal

On Mon, Oct 18, 2021 at 11:59 AM Willy Tarreau <w@1wt.eu> wrote:
> > Extra fixes:
> >   - Use NR_exit_group instead of NR_exit.
>
> Please, this is independent on the fix above so it must be subject
> of a different patch with its own justification. Also it should cover
> all supported architectures, otherwise programs will start to behave
> differently on different targets.
>
> >   - Use `mov %eax,%edi` instead of `movzbq %al,%rdi`. This makes the
> >     exit code more observable from strace. While the exit code is
> >     only 8-bit, the kernel has taken care of that, so no need to
> >     worry about it.
>
> I'm fine with this one as well, but similarly, it should be in its own
> patch and applied to all architectures.
>
[...]
>
> I find the whole thing much less readable here, as asm code tends to
> be read as visual groups of patterns. I'm suggesting that you place a
> multi-line comment before the asm statement indicating the general
> rules (e.g. lowest stack frame must be zero, rsp+8 must be multiple of
> 16 etc), then only comment each instruction on the same line as it was
> before.

Got it, agree with that. I will address your review and resend this as a
patchset v2 soon.

-- 
Ammar Faizi

^ permalink raw reply	[flat|nested] 45+ messages in thread

* RE: [PATCH] tools/nolibc: x86: Remove `r8`, `r9` and `r10` from the clobber list
  2021-10-13 16:24                       ` Michael Matz
  2021-10-13 16:30                         ` Willy Tarreau
@ 2021-10-19  9:06                         ` David Laight
  1 sibling, 0 replies; 45+ messages in thread
From: David Laight @ 2021-10-19  9:06 UTC (permalink / raw)
  To: 'Michael Matz', Willy Tarreau
  Cc: Borislav Petkov, Ammar Faizi, Paul Walmsley, Palmer Dabbelt,
	Albert Ou, Linux Kernel Mailing List, Andy Lutomirski,
	Thomas Gleixner, Ingo Molnar, x86, H. Peter Anvin

From: Michael Matz
> Sent: 13 October 2021 17:24
> 
> Hello,
> 
> On Wed, 13 Oct 2021, Willy Tarreau wrote:
> 
> > On Wed, Oct 13, 2021 at 04:20:55PM +0200, Borislav Petkov wrote:
> > > On Wed, Oct 13, 2021 at 04:07:23PM +0200, Willy Tarreau wrote:
> > > > Yes I agree with the "potentially" here. If it can potentially be (i.e.
> > > > the kernel is allowed by contract to later change the way it's currently
> > > > done) then we have to save them even if it means lower code efficiency.
> > > >
> > > > If, however, the kernel performs such savings on purpose because it is
> > > > willing to observe a stricter saving than the AMD64 ABI, we can follow
> > > > it but only once it's written down somewhere that it is by contract and
> > > > will not change.
> > >
> > > Right, and Micha noted that such a change to the document can be done.
> >
> > great.
> >
> > > And we're basically doing that registers restoring anyway, in POP_REGS.
> >
> > That's what I based my analysis on when I wanted to verify Ammar's
> > finding. I would tend to think that if we're burning cycles popping
> > plenty of registers it's probably for a reason, maybe at least a good
> > one, which is that it's the only way to make sure we're not leaking
> > internal kernel data! This is not a concern for kernel->kernel nor
> > user->user calls but for user->kernel calls it definitely is one, and
> > I don't think we could relax that series of pop without causing leaks
> > anyway.
> 
> It might also be interesting to know that while the wording of the psABI
> was indeed intended to imply that all argument registers are potentially
> clobbered (like with normal calls) glibc's inline assembler to call
> syscalls relies on most registers to actually be preserved:
> 
> # define REGISTERS_CLOBBERED_BY_SYSCALL "cc", "r11", "cx"
> ...
> #define internal_syscall6(number, arg1, arg2, arg3, arg4, arg5, arg6) \
> ({                                                                      \
>     unsigned long int resultvar;                                        \
>     TYPEFY (arg6, __arg6) = ARGIFY (arg6);                              \
>     TYPEFY (arg5, __arg5) = ARGIFY (arg5);                              \
>     TYPEFY (arg4, __arg4) = ARGIFY (arg4);                              \
>     TYPEFY (arg3, __arg3) = ARGIFY (arg3);                              \
>     TYPEFY (arg2, __arg2) = ARGIFY (arg2);                              \
>     TYPEFY (arg1, __arg1) = ARGIFY (arg1);                              \
>     register TYPEFY (arg6, _a6) asm ("r9") = __arg6;                    \
>     register TYPEFY (arg5, _a5) asm ("r8") = __arg5;                    \
>     register TYPEFY (arg4, _a4) asm ("r10") = __arg4;                   \
>     register TYPEFY (arg3, _a3) asm ("rdx") = __arg3;                   \
>     register TYPEFY (arg2, _a2) asm ("rsi") = __arg2;                   \
>     register TYPEFY (arg1, _a1) asm ("rdi") = __arg1;                   \
>     asm volatile (                                                      \
>     "syscall\n\t"                                                       \
>     : "=a" (resultvar)                                                  \
>     : "0" (number), "r" (_a1), "r" (_a2), "r" (_a3), "r" (_a4),         \
>       "r" (_a5), "r" (_a6)                                              \
>     : "memory", REGISTERS_CLOBBERED_BY_SYSCALL);                        \
>     (long int) resultvar;                                               \
> })
> 
> 
> Note in particular the missing clobbers or outputs of any of the argument
> regs.

What about all the AVX registers?
These are normally caller-saved - so are unlikely to be live in a gcc stub.
But glibc is unlikely to keep the clobber list up do date - even if they
were ever added.

While the kernel can't return 'junk' in the AVX registers, it may be
significantly cheaper to zero the registers on at least some code paths.

The same is true for the rxx registers.
Executing 'xor %rxx,%rxx' is faster than 'pop $rxx'.
Especially since the xor cam all be execute in parallel.

	David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)


^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH 2/2] tools/nolibc: x86-64: Fix startup code bug
  2021-10-18  6:53         ` Ammar Faizi
@ 2021-10-23 13:27           ` Ammar Faizi
  2021-10-23 13:43             ` Willy Tarreau
  0 siblings, 1 reply; 45+ messages in thread
From: Ammar Faizi @ 2021-10-23 13:27 UTC (permalink / raw)
  To: Willy Tarreau
  Cc: Ammar Faizi, Paul Walmsley, Palmer Dabbelt, Albert Ou,
	Linux Kernel Mailing List, Andy Lutomirski, Thomas Gleixner,
	Ingo Molnar, x86, H. Peter Anvin, David Laight, Peter Cordes,
	Bedirhan KURT, Louvian Lyndal


On Sat, Oct 23, 2021 at 4:02 PM Willy Tarreau <w@1wt.eu> wrote:
>
> Hi Ammar,
>
> On Mon, Oct 18, 2021 at 01:53:29PM +0700, Ammar Faizi wrote:
> > Got it, agree with that. I will address your review and resend this as a
> > patchset v2 soon.
>
> Just checking if you have anything about this or if you're busy. No
> stress, it's just that I prefer to send batches to Paul since he
> rebuilds and retests everything each time, so I'm keeping your first
> patch and another one on hold for now.
>
> Do not hesitate to let me know if you don't have time and if you want
> me to rework your patches myself.
>
> Thanks!
> Willy

Hi Willy,

Sorry for the delay, I got extra activities this week. Sorry for not
giving any update lately.

  1) I can send the %rsp alignment fix patch. I will send it today or
     tomorrow (GMT+07 time).

  2) I can't send the syscall change used for exit. Because I only
     have x86 machine. So I can't apply the changes to other arch(s).

For (2), basically sys_exit doesn't close the entire process. Instead
it only closes specific thread that calls that syscall. The libc uses
sys_exit_group to close the process and its threads.

^ It's not really an urgent thing, because the nolibc.h may not be
used for multithreaded app. Even so, I don't see something dangerous.

For (1), it's urgent, because the alignment violation causes segfault
if the compiler generates aligned move, often when we compile it
with -O3, usually that happens with SSE instructions, like `movdqa`,
`movaps`.

Preparing the patch...

-- 
Ammar Faizi

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH 2/2] tools/nolibc: x86-64: Fix startup code bug
  2021-10-23 13:27           ` Ammar Faizi
@ 2021-10-23 13:43             ` Willy Tarreau
  2021-10-24  2:11               ` [PATCHSET v2 0/2] tools/nolibc: Fix startup code bug and small improvement Ammar Faizi
  0 siblings, 1 reply; 45+ messages in thread
From: Willy Tarreau @ 2021-10-23 13:43 UTC (permalink / raw)
  To: Ammar Faizi
  Cc: Paul Walmsley, Palmer Dabbelt, Albert Ou,
	Linux Kernel Mailing List, Andy Lutomirski, Thomas Gleixner,
	Ingo Molnar, x86, H. Peter Anvin, David Laight, Peter Cordes,
	Bedirhan KURT, Louvian Lyndal

On Sat, Oct 23, 2021 at 08:27:15PM +0700, Ammar Faizi wrote:
> Sorry for the delay, I got extra activities this week. Sorry for not
> giving any update lately.

No, really, don't be sorry, I'm myself quite busy, so I understnad, I
was just inquiring to arrange my time, nothing more.

>   1) I can send the %rsp alignment fix patch. I will send it today or
>      tomorrow (GMT+07 time).

OK, no rush anyway. Even early next week is okay for me.

>   2) I can't send the syscall change used for exit. Because I only
>      have x86 machine. So I can't apply the changes to other arch(s).

I see. I can do it for the various archs then, as the ones that are
supported are essentially the ones I can test.

> For (2), basically sys_exit doesn't close the entire process. Instead
> it only closes specific thread that calls that syscall. The libc uses
> sys_exit_group to close the process and its threads.
> 
> ^ It's not really an urgent thing, because the nolibc.h may not be
> used for multithreaded app. Even so, I don't see something dangerous.

Yep that's what I understood as well, so we may easily postpone this.

> For (1), it's urgent, because the alignment violation causes segfault
> if the compiler generates aligned move, often when we compile it
> with -O3, usually that happens with SSE instructions, like `movdqa`,
> `movaps`.

Yes, that's what I saw from the other reports, I didn't notice it myself
but I probably faced it and attributed it to anything else.

> Preparing the patch...

Great, thank you!
Willy

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH] tools/nolibc: x86: Remove `r8`, `r9` and `r10` from the clobber list
  2021-10-13 12:43           ` Borislav Petkov
  2021-10-13 12:51             ` Willy Tarreau
@ 2021-10-23 20:40             ` H. Peter Anvin
  1 sibling, 0 replies; 45+ messages in thread
From: H. Peter Anvin @ 2021-10-23 20:40 UTC (permalink / raw)
  To: Borislav Petkov, Ammar Faizi
  Cc: Willy Tarreau, Paul Walmsley, Palmer Dabbelt, Albert Ou,
	Linux Kernel Mailing List, Andy Lutomirski, Thomas Gleixner,
	Ingo Molnar, x86



On 10/13/21 05:43, Borislav Petkov wrote:
> On Wed, Oct 13, 2021 at 05:23:11AM +0700, Ammar Faizi wrote:
>> "Figure 3.4: Register Usage" is not the answer, if it were, nolibc.h
>> would be broken as it is missing "rdi", "rsi", "rdx" in the clobber list.
> 
> It is not about what happens in practice but what the contract is:
> syscall argument registers can potentially get clobbered and userspace
> should treat them as such. Because if the kernel decides to actually
> clobber them for whatever reason and some userspace thing thinks
> otherwise, then it is the userspace thing's problem as it doesn't adhere
> to the well known ABI.
> 

Currently the kernel doesn't, but some past kernels have zeroed some of 
these registers rather than preserving them.

	-hpa

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCHSET v2 0/2] tools/nolibc: Fix startup code bug and small improvement
  2021-10-23 13:43             ` Willy Tarreau
@ 2021-10-24  2:11               ` Ammar Faizi
  2021-10-24  2:11                 ` [PATCH 1/2] tools/nolibc: x86-64: Fix startup code bug Ammar Faizi
                                   ` (2 more replies)
  0 siblings, 3 replies; 45+ messages in thread
From: Ammar Faizi @ 2021-10-24  2:11 UTC (permalink / raw)
  To: Willy Tarreau
  Cc: Ammar Faizi, Paul Walmsley, Palmer Dabbelt, Albert Ou,
	Linux Kernel Mailing List, Andy Lutomirski, Thomas Gleixner,
	Ingo Molnar, x86-ml, H. Peter Anvin, David Laight, Peter Cordes,
	Bedirhan KURT, Louvian Lyndal

Hi Willy,
This is a patchset v2, there are 2 patches in this series.

[PATCH 1/2] is a bug fix. Thanks to Peter who reported the bug, fixed
in [PATCH 1/2] by me.

[PATCH 2/2] is just a small improvement to minimize code size, no
functional changes.

Detailed explanation in the commit message.
Please review!

Link v1: https://lore.kernel.org/lkml/dRLArKzRMqajy1jA86k0vg-ammarfaizi2@gnuweeb.org/
----------------------------------------------------------------
Ammar Faizi (2):
      tools/nolibc: x86-64: Fix startup code bug
      tools/nolibc: x86-64: Use `mov $60,%eax` instead of `mov $60,%rax`

 tools/include/nolibc/nolibc.h | 12 +++++++++---
 1 file changed, 9 insertions(+), 3 deletions(-)

Thanks!
-- 
Ammar Faizi



^ permalink raw reply	[flat|nested] 45+ messages in thread

* [PATCH 1/2] tools/nolibc: x86-64: Fix startup code bug
  2021-10-24  2:11               ` [PATCHSET v2 0/2] tools/nolibc: Fix startup code bug and small improvement Ammar Faizi
@ 2021-10-24  2:11                 ` Ammar Faizi
  2021-10-24  2:11                 ` [PATCH 2/2] tools/nolibc: x86-64: Use `mov $60,%eax` instead of `mov $60,%rax` Ammar Faizi
  2021-10-24 11:41                 ` [PATCHSET v2 0/2] tools/nolibc: Fix startup code bug and small improvement Willy Tarreau
  2 siblings, 0 replies; 45+ messages in thread
From: Ammar Faizi @ 2021-10-24  2:11 UTC (permalink / raw)
  To: Willy Tarreau
  Cc: Ammar Faizi, Paul Walmsley, Palmer Dabbelt, Albert Ou,
	Linux Kernel Mailing List, Andy Lutomirski, Thomas Gleixner,
	Ingo Molnar, x86-ml, H. Peter Anvin, David Laight, Peter Cordes,
	Bedirhan KURT, Louvian Lyndal

Before this patch, the `_start` function looks like this:
```
0000000000001170 <_start>:
    1170:	pop    %rdi
    1171:	mov    %rsp,%rsi
    1174:	lea    0x8(%rsi,%rdi,8),%rdx
    1179:	and    $0xfffffffffffffff0,%rsp
    117d:	sub    $0x8,%rsp
    1181:	call   1000 <main>
    1186:	movzbq %al,%rdi
    118a:	mov    $0x3c,%rax
    1191:	syscall
    1193:	hlt
    1194:	data16 cs nopw 0x0(%rax,%rax,1)
    119f:	nop
```
Note the "and" to %rsp with $-16, it makes the %rsp be 16-byte aligned,
but then there is a "sub" with $0x8 which makes the %rsp no longer
16-byte aligned, then it calls main. That's the bug!

What actually the x86-64 System V ABI mandates is that right before the
"call", the %rsp must be 16-byte aligned, not after the "call". So the
"sub" with $0x8 here breaks the alignment. Remove it.

An example where this rule matters is when the callee needs to align
its stack at 16-byte for aligned move instruction, like `movdqa` and
`movaps`. If the callee can't align its stack properly, it will result
in segmentation fault.

x86-64 System V ABI also mandates the deepest stack frame should be
zero. Just to be safe, let's zero the %rbp on startup as the content
of %rbp may be unspecified when the program starts. Now it looks like
this:
```
0000000000001170 <_start>:
    1170:	pop    %rdi
    1171:	mov    %rsp,%rsi
    1174:	lea    0x8(%rsi,%rdi,8),%rdx
    1179:	xor    %ebp,%ebp                # zero the %rbp
    117b:	and    $0xfffffffffffffff0,%rsp # align the %rsp
    117f:	call   1000 <main>
    1184:	movzbq %al,%rdi
    1188:	mov    $0x3c,%rax
    118f:	syscall
    1191:	hlt
    1192:	data16 cs nopw 0x0(%rax,%rax,1)
    119d:	nopl   (%rax)
```

Cc: Bedirhan KURT <windowz414@gnuweeb.org>
Cc: Louvian Lyndal <louvianlyndal@gmail.com>
Reported-by: Peter Cordes <peter@cordes.ca>
Signed-off-by: Ammar Faizi <ammar.faizi@students.amikom.ac.id>
---
 tools/include/nolibc/nolibc.h | 10 ++++++++--
 1 file changed, 8 insertions(+), 2 deletions(-)

diff --git a/tools/include/nolibc/nolibc.h b/tools/include/nolibc/nolibc.h
index 1483d95c8330..ea38d6f356a1 100644
--- a/tools/include/nolibc/nolibc.h
+++ b/tools/include/nolibc/nolibc.h
@@ -404,14 +404,20 @@ struct stat {
 })
 
 /* startup code */
+/*
+ * x86-64 System V ABI mandates:
+ * 1) %rsp must be 16-byte aligned right before the function call.
+ * 2) The deepest stack frame should be zero (the %rbp).
+ *
+ */
 asm(".section .text\n"
     ".global _start\n"
     "_start:\n"
     "pop %rdi\n"                // argc   (first arg, %rdi)
     "mov %rsp, %rsi\n"          // argv[] (second arg, %rsi)
     "lea 8(%rsi,%rdi,8),%rdx\n" // then a NULL then envp (third arg, %rdx)
-    "and $-16, %rsp\n"          // x86 ABI : esp must be 16-byte aligned when
-    "sub $8, %rsp\n"            // entering the callee
+    "xor %ebp, %ebp\n"          // zero the stack frame
+    "and $-16, %rsp\n"          // x86 ABI : esp must be 16-byte aligned before call
     "call main\n"               // main() returns the status code, we'll exit with it.
     "movzb %al, %rdi\n"         // retrieve exit code from 8 lower bits
     "mov $60, %rax\n"           // NR_exit == 60
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [PATCH 2/2] tools/nolibc: x86-64: Use `mov $60,%eax` instead of `mov $60,%rax`
  2021-10-24  2:11               ` [PATCHSET v2 0/2] tools/nolibc: Fix startup code bug and small improvement Ammar Faizi
  2021-10-24  2:11                 ` [PATCH 1/2] tools/nolibc: x86-64: Fix startup code bug Ammar Faizi
@ 2021-10-24  2:11                 ` Ammar Faizi
  2021-10-24 11:41                 ` [PATCHSET v2 0/2] tools/nolibc: Fix startup code bug and small improvement Willy Tarreau
  2 siblings, 0 replies; 45+ messages in thread
From: Ammar Faizi @ 2021-10-24  2:11 UTC (permalink / raw)
  To: Willy Tarreau
  Cc: Ammar Faizi, Paul Walmsley, Palmer Dabbelt, Albert Ou,
	Linux Kernel Mailing List, Andy Lutomirski, Thomas Gleixner,
	Ingo Molnar, x86-ml, H. Peter Anvin, David Laight, Peter Cordes,
	Bedirhan KURT, Louvian Lyndal

Note that mov to 32-bit register will zero extend to 64-bit register.
Thus `mov $60,%eax` has the same effect with `mov $60,%rax`. Use the
shorter opcode to achieve the same thing.
```
  b8 3c 00 00 00       	mov    $60,%eax (5 bytes) [1]
  48 c7 c0 3c 00 00 00 	mov    $60,%rax (7 bytes) [2]
```
Currently, we use [2]. Change it to [1] for shorter code.

Signed-off-by: Ammar Faizi <ammar.faizi@students.amikom.ac.id>
---
 tools/include/nolibc/nolibc.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tools/include/nolibc/nolibc.h b/tools/include/nolibc/nolibc.h
index ea38d6f356a1..b1a81f711327 100644
--- a/tools/include/nolibc/nolibc.h
+++ b/tools/include/nolibc/nolibc.h
@@ -420,7 +420,7 @@ asm(".section .text\n"
     "and $-16, %rsp\n"          // x86 ABI : esp must be 16-byte aligned before call
     "call main\n"               // main() returns the status code, we'll exit with it.
     "movzb %al, %rdi\n"         // retrieve exit code from 8 lower bits
-    "mov $60, %rax\n"           // NR_exit == 60
+    "mov $60, %eax\n"           // NR_exit == 60
     "syscall\n"                 // really exit
     "hlt\n"                     // ensure it does not return
     "");
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 45+ messages in thread

* Re: [PATCHSET v2 0/2] tools/nolibc: Fix startup code bug and small improvement
  2021-10-24  2:11               ` [PATCHSET v2 0/2] tools/nolibc: Fix startup code bug and small improvement Ammar Faizi
  2021-10-24  2:11                 ` [PATCH 1/2] tools/nolibc: x86-64: Fix startup code bug Ammar Faizi
  2021-10-24  2:11                 ` [PATCH 2/2] tools/nolibc: x86-64: Use `mov $60,%eax` instead of `mov $60,%rax` Ammar Faizi
@ 2021-10-24 11:41                 ` Willy Tarreau
  2 siblings, 0 replies; 45+ messages in thread
From: Willy Tarreau @ 2021-10-24 11:41 UTC (permalink / raw)
  To: Ammar Faizi
  Cc: Paul Walmsley, Palmer Dabbelt, Albert Ou,
	Linux Kernel Mailing List, Andy Lutomirski, Thomas Gleixner,
	Ingo Molnar, x86-ml, H. Peter Anvin, David Laight, Peter Cordes,
	Bedirhan KURT, Louvian Lyndal

On Sun, Oct 24, 2021 at 09:11:30AM +0700, Ammar Faizi wrote:
> Hi Willy,
> This is a patchset v2, there are 2 patches in this series.
> 
> [PATCH 1/2] is a bug fix. Thanks to Peter who reported the bug, fixed
> in [PATCH 1/2] by me.
> 
> [PATCH 2/2] is just a small improvement to minimize code size, no
> functional changes.
> 
> Detailed explanation in the commit message.
> Please review!

Many thanks Ammar, both look good, I've queued them and am now reviewing
the ABI and code for other archs in case I did the same mistake for the
alignment at other places (i386 comes to mind).

I'll also have a look at the exit calls your mentioned.

Thanks!
Willy

^ permalink raw reply	[flat|nested] 45+ messages in thread

end of thread, other threads:[~2021-10-24 11:42 UTC | newest]

Thread overview: 45+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-10-11  4:03 [PATCH] tools/nolibc: x86: Remove `r8`, `r9` and `r10` from the clobber list Ammar Faizi
2021-10-12  5:28 ` Willy Tarreau
2021-10-12  8:36   ` Ammar Faizi
2021-10-12  9:06     ` Willy Tarreau
2021-10-12 20:29       ` Borislav Petkov
2021-10-12 21:51         ` Borislav Petkov
2021-10-12 22:23         ` Ammar Faizi
2021-10-13  3:01           ` Willy Tarreau
2021-10-13  3:32             ` Ammar Faizi
2021-10-13  3:34               ` Ammar Faizi
2021-10-13  3:37                 ` Ammar Faizi
2021-10-13 12:43           ` Borislav Petkov
2021-10-13 12:51             ` Willy Tarreau
2021-10-13 13:06               ` Borislav Petkov
2021-10-13 14:07                 ` Willy Tarreau
2021-10-13 14:20                   ` Borislav Petkov
2021-10-13 14:24                     ` Willy Tarreau
2021-10-13 16:24                       ` Michael Matz
2021-10-13 16:30                         ` Willy Tarreau
2021-10-13 16:51                           ` Andy Lutomirski
2021-10-13 16:52                           ` Borislav Petkov
2021-10-14  8:44                             ` Ammar Faizi
2021-10-14 12:44                             ` Michael Matz
2021-10-14 14:31                               ` Borislav Petkov
2021-10-19  9:06                         ` David Laight
2021-10-23 20:40             ` H. Peter Anvin
2021-10-12 21:21       ` David Laight
2021-10-12 23:02         ` Subject: " Ammar Faizi
2021-10-13  9:03 ` [PATCH v2] " Ammar Faizi
2021-10-15  8:25   ` [PATCH 0/2] Fix clobber list and startup code bug Ammar Faizi
2021-10-15  8:25     ` [PATCH 1/2] tools/nolibc: x86: Remove `r8`, `r9` and `r10` from the clobber list Ammar Faizi
2021-10-18  5:52       ` Willy Tarreau
2021-10-15  8:25     ` [PATCH 2/2] tools/nolibc: x86-64: Fix startup code bug Ammar Faizi
2021-10-15  8:57       ` Ammar Faizi
2021-10-15  9:26         ` Bedirhan KURT
2021-10-15  9:58           ` Ammar Faizi
2021-10-15  9:41         ` Louvian Lyndal
2021-10-18  4:58       ` Willy Tarreau
2021-10-18  6:53         ` Ammar Faizi
2021-10-23 13:27           ` Ammar Faizi
2021-10-23 13:43             ` Willy Tarreau
2021-10-24  2:11               ` [PATCHSET v2 0/2] tools/nolibc: Fix startup code bug and small improvement Ammar Faizi
2021-10-24  2:11                 ` [PATCH 1/2] tools/nolibc: x86-64: Fix startup code bug Ammar Faizi
2021-10-24  2:11                 ` [PATCH 2/2] tools/nolibc: x86-64: Use `mov $60,%eax` instead of `mov $60,%rax` Ammar Faizi
2021-10-24 11:41                 ` [PATCHSET v2 0/2] tools/nolibc: Fix startup code bug and small improvement Willy Tarreau

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).