All of lore.kernel.org
 help / color / mirror / Atom feed
* [RFC PATCH v2 0/8] Add dynamic memory allocator support for nolibc
@ 2022-03-22 10:21 Ammar Faizi
  2022-03-22 10:21 ` [RFC PATCH v2 1/8] tools/nolibc: x86-64: Update System V ABI document link Ammar Faizi
                   ` (8 more replies)
  0 siblings, 9 replies; 44+ messages in thread
From: Ammar Faizi @ 2022-03-22 10:21 UTC (permalink / raw)
  To: Willy Tarreau
  Cc: Paul E. McKenney, Alviro Iskandar Setiawan, Nugraha,
	Linux Kernel Mailing List, GNU/Weeb Mailing List, Ammar Faizi

Hi,

This is the v2 of RFC to add dynamic memory allocator support for
nolibc.

## Background
The need to allocate memory dynamically has become a requirement for
the C programming language. Mainly it happens when the allocation size
is determined at runtime. Many other use cases also do it when the
object's lifetime is long-lived and needs to be recycled at runtime.
Currently, the nolibc header doesn't support such a type of allocation.
This series adds it.

## Implementation
Add basic functions to manage dynamic memory allocation:
  - malloc()
  - calloc()
  - realloc()
  - free()

The allocator uses mmap() syscall to allocate the memory and uses
munmap() syscall to free the allocated memory. The metadata to keep
track the length for munmap-ing is simply defined as a struct below:
```
struct nolibc_heap {
        size_t  len;
        char    user_p[] __attribute__((__aligned__));
};
```
malloc(), realloc() and calloc() return a pointer to `user_p`.

## Add my_syscall6() support for x86 32-bit.
mmap() needs 6 arguments to work with. Not all architectures that
nolibc supports have the my_syscall6() wrapper. This series also
adds my_syscall6() wrapper support for i386.

Notes:
On i386, the 6th argument of syscall goes in %ebp. However, both Clang
and GCC cannot use %ebp in the clobber list and in the "r" constraint
without using -fomit-frame-pointer. To make it always available for
any kind of compilation, the below workaround is implemented.

For clang (the Assembly statement can't clobber %ebp):
  1) Push the 6-th argument.
  2) Push %ebp.
  3) Load the 6-th argument from 4(%esp) to %ebp.
  4) Do the syscall (int $0x80).
  5) Pop %ebp (restore the old value of %ebp).
  6) Add %esp by 4 (undo the stack pointer).

For GCC, fortunately it has a #pragma that can force a specific function
to be compiled with -fomit-frame-pointer, so it can use "r"(var) where
var is a variable bound to %ebp.

## Test
The following simple program can be used to test this series:

  https://gist.github.com/ammarfaizi2/db0af6aa0b95a0c7478bce64e349f021

@@ Changelog:

   Link RFC v1: https://lore.kernel.org/lkml/20220320093750.159991-1-ammarfaizi2@gnuweeb.org/
   RFC v1 -> RFC v2:
    - Add 2 new patches [PATCH 5/8] and [PATCH 7/8].

    [PATCH 2/8]
    - Remove all `.global _start` for all build (GCC and Clang) instead of
      removing all `.weak _start` for clang build (Comment from Willy).

    [PATCH 3/8]
    - Fix %ebp saving method. Don't use redzone, i386 doesn't have a redzone
      (comment from David and Alviro).

    [PATCH 6/8]
    - Move container_of() and offsetof() macro to types.h with a
      separate patch (comment from Willy).

    [PATCH 8/8]
    - Update strdup and strndup implementation, use strlen and strnlen to get
      the string length first (comment from Willy and Alviro).
    - Fix the subject line prefix, it was "tools/include/string: ", it should be
      "tools/nolibc/string: ".
    - Update the commit message.

Signed-off-by: Ammar Faizi <ammarfaizi2@gnuweeb.org>
---
Ammar Faizi (8):
  tools/nolibc: x86-64: Update System V ABI document link
  tools/nolibc: Remove .global _start from the entry point code
  tools/nolibc: i386: Implement syscall with 6 arguments
  tools/nolibc/sys: Implement `mmap()` and `munmap()`
  tools/nolibc/types: Implement `offsetof()` and `container_of()` macro
  tools/nolibc/stdlib: Implement `malloc()`, `calloc()`, `realloc()` and `free()`
  tools/nolibc/string: Implement `strnlen()`
  tools/include/string: Implement `strdup()` and `strndup()`

 tools/include/nolibc/arch-aarch64.h |  1 -
 tools/include/nolibc/arch-arm.h     |  1 -
 tools/include/nolibc/arch-i386.h    | 67 +++++++++++++++++++++++++++-
 tools/include/nolibc/arch-mips.h    |  1 -
 tools/include/nolibc/arch-riscv.h   |  1 -
 tools/include/nolibc/arch-x86_64.h  |  3 +-
 tools/include/nolibc/stdlib.h       | 68 +++++++++++++++++++++++++++++
 tools/include/nolibc/string.h       | 41 +++++++++++++++++
 tools/include/nolibc/sys.h          | 62 ++++++++++++++++++++++++++
 tools/include/nolibc/types.h        | 11 +++++
 10 files changed, 249 insertions(+), 7 deletions(-)


base-commit: 45abaed6ed05856556c60863f4cac429c92c431f
-- 
Ammar Faizi


^ permalink raw reply	[flat|nested] 44+ messages in thread

* [RFC PATCH v2 1/8] tools/nolibc: x86-64: Update System V ABI document link
  2022-03-22 10:21 [RFC PATCH v2 0/8] Add dynamic memory allocator support for nolibc Ammar Faizi
@ 2022-03-22 10:21 ` Ammar Faizi
  2022-03-22 10:21 ` [RFC PATCH v2 2/8] tools/nolibc: Remove .global _start from the entry point code Ammar Faizi
                   ` (7 subsequent siblings)
  8 siblings, 0 replies; 44+ messages in thread
From: Ammar Faizi @ 2022-03-22 10:21 UTC (permalink / raw)
  To: Willy Tarreau
  Cc: Paul E. McKenney, Alviro Iskandar Setiawan, Nugraha,
	Linux Kernel Mailing List, GNU/Weeb Mailing List, Ammar Faizi

The old link no longer works, update it.

Signed-off-by: Ammar Faizi <ammarfaizi2@gnuweeb.org>
---

@@ Changelog:

   Link RFC v1: https://lore.kernel.org/lkml/20220320093750.159991-2-ammarfaizi2@gnuweeb.org
   RFC v1 -> RFC v2:
    * No changes *
---
 tools/include/nolibc/arch-x86_64.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tools/include/nolibc/arch-x86_64.h b/tools/include/nolibc/arch-x86_64.h
index fe517c16cd4d..a7b70ea51b68 100644
--- a/tools/include/nolibc/arch-x86_64.h
+++ b/tools/include/nolibc/arch-x86_64.h
@@ -61,7 +61,7 @@ struct sys_stat_struct {
  *   - see also x86-64 ABI section A.2 AMD64 Linux Kernel Conventions, A.2.1
  *     Calling Conventions.
  *
- * Link x86-64 ABI: https://gitlab.com/x86-psABIs/x86-64-ABI/-/wikis/x86-64-psABI
+ * Link x86-64 ABI: https://gitlab.com/x86-psABIs/x86-64-ABI/-/wikis/home
  *
  */
 
-- 
Ammar Faizi


^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [RFC PATCH v2 2/8] tools/nolibc: Remove .global _start from the entry point code
  2022-03-22 10:21 [RFC PATCH v2 0/8] Add dynamic memory allocator support for nolibc Ammar Faizi
  2022-03-22 10:21 ` [RFC PATCH v2 1/8] tools/nolibc: x86-64: Update System V ABI document link Ammar Faizi
@ 2022-03-22 10:21 ` Ammar Faizi
  2022-03-22 17:09   ` Nick Desaulniers
  2022-03-22 10:21 ` [RFC PATCH v2 3/8] tools/nolibc: i386: Implement syscall with 6 arguments Ammar Faizi
                   ` (6 subsequent siblings)
  8 siblings, 1 reply; 44+ messages in thread
From: Ammar Faizi @ 2022-03-22 10:21 UTC (permalink / raw)
  To: Willy Tarreau
  Cc: Paul E. McKenney, Alviro Iskandar Setiawan, Nugraha,
	Linux Kernel Mailing List, GNU/Weeb Mailing List, Ammar Faizi,
	llvm, Nick Desaulniers

Building with clang yields the following error:
```
  <inline asm>:3:1: error: _start changed binding to STB_GLOBAL
  .global _start
  ^
  1 error generated.
```
Make sure only specify one between `.global _start` and `.weak _start`.
Removing `.global _start`.

Cc: llvm@lists.linux.dev
Cc: Nick Desaulniers <ndesaulniers@google.com>
Signed-off-by: Ammar Faizi <ammarfaizi2@gnuweeb.org>
---

@@ Changelog:

   Link RFC v1: https://lore.kernel.org/llvm/20220320093750.159991-3-ammarfaizi2@gnuweeb.org
   RFC v1 -> RFC v2:
    - Remove all `.global _start` for all build (GCC and Clang) instead of
      removing all `.weak _start` for clang build (Comment from Willy).
---
 tools/include/nolibc/arch-aarch64.h | 1 -
 tools/include/nolibc/arch-arm.h     | 1 -
 tools/include/nolibc/arch-i386.h    | 1 -
 tools/include/nolibc/arch-mips.h    | 1 -
 tools/include/nolibc/arch-riscv.h   | 1 -
 tools/include/nolibc/arch-x86_64.h  | 1 -
 6 files changed, 6 deletions(-)

diff --git a/tools/include/nolibc/arch-aarch64.h b/tools/include/nolibc/arch-aarch64.h
index 87d9e434820c..2dbd80d633cb 100644
--- a/tools/include/nolibc/arch-aarch64.h
+++ b/tools/include/nolibc/arch-aarch64.h
@@ -184,7 +184,6 @@ struct sys_stat_struct {
 /* startup code */
 asm(".section .text\n"
     ".weak _start\n"
-    ".global _start\n"
     "_start:\n"
     "ldr x0, [sp]\n"              // argc (x0) was in the stack
     "add x1, sp, 8\n"             // argv (x1) = sp
diff --git a/tools/include/nolibc/arch-arm.h b/tools/include/nolibc/arch-arm.h
index 001a3c8c9ad5..1191395b5acd 100644
--- a/tools/include/nolibc/arch-arm.h
+++ b/tools/include/nolibc/arch-arm.h
@@ -177,7 +177,6 @@ struct sys_stat_struct {
 /* startup code */
 asm(".section .text\n"
     ".weak _start\n"
-    ".global _start\n"
     "_start:\n"
 #if defined(__THUMBEB__) || defined(__THUMBEL__)
     /* We enter here in 32-bit mode but if some previous functions were in
diff --git a/tools/include/nolibc/arch-i386.h b/tools/include/nolibc/arch-i386.h
index d7e4d53325a3..125a691fc631 100644
--- a/tools/include/nolibc/arch-i386.h
+++ b/tools/include/nolibc/arch-i386.h
@@ -176,7 +176,6 @@ struct sys_stat_struct {
  */
 asm(".section .text\n"
     ".weak _start\n"
-    ".global _start\n"
     "_start:\n"
     "pop %eax\n"                // argc   (first arg, %eax)
     "mov %esp, %ebx\n"          // argv[] (second arg, %ebx)
diff --git a/tools/include/nolibc/arch-mips.h b/tools/include/nolibc/arch-mips.h
index c9a6aac87c6d..1a124790c99f 100644
--- a/tools/include/nolibc/arch-mips.h
+++ b/tools/include/nolibc/arch-mips.h
@@ -192,7 +192,6 @@ struct sys_stat_struct {
 asm(".section .text\n"
     ".weak __start\n"
     ".set nomips16\n"
-    ".global __start\n"
     ".set    noreorder\n"
     ".option pic0\n"
     ".ent __start\n"
diff --git a/tools/include/nolibc/arch-riscv.h b/tools/include/nolibc/arch-riscv.h
index bc10b7b5706d..511d67fc534e 100644
--- a/tools/include/nolibc/arch-riscv.h
+++ b/tools/include/nolibc/arch-riscv.h
@@ -185,7 +185,6 @@ struct sys_stat_struct {
 /* startup code */
 asm(".section .text\n"
     ".weak _start\n"
-    ".global _start\n"
     "_start:\n"
     ".option push\n"
     ".option norelax\n"
diff --git a/tools/include/nolibc/arch-x86_64.h b/tools/include/nolibc/arch-x86_64.h
index a7b70ea51b68..84c174181425 100644
--- a/tools/include/nolibc/arch-x86_64.h
+++ b/tools/include/nolibc/arch-x86_64.h
@@ -199,7 +199,6 @@ struct sys_stat_struct {
  */
 asm(".section .text\n"
     ".weak _start\n"
-    ".global _start\n"
     "_start:\n"
     "pop %rdi\n"                // argc   (first arg, %rdi)
     "mov %rsp, %rsi\n"          // argv[] (second arg, %rsi)
-- 
Ammar Faizi


^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [RFC PATCH v2 3/8] tools/nolibc: i386: Implement syscall with 6 arguments
  2022-03-22 10:21 [RFC PATCH v2 0/8] Add dynamic memory allocator support for nolibc Ammar Faizi
  2022-03-22 10:21 ` [RFC PATCH v2 1/8] tools/nolibc: x86-64: Update System V ABI document link Ammar Faizi
  2022-03-22 10:21 ` [RFC PATCH v2 2/8] tools/nolibc: Remove .global _start from the entry point code Ammar Faizi
@ 2022-03-22 10:21 ` Ammar Faizi
  2022-03-22 10:57   ` David Laight
  2022-03-22 11:39   ` David Laight
  2022-03-22 10:21 ` [RFC PATCH v2 4/8] tools/nolibc/sys: Implement `mmap()` and `munmap()` Ammar Faizi
                   ` (5 subsequent siblings)
  8 siblings, 2 replies; 44+ messages in thread
From: Ammar Faizi @ 2022-03-22 10:21 UTC (permalink / raw)
  To: Willy Tarreau
  Cc: Paul E. McKenney, Alviro Iskandar Setiawan, Nugraha,
	Linux Kernel Mailing List, GNU/Weeb Mailing List, Ammar Faizi,
	x86, llvm, David Laight

On i386, the 6th argument of syscall goes in %ebp. However, both Clang
and GCC cannot use %ebp in the clobber list and in the "r" constraint
without using -fomit-frame-pointer. To make it always available for
any kind of compilation, the below workaround is implemented.

For clang (the Assembly statement can't clobber %ebp):
  1) Push the 6-th argument.
  2) Push %ebp.
  3) Load the 6-th argument from 4(%esp) to %ebp.
  4) Do the syscall (int $0x80).
  5) Pop %ebp (restore the old value of %ebp).
  6) Add %esp by 4 (undo the stack pointer).

For GCC, fortunately it has a #pragma that can force a specific function
to be compiled with -fomit-frame-pointer, so it can use "r"(var) where
var is a variable bound to %ebp.

Cc: x86@kernel.org
Cc: llvm@lists.linux.dev
Link: https://lore.kernel.org/lkml/2e335ac54db44f1d8496583d97f9dab0@AcuMS.aculab.com
Suggested-by: David Laight <David.Laight@ACULAB.COM>
Signed-off-by: Ammar Faizi <ammarfaizi2@gnuweeb.org>
---

@@ Changelog:

   Link RFC v1: https://lore.kernel.org/llvm/20220320093750.159991-4-ammarfaizi2@gnuweeb.org
   RFC v1 -> RFC v2:
    - Fix %ebp saving method. Don't use redzone, i386 doesn't have a redzone
      (comment from David and Alviro).
---
 tools/include/nolibc/arch-i386.h | 66 ++++++++++++++++++++++++++++++++
 1 file changed, 66 insertions(+)

diff --git a/tools/include/nolibc/arch-i386.h b/tools/include/nolibc/arch-i386.h
index 125a691fc631..9f4dc36e6ac2 100644
--- a/tools/include/nolibc/arch-i386.h
+++ b/tools/include/nolibc/arch-i386.h
@@ -167,6 +167,72 @@ struct sys_stat_struct {
 	_ret;                                                                 \
 })
 
+
+/*
+ * Both Clang and GCC cannot use %ebp in the clobber list and in the "r"
+ * constraint without using -fomit-frame-pointer. To make it always
+ * available for any kind of compilation, the below workaround is
+ * implemented.
+ *
+ * For clang (the Assembly statement can't clobber %ebp):
+ *   1) Push the 6-th argument.
+ *   2) Push %ebp.
+ *   3) Load the 6-th argument from 4(%esp) to %ebp.
+ *   4) Do the syscall (int $0x80).
+ *   5) Pop %ebp (restore the old value of %ebp).
+ *   6) Add %esp by 4 (undo the stack pointer).
+ *
+ * For GCC, fortunately it has a #pragma that can force a specific function
+ * to be compiled with -fomit-frame-pointer, so it can use "r"(var) where
+ * var is a variable bound to %ebp.
+ *
+ */
+#if defined(__clang__)
+static inline long ____do_syscall6(long eax, long ebx, long ecx, long edx,
+				   long esi, long edi, long ebp)
+{
+	__asm__ volatile (
+		"pushl	%[arg6]\n\t"
+		"pushl	%%ebp\n\t"
+		"movl	4(%%esp), %%ebp\n\t"
+		"int	$0x80\n\t"
+		"popl	%%ebp\n\t"
+		"addl	$4,%%esp\n\t"
+		: "=a"(eax)
+		: "a"(eax), "b"(ebx), "c"(ecx), "d"(edx), "S"(esi), "D"(edi),
+		  [arg6]"m"(ebp)
+		: "memory", "cc"
+	);
+	return eax;
+}
+
+#else /* #if defined(__clang__) */
+#pragma GCC push_options
+#pragma GCC optimize "-fomit-frame-pointer"
+static long ____do_syscall6(long eax, long ebx, long ecx, long edx, long esi,
+			    long edi, long ebp)
+{
+	register long __ebp __asm__("ebp") = ebp;
+	__asm__ volatile (
+		"int	$0x80"
+		: "=a"(eax)
+		: "a"(eax), "b"(ebx), "c"(ecx), "d"(edx), "S"(esi), "D"(edi),
+		  "r"(__ebp)
+		: "memory", "cc"
+	);
+	return eax;
+}
+#pragma GCC pop_options
+#endif /* #if defined(__clang__) */
+
+#define my_syscall6(num, arg1, arg2, arg3, arg4, arg5, arg6) (   \
+	____do_syscall6((long)(num), (long)(arg1),               \
+			(long)(arg2), (long)(arg3),              \
+			(long)(arg4), (long)(arg5),              \
+			(long)(arg6))                            \
+)
+
+
 /* startup code */
 /*
  * i386 System V ABI mandates:
-- 
Ammar Faizi


^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [RFC PATCH v2 4/8] tools/nolibc/sys: Implement `mmap()` and `munmap()`
  2022-03-22 10:21 [RFC PATCH v2 0/8] Add dynamic memory allocator support for nolibc Ammar Faizi
                   ` (2 preceding siblings ...)
  2022-03-22 10:21 ` [RFC PATCH v2 3/8] tools/nolibc: i386: Implement syscall with 6 arguments Ammar Faizi
@ 2022-03-22 10:21 ` Ammar Faizi
  2022-03-22 10:21 ` [RFC PATCH v2 5/8] tools/nolibc/types: Implement `offsetof()` and `container_of()` macro Ammar Faizi
                   ` (4 subsequent siblings)
  8 siblings, 0 replies; 44+ messages in thread
From: Ammar Faizi @ 2022-03-22 10:21 UTC (permalink / raw)
  To: Willy Tarreau
  Cc: Paul E. McKenney, Alviro Iskandar Setiawan, Nugraha,
	Linux Kernel Mailing List, GNU/Weeb Mailing List, Ammar Faizi

Implement mmap() and munmap(). Currently, they are only available for
architecures that have my_syscall6 macro. For architectures that don't
have, this function will return -1 with errno set to ENOSYS (Function
not implemented).

This has been tested on x86 and i386.

Notes for i386:
 1) The common mmap() syscall implementation uses __NR_mmap2 instead
    of __NR_mmap.

 2) The offset must be shifted-right by 12-bit.

Signed-off-by: Ammar Faizi <ammarfaizi2@gnuweeb.org>
---

@@ Changelog:

   Link RFC v1: https://lore.kernel.org/lkml/20220320093750.159991-5-ammarfaizi2@gnuweeb.org/
   RFC v1 -> RFC v2:
    * No changes *
---
 tools/include/nolibc/sys.h | 62 ++++++++++++++++++++++++++++++++++++++
 1 file changed, 62 insertions(+)

diff --git a/tools/include/nolibc/sys.h b/tools/include/nolibc/sys.h
index 4d4308d5d111..08491070387b 100644
--- a/tools/include/nolibc/sys.h
+++ b/tools/include/nolibc/sys.h
@@ -14,6 +14,7 @@
 #include <asm/unistd.h>
 #include <asm/signal.h>  // for SIGCHLD
 #include <asm/ioctls.h>
+#include <asm/mman.h>
 #include <linux/fs.h>
 #include <linux/loop.h>
 #include <linux/time.h>
@@ -675,6 +676,67 @@ int mknod(const char *path, mode_t mode, dev_t dev)
 	return ret;
 }
 
+#ifndef MAP_SHARED
+#define MAP_SHARED		0x01	/* Share changes */
+#define MAP_PRIVATE		0x02	/* Changes are private */
+#define MAP_SHARED_VALIDATE	0x03	/* share + validate extension flags */
+#endif
+
+#ifndef MAP_FAILED
+#define MAP_FAILED ((void *)-1)
+#endif
+
+static __attribute__((unused))
+void *sys_mmap(void *addr, size_t length, int prot, int flags, int fd,
+	       off_t offset)
+{
+#ifndef my_syscall6
+	/* Function not implemented. */
+	return -ENOSYS;
+#else
+
+	int n;
+
+#if defined(__i386__)
+	n = __NR_mmap2;
+	offset >>= 12;
+#else
+	n = __NR_mmap;
+#endif
+
+	return (void *)my_syscall6(n, addr, length, prot, flags, fd, offset);
+#endif
+}
+
+static __attribute__((unused))
+void *mmap(void *addr, size_t length, int prot, int flags, int fd, off_t offset)
+{
+	void *ret = sys_mmap(addr, length, prot, flags, fd, offset);
+
+	if ((unsigned long)ret >= -4095UL) {
+		SET_ERRNO(-(long)ret);
+		ret = MAP_FAILED;
+	}
+	return ret;
+}
+
+static __attribute__((unused))
+int sys_munmap(void *addr, size_t length)
+{
+	return my_syscall2(__NR_munmap, addr, length);
+}
+
+static __attribute__((unused))
+int munmap(void *addr, size_t length)
+{
+	int ret = sys_munmap(addr, length);
+
+	if (ret < 0) {
+		SET_ERRNO(-ret);
+		ret = -1;
+	}
+	return ret;
+}
 
 /*
  * int mount(const char *source, const char *target,
-- 
Ammar Faizi


^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [RFC PATCH v2 5/8] tools/nolibc/types: Implement `offsetof()` and `container_of()` macro
  2022-03-22 10:21 [RFC PATCH v2 0/8] Add dynamic memory allocator support for nolibc Ammar Faizi
                   ` (3 preceding siblings ...)
  2022-03-22 10:21 ` [RFC PATCH v2 4/8] tools/nolibc/sys: Implement `mmap()` and `munmap()` Ammar Faizi
@ 2022-03-22 10:21 ` Ammar Faizi
  2022-03-22 10:21 ` [RFC PATCH v2 6/8] tools/nolibc/stdlib: Implement `malloc()`, `calloc()`, `realloc()` and `free()` Ammar Faizi
                   ` (3 subsequent siblings)
  8 siblings, 0 replies; 44+ messages in thread
From: Ammar Faizi @ 2022-03-22 10:21 UTC (permalink / raw)
  To: Willy Tarreau
  Cc: Paul E. McKenney, Alviro Iskandar Setiawan, Nugraha,
	Linux Kernel Mailing List, GNU/Weeb Mailing List, Ammar Faizi

Implement `offsetof()` and `container_of()` macro. The first use case
of these macros is for `malloc()`, `realloc()` and `free()`.

Signed-off-by: Ammar Faizi <ammarfaizi2@gnuweeb.org>
---
 tools/include/nolibc/types.h | 11 +++++++++++
 1 file changed, 11 insertions(+)

diff --git a/tools/include/nolibc/types.h b/tools/include/nolibc/types.h
index 357e60ad38a8..959997034e55 100644
--- a/tools/include/nolibc/types.h
+++ b/tools/include/nolibc/types.h
@@ -191,4 +191,15 @@ struct stat {
 #define major(dev) ((unsigned int)(((dev) >> 8) & 0xfff))
 #define minor(dev) ((unsigned int)(((dev) & 0xff))
 
+#ifndef offsetof
+#define offsetof(TYPE, FIELD) ((size_t) &((TYPE *)0)->FIELD)
+#endif
+
+#ifndef container_of
+#define container_of(PTR, TYPE, FIELD) ({			\
+	__typeof__(((TYPE *)0)->FIELD) *__FIELD_PTR = (PTR);	\
+	(TYPE *)((char *) __FIELD_PTR - offsetof(TYPE, FIELD));	\
+})
+#endif
+
 #endif /* _NOLIBC_TYPES_H */
-- 
Ammar Faizi


^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [RFC PATCH v2 6/8] tools/nolibc/stdlib: Implement `malloc()`, `calloc()`, `realloc()` and `free()`
  2022-03-22 10:21 [RFC PATCH v2 0/8] Add dynamic memory allocator support for nolibc Ammar Faizi
                   ` (4 preceding siblings ...)
  2022-03-22 10:21 ` [RFC PATCH v2 5/8] tools/nolibc/types: Implement `offsetof()` and `container_of()` macro Ammar Faizi
@ 2022-03-22 10:21 ` Ammar Faizi
  2022-03-22 11:52   ` David Laight
  2022-03-22 10:21 ` [RFC PATCH v2 7/8] tools/nolibc/string: Implement `strnlen()` Ammar Faizi
                   ` (2 subsequent siblings)
  8 siblings, 1 reply; 44+ messages in thread
From: Ammar Faizi @ 2022-03-22 10:21 UTC (permalink / raw)
  To: Willy Tarreau
  Cc: Paul E. McKenney, Alviro Iskandar Setiawan, Nugraha,
	Linux Kernel Mailing List, GNU/Weeb Mailing List, Ammar Faizi

Implement basic dynamic allocator functions. These functions are
currently only available on architectures that have nolibc mmap()
syscall implemented. These are not a super-fast memory allocator,
but at least they can satisfy basic needs for having heap without
libc.

Signed-off-by: Ammar Faizi <ammarfaizi2@gnuweeb.org>
---

@@ Changelog:

   Link: https://lore.kernel.org/lkml/20220320093750.159991-6-ammarfaizi2@gnuweeb.org
   RFC v1 -> RFC v2:
    - Move container_of() and offsetof() macro to types.h with a
      separate patch (comment from Willy).
---
 tools/include/nolibc/stdlib.h | 68 +++++++++++++++++++++++++++++++++++
 1 file changed, 68 insertions(+)

diff --git a/tools/include/nolibc/stdlib.h b/tools/include/nolibc/stdlib.h
index aca8616335e3..a0ed75431e0a 100644
--- a/tools/include/nolibc/stdlib.h
+++ b/tools/include/nolibc/stdlib.h
@@ -11,7 +11,12 @@
 #include "arch.h"
 #include "types.h"
 #include "sys.h"
+#include "string.h"
 
+struct nolibc_heap {
+	size_t	len;
+	char	user_p[] __attribute__((__aligned__));
+};
 
 /* Buffer used to store int-to-ASCII conversions. Will only be implemented if
  * any of the related functions is implemented. The area is large enough to
@@ -60,6 +65,18 @@ int atoi(const char *s)
 	return atol(s);
 }
 
+static __attribute__((unused))
+void free(void *ptr)
+{
+	struct nolibc_heap *heap;
+
+	if (!ptr)
+		return;
+
+	heap = container_of(ptr, struct nolibc_heap, user_p);
+	munmap(heap, heap->len);
+}
+
 /* Tries to find the environment variable named <name> in the environment array
  * pointed to by global variable "environ" which must be declared as a char **,
  * and must be terminated by a NULL (it is recommended to set this variable to
@@ -205,6 +222,57 @@ char *ltoa(long in)
 	return itoa_buffer;
 }
 
+static __attribute__((unused))
+void *malloc(size_t len)
+{
+	struct nolibc_heap *heap;
+
+	heap = mmap(NULL, sizeof(*heap) + len, PROT_READ|PROT_WRITE,
+		    MAP_ANONYMOUS|MAP_PRIVATE, -1, 0);
+	if (__builtin_expect(heap == MAP_FAILED, 0))
+		return NULL;
+
+	heap->len = sizeof(*heap) + len;
+	return heap->user_p;
+}
+
+static __attribute__((unused))
+void *calloc(size_t size, size_t nmemb)
+{
+	void *orig;
+	size_t res = 0;
+
+	if (__builtin_expect(__builtin_mul_overflow(nmemb, size, &res), 0)) {
+		SET_ERRNO(ENOMEM);
+		return NULL;
+	}
+
+	/*
+	 * No need to zero the heap, the MAP_ANONYMOUS in malloc()
+	 * already does it.
+	 */
+	return malloc(res);
+}
+
+static __attribute__((unused))
+void *realloc(void *old_ptr, size_t new_size)
+{
+	struct nolibc_heap *heap;
+	void *ret;
+
+	ret = malloc(new_size);
+	if (__builtin_expect(!ret, 0))
+		return NULL;
+
+	if (!old_ptr)
+		return ret;
+
+	heap = container_of(old_ptr, struct nolibc_heap, user_p);
+	memcpy(ret, heap->user_p, heap->len);
+	munmap(heap, heap->len);
+	return ret;
+}
+
 /* converts unsigned long integer <in> to a string using the static itoa_buffer
  * and returns the pointer to that string.
  */
-- 
Ammar Faizi


^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [RFC PATCH v2 7/8] tools/nolibc/string: Implement `strnlen()`
  2022-03-22 10:21 [RFC PATCH v2 0/8] Add dynamic memory allocator support for nolibc Ammar Faizi
                   ` (5 preceding siblings ...)
  2022-03-22 10:21 ` [RFC PATCH v2 6/8] tools/nolibc/stdlib: Implement `malloc()`, `calloc()`, `realloc()` and `free()` Ammar Faizi
@ 2022-03-22 10:21 ` Ammar Faizi
  2022-03-22 10:21 ` [RFC PATCH v2 8/8] tools/include/string: Implement `strdup()` and `strndup()` Ammar Faizi
  2022-03-22 11:27 ` [RFC PATCH v2 0/8] Add dynamic memory allocator support for nolibc Willy Tarreau
  8 siblings, 0 replies; 44+ messages in thread
From: Ammar Faizi @ 2022-03-22 10:21 UTC (permalink / raw)
  To: Willy Tarreau
  Cc: Paul E. McKenney, Alviro Iskandar Setiawan, Nugraha,
	Linux Kernel Mailing List, GNU/Weeb Mailing List, Ammar Faizi

  size_t strnlen(const char *str, size_t maxlen);

The strnlen() function returns the number of bytes in the string
pointed to by sstr, excluding the terminating null byte ('\0'), but at
most maxlen. In doing this, strnlen() looks only at the first maxlen
characters in the string pointed to by str and never beyond str[maxlen-1].

The first use case of this function is for determining the memory
allocation size in the strndup() function.

Link: https://lore.kernel.org/lkml/CAOG64qMpEMh+EkOfjNdAoueC+uQyT2Uv3689_sOr37-JxdJf4g@mail.gmail.com
Suggested-by: Alviro Iskandar Setiawan <alviro.iskandar@gnuweeb.org>
Signed-off-by: Ammar Faizi <ammarfaizi2@gnuweeb.org>
---
 tools/include/nolibc/string.h | 9 +++++++++
 1 file changed, 9 insertions(+)

diff --git a/tools/include/nolibc/string.h b/tools/include/nolibc/string.h
index 0d5e870c7c0b..1426eefc1ef2 100644
--- a/tools/include/nolibc/string.h
+++ b/tools/include/nolibc/string.h
@@ -138,6 +138,15 @@ size_t nolibc_strlen(const char *str)
 		nolibc_strlen((str));           \
 })
 
+static __attribute__((unused))
+size_t strnlen(const char *str, size_t maxlen)
+{
+	size_t len;
+
+	for (len = 0; (len < maxlen) && str[len]; len++);
+	return len;
+}
+
 static __attribute__((unused))
 size_t strlcat(char *dst, const char *src, size_t size)
 {
-- 
Ammar Faizi


^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [RFC PATCH v2 8/8] tools/include/string: Implement `strdup()` and `strndup()`
  2022-03-22 10:21 [RFC PATCH v2 0/8] Add dynamic memory allocator support for nolibc Ammar Faizi
                   ` (6 preceding siblings ...)
  2022-03-22 10:21 ` [RFC PATCH v2 7/8] tools/nolibc/string: Implement `strnlen()` Ammar Faizi
@ 2022-03-22 10:21 ` Ammar Faizi
  2022-03-22 11:27 ` [RFC PATCH v2 0/8] Add dynamic memory allocator support for nolibc Willy Tarreau
  8 siblings, 0 replies; 44+ messages in thread
From: Ammar Faizi @ 2022-03-22 10:21 UTC (permalink / raw)
  To: Willy Tarreau
  Cc: Paul E. McKenney, Alviro Iskandar Setiawan, Nugraha,
	Linux Kernel Mailing List, GNU/Weeb Mailing List, Ammar Faizi

These functions are currently only available on architectures that have
my_syscall6() macro implemented. Since these functions use malloc(),
malloc() uses mmap(), mmap() depends on my_syscall6() macro.

On architectures that don't support my_syscall6(), these function will
always return NULL with errno set to ENOSYS.

Signed-off-by: Ammar Faizi <ammarfaizi2@gnuweeb.org>
---

@@ Changelog:

   Link RFC v1: https://lore.kernel.org/lkml/20220320093750.159991-7-ammarfaizi2@gnuweeb.org/
   RFC v1 -> RFC v2:
    - Update strdup and strndup implementation, use strlen and strnlen to get
      the string length first (comment from Willy and Alviro).
    - Fix the subject line prefix, it was "tools/include/string: ", it should be
      "tools/nolibc/string: ".
    - Update the commit message.
---
 tools/include/nolibc/string.h | 32 ++++++++++++++++++++++++++++++++
 1 file changed, 32 insertions(+)

diff --git a/tools/include/nolibc/string.h b/tools/include/nolibc/string.h
index 1426eefc1ef2..bcc76f89199e 100644
--- a/tools/include/nolibc/string.h
+++ b/tools/include/nolibc/string.h
@@ -9,6 +9,8 @@
 
 #include "std.h"
 
+static void *malloc(size_t len);
+
 /*
  * As much as possible, please keep functions alphabetically sorted.
  */
@@ -147,6 +149,36 @@ size_t strnlen(const char *str, size_t maxlen)
 	return len;
 }
 
+static __attribute__((unused))
+char *strdup(const char *str)
+{
+	size_t len;
+	char *ret;
+
+	len = strlen(str);
+	ret = malloc(len + 1);
+	if (__builtin_expect(ret != NULL, 1))
+		memcpy(ret, str, len + 1);
+
+	return ret;
+}
+
+static __attribute__((unused))
+char *strndup(const char *str, size_t maxlen)
+{
+	size_t len;
+	char *ret;
+
+	len = strnlen(str, maxlen);
+	ret = malloc(len + 1);
+	if (__builtin_expect(ret != NULL, 1)) {
+		memcpy(ret, str, len);
+		ret[len] = '\0';
+	}
+
+	return ret;
+}
+
 static __attribute__((unused))
 size_t strlcat(char *dst, const char *src, size_t size)
 {
-- 
Ammar Faizi


^ permalink raw reply related	[flat|nested] 44+ messages in thread

* RE: [RFC PATCH v2 3/8] tools/nolibc: i386: Implement syscall with 6 arguments
  2022-03-22 10:21 ` [RFC PATCH v2 3/8] tools/nolibc: i386: Implement syscall with 6 arguments Ammar Faizi
@ 2022-03-22 10:57   ` David Laight
  2022-03-22 11:23     ` Willy Tarreau
  2022-03-22 11:39   ` David Laight
  1 sibling, 1 reply; 44+ messages in thread
From: David Laight @ 2022-03-22 10:57 UTC (permalink / raw)
  To: 'Ammar Faizi', Willy Tarreau
  Cc: Paul E. McKenney, Alviro Iskandar Setiawan, Nugraha,
	Linux Kernel Mailing List, GNU/Weeb Mailing List, x86, llvm

From: Ammar Faizi
> Sent: 22 March 2022 10:21
> 
> On i386, the 6th argument of syscall goes in %ebp. However, both Clang
> and GCC cannot use %ebp in the clobber list and in the "r" constraint
> without using -fomit-frame-pointer. To make it always available for
> any kind of compilation, the below workaround is implemented.
> 
> For clang (the Assembly statement can't clobber %ebp):
>   1) Push the 6-th argument.
>   2) Push %ebp.
>   3) Load the 6-th argument from 4(%esp) to %ebp.
>   4) Do the syscall (int $0x80).
>   5) Pop %ebp (restore the old value of %ebp).
>   6) Add %esp by 4 (undo the stack pointer).
> 
> For GCC, fortunately it has a #pragma that can force a specific function
> to be compiled with -fomit-frame-pointer, so it can use "r"(var) where
> var is a variable bound to %ebp.

You need to use the 'clang' pattern for gcc.
#pragma optimise is fundamentally broken.
What actually happens here is the 'inline' gets lost
(because of the implied -O0) and you get far worse code
than you might expect.

Since you need the 'clang' version, use it all the time.

	David

> 
> Cc: x86@kernel.org
> Cc: llvm@lists.linux.dev
> Link: https://lore.kernel.org/lkml/2e335ac54db44f1d8496583d97f9dab0@AcuMS.aculab.com
> Suggested-by: David Laight <David.Laight@ACULAB.COM>
> Signed-off-by: Ammar Faizi <ammarfaizi2@gnuweeb.org>
> ---
> 
> @@ Changelog:
> 
>    Link RFC v1: https://lore.kernel.org/llvm/20220320093750.159991-4-ammarfaizi2@gnuweeb.org
>    RFC v1 -> RFC v2:
>     - Fix %ebp saving method. Don't use redzone, i386 doesn't have a redzone
>       (comment from David and Alviro).
> ---
>  tools/include/nolibc/arch-i386.h | 66 ++++++++++++++++++++++++++++++++
>  1 file changed, 66 insertions(+)
> 
> diff --git a/tools/include/nolibc/arch-i386.h b/tools/include/nolibc/arch-i386.h
> index 125a691fc631..9f4dc36e6ac2 100644
> --- a/tools/include/nolibc/arch-i386.h
> +++ b/tools/include/nolibc/arch-i386.h
> @@ -167,6 +167,72 @@ struct sys_stat_struct {
>  	_ret;                                                                 \
>  })
> 
> +
> +/*
> + * Both Clang and GCC cannot use %ebp in the clobber list and in the "r"
> + * constraint without using -fomit-frame-pointer. To make it always
> + * available for any kind of compilation, the below workaround is
> + * implemented.
> + *
> + * For clang (the Assembly statement can't clobber %ebp):
> + *   1) Push the 6-th argument.
> + *   2) Push %ebp.
> + *   3) Load the 6-th argument from 4(%esp) to %ebp.
> + *   4) Do the syscall (int $0x80).
> + *   5) Pop %ebp (restore the old value of %ebp).
> + *   6) Add %esp by 4 (undo the stack pointer).
> + *
> + * For GCC, fortunately it has a #pragma that can force a specific function
> + * to be compiled with -fomit-frame-pointer, so it can use "r"(var) where
> + * var is a variable bound to %ebp.
> + *
> + */
> +#if defined(__clang__)
> +static inline long ____do_syscall6(long eax, long ebx, long ecx, long edx,
> +				   long esi, long edi, long ebp)
> +{
> +	__asm__ volatile (
> +		"pushl	%[arg6]\n\t"
> +		"pushl	%%ebp\n\t"
> +		"movl	4(%%esp), %%ebp\n\t"
> +		"int	$0x80\n\t"
> +		"popl	%%ebp\n\t"
> +		"addl	$4,%%esp\n\t"
> +		: "=a"(eax)
> +		: "a"(eax), "b"(ebx), "c"(ecx), "d"(edx), "S"(esi), "D"(edi),
> +		  [arg6]"m"(ebp)
> +		: "memory", "cc"
> +	);
> +	return eax;
> +}
> +
> +#else /* #if defined(__clang__) */
> +#pragma GCC push_options
> +#pragma GCC optimize "-fomit-frame-pointer"
> +static long ____do_syscall6(long eax, long ebx, long ecx, long edx, long esi,
> +			    long edi, long ebp)
> +{
> +	register long __ebp __asm__("ebp") = ebp;
> +	__asm__ volatile (
> +		"int	$0x80"
> +		: "=a"(eax)
> +		: "a"(eax), "b"(ebx), "c"(ecx), "d"(edx), "S"(esi), "D"(edi),
> +		  "r"(__ebp)
> +		: "memory", "cc"
> +	);
> +	return eax;
> +}
> +#pragma GCC pop_options
> +#endif /* #if defined(__clang__) */
> +
> +#define my_syscall6(num, arg1, arg2, arg3, arg4, arg5, arg6) (   \
> +	____do_syscall6((long)(num), (long)(arg1),               \
> +			(long)(arg2), (long)(arg3),              \
> +			(long)(arg4), (long)(arg5),              \
> +			(long)(arg6))                            \
> +)
> +
> +
>  /* startup code */
>  /*
>   * i386 System V ABI mandates:
> --
> Ammar Faizi

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)


^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [RFC PATCH v2 3/8] tools/nolibc: i386: Implement syscall with 6 arguments
  2022-03-22 10:57   ` David Laight
@ 2022-03-22 11:23     ` Willy Tarreau
  0 siblings, 0 replies; 44+ messages in thread
From: Willy Tarreau @ 2022-03-22 11:23 UTC (permalink / raw)
  To: David Laight
  Cc: 'Ammar Faizi',
	Paul E. McKenney, Alviro Iskandar Setiawan, Nugraha,
	Linux Kernel Mailing List, GNU/Weeb Mailing List, x86, llvm

On Tue, Mar 22, 2022 at 10:57:01AM +0000, David Laight wrote:
> From: Ammar Faizi
> > Sent: 22 March 2022 10:21
> > 
> > On i386, the 6th argument of syscall goes in %ebp. However, both Clang
> > and GCC cannot use %ebp in the clobber list and in the "r" constraint
> > without using -fomit-frame-pointer. To make it always available for
> > any kind of compilation, the below workaround is implemented.
> > 
> > For clang (the Assembly statement can't clobber %ebp):
> >   1) Push the 6-th argument.
> >   2) Push %ebp.
> >   3) Load the 6-th argument from 4(%esp) to %ebp.
> >   4) Do the syscall (int $0x80).
> >   5) Pop %ebp (restore the old value of %ebp).
> >   6) Add %esp by 4 (undo the stack pointer).
> > 
> > For GCC, fortunately it has a #pragma that can force a specific function
> > to be compiled with -fomit-frame-pointer, so it can use "r"(var) where
> > var is a variable bound to %ebp.
> 
> You need to use the 'clang' pattern for gcc.
> #pragma optimise is fundamentally broken.
> What actually happens here is the 'inline' gets lost
> (because of the implied -O0) and you get far worse code
> than you might expect.
> 
> Since you need the 'clang' version, use it all the time.

I clearly prefer it as well, it looks much cleaner!

Willy

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [RFC PATCH v2 0/8] Add dynamic memory allocator support for nolibc
  2022-03-22 10:21 [RFC PATCH v2 0/8] Add dynamic memory allocator support for nolibc Ammar Faizi
                   ` (7 preceding siblings ...)
  2022-03-22 10:21 ` [RFC PATCH v2 8/8] tools/include/string: Implement `strdup()` and `strndup()` Ammar Faizi
@ 2022-03-22 11:27 ` Willy Tarreau
  2022-03-22 12:43   ` Ammar Faizi
  8 siblings, 1 reply; 44+ messages in thread
From: Willy Tarreau @ 2022-03-22 11:27 UTC (permalink / raw)
  To: Ammar Faizi
  Cc: Paul E. McKenney, Alviro Iskandar Setiawan, Nugraha,
	Linux Kernel Mailing List, GNU/Weeb Mailing List

Hi Ammar,

On Tue, Mar 22, 2022 at 05:21:07PM +0700, Ammar Faizi wrote:
> Hi,
> 
> This is the v2 of RFC to add dynamic memory allocator support for
> nolibc.

So overall, except for the syscall6 implementation whereI agree with
David that it would be better to always use the "push" variant, I'm
fine with the rest of the series.

thank you!
Willy

^ permalink raw reply	[flat|nested] 44+ messages in thread

* RE: [RFC PATCH v2 3/8] tools/nolibc: i386: Implement syscall with 6 arguments
  2022-03-22 10:21 ` [RFC PATCH v2 3/8] tools/nolibc: i386: Implement syscall with 6 arguments Ammar Faizi
  2022-03-22 10:57   ` David Laight
@ 2022-03-22 11:39   ` David Laight
  2022-03-22 12:02     ` Ammar Faizi
  1 sibling, 1 reply; 44+ messages in thread
From: David Laight @ 2022-03-22 11:39 UTC (permalink / raw)
  To: 'Ammar Faizi', Willy Tarreau
  Cc: Paul E. McKenney, Alviro Iskandar Setiawan, Nugraha,
	Linux Kernel Mailing List, GNU/Weeb Mailing List, x86, llvm

From: Ammar Faizi
> Sent: 22 March 2022 10:21
> On i386, the 6th argument of syscall goes in %ebp. However, both Clang
> and GCC cannot use %ebp in the clobber list and in the "r" constraint
> without using -fomit-frame-pointer. To make it always available for
> any kind of compilation, the below workaround is implemented.
> 
...
> diff --git a/tools/include/nolibc/arch-i386.h b/tools/include/nolibc/arch-i386.h
> index 125a691fc631..9f4dc36e6ac2 100644
> --- a/tools/include/nolibc/arch-i386.h
> +++ b/tools/include/nolibc/arch-i386.h
> @@ -167,6 +167,72 @@ struct sys_stat_struct {
>  	_ret;                                                                 \
>  })
> 
> +
> +/*
> + * Both Clang and GCC cannot use %ebp in the clobber list and in the "r"
> + * constraint without using -fomit-frame-pointer. To make it always
> + * available for any kind of compilation, the below workaround is
> + * implemented.
> + *
> + * For clang (the Assembly statement can't clobber %ebp):
> + *   1) Push the 6-th argument.
> + *   2) Push %ebp.
> + *   3) Load the 6-th argument from 4(%esp) to %ebp.
> + *   4) Do the syscall (int $0x80).
> + *   5) Pop %ebp (restore the old value of %ebp).
> + *   6) Add %esp by 4 (undo the stack pointer).
> + *
> + * For GCC, fortunately it has a #pragma that can force a specific function
> + * to be compiled with -fomit-frame-pointer, so it can use "r"(var) where
> + * var is a variable bound to %ebp.
> + *
> + */
> +#if defined(__clang__)
> +static inline long ____do_syscall6(long eax, long ebx, long ecx, long edx,
> +				   long esi, long edi, long ebp)

That should probably be:
static inline long ____do_syscall6(long nr, long arg1, long arg2, long arg3,
				   long arg4, long arg5, long arg6)
and the input constraints changed to match.

> +{
> +	__asm__ volatile (
> +		"pushl	%[arg6]\n\t"
> +		"pushl	%%ebp\n\t"
> +		"movl	4(%%esp), %%ebp\n\t"
> +		"int	$0x80\n\t"
> +		"popl	%%ebp\n\t"
> +		"addl	$4,%%esp\n\t"
> +		: "=a"(eax)
> +		: "a"(eax), "b"(ebx), "c"(ecx), "d"(edx), "S"(esi), "D"(edi),

Does having "=a" for an output constraint and "a" for an input
constraint actually DTRT?
There is a special syntax for tying input and output to
the same register.
Or you could use "+a"(nr_rval) and 'return nr_rval'.

	David

> +		  [arg6]"m"(ebp)
> +		: "memory", "cc"
> +	);
> +	return eax;
> +}

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)


^ permalink raw reply	[flat|nested] 44+ messages in thread

* RE: [RFC PATCH v2 6/8] tools/nolibc/stdlib: Implement `malloc()`, `calloc()`, `realloc()` and `free()`
  2022-03-22 10:21 ` [RFC PATCH v2 6/8] tools/nolibc/stdlib: Implement `malloc()`, `calloc()`, `realloc()` and `free()` Ammar Faizi
@ 2022-03-22 11:52   ` David Laight
  2022-03-22 12:18     ` Ammar Faizi
  2022-03-22 12:21     ` Willy Tarreau
  0 siblings, 2 replies; 44+ messages in thread
From: David Laight @ 2022-03-22 11:52 UTC (permalink / raw)
  To: 'Ammar Faizi', Willy Tarreau
  Cc: Paul E. McKenney, Alviro Iskandar Setiawan, Nugraha,
	Linux Kernel Mailing List, GNU/Weeb Mailing List

From: Ammar Faizi
> Sent: 22 March 2022 10:21
> 
> Implement basic dynamic allocator functions. These functions are
> currently only available on architectures that have nolibc mmap()
> syscall implemented. These are not a super-fast memory allocator,
> but at least they can satisfy basic needs for having heap without
> libc.
> 
> Signed-off-by: Ammar Faizi <ammarfaizi2@gnuweeb.org>
> ---
> 
> @@ Changelog:
> 
>    Link: https://lore.kernel.org/lkml/20220320093750.159991-6-ammarfaizi2@gnuweeb.org
>    RFC v1 -> RFC v2:
>     - Move container_of() and offsetof() macro to types.h with a
>       separate patch (comment from Willy).
> ---
>  tools/include/nolibc/stdlib.h | 68 +++++++++++++++++++++++++++++++++++
>  1 file changed, 68 insertions(+)
> 
> diff --git a/tools/include/nolibc/stdlib.h b/tools/include/nolibc/stdlib.h
> index aca8616335e3..a0ed75431e0a 100644
> --- a/tools/include/nolibc/stdlib.h
> +++ b/tools/include/nolibc/stdlib.h
> @@ -11,7 +11,12 @@
>  #include "arch.h"
>  #include "types.h"
>  #include "sys.h"
> +#include "string.h"
> 
> +struct nolibc_heap {
> +	size_t	len;
> +	char	user_p[] __attribute__((__aligned__));

Doesn't that need (number) in the attribute?

> +};
> 
>  /* Buffer used to store int-to-ASCII conversions. Will only be implemented if
>   * any of the related functions is implemented. The area is large enough to
> @@ -60,6 +65,18 @@ int atoi(const char *s)
>  	return atol(s);
>  }
> 
> +static __attribute__((unused))
> +void free(void *ptr)
> +{
> +	struct nolibc_heap *heap;
> +
> +	if (!ptr)
> +		return;
> +
> +	heap = container_of(ptr, struct nolibc_heap, user_p);
> +	munmap(heap, heap->len);
> +}
> +
>  /* Tries to find the environment variable named <name> in the environment array
>   * pointed to by global variable "environ" which must be declared as a char **,
>   * and must be terminated by a NULL (it is recommended to set this variable to
> @@ -205,6 +222,57 @@ char *ltoa(long in)
>  	return itoa_buffer;
>  }
> 
> +static __attribute__((unused))
> +void *malloc(size_t len)
> +{
> +	struct nolibc_heap *heap;

If you do (say):
	len = ROUNDUP(len + sizeof *heap, 4096)
you can optimise a lot of the realloc() calls.

I actually wonder if compiling a mini-libc.a
and then linking the programs against it might
be better than all these static functions?
-ffunction-sections can help a bit (where supported).

	David
	
> +
> +	heap = mmap(NULL, sizeof(*heap) + len, PROT_READ|PROT_WRITE,
> +		    MAP_ANONYMOUS|MAP_PRIVATE, -1, 0);
> +	if (__builtin_expect(heap == MAP_FAILED, 0))
> +		return NULL;
> +
> +	heap->len = sizeof(*heap) + len;
> +	return heap->user_p;
> +}
> +
> +static __attribute__((unused))
> +void *calloc(size_t size, size_t nmemb)
> +{
> +	void *orig;
> +	size_t res = 0;
> +
> +	if (__builtin_expect(__builtin_mul_overflow(nmemb, size, &res), 0)) {
> +		SET_ERRNO(ENOMEM);
> +		return NULL;
> +	}
> +
> +	/*
> +	 * No need to zero the heap, the MAP_ANONYMOUS in malloc()
> +	 * already does it.
> +	 */
> +	return malloc(res);
> +}
> +
> +static __attribute__((unused))
> +void *realloc(void *old_ptr, size_t new_size)
> +{
> +	struct nolibc_heap *heap;
> +	void *ret;
> +
> +	ret = malloc(new_size);
> +	if (__builtin_expect(!ret, 0))
> +		return NULL;
> +
> +	if (!old_ptr)
> +		return ret;
> +
> +	heap = container_of(old_ptr, struct nolibc_heap, user_p);
> +	memcpy(ret, heap->user_p, heap->len);
> +	munmap(heap, heap->len);
> +	return ret;
> +}
> +
>  /* converts unsigned long integer <in> to a string using the static itoa_buffer
>   * and returns the pointer to that string.
>   */
> --
> Ammar Faizi

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)


^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [RFC PATCH v2 3/8] tools/nolibc: i386: Implement syscall with 6 arguments
  2022-03-22 11:39   ` David Laight
@ 2022-03-22 12:02     ` Ammar Faizi
  2022-03-22 12:07       ` Ammar Faizi
  2022-03-22 12:13       ` Willy Tarreau
  0 siblings, 2 replies; 44+ messages in thread
From: Ammar Faizi @ 2022-03-22 12:02 UTC (permalink / raw)
  To: David Laight, Willy Tarreau
  Cc: Paul E. McKenney, Alviro Iskandar Setiawan, Nugraha,
	Linux Kernel Mailing List, GNU/Weeb Mailing List, x86, llvm

On 3/22/22 6:39 PM, David Laight wrote:
>> +	__asm__ volatile (
>> +		"pushl	%[arg6]\n\t"
>> +		"pushl	%%ebp\n\t"
>> +		"movl	4(%%esp), %%ebp\n\t"
>> +		"int	$0x80\n\t"
>> +		"popl	%%ebp\n\t"
>> +		"addl	$4,%%esp\n\t"
>> +		: "=a"(eax)
>> +		: "a"(eax), "b"(ebx), "c"(ecx), "d"(edx), "S"(esi), "D"(edi),
> 
> Does having "=a" for an output constraint and "a" for an input
> constraint actually DTRT?
> There is a special syntax for tying input and output to
> the same register.
> Or you could use "+a"(nr_rval) and 'return nr_rval'.

Well, I agree with your previous email. Now since we no longer use a #pragma
optimize with -fomit-frame-pointer, the function is not needed. I propose the
following macro (this is not so much different with other my_syscall macro),
expect the 6th argument can be in reg or mem.

The "rm" constraint here gives the opportunity for the compiler to use %ebp
instead of memory if -fomit-frame-pointer is turned on.

#define my_syscall6(num, arg1, arg2, arg3, arg4, arg5, arg6) \
({                                                         \
     long _ret;                                             \
     register long _num asm("eax") = (num);                 \
     register long _arg1 asm("ebx") = (long)(arg1);         \
     register long _arg2 asm("ecx") = (long)(arg2);         \
     register long _arg3 asm("edx") = (long)(arg3);         \
     register long _arg4 asm("esi") = (long)(arg4);         \
     register long _arg5 asm("edi") = (long)(arg5);         \
     long _arg6 = (long)(arg6); /* Might be in memory */    \
                                                            \
     asm volatile (                                         \
         "pushl  %[_arg6]\n\t"                              \
         "pushl  %%ebp\n\t"                                 \
         "movl   4(%%esp), %%ebp\n\t"                       \
         "int    $0x80\n\t"                                 \
         "popl   %%ebp\n\t"                                 \
         "addl   $4,%%esp\n\t"                              \
         : "=a"(_ret)                                       \
         : "r"(_num), "r"(_arg1), "r"(_arg2), "r"(_arg3),   \
           "r"(_arg4),"r"(_arg5), [_arg6]"rm"(_arg6)        \
         : "memory", "cc"                                   \
     );                                                     \
     _ret;                                                  \
})

What do you think?

-- 
Ammar Faizi

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [RFC PATCH v2 3/8] tools/nolibc: i386: Implement syscall with 6 arguments
  2022-03-22 12:02     ` Ammar Faizi
@ 2022-03-22 12:07       ` Ammar Faizi
  2022-03-22 12:13       ` Willy Tarreau
  1 sibling, 0 replies; 44+ messages in thread
From: Ammar Faizi @ 2022-03-22 12:07 UTC (permalink / raw)
  To: David Laight, Willy Tarreau
  Cc: Paul E. McKenney, Alviro Iskandar Setiawan, Nugraha,
	Linux Kernel Mailing List, GNU/Weeb Mailing List, x86, llvm

On 3/22/22 7:02 PM, Ammar Faizi wrote:
> Well, I agree with your previous email. Now since we no longer use a #pragma
> optimize with -fomit-frame-pointer, the function is not needed. I propose the
> following macro (this is not so much different with other my_syscall macro),
> expect the 6th argument can be in reg or mem.
> 
> The "rm" constraint here gives the opportunity for the compiler to use %ebp
> instead of memory if -fomit-frame-pointer is turned on.
> 
> #define my_syscall6(num, arg1, arg2, arg3, arg4, arg5, arg6) \
> ({                                                         \
>      long _ret;                                             \
>      register long _num asm("eax") = (num);                 \
>      register long _arg1 asm("ebx") = (long)(arg1);         \
>      register long _arg2 asm("ecx") = (long)(arg2);         \
>      register long _arg3 asm("edx") = (long)(arg3);         \
>      register long _arg4 asm("esi") = (long)(arg4);         \
>      register long _arg5 asm("edi") = (long)(arg5);         \
>      long _arg6 = (long)(arg6); /* Might be in memory */    \
>                                                             \
>      asm volatile (                                         \
>          "pushl  %[_arg6]\n\t"                              \
>          "pushl  %%ebp\n\t"                                 \
>          "movl   4(%%esp), %%ebp\n\t"                       \
>          "int    $0x80\n\t"                                 \
>          "popl   %%ebp\n\t"                                 \
>          "addl   $4,%%esp\n\t"                              \
>          : "=a"(_ret)                                       \
>          : "r"(_num), "r"(_arg1), "r"(_arg2), "r"(_arg3),   \
>            "r"(_arg4),"r"(_arg5), [_arg6]"rm"(_arg6)        \
>          : "memory", "cc"                                   \
>      );                                                     \
>      _ret;                                                  \
> })
> 
> What do you think?
> 

For the following code:

   int main()
   {
     mmap(NULL, 0x1000, PROT_READ|PROT_WRITE, MAP_ANONYMOUS|MAP_PRIVATE, -1, 0);
     return 0;
   }

GCC generates this:

   00001000 <main>:
     1000: push   %ebp
     1001: mov    $0xc0,%eax
     1006: mov    $0x1000,%ecx
     100b: mov    $0x3,%edx
     1010: push   %edi
     1011: xor    %ebp,%ebp
     1013: mov    $0xffffffff,%edi
     1018: push   %esi
     1019: mov    $0x22,%esi
     101e: push   %ebx
     101f: xor    %ebx,%ebx
     1021: push   %ebp        <--- arg6 here
     1022: push   %ebp
     1023: mov    0x4(%esp),%ebp
     1027: int    $0x80
     1029: pop    %ebp
     102a: add    $0x4,%esp
     102d: xor    %eax,%eax
     102f: pop    %ebx
     1030: pop    %esi
     1031: pop    %edi
     1032: pop    %ebp
     1033: ret

-- 
Ammar Faizi

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [RFC PATCH v2 3/8] tools/nolibc: i386: Implement syscall with 6 arguments
  2022-03-22 12:02     ` Ammar Faizi
  2022-03-22 12:07       ` Ammar Faizi
@ 2022-03-22 12:13       ` Willy Tarreau
  2022-03-22 13:26         ` Ammar Faizi
  2022-03-22 13:37         ` David Laight
  1 sibling, 2 replies; 44+ messages in thread
From: Willy Tarreau @ 2022-03-22 12:13 UTC (permalink / raw)
  To: Ammar Faizi
  Cc: David Laight, Paul E. McKenney, Alviro Iskandar Setiawan,
	Nugraha, Linux Kernel Mailing List, GNU/Weeb Mailing List, x86,
	llvm

On Tue, Mar 22, 2022 at 07:02:53PM +0700, Ammar Faizi wrote:
> I propose the
> following macro (this is not so much different with other my_syscall macro),
> expect the 6th argument can be in reg or mem.
> 
> The "rm" constraint here gives the opportunity for the compiler to use %ebp
> instead of memory if -fomit-frame-pointer is turned on.
> 
> #define my_syscall6(num, arg1, arg2, arg3, arg4, arg5, arg6) \
> ({                                                         \
>     long _ret;                                             \
>     register long _num asm("eax") = (num);                 \
>     register long _arg1 asm("ebx") = (long)(arg1);         \
>     register long _arg2 asm("ecx") = (long)(arg2);         \
>     register long _arg3 asm("edx") = (long)(arg3);         \
>     register long _arg4 asm("esi") = (long)(arg4);         \
>     register long _arg5 asm("edi") = (long)(arg5);         \
>     long _arg6 = (long)(arg6); /* Might be in memory */    \
>                                                            \
>     asm volatile (                                         \
>         "pushl  %[_arg6]\n\t"                              \
>         "pushl  %%ebp\n\t"                                 \
>         "movl   4(%%esp), %%ebp\n\t"                       \
>         "int    $0x80\n\t"                                 \
>         "popl   %%ebp\n\t"                                 \
>         "addl   $4,%%esp\n\t"                              \
>         : "=a"(_ret)                                       \
>         : "r"(_num), "r"(_arg1), "r"(_arg2), "r"(_arg3),   \
>           "r"(_arg4),"r"(_arg5), [_arg6]"rm"(_arg6)        \
>         : "memory", "cc"                                   \
>     );                                                     \
>     _ret;                                                  \
> })
> 
> What do you think?

Hmmm indeed that comes back to the existing constructs and is certainly
more in line with the rest of the code (plus it will not be affected by
-O0).

I seem to remember a register allocation issue which kept me away from
implementing it this way on i386 back then, but given that my focus was
not as much on i386 as it was on other platforms, it's likely that I have
not insisted too much and not tried this one which looks like the way to
go to me.

Willy

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [RFC PATCH v2 6/8] tools/nolibc/stdlib: Implement `malloc()`, `calloc()`, `realloc()` and `free()`
  2022-03-22 11:52   ` David Laight
@ 2022-03-22 12:18     ` Ammar Faizi
  2022-03-22 12:36       ` Alviro Iskandar Setiawan
  2022-03-22 12:21     ` Willy Tarreau
  1 sibling, 1 reply; 44+ messages in thread
From: Ammar Faizi @ 2022-03-22 12:18 UTC (permalink / raw)
  To: David Laight, Willy Tarreau
  Cc: Paul E. McKenney, Alviro Iskandar Setiawan, Nugraha,
	Linux Kernel Mailing List, GNU/Weeb Mailing List

On 3/22/22 6:52 PM, David Laight wrote:

[...]
>> +struct nolibc_heap {
>> +	size_t	len;
>> +	char	user_p[] __attribute__((__aligned__));
> 
> Doesn't that need (number) in the attribute?

The number is not mandatory.

Specifying no alignment argument implies the maximum alignment for
the target, which is often, but by no means always, 8 or 16 bytes.

This has been discussed in the RFC v1, see the full message here:

   https://lore.kernel.org/lkml/c7129520-5e9a-f9d1-cc12-5af9456c917f@gnuweeb.org/


>> +static __attribute__((unused))
>> +void *malloc(size_t len)
>> +{
>> +	struct nolibc_heap *heap;
> 
> If you do (say):
> 	len = ROUNDUP(len + sizeof *heap, 4096)
> you can optimise a lot of the realloc() calls.
> 
> I actually wonder if compiling a mini-libc.a
> and then linking the programs against it might
> be better than all these static functions?
> -ffunction-sections can help a bit (where supported).

Rounding up is not useful here, because we don't have any free list to keep
track the unused block of memory. I mean, even if it's rounded up, the extra
space after rounded up cannot be utilized with this design. There is no
book-keeping that tracks it.

Though, the kernel still allocates the size in multiple page size.

-- 
Ammar Faizi

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [RFC PATCH v2 6/8] tools/nolibc/stdlib: Implement `malloc()`, `calloc()`, `realloc()` and `free()`
  2022-03-22 11:52   ` David Laight
  2022-03-22 12:18     ` Ammar Faizi
@ 2022-03-22 12:21     ` Willy Tarreau
  1 sibling, 0 replies; 44+ messages in thread
From: Willy Tarreau @ 2022-03-22 12:21 UTC (permalink / raw)
  To: David Laight
  Cc: 'Ammar Faizi',
	Paul E. McKenney, Alviro Iskandar Setiawan, Nugraha,
	Linux Kernel Mailing List, GNU/Weeb Mailing List

On Tue, Mar 22, 2022 at 11:52:43AM +0000, David Laight wrote:
> > +struct nolibc_heap {
> > +	size_t	len;
> > +	char	user_p[] __attribute__((__aligned__));
> 
> Doesn't that need (number) in the attribute?

That was my question in the previous review but Ammar pointed me to
the doc indicating that without value it's "large enough for any type"
(i.e. the usual double-long stuff). So that's fine.

> > +static __attribute__((unused))
> > +void *malloc(size_t len)
> > +{
> > +	struct nolibc_heap *heap;
> 
> If you do (say):
> 	len = ROUNDUP(len + sizeof *heap, 4096)
> you can optimise a lot of the realloc() calls.

Could be, but do we *really* care ? Again, I didn't even intend to
implement dynamic allocation at all for the targetted use cases.

> I actually wonder if compiling a mini-libc.a
> and then linking the programs against it might
> be better than all these static functions?
> -ffunction-sections can help a bit (where supported).

That was really not the intent when I started this project this
a few years ago. Instead the purpose precisely was *not* to have
to depend on any pre-compiled stuff and it seems a few of us find
this lack of dependency convenient. Right now using bare-metal
compilers from kernel.org/pub/tools/crosstool works out of the
box and is very convenient for testing and for pre-init stuff; if
the compiler can build the kernel, it can also build your userland
code.

Willy

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [RFC PATCH v2 6/8] tools/nolibc/stdlib: Implement `malloc()`, `calloc()`, `realloc()` and `free()`
  2022-03-22 12:18     ` Ammar Faizi
@ 2022-03-22 12:36       ` Alviro Iskandar Setiawan
  2022-03-22 12:42         ` Ammar Faizi
  0 siblings, 1 reply; 44+ messages in thread
From: Alviro Iskandar Setiawan @ 2022-03-22 12:36 UTC (permalink / raw)
  To: Ammar Faizi
  Cc: David Laight, Willy Tarreau, Paul E. McKenney, Nugraha,
	Linux Kernel Mailing List, GNU/Weeb Mailing List

On Tue, Mar 22, 2022 at 7:18 PM Ammar Faizi wrote:
> Rounding up is not useful here, because we don't have any free list to keep
> track the unused block of memory. I mean, even if it's rounded up, the extra
> space after rounded up cannot be utilized with this design. There is no
> book-keeping that tracks it.
>
> Though, the kernel still allocates the size in multiple page size.

BTW, what David meant probably, don't call mmap() again if heap->len
is greater than new_len. Isn't that simple enough to give it a go?

-- Viro

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [RFC PATCH v2 6/8] tools/nolibc/stdlib: Implement `malloc()`, `calloc()`, `realloc()` and `free()`
  2022-03-22 12:36       ` Alviro Iskandar Setiawan
@ 2022-03-22 12:42         ` Ammar Faizi
  0 siblings, 0 replies; 44+ messages in thread
From: Ammar Faizi @ 2022-03-22 12:42 UTC (permalink / raw)
  To: Alviro Iskandar Setiawan
  Cc: David Laight, Willy Tarreau, Paul E. McKenney, Nugraha,
	Linux Kernel Mailing List, GNU/Weeb Mailing List

On 3/22/22 7:36 PM, Alviro Iskandar Setiawan wrote:
> On Tue, Mar 22, 2022 at 7:18 PM Ammar Faizi wrote:
>> Rounding up is not useful here, because we don't have any free list to keep
>> track the unused block of memory. I mean, even if it's rounded up, the extra
>> space after rounded up cannot be utilized with this design. There is no
>> book-keeping that tracks it.
>>
>> Though, the kernel still allocates the size in multiple page size.
> 
> BTW, what David meant probably, don't call mmap() again if heap->len
> is greater than new_len. Isn't that simple enough to give it a go?

Ah yes, I get the idea now, it shouldn't be a burden for this series. I
think that small improvement is not too overkill. So I will take that
suggestion for the next series.

Thanks!

-- 
Ammar Faizi

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [RFC PATCH v2 0/8] Add dynamic memory allocator support for nolibc
  2022-03-22 11:27 ` [RFC PATCH v2 0/8] Add dynamic memory allocator support for nolibc Willy Tarreau
@ 2022-03-22 12:43   ` Ammar Faizi
  0 siblings, 0 replies; 44+ messages in thread
From: Ammar Faizi @ 2022-03-22 12:43 UTC (permalink / raw)
  To: Willy Tarreau
  Cc: Paul E. McKenney, Alviro Iskandar Setiawan, Nugraha,
	Linux Kernel Mailing List, GNU/Weeb Mailing List

On 3/22/22 6:27 PM, Willy Tarreau wrote:
> Hi Ammar,
> 
> On Tue, Mar 22, 2022 at 05:21:07PM +0700, Ammar Faizi wrote:
>> Hi,
>>
>> This is the v2 of RFC to add dynamic memory allocator support for
>> nolibc.
> So overall, except for the syscall6 implementation whereI agree with
> David that it would be better to always use the "push" variant, I'm
> fine with the rest of the series.

Will send a non-RFC patchset for this...

-- 
Ammar Faizi

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [RFC PATCH v2 3/8] tools/nolibc: i386: Implement syscall with 6 arguments
  2022-03-22 12:13       ` Willy Tarreau
@ 2022-03-22 13:26         ` Ammar Faizi
  2022-03-22 13:34           ` Willy Tarreau
  2022-03-22 13:37         ` David Laight
  1 sibling, 1 reply; 44+ messages in thread
From: Ammar Faizi @ 2022-03-22 13:26 UTC (permalink / raw)
  To: Willy Tarreau
  Cc: David Laight, Paul E. McKenney, Alviro Iskandar Setiawan,
	Nugraha, Linux Kernel Mailing List, GNU/Weeb Mailing List, x86,
	llvm

On 3/22/22 7:13 PM, Willy Tarreau wrote:
> On Tue, Mar 22, 2022 at 07:02:53PM +0700, Ammar Faizi wrote:
>> I propose the
>> following macro (this is not so much different with other my_syscall macro),
>> expect the 6th argument can be in reg or mem.
>>
>> The "rm" constraint here gives the opportunity for the compiler to use %ebp
>> instead of memory if -fomit-frame-pointer is turned on.
>>
>> #define my_syscall6(num, arg1, arg2, arg3, arg4, arg5, arg6) \
>> ({                                                         \
>>      long _ret;                                             \
>>      register long _num asm("eax") = (num);                 \
>>      register long _arg1 asm("ebx") = (long)(arg1);         \
>>      register long _arg2 asm("ecx") = (long)(arg2);         \
>>      register long _arg3 asm("edx") = (long)(arg3);         \
>>      register long _arg4 asm("esi") = (long)(arg4);         \
>>      register long _arg5 asm("edi") = (long)(arg5);         \
>>      long _arg6 = (long)(arg6); /* Might be in memory */    \
>>                                                             \
>>      asm volatile (                                         \
>>          "pushl  %[_arg6]\n\t"                              \
>>          "pushl  %%ebp\n\t"                                 \
>>          "movl   4(%%esp), %%ebp\n\t"                       \
>>          "int    $0x80\n\t"                                 \
>>          "popl   %%ebp\n\t"                                 \
>>          "addl   $4,%%esp\n\t"                              \
>>          : "=a"(_ret)                                       \
>>          : "r"(_num), "r"(_arg1), "r"(_arg2), "r"(_arg3),   \
>>            "r"(_arg4),"r"(_arg5), [_arg6]"rm"(_arg6)        \
>>          : "memory", "cc"                                   \
>>      );                                                     \
>>      _ret;                                                  \
>> })
>>
>> What do you think?
> 
> Hmmm indeed that comes back to the existing constructs and is certainly
> more in line with the rest of the code (plus it will not be affected by
> -O0).
> 
> I seem to remember a register allocation issue which kept me away from
> implementing it this way on i386 back then, but given that my focus was
> not as much on i386 as it was on other platforms, it's likely that I have
> not insisted too much and not tried this one which looks like the way to
> go to me.

I turned out GCC refuses to use "rm" if we compile without -fomit-frame-pointer
(e.g. without optimization / -O0). So I will still use "m" here.

-- 
Ammar Faizi

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [RFC PATCH v2 3/8] tools/nolibc: i386: Implement syscall with 6 arguments
  2022-03-22 13:26         ` Ammar Faizi
@ 2022-03-22 13:34           ` Willy Tarreau
  2022-03-22 13:37             ` Ammar Faizi
  0 siblings, 1 reply; 44+ messages in thread
From: Willy Tarreau @ 2022-03-22 13:34 UTC (permalink / raw)
  To: Ammar Faizi
  Cc: David Laight, Paul E. McKenney, Alviro Iskandar Setiawan,
	Nugraha, Linux Kernel Mailing List, GNU/Weeb Mailing List, x86,
	llvm

On Tue, Mar 22, 2022 at 08:26:37PM +0700, Ammar Faizi wrote:
> On 3/22/22 7:13 PM, Willy Tarreau wrote:
> > On Tue, Mar 22, 2022 at 07:02:53PM +0700, Ammar Faizi wrote:
> > > I propose the
> > > following macro (this is not so much different with other my_syscall macro),
> > > expect the 6th argument can be in reg or mem.
> > > 
> > > The "rm" constraint here gives the opportunity for the compiler to use %ebp
> > > instead of memory if -fomit-frame-pointer is turned on.
> > > 
> > > #define my_syscall6(num, arg1, arg2, arg3, arg4, arg5, arg6) \
> > > ({                                                         \
> > >      long _ret;                                             \
> > >      register long _num asm("eax") = (num);                 \
> > >      register long _arg1 asm("ebx") = (long)(arg1);         \
> > >      register long _arg2 asm("ecx") = (long)(arg2);         \
> > >      register long _arg3 asm("edx") = (long)(arg3);         \
> > >      register long _arg4 asm("esi") = (long)(arg4);         \
> > >      register long _arg5 asm("edi") = (long)(arg5);         \
> > >      long _arg6 = (long)(arg6); /* Might be in memory */    \
> > >                                                             \
> > >      asm volatile (                                         \
> > >          "pushl  %[_arg6]\n\t"                              \
> > >          "pushl  %%ebp\n\t"                                 \
> > >          "movl   4(%%esp), %%ebp\n\t"                       \
> > >          "int    $0x80\n\t"                                 \
> > >          "popl   %%ebp\n\t"                                 \
> > >          "addl   $4,%%esp\n\t"                              \
> > >          : "=a"(_ret)                                       \
> > >          : "r"(_num), "r"(_arg1), "r"(_arg2), "r"(_arg3),   \
> > >            "r"(_arg4),"r"(_arg5), [_arg6]"rm"(_arg6)        \
> > >          : "memory", "cc"                                   \
> > >      );                                                     \
> > >      _ret;                                                  \
> > > })
> > > 
> > > What do you think?
> > 
> > Hmmm indeed that comes back to the existing constructs and is certainly
> > more in line with the rest of the code (plus it will not be affected by
> > -O0).
> > 
> > I seem to remember a register allocation issue which kept me away from
> > implementing it this way on i386 back then, but given that my focus was
> > not as much on i386 as it was on other platforms, it's likely that I have
> > not insisted too much and not tried this one which looks like the way to
> > go to me.
> 
> I turned out GCC refuses to use "rm" if we compile without -fomit-frame-pointer
> (e.g. without optimization / -O0). So I will still use "m" here.

OK that's fine. then you can probably simplify it like this:

      long _arg6 = (long)(arg6); /* Might be in memory */    \
                                                             \
      asm volatile (                                         \
          "pushl  %%ebp\n\t"                                 \
          "movl   %[_arg6], %%ebp\n\t"                       \
          "int    $0x80\n\t"                                 \
          "popl   %%ebp\n\t"                                 \
          : "=a"(_ret)                                       \
          : "r"(_num), "r"(_arg1), "r"(_arg2), "r"(_arg3),   \
            "r"(_arg4),"r"(_arg5), [_arg6]"m"(_arg6)        \
          : "memory", "cc"                                   \
      );                                                     \

See ? no more push, no more addl, direct load from memory.

Willy

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [RFC PATCH v2 3/8] tools/nolibc: i386: Implement syscall with 6 arguments
  2022-03-22 13:34           ` Willy Tarreau
@ 2022-03-22 13:37             ` Ammar Faizi
  2022-03-22 13:39               ` David Laight
  0 siblings, 1 reply; 44+ messages in thread
From: Ammar Faizi @ 2022-03-22 13:37 UTC (permalink / raw)
  To: Willy Tarreau
  Cc: David Laight, Paul E. McKenney, Alviro Iskandar Setiawan,
	Nugraha, Linux Kernel Mailing List, GNU/Weeb Mailing List, x86,
	llvm

On 3/22/22 8:34 PM, Willy Tarreau wrote:
>> I turned out GCC refuses to use "rm" if we compile without -fomit-frame-pointer
>> (e.g. without optimization / -O0). So I will still use "m" here.
> 
> OK that's fine. then you can probably simplify it like this:
> 
>        long _arg6 = (long)(arg6); /* Might be in memory */    \
>                                                               \
>        asm volatile (                                         \
>            "pushl  %%ebp\n\t"                                 \
>            "movl   %[_arg6], %%ebp\n\t"                       \
>            "int    $0x80\n\t"                                 \
>            "popl   %%ebp\n\t"                                 \
>            : "=a"(_ret)                                       \
>            : "r"(_num), "r"(_arg1), "r"(_arg2), "r"(_arg3),   \
>              "r"(_arg4),"r"(_arg5), [_arg6]"m"(_arg6)        \
>            : "memory", "cc"                                   \
>        );                                                     \
> 
> See ? no more push, no more addl, direct load from memory.

Uggh... I crafted the same code like you suggested before, but then
I realized it's buggy, it's buggy because %[_arg6] may live in N(%esp).

When you pushl %ebp, the %esp changes, N(%esp) no longer points to the
6-th argument.

-- 
Ammar Faizi

^ permalink raw reply	[flat|nested] 44+ messages in thread

* RE: [RFC PATCH v2 3/8] tools/nolibc: i386: Implement syscall with 6 arguments
  2022-03-22 12:13       ` Willy Tarreau
  2022-03-22 13:26         ` Ammar Faizi
@ 2022-03-22 13:37         ` David Laight
  2022-03-22 14:47           ` Alviro Iskandar Setiawan
  2022-03-23  6:29           ` Ammar Faizi
  1 sibling, 2 replies; 44+ messages in thread
From: David Laight @ 2022-03-22 13:37 UTC (permalink / raw)
  To: 'Willy Tarreau', Ammar Faizi
  Cc: Paul E. McKenney, Alviro Iskandar Setiawan, Nugraha,
	Linux Kernel Mailing List, GNU/Weeb Mailing List, x86, llvm

From: Willy Tarreau
> Sent: 22 March 2022 12:14
> 
> On Tue, Mar 22, 2022 at 07:02:53PM +0700, Ammar Faizi wrote:
> > I propose the
> > following macro (this is not so much different with other my_syscall macro),
> > expect the 6th argument can be in reg or mem.
> >
> > The "rm" constraint here gives the opportunity for the compiler to use %ebp
> > instead of memory if -fomit-frame-pointer is turned on.
> >
> > #define my_syscall6(num, arg1, arg2, arg3, arg4, arg5, arg6) \
> > ({                                                         \
> >     long _ret;                                             \
> >     register long _num asm("eax") = (num);                 \
> >     register long _arg1 asm("ebx") = (long)(arg1);         \
> >     register long _arg2 asm("ecx") = (long)(arg2);         \
> >     register long _arg3 asm("edx") = (long)(arg3);         \
> >     register long _arg4 asm("esi") = (long)(arg4);         \
> >     register long _arg5 asm("edi") = (long)(arg5);         \
> >     long _arg6 = (long)(arg6); /* Might be in memory */    \
> >                                                            \
> >     asm volatile (                                         \
> >         "pushl  %[_arg6]\n\t"                              \
> >         "pushl  %%ebp\n\t"                                 \
> >         "movl   4(%%esp), %%ebp\n\t"                       \
> >         "int    $0x80\n\t"                                 \
> >         "popl   %%ebp\n\t"                                 \
> >         "addl   $4,%%esp\n\t"                              \
> >         : "=a"(_ret)                                       \
> >         : "r"(_num), "r"(_arg1), "r"(_arg2), "r"(_arg3),   \
> >           "r"(_arg4),"r"(_arg5), [_arg6]"rm"(_arg6)        \
> >         : "memory", "cc"                                   \
> >     );                                                     \
> >     _ret;                                                  \
> > })
> >
> > What do you think?
> 
> Hmmm indeed that comes back to the existing constructs and is certainly
> more in line with the rest of the code (plus it will not be affected by
> -O0).

I'd add an 'always_inline' to the function.
That will force inline even with -O0.

> I seem to remember a register allocation issue which kept me away from
> implementing it this way on i386 back then, but given that my focus was
> not as much on i386 as it was on other platforms, it's likely that I have
> not insisted too much and not tried this one which looks like the way to
> go to me.

dunno, 'asm' register variables are rather more horrid and
should probably only be used (for asm statements) when there aren't
suitable register constraints.

(I'm sure there is a comment about that in the gcc docs.)

	David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)


^ permalink raw reply	[flat|nested] 44+ messages in thread

* RE: [RFC PATCH v2 3/8] tools/nolibc: i386: Implement syscall with 6 arguments
  2022-03-22 13:37             ` Ammar Faizi
@ 2022-03-22 13:39               ` David Laight
  2022-03-22 13:41                 ` Willy Tarreau
  0 siblings, 1 reply; 44+ messages in thread
From: David Laight @ 2022-03-22 13:39 UTC (permalink / raw)
  To: 'Ammar Faizi', Willy Tarreau
  Cc: Paul E. McKenney, Alviro Iskandar Setiawan, Nugraha,
	Linux Kernel Mailing List, GNU/Weeb Mailing List, x86, llvm

From: Ammar Faizi
> Sent: 22 March 2022 13:37
> 
> On 3/22/22 8:34 PM, Willy Tarreau wrote:
> >> I turned out GCC refuses to use "rm" if we compile without -fomit-frame-pointer
> >> (e.g. without optimization / -O0). So I will still use "m" here.
> >
> > OK that's fine. then you can probably simplify it like this:
> >
> >        long _arg6 = (long)(arg6); /* Might be in memory */    \
> >                                                               \
> >        asm volatile (                                         \
> >            "pushl  %%ebp\n\t"                                 \
> >            "movl   %[_arg6], %%ebp\n\t"                       \
> >            "int    $0x80\n\t"                                 \
> >            "popl   %%ebp\n\t"                                 \
> >            : "=a"(_ret)                                       \
> >            : "r"(_num), "r"(_arg1), "r"(_arg2), "r"(_arg3),   \
> >              "r"(_arg4),"r"(_arg5), [_arg6]"m"(_arg6)        \
> >            : "memory", "cc"                                   \
> >        );                                                     \
> >
> > See ? no more push, no more addl, direct load from memory.
> 
> Uggh... I crafted the same code like you suggested before, but then
> I realized it's buggy, it's buggy because %[_arg6] may live in N(%esp).
> 
> When you pushl %ebp, the %esp changes, N(%esp) no longer points to the
> 6-th argument.

Yep - that is why I wrote the 'push arg6'.

	David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [RFC PATCH v2 3/8] tools/nolibc: i386: Implement syscall with 6 arguments
  2022-03-22 13:39               ` David Laight
@ 2022-03-22 13:41                 ` Willy Tarreau
  2022-03-22 13:45                   ` Ammar Faizi
  0 siblings, 1 reply; 44+ messages in thread
From: Willy Tarreau @ 2022-03-22 13:41 UTC (permalink / raw)
  To: David Laight
  Cc: 'Ammar Faizi',
	Paul E. McKenney, Alviro Iskandar Setiawan, Nugraha,
	Linux Kernel Mailing List, GNU/Weeb Mailing List, x86, llvm

On Tue, Mar 22, 2022 at 01:39:41PM +0000, David Laight wrote:
> From: Ammar Faizi
> > Sent: 22 March 2022 13:37
> > 
> > On 3/22/22 8:34 PM, Willy Tarreau wrote:
> > >> I turned out GCC refuses to use "rm" if we compile without -fomit-frame-pointer
> > >> (e.g. without optimization / -O0). So I will still use "m" here.
> > >
> > > OK that's fine. then you can probably simplify it like this:
> > >
> > >        long _arg6 = (long)(arg6); /* Might be in memory */    \
> > >                                                               \
> > >        asm volatile (                                         \
> > >            "pushl  %%ebp\n\t"                                 \
> > >            "movl   %[_arg6], %%ebp\n\t"                       \
> > >            "int    $0x80\n\t"                                 \
> > >            "popl   %%ebp\n\t"                                 \
> > >            : "=a"(_ret)                                       \
> > >            : "r"(_num), "r"(_arg1), "r"(_arg2), "r"(_arg3),   \
> > >              "r"(_arg4),"r"(_arg5), [_arg6]"m"(_arg6)        \
> > >            : "memory", "cc"                                   \
> > >        );                                                     \
> > >
> > > See ? no more push, no more addl, direct load from memory.
> > 
> > Uggh... I crafted the same code like you suggested before, but then
> > I realized it's buggy, it's buggy because %[_arg6] may live in N(%esp).
> > 
> > When you pushl %ebp, the %esp changes, N(%esp) no longer points to the
> > 6-th argument.
> 
> Yep - that is why I wrote the 'push arg6'.

Got it and you're right indeed, sorry for the noise :-)

Willy

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [RFC PATCH v2 3/8] tools/nolibc: i386: Implement syscall with 6 arguments
  2022-03-22 13:41                 ` Willy Tarreau
@ 2022-03-22 13:45                   ` Ammar Faizi
  2022-03-22 13:54                     ` Ammar Faizi
  0 siblings, 1 reply; 44+ messages in thread
From: Ammar Faizi @ 2022-03-22 13:45 UTC (permalink / raw)
  To: Willy Tarreau, David Laight
  Cc: Paul E. McKenney, Alviro Iskandar Setiawan, Nugraha,
	Linux Kernel Mailing List, GNU/Weeb Mailing List, x86, llvm

On 3/22/22 8:41 PM, Willy Tarreau wrote:
[...]
>>> When you pushl %ebp, the %esp changes, N(%esp) no longer points to the
>>> 6-th argument.
>>
>> Yep - that is why I wrote the 'push arg6'.
> 
> Got it and you're right indeed, sorry for the noise :-)

Uggh... it seems I hit a GCC bug when playing with -m32 (32-bit code).
I am on Linux x86-64. Compiling without optimization causing GCC stuck
in an endless loop with 100% CPU usage.

I will try to narrow it down and see if I can create a simple reproducer
on this issue.

ammarfaizi2@integral2:~/work/linux.work/tools/include/nolibc$ gcc --version
gcc (Ubuntu 11.2.0-7ubuntu2) 11.2.0
Copyright (C) 2021 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

ammarfaizi2@integral2:~/work/linux.work/tools/include/nolibc$ time taskset -c 0 gcc -m32 -ffreestanding -nostdlib -nostartfiles test.c -o test -lgcc
^C

real	0m46.696s
user	0m0.000s
sys	0m0.002s
ammarfaizi2@integral2:~/work/linux.work/tools/include/nolibc$ time taskset -c 0 gcc -O1 -m32 -ffreestanding -nostdlib -nostartfiles test.c -o test -lgcc

real	0m0.054s
user	0m0.046s
sys	0m0.008s
ammarfaizi2@integral2:~/work/linux.work/tools/include/nolibc$ time taskset -c 0 gcc -O2 -m32 -ffreestanding -nostdlib -nostartfiles test.c -o test -lgcc

real	0m0.079s
user	0m0.067s
sys	0m0.012s
ammarfaizi2@integral2:~/work/linux.work/tools/include/nolibc$ time taskset -c 0 gcc -O3 -m32 -ffreestanding -nostdlib -nostartfiles test.c -o test -lgcc

real	0m0.110s
user	0m0.097s
sys	0m0.013s
ammarfaizi2@integral2:~/work/linux.work/tools/include/nolibc$


-- 
Ammar Faizi

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [RFC PATCH v2 3/8] tools/nolibc: i386: Implement syscall with 6 arguments
  2022-03-22 13:45                   ` Ammar Faizi
@ 2022-03-22 13:54                     ` Ammar Faizi
  2022-03-22 13:56                       ` Ammar Faizi
  0 siblings, 1 reply; 44+ messages in thread
From: Ammar Faizi @ 2022-03-22 13:54 UTC (permalink / raw)
  To: Willy Tarreau, David Laight
  Cc: Paul E. McKenney, Alviro Iskandar Setiawan, Nugraha,
	Linux Kernel Mailing List, GNU/Weeb Mailing List, x86, llvm


Willy, something goes wrong here...

ammarfaizi2@integral2:~/work/linux.work/tools/include/nolibc$ taskset -c 0 gcc -ffreestanding -nostdlib -nostartfiles test.c -o test -lgcc
/usr/bin/ld: /tmp/ccHiYiks.o: warning: relocation against `environ' in read-only section `.text'
/usr/bin/ld: /tmp/ccHiYiks.o: in function `getenv':
test.c:(.text+0x1f76): undefined reference to `environ'
/usr/bin/ld: test.c:(.text+0x1fc3): undefined reference to `environ'
/usr/bin/ld: test.c:(.text+0x1ffc): undefined reference to `environ'
/usr/bin/ld: test.c:(.text+0x2021): undefined reference to `environ'
/usr/bin/ld: test.c:(.text+0x2049): undefined reference to `environ'
/usr/bin/ld: warning: creating DT_TEXTREL in a PIE
collect2: error: ld returned 1 exit status
ammarfaizi2@integral2:~/work/linux.work/tools/include/nolibc$


I suspect it's caused by commit:

commit c970abe796019b3d576fd154a54b94efb35c02b1
Author: Willy Tarreau <w@1wt.eu>
Date:   Mon Mar 21 18:33:08 2022 +0100

     tools/nolibc/stdlib: add a simple getenv() implementation
     
     This implementation relies on an extern definition of the environ
     variable, that the caller must declare and initialize from envp.
     
     Signed-off-by: Willy Tarreau <w@1wt.eu>
     Signed-off-by: Paul E. McKenney <paulmck@kernel.org>

I will take a look deeper on this...

-- 
Ammar Faizi

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [RFC PATCH v2 3/8] tools/nolibc: i386: Implement syscall with 6 arguments
  2022-03-22 13:54                     ` Ammar Faizi
@ 2022-03-22 13:56                       ` Ammar Faizi
  2022-03-22 14:02                         ` Willy Tarreau
  0 siblings, 1 reply; 44+ messages in thread
From: Ammar Faizi @ 2022-03-22 13:56 UTC (permalink / raw)
  To: Willy Tarreau, David Laight
  Cc: Paul E. McKenney, Alviro Iskandar Setiawan, Nugraha,
	Linux Kernel Mailing List, GNU/Weeb Mailing List, x86, llvm

On 3/22/22 8:54 PM, Ammar Faizi wrote:
> 
> Willy, something goes wrong here...
> 
> ammarfaizi2@integral2:~/work/linux.work/tools/include/nolibc$ taskset -c 0 gcc -ffreestanding -nostdlib -nostartfiles test.c -o test -lgcc
> /usr/bin/ld: /tmp/ccHiYiks.o: warning: relocation against `environ' in read-only section `.text'
> /usr/bin/ld: /tmp/ccHiYiks.o: in function `getenv':
> test.c:(.text+0x1f76): undefined reference to `environ'
> /usr/bin/ld: test.c:(.text+0x1fc3): undefined reference to `environ'
> /usr/bin/ld: test.c:(.text+0x1ffc): undefined reference to `environ'
> /usr/bin/ld: test.c:(.text+0x2021): undefined reference to `environ'
> /usr/bin/ld: test.c:(.text+0x2049): undefined reference to `environ'
> /usr/bin/ld: warning: creating DT_TEXTREL in a PIE
> collect2: error: ld returned 1 exit status
> ammarfaizi2@integral2:~/work/linux.work/tools/include/nolibc$
> 
> 
> I suspect it's caused by commit:
> 
> commit c970abe796019b3d576fd154a54b94efb35c02b1
> Author: Willy Tarreau <w@1wt.eu>
> Date:   Mon Mar 21 18:33:08 2022 +0100
> 
>      tools/nolibc/stdlib: add a simple getenv() implementation
>      This implementation relies on an extern definition of the environ
>      variable, that the caller must declare and initialize from envp.
>      Signed-off-by: Willy Tarreau <w@1wt.eu>
>      Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
> 
> I will take a look deeper on this...

This bug only exists when compiling without optimization.

-- 
Ammar Faizi

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [RFC PATCH v2 3/8] tools/nolibc: i386: Implement syscall with 6 arguments
  2022-03-22 13:56                       ` Ammar Faizi
@ 2022-03-22 14:02                         ` Willy Tarreau
  0 siblings, 0 replies; 44+ messages in thread
From: Willy Tarreau @ 2022-03-22 14:02 UTC (permalink / raw)
  To: Ammar Faizi
  Cc: David Laight, Paul E. McKenney, Alviro Iskandar Setiawan,
	Nugraha, Linux Kernel Mailing List, GNU/Weeb Mailing List, x86,
	llvm

On Tue, Mar 22, 2022 at 08:56:44PM +0700, Ammar Faizi wrote:
> On 3/22/22 8:54 PM, Ammar Faizi wrote:
> > 
> > Willy, something goes wrong here...
> > 
> > ammarfaizi2@integral2:~/work/linux.work/tools/include/nolibc$ taskset -c 0 gcc -ffreestanding -nostdlib -nostartfiles test.c -o test -lgcc
> > /usr/bin/ld: /tmp/ccHiYiks.o: warning: relocation against `environ' in read-only section `.text'
> > /usr/bin/ld: /tmp/ccHiYiks.o: in function `getenv':
> > test.c:(.text+0x1f76): undefined reference to `environ'
> > /usr/bin/ld: test.c:(.text+0x1fc3): undefined reference to `environ'
> > /usr/bin/ld: test.c:(.text+0x1ffc): undefined reference to `environ'
> > /usr/bin/ld: test.c:(.text+0x2021): undefined reference to `environ'
> > /usr/bin/ld: test.c:(.text+0x2049): undefined reference to `environ'
> > /usr/bin/ld: warning: creating DT_TEXTREL in a PIE
> > collect2: error: ld returned 1 exit status
> > ammarfaizi2@integral2:~/work/linux.work/tools/include/nolibc$
> > 
> > 
> > I suspect it's caused by commit:
> > 
> > commit c970abe796019b3d576fd154a54b94efb35c02b1
> > Author: Willy Tarreau <w@1wt.eu>
> > Date:   Mon Mar 21 18:33:08 2022 +0100
> > 
> >      tools/nolibc/stdlib: add a simple getenv() implementation
> >      This implementation relies on an extern definition of the environ
> >      variable, that the caller must declare and initialize from envp.
> >      Signed-off-by: Willy Tarreau <w@1wt.eu>
> >      Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
> > 
> > I will take a look deeper on this...
> 
> This bug only exists when compiling without optimization.

Indeed, reproduced. I can bypass it by adding __attribute__((weak)) on
the environ declaration in getenv(). Will send a patch later.

Thanks,
Willy

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [RFC PATCH v2 3/8] tools/nolibc: i386: Implement syscall with 6 arguments
  2022-03-22 13:37         ` David Laight
@ 2022-03-22 14:47           ` Alviro Iskandar Setiawan
  2022-03-22 15:11             ` David Laight
  2022-03-23  6:29           ` Ammar Faizi
  1 sibling, 1 reply; 44+ messages in thread
From: Alviro Iskandar Setiawan @ 2022-03-22 14:47 UTC (permalink / raw)
  To: David Laight
  Cc: Willy Tarreau, Ammar Faizi, Paul E. McKenney, Nugraha,
	Linux Kernel Mailing List, GNU/Weeb Mailing List, x86, llvm

On Tue, Mar 22, 2022 at 8:37 PM David Laight wrote:
> dunno, 'asm' register variables are rather more horrid and
> should probably only be used (for asm statements) when there aren't
> suitable register constraints.
>
> (I'm sure there is a comment about that in the gcc docs.)

I don't find the comment that says so here:
https://gcc.gnu.org/onlinedocs/gcc/Local-Register-Variables.html

The current code looks valid to me, but I would still prefer to use
the explicit register constraints instead of always using "r"(var) if
available. No strong reason in denying that, tho. Still looks good.

-- Viro

^ permalink raw reply	[flat|nested] 44+ messages in thread

* RE: [RFC PATCH v2 3/8] tools/nolibc: i386: Implement syscall with 6 arguments
  2022-03-22 14:47           ` Alviro Iskandar Setiawan
@ 2022-03-22 15:11             ` David Laight
  0 siblings, 0 replies; 44+ messages in thread
From: David Laight @ 2022-03-22 15:11 UTC (permalink / raw)
  To: 'Alviro Iskandar Setiawan'
  Cc: Willy Tarreau, Ammar Faizi, Paul E. McKenney, Nugraha,
	Linux Kernel Mailing List, GNU/Weeb Mailing List, x86, llvm

From: Alviro Iskandar Setiawan
> Sent: 22 March 2022 14:48
> 
> On Tue, Mar 22, 2022 at 8:37 PM David Laight wrote:
> > dunno, 'asm' register variables are rather more horrid and
> > should probably only be used (for asm statements) when there aren't
> > suitable register constraints.
> >
> > (I'm sure there is a comment about that in the gcc docs.)
> 
> I don't find the comment that says so here:
> https://gcc.gnu.org/onlinedocs/gcc/Local-Register-Variables.html

I've probably inferred it from:
    "The only supported use for this feature is to specify registers for
    input and output operands when calling Extended asm (see Extended Asm).
    This may be necessary if the constraints for a particular machine don’t
    provide sufficient control to select the desired register."

Here is isn't necessary because the required constraint exist/

	David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [RFC PATCH v2 2/8] tools/nolibc: Remove .global _start from the entry point code
  2022-03-22 10:21 ` [RFC PATCH v2 2/8] tools/nolibc: Remove .global _start from the entry point code Ammar Faizi
@ 2022-03-22 17:09   ` Nick Desaulniers
  2022-03-22 17:25     ` Willy Tarreau
  0 siblings, 1 reply; 44+ messages in thread
From: Nick Desaulniers @ 2022-03-22 17:09 UTC (permalink / raw)
  To: Ammar Faizi
  Cc: Willy Tarreau, Paul E. McKenney, Alviro Iskandar Setiawan,
	Nugraha, Linux Kernel Mailing List, GNU/Weeb Mailing List, llvm

On Tue, Mar 22, 2022 at 3:21 AM Ammar Faizi <ammarfaizi2@gnuweeb.org> wrote:
>
> Building with clang yields the following error:
> ```
>   <inline asm>:3:1: error: _start changed binding to STB_GLOBAL
>   .global _start
>   ^
>   1 error generated.
> ```
> Make sure only specify one between `.global _start` and `.weak _start`.
> Removing `.global _start`.

Reviewed-by: Nick Desaulniers <ndesaulniers@google.com>

Yes, symbols should either be `.weak` or `.global`. The warning from
Clang's integrated assembler is meant to flush out funny business.

I assume there's a good reason _why_ _start is weak and not strong?
Then again, I'm not familiar with nolibc.

>
> Cc: llvm@lists.linux.dev
> Cc: Nick Desaulniers <ndesaulniers@google.com>
> Signed-off-by: Ammar Faizi <ammarfaizi2@gnuweeb.org>
> ---
>
> @@ Changelog:
>
>    Link RFC v1: https://lore.kernel.org/llvm/20220320093750.159991-3-ammarfaizi2@gnuweeb.org
>    RFC v1 -> RFC v2:
>     - Remove all `.global _start` for all build (GCC and Clang) instead of
>       removing all `.weak _start` for clang build (Comment from Willy).
> ---
>  tools/include/nolibc/arch-aarch64.h | 1 -
>  tools/include/nolibc/arch-arm.h     | 1 -
>  tools/include/nolibc/arch-i386.h    | 1 -
>  tools/include/nolibc/arch-mips.h    | 1 -
>  tools/include/nolibc/arch-riscv.h   | 1 -
>  tools/include/nolibc/arch-x86_64.h  | 1 -
>  6 files changed, 6 deletions(-)
>
> diff --git a/tools/include/nolibc/arch-aarch64.h b/tools/include/nolibc/arch-aarch64.h
> index 87d9e434820c..2dbd80d633cb 100644
> --- a/tools/include/nolibc/arch-aarch64.h
> +++ b/tools/include/nolibc/arch-aarch64.h
> @@ -184,7 +184,6 @@ struct sys_stat_struct {
>  /* startup code */
>  asm(".section .text\n"
>      ".weak _start\n"
> -    ".global _start\n"
>      "_start:\n"
>      "ldr x0, [sp]\n"              // argc (x0) was in the stack
>      "add x1, sp, 8\n"             // argv (x1) = sp
> diff --git a/tools/include/nolibc/arch-arm.h b/tools/include/nolibc/arch-arm.h
> index 001a3c8c9ad5..1191395b5acd 100644
> --- a/tools/include/nolibc/arch-arm.h
> +++ b/tools/include/nolibc/arch-arm.h
> @@ -177,7 +177,6 @@ struct sys_stat_struct {
>  /* startup code */
>  asm(".section .text\n"
>      ".weak _start\n"
> -    ".global _start\n"
>      "_start:\n"
>  #if defined(__THUMBEB__) || defined(__THUMBEL__)
>      /* We enter here in 32-bit mode but if some previous functions were in
> diff --git a/tools/include/nolibc/arch-i386.h b/tools/include/nolibc/arch-i386.h
> index d7e4d53325a3..125a691fc631 100644
> --- a/tools/include/nolibc/arch-i386.h
> +++ b/tools/include/nolibc/arch-i386.h
> @@ -176,7 +176,6 @@ struct sys_stat_struct {
>   */
>  asm(".section .text\n"
>      ".weak _start\n"
> -    ".global _start\n"
>      "_start:\n"
>      "pop %eax\n"                // argc   (first arg, %eax)
>      "mov %esp, %ebx\n"          // argv[] (second arg, %ebx)
> diff --git a/tools/include/nolibc/arch-mips.h b/tools/include/nolibc/arch-mips.h
> index c9a6aac87c6d..1a124790c99f 100644
> --- a/tools/include/nolibc/arch-mips.h
> +++ b/tools/include/nolibc/arch-mips.h
> @@ -192,7 +192,6 @@ struct sys_stat_struct {
>  asm(".section .text\n"
>      ".weak __start\n"
>      ".set nomips16\n"
> -    ".global __start\n"
>      ".set    noreorder\n"
>      ".option pic0\n"
>      ".ent __start\n"
> diff --git a/tools/include/nolibc/arch-riscv.h b/tools/include/nolibc/arch-riscv.h
> index bc10b7b5706d..511d67fc534e 100644
> --- a/tools/include/nolibc/arch-riscv.h
> +++ b/tools/include/nolibc/arch-riscv.h
> @@ -185,7 +185,6 @@ struct sys_stat_struct {
>  /* startup code */
>  asm(".section .text\n"
>      ".weak _start\n"
> -    ".global _start\n"
>      "_start:\n"
>      ".option push\n"
>      ".option norelax\n"
> diff --git a/tools/include/nolibc/arch-x86_64.h b/tools/include/nolibc/arch-x86_64.h
> index a7b70ea51b68..84c174181425 100644
> --- a/tools/include/nolibc/arch-x86_64.h
> +++ b/tools/include/nolibc/arch-x86_64.h
> @@ -199,7 +199,6 @@ struct sys_stat_struct {
>   */
>  asm(".section .text\n"
>      ".weak _start\n"
> -    ".global _start\n"
>      "_start:\n"
>      "pop %rdi\n"                // argc   (first arg, %rdi)
>      "mov %rsp, %rsi\n"          // argv[] (second arg, %rsi)
> --
> Ammar Faizi
>


-- 
Thanks,
~Nick Desaulniers

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [RFC PATCH v2 2/8] tools/nolibc: Remove .global _start from the entry point code
  2022-03-22 17:09   ` Nick Desaulniers
@ 2022-03-22 17:25     ` Willy Tarreau
  2022-03-22 17:30       ` Nick Desaulniers
  0 siblings, 1 reply; 44+ messages in thread
From: Willy Tarreau @ 2022-03-22 17:25 UTC (permalink / raw)
  To: Nick Desaulniers
  Cc: Ammar Faizi, Paul E. McKenney, Alviro Iskandar Setiawan, Nugraha,
	Linux Kernel Mailing List, GNU/Weeb Mailing List, llvm

Hi Nick,

On Tue, Mar 22, 2022 at 10:09:18AM -0700, Nick Desaulniers wrote:
> On Tue, Mar 22, 2022 at 3:21 AM Ammar Faizi <ammarfaizi2@gnuweeb.org> wrote:
> >
> > Building with clang yields the following error:
> > ```
> >   <inline asm>:3:1: error: _start changed binding to STB_GLOBAL
> >   .global _start
> >   ^
> >   1 error generated.
> > ```
> > Make sure only specify one between `.global _start` and `.weak _start`.
> > Removing `.global _start`.
> 
> Reviewed-by: Nick Desaulniers <ndesaulniers@google.com>
> 
> Yes, symbols should either be `.weak` or `.global`. The warning from
> Clang's integrated assembler is meant to flush out funny business.
> 
> I assume there's a good reason _why_ _start is weak and not strong?

Yes, the issue appears when you start to build programs made of more than
one C file. That's why we have a few weak symbols here and there (others
like errno are static and the lack of inter-unit portability is assumed).

> Then again, I'm not familiar with nolibc.

No problem. The purpose is clearly *not* to implement a libc, but to have
something very lightweight that allows to compile trivial programs. A good
example of this is tools/testing/selftests/rcutorture/bin/mkinitrd.sh. I'm
personally using a tiny pre-init shell that I always package with my
kernels and that builds with them [1]. It will never do big things but
the balance between ease of use and coding effort is pretty good in my
experience. And I'm also careful not to make it complicated to use nor
to maintain, pragmatism is important and the effort should remain on the
program developer if some arbitration is needed.

Regards,
Willy

[1] https://github.com/formilux/flxutils/tree/master/init

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [RFC PATCH v2 2/8] tools/nolibc: Remove .global _start from the entry point code
  2022-03-22 17:25     ` Willy Tarreau
@ 2022-03-22 17:30       ` Nick Desaulniers
  2022-03-22 17:58         ` Willy Tarreau
  0 siblings, 1 reply; 44+ messages in thread
From: Nick Desaulniers @ 2022-03-22 17:30 UTC (permalink / raw)
  To: Willy Tarreau; +Cc: Linux Kernel Mailing List, GNU/Weeb Mailing List, llvm

(Moving folks to bcc; check the lists if you're interested)

On Tue, Mar 22, 2022 at 10:25 AM Willy Tarreau <w@1wt.eu> wrote:
>
> Hi Nick,
>
> On Tue, Mar 22, 2022 at 10:09:18AM -0700, Nick Desaulniers wrote:
> > Then again, I'm not familiar with nolibc.
>
> No problem. The purpose is clearly *not* to implement a libc, but to have
> something very lightweight that allows to compile trivial programs. A good
> example of this is tools/testing/selftests/rcutorture/bin/mkinitrd.sh. I'm
> personally using a tiny pre-init shell that I always package with my
> kernels and that builds with them [1]. It will never do big things but
> the balance between ease of use and coding effort is pretty good in my
> experience. And I'm also careful not to make it complicated to use nor
> to maintain, pragmatism is important and the effort should remain on the
> program developer if some arbitration is needed.

Neat, I bet that helps generate very small initrd! Got any quick size
measurements?
-- 
Thanks,
~Nick Desaulniers

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [RFC PATCH v2 2/8] tools/nolibc: Remove .global _start from the entry point code
  2022-03-22 17:30       ` Nick Desaulniers
@ 2022-03-22 17:58         ` Willy Tarreau
  2022-03-22 18:07           ` Nick Desaulniers
  0 siblings, 1 reply; 44+ messages in thread
From: Willy Tarreau @ 2022-03-22 17:58 UTC (permalink / raw)
  To: Nick Desaulniers; +Cc: Linux Kernel Mailing List, GNU/Weeb Mailing List, llvm

On Tue, Mar 22, 2022 at 10:30:53AM -0700, Nick Desaulniers wrote:
> (Moving folks to bcc; check the lists if you're interested)

Yes, agreed :-)

> On Tue, Mar 22, 2022 at 10:25 AM Willy Tarreau <w@1wt.eu> wrote:
> > The purpose is clearly *not* to implement a libc, but to have
> > something very lightweight that allows to compile trivial programs. A good
> > example of this is tools/testing/selftests/rcutorture/bin/mkinitrd.sh. I'm
> > personally using a tiny pre-init shell that I always package with my
> > kernels and that builds with them [1]. It will never do big things but
> > the balance between ease of use and coding effort is pretty good in my
> > experience. And I'm also careful not to make it complicated to use nor
> > to maintain, pragmatism is important and the effort should remain on the
> > program developer if some arbitration is needed.
> 
> Neat, I bet that helps generate very small initrd! Got any quick size
> measurements?

Yep:

First, the usual static printf("hello world!\n"):

  $ ll hello-*libc
  -rwxrwxr-x 1 willy dev 719232 Mar 22 18:50 hello-glibc*
  -rwxrwxr-x 1 willy dev   1248 Mar 22 18:51 hello-nolibc*

  $ objdump -h hello-nolibc 
  hello-nolibc:     file format elf64-x86-64

  Sections:
  Idx Name          Size      VMA               LMA               File off  Algn
    0 .text         00000300  00000000004000b0  00000000004000b0  000000b0  2**0
                    CONTENTS, ALLOC, LOAD, READONLY, CODE
    1 .rodata       00000015  00000000004003b0  00000000004003b0  000003b0  2**0
                    CONTENTS, ALLOC, LOAD, READONLY, DATA

Then the preinit stuff:

  $ ll initramfs/init 
  -rwxr-xr-x 1 willy users 13936 Mar 22 18:40 initramfs/init*

  $ xz -c9 < initramfs/init | wc -c
  8392

  $ size initramfs/init 
     text    data     bss     dec     hex filename
    13348       0   23016   36364    8e0c init

  $ objdump -h initramfs/init
  initramfs/init:     file format elf64-x86-64
  Sections:
  Idx Name          Size      VMA               LMA               File off  Algn
    0 .text         00002b74  00000000004000e8  00000000004000e8  000000e8  2**0
                    CONTENTS, ALLOC, LOAD, READONLY, CODE
    1 .rodata       000008b0  0000000000402c60  0000000000402c60  00002c60  2**5
                    CONTENTS, ALLOC, LOAD, READONLY, DATA
    2 .bss          000059e8  0000000000404520  0000000000404520  00003520  2**5
                    ALLOC

This one supports ~30-40 simple commands (mount/unmount, mknod, ls, ln),
a tar extractor, multi-level braces, and boolean expression evaluation,
variable expansion, and a config file parser to script all this. The code
is 20 years old and is really ugly (even uglier than you think). But that
gives an idea. 20 years ago the init was much simpler and 800 bytes (my
constraint was for single floppies containing kernel+rootfs) and strings
were manually merged by tails and put in .text to drop .rodata.

You'll also note that there's 0 data segment above. That used to be
convenient to further shrink programs, but these days given how linkers
arrange segments by permissions that doesn't save as much as it used to,
and it's likely that at some points I'll assume that there must be some
variables by default (errno, environ, etc) and that we'll accept to invest
a few extra tens of bytes by default for more convenience.

Cheers,
Willy

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [RFC PATCH v2 2/8] tools/nolibc: Remove .global _start from the entry point code
  2022-03-22 17:58         ` Willy Tarreau
@ 2022-03-22 18:07           ` Nick Desaulniers
  2022-03-22 18:24             ` Willy Tarreau
  0 siblings, 1 reply; 44+ messages in thread
From: Nick Desaulniers @ 2022-03-22 18:07 UTC (permalink / raw)
  To: Willy Tarreau; +Cc: Linux Kernel Mailing List, GNU/Weeb Mailing List, llvm

On Tue, Mar 22, 2022 at 10:58 AM Willy Tarreau <w@1wt.eu> wrote:
>
> On Tue, Mar 22, 2022 at 10:30:53AM -0700, Nick Desaulniers wrote:
> > On Tue, Mar 22, 2022 at 10:25 AM Willy Tarreau <w@1wt.eu> wrote:
> > > The purpose is clearly *not* to implement a libc, but to have
> > > something very lightweight that allows to compile trivial programs. A good
> > > example of this is tools/testing/selftests/rcutorture/bin/mkinitrd.sh. I'm
> > > personally using a tiny pre-init shell that I always package with my
> > > kernels and that builds with them [1]. It will never do big things but
> > > the balance between ease of use and coding effort is pretty good in my
> > > experience. And I'm also careful not to make it complicated to use nor
> > > to maintain, pragmatism is important and the effort should remain on the
> > > program developer if some arbitration is needed.
> >
> > Neat, I bet that helps generate very small initrd! Got any quick size
> > measurements?
>
> Yep:
>
> First, the usual static printf("hello world!\n"):
>
>   $ ll hello-*libc
>   -rwxrwxr-x 1 willy dev 719232 Mar 22 18:50 hello-glibc*
>   -rwxrwxr-x 1 willy dev   1248 Mar 22 18:51 hello-nolibc*

! What! Are those both statically linked?

> This one supports ~30-40 simple commands (mount/unmount, mknod, ls, ln),
> a tar extractor, multi-level braces, and boolean expression evaluation,
> variable expansion, and a config file parser to script all this. The code
> is 20 years old and is really ugly (even uglier than you think). But that
> gives an idea. 20 years ago the init was much simpler and 800 bytes (my
> constraint was for single floppies containing kernel+rootfs) and strings
> were manually merged by tails and put in .text to drop .rodata.

Oh, so nolibc has been around for a while then?

ld.lld will do string merging in that fashion at -O2 (the linker can
accept and optimization level).  I did have a kernel patch for that
somewhere, need to update it for CC_OPTIMIZE_FOR_SIZE...

I guess the tradeoff with strings in .text is that now the strings
themselves are r+x and not just r?

>
> You'll also note that there's 0 data segment above. That used to be
> convenient to further shrink programs, but these days given how linkers
> arrange segments by permissions that doesn't save as much as it used to,
> and it's likely that at some points I'll assume that there must be some
> variables by default (errno, environ, etc) and that we'll accept to invest
> a few extra tens of bytes by default for more convenience.

Thanks for the measurements.
-- 
Thanks,
~Nick Desaulniers

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [RFC PATCH v2 2/8] tools/nolibc: Remove .global _start from the entry point code
  2022-03-22 18:07           ` Nick Desaulniers
@ 2022-03-22 18:24             ` Willy Tarreau
  2022-03-22 18:38               ` Nick Desaulniers
  0 siblings, 1 reply; 44+ messages in thread
From: Willy Tarreau @ 2022-03-22 18:24 UTC (permalink / raw)
  To: Nick Desaulniers; +Cc: Linux Kernel Mailing List, GNU/Weeb Mailing List, llvm

On Tue, Mar 22, 2022 at 11:07:17AM -0700, Nick Desaulniers wrote:
> > First, the usual static printf("hello world!\n"):
> >
> >   $ ll hello-*libc
> >   -rwxrwxr-x 1 willy dev 719232 Mar 22 18:50 hello-glibc*
> >   -rwxrwxr-x 1 willy dev   1248 Mar 22 18:51 hello-nolibc*
> 
> ! What! Are those both statically linked?

Yes:

  $ file hello-nolibc 
  hello-nolibc: ELF 64-bit LSB executable, x86-64, version 1 (SYSV), statically linked, stripped

  (rebuilding without stripping)
  $ nm --size hello-nolibc 
  000000000000000f T main
  0000000000000053 t u64toa_r
  0000000000000280 t printf.constprop.0
  $ nm hello-nolibc 
  00000000004013c5 R __bss_start
  00000000004013c5 R _edata
  00000000004013c8 R _end
  00000000004000bf W _start
  00000000004000b0 T main
  0000000000400130 t printf.constprop.0
  00000000004000dd t u64toa_r

> > This one supports ~30-40 simple commands (mount/unmount, mknod, ls, ln),
> > a tar extractor, multi-level braces, and boolean expression evaluation,
> > variable expansion, and a config file parser to script all this. The code
> > is 20 years old and is really ugly (even uglier than you think). But that
> > gives an idea. 20 years ago the init was much simpler and 800 bytes (my
> > constraint was for single floppies containing kernel+rootfs) and strings
> > were manually merged by tails and put in .text to drop .rodata.
> 
> Oh, so nolibc has been around for a while then?

Not exactly. Over time I collected some of my stuff out of preinit to
make more reusable code for other tools, and eventually created a separate
project for it 5 years ago [1]. I then changed my mind a few times on how
to arrange all this and over time it became a bit easier to use. One day
Paul asked how to make less invasive static binaries for rcutorture and I
found that it was the perfect match so we agreed to integrate it there. It
was still a single file by then. And as usual when some code starts to get
more exposure it receives more contribs and feature requests ;-)

> ld.lld will do string merging in that fashion at -O2 (the linker can
> accept and optimization level).  I did have a kernel patch for that
> somewhere, need to update it for CC_OPTIMIZE_FOR_SIZE...

Ah I didn't know, that's good to know!

> I guess the tradeoff with strings in .text is that now the strings
> themselves are r+x and not just r?

Yes but when you're writing a small shell to allow you to manually
mount your rootfs from the kernel, you don't really care if someone
might try to use some of your strings as code gadgets for ROP exploits :-)

I would really not want to see this used for general programs, but it
does fit well with hacking stuff for initramfs, and what lies in the
selftests directory in general I guess.

What I particularly like is that I don't need a full toolchain, so if
I can build a kernel with the bare-metal compilers from kernel.org then
I know I can also build my initramfs that's packaged in it using the
exact same compiler. This significantly simplifies the build process.

Willy

[1] https://github.com/wtarreau/nolibc

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [RFC PATCH v2 2/8] tools/nolibc: Remove .global _start from the entry point code
  2022-03-22 18:24             ` Willy Tarreau
@ 2022-03-22 18:38               ` Nick Desaulniers
  0 siblings, 0 replies; 44+ messages in thread
From: Nick Desaulniers @ 2022-03-22 18:38 UTC (permalink / raw)
  To: Willy Tarreau; +Cc: Linux Kernel Mailing List, GNU/Weeb Mailing List, llvm

On Tue, Mar 22, 2022 at 11:24 AM Willy Tarreau <w@1wt.eu> wrote:
>
> What I particularly like is that I don't need a full toolchain, so if
> I can build a kernel with the bare-metal compilers from kernel.org then
> I know I can also build my initramfs that's packaged in it using the
> exact same compiler. This significantly simplifies the build process.

Neat; yeah that coincides a bit with my interest in having builds of
llvm on kernel.org; having/needing a libc is a PITA and building a
full cross toolchain is also more difficult than I think it needs to
be.  The libc will depend on kernel headers, for each target.  LLVM
currently has a WIP libc in its tree; I'm looking for something I can
statically link into the toolchain images (even LTO them into the
image).  Will probably pursue musl (if I ever get time for this,
though maybe a project for my summer intern...).

One thing I've been looking at is a utility called llvm-ifs [1]; it
can generate .so stubs from a textual description that can be more
easily read, diff'ed, and committed. These are much faster to build
and reduce the chain of build dependencies (when dynamically linking).
Last I checked it had issues with versioned symbols, and I'm not sure
if/what it does for headers, which are still needed.  Within Android,
libabigail is being used to dump+diff xml descriptions of parts of an
ABI, it looks like llvm-ifs might be useful for that as well.  Not
sure if it's interesting but thought I'd share.

[1] https://www.youtube.com/watch?v=_pIorUFavc8
-- 
Thanks,
~Nick Desaulniers

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [RFC PATCH v2 3/8] tools/nolibc: i386: Implement syscall with 6 arguments
  2022-03-22 13:37         ` David Laight
  2022-03-22 14:47           ` Alviro Iskandar Setiawan
@ 2022-03-23  6:29           ` Ammar Faizi
  2022-03-23  6:32             ` Ammar Faizi
  2022-03-23  7:10             ` Willy Tarreau
  1 sibling, 2 replies; 44+ messages in thread
From: Ammar Faizi @ 2022-03-23  6:29 UTC (permalink / raw)
  To: David Laight, 'Willy Tarreau'
  Cc: Paul E. McKenney, Alviro Iskandar Setiawan, Nugraha,
	Linux Kernel Mailing List, GNU/Weeb Mailing List, x86, llvm

On 3/22/22 8:37 PM, David Laight wrote:
> dunno, 'asm' register variables are rather more horrid and
> should probably only be used (for asm statements) when there aren't
> suitable register constraints.
> 
> (I'm sure there is a comment about that in the gcc docs.)

^ Hey David, yes you're right, that is very interesting...

I hit a GCC bug when playing with syscall6() implementation here.

Using register variables for all inputs for syscall6() causing GCC 11.2
stuck in an endless loop with 100% CPU usage. Reproducible with several
versions of GCC.

In GCC 6.3, the syscall6() implementation above yields ICE (Internal
Compiler Error):
```
   <source>: In function '__sys_mmap':
   <source>:35:1: error: unable to find a register to spill
    }
    ^
   <source>:35:1: error: this is the insn:
   (insn 14 13 30 2 (set (reg:SI 95 [92])
           (mem/c:SI (plus:SI (reg/f:SI 16 argp)
                   (const_int 28 [0x1c])) [1 offset+0 S4 A32])) <source>:33 86 {*movsi_internal}
        (expr_list:REG_DEAD (reg:SI 16 argp)
           (nil)))
   <source>:35: confused by earlier errors, bailing out
   Compiler returned: 1
```
See the full show here: https://godbolt.org/z/dYeKaYWY3

Using the appropriate constraints, it compiles nicely, now it looks
like this:
```
   #define my_syscall6(num, arg1, arg2, arg3, arg4, arg5, arg6)	\
   ({								\
   	long _eax  = (long)(num);				\
   	long _arg6 = (long)(arg6); /* Always be in memory */	\
   	asm volatile (						\
   		"pushl	%[_arg6]\n\t"				\
   		"pushl	%%ebp\n\t"				\
   		"movl	4(%%esp), %%ebp\n\t"			\
   		"int	$0x80\n\t"				\
   		"popl	%%ebp\n\t"				\
   		"addl	$4,%%esp\n\t"				\
   		: "+a"(_eax)		/* %eax */		\
   		: "b"(arg1),		/* %ebx */		\
   		  "c"(arg2),		/* %ecx */		\
   		  "d"(arg3),		/* %edx */		\
   		  "S"(arg4),		/* %esi */		\
   		  "D"(arg5),		/* %edi */		\
   		  [_arg6]"m"(_arg6)	/* memory */		\
   		: "memory", "cc"				\
   	);							\
   	_eax;							\
   })
```
Link: https://godbolt.org/z/ozGbYWbPY

Will use that in the next patchset version.

-- 
Ammar Faizi


^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [RFC PATCH v2 3/8] tools/nolibc: i386: Implement syscall with 6 arguments
  2022-03-23  6:29           ` Ammar Faizi
@ 2022-03-23  6:32             ` Ammar Faizi
  2022-03-23  7:10             ` Willy Tarreau
  1 sibling, 0 replies; 44+ messages in thread
From: Ammar Faizi @ 2022-03-23  6:32 UTC (permalink / raw)
  To: David Laight, 'Willy Tarreau'
  Cc: Paul E. McKenney, Alviro Iskandar Setiawan, Nugraha,
	Linux Kernel Mailing List, GNU/Weeb Mailing List, x86, llvm

I have reported this bug to GNU people.
   
   https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105032

-- 
Ammar Faizi

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [RFC PATCH v2 3/8] tools/nolibc: i386: Implement syscall with 6 arguments
  2022-03-23  6:29           ` Ammar Faizi
  2022-03-23  6:32             ` Ammar Faizi
@ 2022-03-23  7:10             ` Willy Tarreau
  1 sibling, 0 replies; 44+ messages in thread
From: Willy Tarreau @ 2022-03-23  7:10 UTC (permalink / raw)
  To: Ammar Faizi
  Cc: David Laight, Paul E. McKenney, Alviro Iskandar Setiawan,
	Nugraha, Linux Kernel Mailing List, GNU/Weeb Mailing List, x86,
	llvm

On Wed, Mar 23, 2022 at 01:29:39PM +0700, Ammar Faizi wrote:
> On 3/22/22 8:37 PM, David Laight wrote:
> > dunno, 'asm' register variables are rather more horrid and
> > should probably only be used (for asm statements) when there aren't
> > suitable register constraints.
> > 
> > (I'm sure there is a comment about that in the gcc docs.)
> 
> ^ Hey David, yes you're right, that is very interesting...
> 
> I hit a GCC bug when playing with syscall6() implementation here.
> 
> Using register variables for all inputs for syscall6() causing GCC 11.2
> stuck in an endless loop with 100% CPU usage. Reproducible with several
> versions of GCC.
> 
> In GCC 6.3, the syscall6() implementation above yields ICE (Internal
> Compiler Error):
> ```
>   <source>: In function '__sys_mmap':
>   <source>:35:1: error: unable to find a register to spill

Now I'm pretty sure that it was the issue I faced when trying long ago,
I remember this error message before I found it wiser to give up.

Willy

^ permalink raw reply	[flat|nested] 44+ messages in thread

end of thread, other threads:[~2022-03-23  7:10 UTC | newest]

Thread overview: 44+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-03-22 10:21 [RFC PATCH v2 0/8] Add dynamic memory allocator support for nolibc Ammar Faizi
2022-03-22 10:21 ` [RFC PATCH v2 1/8] tools/nolibc: x86-64: Update System V ABI document link Ammar Faizi
2022-03-22 10:21 ` [RFC PATCH v2 2/8] tools/nolibc: Remove .global _start from the entry point code Ammar Faizi
2022-03-22 17:09   ` Nick Desaulniers
2022-03-22 17:25     ` Willy Tarreau
2022-03-22 17:30       ` Nick Desaulniers
2022-03-22 17:58         ` Willy Tarreau
2022-03-22 18:07           ` Nick Desaulniers
2022-03-22 18:24             ` Willy Tarreau
2022-03-22 18:38               ` Nick Desaulniers
2022-03-22 10:21 ` [RFC PATCH v2 3/8] tools/nolibc: i386: Implement syscall with 6 arguments Ammar Faizi
2022-03-22 10:57   ` David Laight
2022-03-22 11:23     ` Willy Tarreau
2022-03-22 11:39   ` David Laight
2022-03-22 12:02     ` Ammar Faizi
2022-03-22 12:07       ` Ammar Faizi
2022-03-22 12:13       ` Willy Tarreau
2022-03-22 13:26         ` Ammar Faizi
2022-03-22 13:34           ` Willy Tarreau
2022-03-22 13:37             ` Ammar Faizi
2022-03-22 13:39               ` David Laight
2022-03-22 13:41                 ` Willy Tarreau
2022-03-22 13:45                   ` Ammar Faizi
2022-03-22 13:54                     ` Ammar Faizi
2022-03-22 13:56                       ` Ammar Faizi
2022-03-22 14:02                         ` Willy Tarreau
2022-03-22 13:37         ` David Laight
2022-03-22 14:47           ` Alviro Iskandar Setiawan
2022-03-22 15:11             ` David Laight
2022-03-23  6:29           ` Ammar Faizi
2022-03-23  6:32             ` Ammar Faizi
2022-03-23  7:10             ` Willy Tarreau
2022-03-22 10:21 ` [RFC PATCH v2 4/8] tools/nolibc/sys: Implement `mmap()` and `munmap()` Ammar Faizi
2022-03-22 10:21 ` [RFC PATCH v2 5/8] tools/nolibc/types: Implement `offsetof()` and `container_of()` macro Ammar Faizi
2022-03-22 10:21 ` [RFC PATCH v2 6/8] tools/nolibc/stdlib: Implement `malloc()`, `calloc()`, `realloc()` and `free()` Ammar Faizi
2022-03-22 11:52   ` David Laight
2022-03-22 12:18     ` Ammar Faizi
2022-03-22 12:36       ` Alviro Iskandar Setiawan
2022-03-22 12:42         ` Ammar Faizi
2022-03-22 12:21     ` Willy Tarreau
2022-03-22 10:21 ` [RFC PATCH v2 7/8] tools/nolibc/string: Implement `strnlen()` Ammar Faizi
2022-03-22 10:21 ` [RFC PATCH v2 8/8] tools/include/string: Implement `strdup()` and `strndup()` Ammar Faizi
2022-03-22 11:27 ` [RFC PATCH v2 0/8] Add dynamic memory allocator support for nolibc Willy Tarreau
2022-03-22 12:43   ` Ammar Faizi

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.