All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v1 0/8] tools/nolibc: add 32/64-bit powerpc support
@ 2023-07-18 21:09 Zhangjin Wu
  2023-07-18 21:10 ` [PATCH v1 1/8] tools/nolibc: add support for powerpc Zhangjin Wu
                   ` (8 more replies)
  0 siblings, 9 replies; 37+ messages in thread
From: Zhangjin Wu @ 2023-07-18 21:09 UTC (permalink / raw)
  To: w; +Cc: thomas, arnd, falcon, linux-kernel, linux-kselftest

Hi, Willy

Here is the powerpc support, includes 32-bit big-endian powerpc, 64-bit
little endian and big endian powerpc.

All of them passes run-user with the default powerpc toolchain from
Ubuntu 20.04:

    $ make run-user DEFCONFIG=tinyconfig XARCH=powerpc CROSS_COMPILE=powerpc-linux-gnu- | grep status
    165 test(s): 157 passed,   8 skipped,   0 failed => status: warning
    $ make run-user DEFCONFIG=tinyconfig XARCH=powerpc64 CROSS_COMPILE=powerpc64le-linux-gnu- | grep status
    165 test(s): 157 passed,   8 skipped,   0 failed => status: warning
    $ make run-user DEFCONFIG=tinyconfig XARCH=powerpc64le CROSS_COMPILE=powerpc64le-linux-gnu- | grep status
    165 test(s): 157 passed,   8 skipped,   0 failed => status: warning

For the slow 'run' target, I have run with defconfig before, and just
verified them via the fast tinyconfig + run with a new patch from next
patchset, all of them passes:

    165 test(s): 156 passed,   9 skipped,   0 failed => status: warning

Note, the big endian crosstool powerpc64-linux-gcc from
https://mirrors.edge.kernel.org/pub/tools/crosstool/ has been used to
test both little endian and big endian powerpc64 too, both passed.

Here simply explain what they are:

* tools/nolibc: add support for powerpc
  tools/nolibc: add support for powerpc64

    32-bit & 64-bit powerpc support of nolibc.

* selftests/nolibc: select_null: fix up for big endian powerpc64

    fix up a test case for big endian powerpc64.

* selftests/nolibc: add extra config file customize support

    add extconfig target to allow enable extra config options via
    configs/<ARCH>.config

    applied suggestion from Thomas to use config files instead of config
    lines.

* selftests/nolibc: add XARCH and ARCH mapping support

    applied suggestions from Willy, use XARCH as the input of our nolibc
    test, use ARCH as the pure kernel input, at last build the mapping
    between XARCH and ARCH.

    Customize the variables via the input XARCH.

* selftests/nolibc: add test support for powerpc

    Require to use extconfig to enable the console options specified in
    configs/powerpc.config

    currently, we should manually run extconfig after defconfig, in next
    patchset, we will do this automatically.

* selftests/nolibc: add test support for powerpc64le
  selftests/nolibc: add test support for powerpc64

    Very simple, but customize CFLAGS carefully to let them work with
    powerpc64le-linux-gnu-gcc (from Linux distributions) and
    powerpc64-linux-gcc (from mirrors.edge.kernel.org)

The next patchset will not be tinyconfig, but some prepare patches, will
be sent out soon.

Best regards,
Zhangjin
---

Zhangjin Wu (8):
  tools/nolibc: add support for powerpc
  tools/nolibc: add support for powerpc64
  selftests/nolibc: select_null: fix up for big endian powerpc64
  selftests/nolibc: add extra config file customize support
  selftests/nolibc: add XARCH and ARCH mapping support
  selftests/nolibc: add test support for powerpc
  selftests/nolibc: add test support for powerpc64le
  selftests/nolibc: add test support for powerpc64

 tools/include/nolibc/arch-powerpc.h           | 170 ++++++++++++++++++
 tools/testing/selftests/nolibc/Makefile       |  55 ++++--
 .../selftests/nolibc/configs/powerpc.config   |   3 +
 tools/testing/selftests/nolibc/nolibc-test.c  |   2 +-
 4 files changed, 217 insertions(+), 13 deletions(-)
 create mode 100644 tools/include/nolibc/arch-powerpc.h
 create mode 100644 tools/testing/selftests/nolibc/configs/powerpc.config

-- 
2.25.1


^ permalink raw reply	[flat|nested] 37+ messages in thread

* [PATCH v1 1/8] tools/nolibc: add support for powerpc
  2023-07-18 21:09 [PATCH v1 0/8] tools/nolibc: add 32/64-bit powerpc support Zhangjin Wu
@ 2023-07-18 21:10 ` Zhangjin Wu
  2023-07-23  7:32   ` Thomas Weißschuh
  2023-07-18 21:11 ` [PATCH v1 2/8] tools/nolibc: add support for powerpc64 Zhangjin Wu
                   ` (7 subsequent siblings)
  8 siblings, 1 reply; 37+ messages in thread
From: Zhangjin Wu @ 2023-07-18 21:10 UTC (permalink / raw)
  To: w; +Cc: thomas, arnd, falcon, linux-kernel, linux-kselftest

Both syscall declarations and _start code definition are added for
powerpc to nolibc.

Like mips, powerpc uses a register (exactly, the summary overflow bit)
to record the error occurred, and uses another register to return the
value [1]. So, the return value of every syscall declaration must be
normalized to easier the __sysret helper, return -value when there is an
error, otheriwse, return value directly.

Glibc and musl use different methods to check the summary overflow bit,
glibc (sysdeps/unix/sysv/linux/powerpc/sysdep.h) saves the cr register
to r0 at first, and then check the summary overflow bit in cr0:

    mfcr r0
    r0 & (1 << 28) ? -r3 : r3

    -->

    10003c14:       7c 00 00 26     mfcr    r0
    10003c18:       74 09 10 00     andis.  r9,r0,4096
    10003c1c:       41 82 00 08     beq     0x10003c24
    10003c20:       7c 63 00 d0     neg     r3,r3

Musl (arch/powerpc/syscall_arch.h) directly checks the summary overflow
bit with the 'bns' instruction:

    /* no summary overflow bit means no error, return value directly */
    bns+ 1f
    /* otherwise, return negated value */
    neg r3, r3
    1:

    -->

    10000418:       40 a3 00 08     bns     0x10000420
    1000041c:       7c 63 00 d0     neg     r3,r3

The later one is smaller, here applies it.

arch/powerpc/include/asm/vdso/gettimeofday.h file uses the smaller
method for do_syscall_2() too.

[1]: https://man7.org/linux/man-pages/man2/syscall.2.html

Signed-off-by: Zhangjin Wu <falcon@tinylab.org>
---
 tools/include/nolibc/arch-powerpc.h | 156 ++++++++++++++++++++++++++++
 1 file changed, 156 insertions(+)
 create mode 100644 tools/include/nolibc/arch-powerpc.h

diff --git a/tools/include/nolibc/arch-powerpc.h b/tools/include/nolibc/arch-powerpc.h
new file mode 100644
index 000000000000..100ec0f412dc
--- /dev/null
+++ b/tools/include/nolibc/arch-powerpc.h
@@ -0,0 +1,156 @@
+/* SPDX-License-Identifier: LGPL-2.1 OR MIT */
+/*
+ * PowerPC specific definitions for NOLIBC
+ * Copyright (C) 2023 Zhangjin Wu <falcon@tinylab.org>
+ */
+
+#ifndef _NOLIBC_ARCH_POWERPC_H
+#define _NOLIBC_ARCH_POWERPC_H
+
+#include "compiler.h"
+#include "crt.h"
+
+/* Syscalls for PowerPC :
+ *   - stack is 16-byte aligned
+ *   - syscall number is passed in r0
+ *   - arguments are in r3, r4, r5, r6, r7, r8, r9
+ *   - the system call is performed by calling "sc"
+ *   - syscall return comes in r3, and the summary overflow bit is checked
+ *     to know if an error occurred, in which case errno is in r3.
+ *   - the arguments are cast to long and assigned into the target
+ *     registers which are then simply passed as registers to the asm code,
+ *     so that we don't have to experience issues with register constraints.
+ */
+
+#define _NOLIBC_SYSCALL_CLOBBERLIST \
+	"memory", "cr0", "r9", "r10", "r11", "r12"
+
+#define my_syscall0(num)                                                     \
+({                                                                           \
+	register long r0 __asm__ ("r0") = (num);                             \
+	register long r3 __asm__ ("r3");                                     \
+									     \
+	__asm__ volatile (                                                   \
+		"sc; bns+ 1f; neg %1, %1; 1:\n"                              \
+		: "+r"(r0), "=r"(r3)                                         \
+		:: "r4", "r5", "r6", "r7", "r8", _NOLIBC_SYSCALL_CLOBBERLIST \
+	);                                                                   \
+	r3;                                                                  \
+})
+
+#define my_syscall1(num, arg1)                                               \
+({                                                                           \
+	register long r0 __asm__ ("r0") = (num);                             \
+	register long r3 __asm__ ("r3") = (long)(arg1);                      \
+									     \
+	__asm__ volatile (                                                   \
+		"sc; bns+ 1f; neg %1, %1; 1:\n"                              \
+		: "+r"(r0), "+r"(r3)                                         \
+		:: "r4", "r5", "r6", "r7", "r8", _NOLIBC_SYSCALL_CLOBBERLIST \
+	);                                                                   \
+	r3;                                                                  \
+})
+
+
+#define my_syscall2(num, arg1, arg2)                                         \
+({                                                                           \
+	register long r0 __asm__ ("r0") = (num);                             \
+	register long r3 __asm__ ("r3") = (long)(arg1);                      \
+	register long r4 __asm__ ("r4") = (long)(arg2);                      \
+									     \
+	__asm__ volatile (                                                   \
+		"sc; bns+ 1f; neg %1, %1; 1:\n"                              \
+		: "+r"(r0), "+r"(r3),                                        \
+		  "+r"(r4)                                                   \
+		:: "r5", "r6", "r7", "r8", _NOLIBC_SYSCALL_CLOBBERLIST       \
+	);                                                                   \
+	r3;                                                                  \
+})
+
+
+#define my_syscall3(num, arg1, arg2, arg3)                                   \
+({                                                                           \
+	register long r0 __asm__ ("r0") = (num);                             \
+	register long r3 __asm__ ("r3") = (long)(arg1);                      \
+	register long r4 __asm__ ("r4") = (long)(arg2);                      \
+	register long r5 __asm__ ("r5") = (long)(arg3);                      \
+									     \
+	__asm__ volatile (                                                   \
+		"sc; bns+ 1f; neg %1, %1; 1:\n"                              \
+		: "+r"(r0), "+r"(r3),                                        \
+		  "+r"(r4), "+r"(r5)                                         \
+		:: "r6", "r7", "r8", _NOLIBC_SYSCALL_CLOBBERLIST             \
+	);                                                                   \
+	r3;                                                                  \
+})
+
+
+#define my_syscall4(num, arg1, arg2, arg3, arg4)                             \
+({                                                                           \
+	register long r0 __asm__ ("r0") = (num);                             \
+	register long r3 __asm__ ("r3") = (long)(arg1);                      \
+	register long r4 __asm__ ("r4") = (long)(arg2);                      \
+	register long r5 __asm__ ("r5") = (long)(arg3);                      \
+	register long r6 __asm__ ("r6") = (long)(arg4);                      \
+									     \
+	__asm__ volatile (                                                   \
+		"sc; bns+ 1f; neg %1, %1; 1:\n"                              \
+		: "+r"(r0), "+r"(r3),                                        \
+		  "+r"(r4), "+r"(r5), "+r"(r6)                               \
+		:: "r7", "r8", _NOLIBC_SYSCALL_CLOBBERLIST                   \
+	);                                                                   \
+	r3;                                                                  \
+})
+
+
+#define my_syscall5(num, arg1, arg2, arg3, arg4, arg5)                       \
+({                                                                           \
+	register long r0 __asm__ ("r0") = (num);                             \
+	register long r3 __asm__ ("r3") = (long)(arg1);                      \
+	register long r4 __asm__ ("r4") = (long)(arg2);                      \
+	register long r5 __asm__ ("r5") = (long)(arg3);                      \
+	register long r6 __asm__ ("r6") = (long)(arg4);                      \
+	register long r7 __asm__ ("r7") = (long)(arg5);                      \
+									     \
+	__asm__ volatile (                                                   \
+		"sc; bns+ 1f; neg %1, %1; 1:\n"                              \
+		: "+r"(r0), "+r"(r3),                                        \
+		  "+r"(r4), "+r"(r5), "+r"(r6), "+r"(r7)                     \
+		:: "r8", _NOLIBC_SYSCALL_CLOBBERLIST                         \
+	);                                                                   \
+	r3;                                                                  \
+})
+
+#define my_syscall6(num, arg1, arg2, arg3, arg4, arg5, arg6)                 \
+({                                                                           \
+	register long r0 __asm__ ("r0") = (num);                             \
+	register long r3 __asm__ ("r3") = (long)(arg1);                      \
+	register long r4 __asm__ ("r4") = (long)(arg2);                      \
+	register long r5 __asm__ ("r5") = (long)(arg3);                      \
+	register long r6 __asm__ ("r6") = (long)(arg4);                      \
+	register long r7 __asm__ ("r7") = (long)(arg5);                      \
+	register long r8 __asm__ ("r8") = (long)(arg6);                      \
+									     \
+	__asm__ volatile (                                                   \
+		"sc; bns+ 1f; neg %1, %1; 1:\n"                              \
+		: "+r"(r0), "+r"(r3),                                        \
+		  "+r"(r4), "+r"(r5), "+r"(r6), "+r"(r7), "+r"(r8)           \
+		:: _NOLIBC_SYSCALL_CLOBBERLIST                               \
+	);                                                                   \
+	r3;                                                                  \
+})
+
+/* startup code */
+void __attribute__((weak, noreturn, optimize("Os", "omit-frame-pointer"))) __no_stack_protector _start(void)
+{
+	__asm__ volatile (
+		"mr     3, 1\n"         /* save stack pointer to r3, as arg1 of _start_c */
+		"clrrwi 1, 1, 4\n"      /* align the stack to 16 bytes                   */
+		"li     0, 0\n"         /* zero the frame pointer                        */
+		"stwu   1, -16(1)\n"    /* the initial stack frame                       */
+		"bl     _start_c\n"     /* transfer to c runtime                         */
+	);
+	__builtin_unreachable();
+}
+
+#endif /* _NOLIBC_ARCH_POWERPC_H */
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [PATCH v1 2/8] tools/nolibc: add support for powerpc64
  2023-07-18 21:09 [PATCH v1 0/8] tools/nolibc: add 32/64-bit powerpc support Zhangjin Wu
  2023-07-18 21:10 ` [PATCH v1 1/8] tools/nolibc: add support for powerpc Zhangjin Wu
@ 2023-07-18 21:11 ` Zhangjin Wu
  2023-07-18 21:13 ` [PATCH v1 3/8] selftests/nolibc: select_null: fix up for big endian powerpc64 Zhangjin Wu
                   ` (6 subsequent siblings)
  8 siblings, 0 replies; 37+ messages in thread
From: Zhangjin Wu @ 2023-07-18 21:11 UTC (permalink / raw)
  To: w; +Cc: thomas, arnd, falcon, linux-kernel, linux-kselftest

This follows the 64-bit PowerPC ABI [1], refers to the slides: "A new
ABI for little-endian PowerPC64 Design & Implementation" [2] and the
musl code in arch/powerpc64/crt_arch.h.

Firstly, stdu and clrrdi are used instead of stwu and clrrwi for
powerpc64.

Second, the stack frame size is increased to 32 bytes for powerpc64, 32
bytes is the minimal stack frame size supported described in [2].

Besides, the TOC pointer (GOT pointer) must be saved to r2.

This works on both little endian and big endian 64-bit PowerPC.

[1]: https://refspecs.linuxfoundation.org/ELF/ppc64/PPC-elf64abi.pdf
[2]: https://www.llvm.org/devmtg/2014-04/PDFs/Talks/Euro-LLVM-2014-Weigand.pdf

Signed-off-by: Zhangjin Wu <falcon@tinylab.org>
---
 tools/include/nolibc/arch-powerpc.h | 14 ++++++++++++++
 1 file changed, 14 insertions(+)

diff --git a/tools/include/nolibc/arch-powerpc.h b/tools/include/nolibc/arch-powerpc.h
index 100ec0f412dc..7b28ebcfcc23 100644
--- a/tools/include/nolibc/arch-powerpc.h
+++ b/tools/include/nolibc/arch-powerpc.h
@@ -143,6 +143,19 @@
 /* startup code */
 void __attribute__((weak, noreturn, optimize("Os", "omit-frame-pointer"))) __no_stack_protector _start(void)
 {
+#ifdef __powerpc64__
+	/* On 64-bit PowerPC, save TOC/GOT pointer to r2 */
+	extern char TOC __asm__ (".TOC.");
+	register volatile long r2 __asm__ ("r2") = (void *)&TOC - (void *)_start;
+
+	__asm__ volatile (
+		"mr     3, 1\n"         /* save stack pointer to r3, as arg1 of _start_c */
+		"clrrdi 1, 1, 4\n"      /* align the stack to 16 bytes                   */
+		"li     0, 0\n"         /* zero the frame pointer                        */
+		"stdu   1, -32(1)\n"    /* the initial stack frame                       */
+		"bl     _start_c\n"     /* transfer to c runtime                         */
+	);
+#else
 	__asm__ volatile (
 		"mr     3, 1\n"         /* save stack pointer to r3, as arg1 of _start_c */
 		"clrrwi 1, 1, 4\n"      /* align the stack to 16 bytes                   */
@@ -150,6 +163,7 @@ void __attribute__((weak, noreturn, optimize("Os", "omit-frame-pointer"))) __no_
 		"stwu   1, -16(1)\n"    /* the initial stack frame                       */
 		"bl     _start_c\n"     /* transfer to c runtime                         */
 	);
+#endif
 	__builtin_unreachable();
 }
 
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [PATCH v1 3/8] selftests/nolibc: select_null: fix up for big endian powerpc64
  2023-07-18 21:09 [PATCH v1 0/8] tools/nolibc: add 32/64-bit powerpc support Zhangjin Wu
  2023-07-18 21:10 ` [PATCH v1 1/8] tools/nolibc: add support for powerpc Zhangjin Wu
  2023-07-18 21:11 ` [PATCH v1 2/8] tools/nolibc: add support for powerpc64 Zhangjin Wu
@ 2023-07-18 21:13 ` Zhangjin Wu
  2023-07-18 22:17   ` Thomas Weißschuh
  2023-07-18 21:14 ` [PATCH v1 4/8] selftests/nolibc: add extra config file customize support Zhangjin Wu
                   ` (5 subsequent siblings)
  8 siblings, 1 reply; 37+ messages in thread
From: Zhangjin Wu @ 2023-07-18 21:13 UTC (permalink / raw)
  To: w; +Cc: thomas, arnd, falcon, linux-kernel, linux-kselftest

The following error reported while running nolibc-test on the big endian
64-bit PowerPC kernel compiled with powerpc64le-linux-gnu-gcc in Ubuntu
20.04.

    56 select_nullinit[1]: illegal instruction (4) at 100042a8 nip 100042a8 lr 100042a8 code 1 in init[10000000+10000]
    init[1]: code: 7c6307b4 7c234840 4081f580 7c6300d0 907d0000 3860ffff 4bfff570 3ca2fffe
    init[1]: code: 38800038 38a5d547 7fc3f378 4bffcd65 <1000038c> 38c10060 38a00000 38800000

Let's explicitly initialize all of the timeval members to zero.

Signed-off-by: Zhangjin Wu <falcon@tinylab.org>
---
 tools/testing/selftests/nolibc/nolibc-test.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tools/testing/selftests/nolibc/nolibc-test.c b/tools/testing/selftests/nolibc/nolibc-test.c
index 03b1d30f5507..ec2c7774522e 100644
--- a/tools/testing/selftests/nolibc/nolibc-test.c
+++ b/tools/testing/selftests/nolibc/nolibc-test.c
@@ -858,7 +858,7 @@ int run_syscall(int min, int max)
 		CASE_TEST(read_badf);         EXPECT_SYSER(1, read(-1, &tmp, 1), -1, EBADF); break;
 		CASE_TEST(rmdir_blah);        EXPECT_SYSER(1, rmdir("/blah"), -1, ENOENT); break;
 		CASE_TEST(sched_yield);       EXPECT_SYSZR(1, sched_yield()); break;
-		CASE_TEST(select_null);       EXPECT_SYSZR(1, ({ struct timeval tv = { 0 }; select(0, NULL, NULL, NULL, &tv); })); break;
+		CASE_TEST(select_null);       EXPECT_SYSZR(1, ({ struct timeval tv = { 0, 0 }; select(0, NULL, NULL, NULL, &tv); })); break;
 		CASE_TEST(select_stdout);     EXPECT_SYSNE(1, ({ fd_set fds; FD_ZERO(&fds); FD_SET(1, &fds); select(2, NULL, &fds, NULL, NULL); }), -1); break;
 		CASE_TEST(select_fault);      EXPECT_SYSER(1, select(1, (void *)1, NULL, NULL, 0), -1, EFAULT); break;
 		CASE_TEST(stat_blah);         EXPECT_SYSER(1, stat("/proc/self/blah", &stat_buf), -1, ENOENT); break;
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [PATCH v1 4/8] selftests/nolibc: add extra config file customize support
  2023-07-18 21:09 [PATCH v1 0/8] tools/nolibc: add 32/64-bit powerpc support Zhangjin Wu
                   ` (2 preceding siblings ...)
  2023-07-18 21:13 ` [PATCH v1 3/8] selftests/nolibc: select_null: fix up for big endian powerpc64 Zhangjin Wu
@ 2023-07-18 21:14 ` Zhangjin Wu
  2023-07-22 12:00   ` Willy Tarreau
  2023-07-18 21:15 ` [PATCH v1 5/8] selftests/nolibc: add XARCH and ARCH mapping support Zhangjin Wu
                   ` (4 subsequent siblings)
  8 siblings, 1 reply; 37+ messages in thread
From: Zhangjin Wu @ 2023-07-18 21:14 UTC (permalink / raw)
  To: w
  Cc: thomas, arnd, falcon, linux-kernel, linux-kselftest,
	Thomas Weißschuh

The default DEFCONFIG_<ARCH> may not always work for all architectures,
some architectures require to add extra kernel config options, this adds
a new 'extconfig' target for this requirement.

It allows to customize extra kernel config options via the common
common.config and the architecture specific <ARCH>.config, at last
trigger 'allnoconfig' to let them take effect with missing config
options as disabled.

The scripts/kconfig/merge_config.sh tool is used to merge the extra
config files.

Suggested-by: Thomas Weißschuh <linux@weissschuh.net>
Link: https://lore.kernel.org/lkml/67eb70d4-c9ff-4afc-bac7-7f36cc2c81bc@t-8ch.de/
Signed-off-by: Zhangjin Wu <falcon@tinylab.org>
---
 tools/testing/selftests/nolibc/Makefile | 7 +++++++
 1 file changed, 7 insertions(+)

diff --git a/tools/testing/selftests/nolibc/Makefile b/tools/testing/selftests/nolibc/Makefile
index f42adef87e12..08a5ca5f418b 100644
--- a/tools/testing/selftests/nolibc/Makefile
+++ b/tools/testing/selftests/nolibc/Makefile
@@ -39,6 +39,9 @@ DEFCONFIG_s390       = defconfig
 DEFCONFIG_loongarch  = defconfig
 DEFCONFIG            = $(DEFCONFIG_$(ARCH))
 
+# extra kernel config files under configs/, include common + architecture specific
+EXTCONFIG            = common.config $(ARCH).config
+
 # optional tests to run (default = all)
 TEST =
 
@@ -162,6 +165,10 @@ initramfs: nolibc-test
 defconfig:
 	$(Q)$(MAKE) -C $(srctree) ARCH=$(ARCH) CC=$(CC) CROSS_COMPILE=$(CROSS_COMPILE) mrproper $(DEFCONFIG) prepare
 
+extconfig:
+	$(Q)$(srctree)/scripts/kconfig/merge_config.sh -O "$(srctree)" -m "$(srctree)/.config" $(foreach c,$(EXTCONFIG),$(wildcard $(CURDIR)/configs/$c))
+	$(Q)$(MAKE) -C $(srctree) ARCH=$(ARCH) CC=$(CC) CROSS_COMPILE=$(CROSS_COMPILE) KCONFIG_ALLCONFIG="$(srctree)/.config" allnoconfig
+
 kernel: initramfs
 	$(Q)$(MAKE) -C $(srctree) ARCH=$(ARCH) CC=$(CC) CROSS_COMPILE=$(CROSS_COMPILE) $(IMAGE_NAME) CONFIG_INITRAMFS_SOURCE=$(CURDIR)/initramfs
 
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [PATCH v1 5/8] selftests/nolibc: add XARCH and ARCH mapping support
  2023-07-18 21:09 [PATCH v1 0/8] tools/nolibc: add 32/64-bit powerpc support Zhangjin Wu
                   ` (3 preceding siblings ...)
  2023-07-18 21:14 ` [PATCH v1 4/8] selftests/nolibc: add extra config file customize support Zhangjin Wu
@ 2023-07-18 21:15 ` Zhangjin Wu
  2023-07-22 12:03   ` Willy Tarreau
  2023-07-18 21:16 ` [PATCH v1 6/8] selftests/nolibc: add test support for powerpc Zhangjin Wu
                   ` (3 subsequent siblings)
  8 siblings, 1 reply; 37+ messages in thread
From: Zhangjin Wu @ 2023-07-18 21:15 UTC (permalink / raw)
  To: w; +Cc: thomas, arnd, falcon, linux-kernel, linux-kselftest

To test the architectures not directly supported by kernel, let's add a
new XARCH as our own test input and map between XARCH and ARCH to make
sure pass a right ARCH to kernel for a XARCH input and also configure a
default XARCH for ARCH.

ARCH is a subset of XARCH, to test more architectures than the ARCH
variable directly supported by kernel, the old architecture specific
variables used by our test are converted to use XARCH instead of ARCH.

Suggested-by: Willy Tarreau <w@1wt.eu>
Link: https://lore.kernel.org/lkml/20230702171715.GD16233@1wt.eu/
Signed-off-by: Zhangjin Wu <falcon@tinylab.org>
---
 tools/testing/selftests/nolibc/Makefile | 34 +++++++++++++++----------
 1 file changed, 21 insertions(+), 13 deletions(-)

diff --git a/tools/testing/selftests/nolibc/Makefile b/tools/testing/selftests/nolibc/Makefile
index 08a5ca5f418b..b17a82efe6de 100644
--- a/tools/testing/selftests/nolibc/Makefile
+++ b/tools/testing/selftests/nolibc/Makefile
@@ -14,6 +14,13 @@ include $(srctree)/scripts/subarch.include
 ARCH = $(SUBARCH)
 endif
 
+# XARCH and ARCH mapping
+# XARCH is specified by user
+XARCH           ?= $(or $(XARCH_$(ARCH)),$(ARCH))
+
+# ARCH is supported by kernel
+ARCH            := $(or $(ARCH_$(XARCH)),$(XARCH))
+
 # kernel image names by architecture
 IMAGE_i386       = arch/x86/boot/bzImage
 IMAGE_x86_64     = arch/x86/boot/bzImage
@@ -24,7 +31,7 @@ IMAGE_mips       = vmlinuz
 IMAGE_riscv      = arch/riscv/boot/Image
 IMAGE_s390       = arch/s390/boot/bzImage
 IMAGE_loongarch  = arch/loongarch/boot/vmlinuz.efi
-IMAGE            = $(IMAGE_$(ARCH))
+IMAGE            = $(IMAGE_$(XARCH))
 IMAGE_NAME       = $(notdir $(IMAGE))
 
 # default kernel configurations that appear to be usable
@@ -37,10 +44,10 @@ DEFCONFIG_mips       = malta_defconfig
 DEFCONFIG_riscv      = defconfig
 DEFCONFIG_s390       = defconfig
 DEFCONFIG_loongarch  = defconfig
-DEFCONFIG            = $(DEFCONFIG_$(ARCH))
+DEFCONFIG            = $(DEFCONFIG_$(XARCH))
 
 # extra kernel config files under configs/, include common + architecture specific
-EXTCONFIG            = common.config $(ARCH).config
+EXTCONFIG            = common.config $(XARCH).config
 
 # optional tests to run (default = all)
 TEST =
@@ -55,7 +62,7 @@ QEMU_ARCH_mips       = mipsel  # works with malta_defconfig
 QEMU_ARCH_riscv      = riscv64
 QEMU_ARCH_s390       = s390x
 QEMU_ARCH_loongarch  = loongarch64
-QEMU_ARCH            = $(QEMU_ARCH_$(ARCH))
+QEMU_ARCH            = $(QEMU_ARCH_$(XARCH))
 
 # QEMU_ARGS : some arch-specific args to pass to qemu
 QEMU_ARGS_i386       = -M pc -append "console=ttyS0,9600 i8042.noaux panic=-1 $(TEST:%=NOLIBC_TEST=%)"
@@ -67,7 +74,7 @@ QEMU_ARGS_mips       = -M malta -append "panic=-1 $(TEST:%=NOLIBC_TEST=%)"
 QEMU_ARGS_riscv      = -M virt -append "console=ttyS0 panic=-1 $(TEST:%=NOLIBC_TEST=%)"
 QEMU_ARGS_s390       = -M s390-ccw-virtio -m 1G -append "console=ttyS0 panic=-1 $(TEST:%=NOLIBC_TEST=%)"
 QEMU_ARGS_loongarch  = -M virt -append "console=ttyS0,115200 panic=-1 $(TEST:%=NOLIBC_TEST=%)"
-QEMU_ARGS            = $(QEMU_ARGS_$(ARCH)) $(QEMU_ARGS_EXTRA)
+QEMU_ARGS            = $(QEMU_ARGS_$(XARCH)) $(QEMU_ARGS_EXTRA)
 
 # OUTPUT is only set when run from the main makefile, otherwise
 # it defaults to this nolibc directory.
@@ -84,7 +91,7 @@ CFLAGS_mips = -EL
 CFLAGS_STACKPROTECTOR ?= $(call cc-option,-mstack-protector-guard=global $(call cc-option,-fstack-protector-all))
 CFLAGS  ?= -Os -fno-ident -fno-asynchronous-unwind-tables -std=c89 \
 		$(call cc-option,-fno-stack-protector) \
-		$(CFLAGS_$(ARCH)) $(CFLAGS_STACKPROTECTOR)
+		$(CFLAGS_$(XARCH)) $(CFLAGS_STACKPROTECTOR)
 LDFLAGS := -s
 
 REPORT  ?= awk '/\[OK\][\r]*$$/{p++} /\[FAIL\][\r]*$$/{if (!f) printf("\n"); f++; print;} /\[SKIPPED\][\r]*$$/{s++} \
@@ -99,24 +106,25 @@ help:
 	@echo "  sysroot      create the nolibc sysroot here (uses \$$ARCH)"
 	@echo "  nolibc-test  build the executable (uses \$$CC and \$$CROSS_COMPILE)"
 	@echo "  libc-test    build an executable using the compiler's default libc instead"
-	@echo "  run-user     runs the executable under QEMU (uses \$$ARCH, \$$TEST)"
+	@echo "  run-user     runs the executable under QEMU (uses \$$XARCH, \$$TEST)"
 	@echo "  initramfs    prepare the initramfs with nolibc-test"
-	@echo "  defconfig    create a fresh new default config (uses \$$ARCH)"
-	@echo "  kernel       (re)build the kernel with the initramfs (uses \$$ARCH)"
-	@echo "  run          runs the kernel in QEMU after building it (uses \$$ARCH, \$$TEST)"
-	@echo "  rerun        runs a previously prebuilt kernel in QEMU (uses \$$ARCH, \$$TEST)"
+	@echo "  defconfig    create a fresh new default config (uses \$$XARCH)"
+	@echo "  kernel       (re)build the kernel with the initramfs (uses \$$XARCH)"
+	@echo "  run          runs the kernel in QEMU after building it (uses \$$XARCH, \$$TEST)"
+	@echo "  rerun        runs a previously prebuilt kernel in QEMU (uses \$$XARCH, \$$TEST)"
 	@echo "  clean        clean the sysroot, initramfs, build and output files"
 	@echo ""
 	@echo "The output file is \"run.out\". Test ranges may be passed using \$$TEST."
 	@echo ""
 	@echo "Currently using the following variables:"
 	@echo "  ARCH          = $(ARCH)"
+	@echo "  XARCH         = $(XARCH)"
 	@echo "  CROSS_COMPILE = $(CROSS_COMPILE)"
 	@echo "  CC            = $(CC)"
 	@echo "  OUTPUT        = $(OUTPUT)"
 	@echo "  TEST          = $(TEST)"
-	@echo "  QEMU_ARCH     = $(if $(QEMU_ARCH),$(QEMU_ARCH),UNKNOWN_ARCH) [determined from \$$ARCH]"
-	@echo "  IMAGE_NAME    = $(if $(IMAGE_NAME),$(IMAGE_NAME),UNKNOWN_ARCH) [determined from \$$ARCH]"
+	@echo "  QEMU_ARCH     = $(if $(QEMU_ARCH),$(QEMU_ARCH),UNKNOWN_ARCH) [determined from \$$XARCH]"
+	@echo "  IMAGE_NAME    = $(if $(IMAGE_NAME),$(IMAGE_NAME),UNKNOWN_ARCH) [determined from \$$XARCH]"
 	@echo ""
 
 all: run
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [PATCH v1 6/8] selftests/nolibc: add test support for powerpc
  2023-07-18 21:09 [PATCH v1 0/8] tools/nolibc: add 32/64-bit powerpc support Zhangjin Wu
                   ` (4 preceding siblings ...)
  2023-07-18 21:15 ` [PATCH v1 5/8] selftests/nolibc: add XARCH and ARCH mapping support Zhangjin Wu
@ 2023-07-18 21:16 ` Zhangjin Wu
  2023-07-18 21:17 ` [PATCH v1 7/8] selftests/nolibc: add test support for powerpc64le Zhangjin Wu
                   ` (2 subsequent siblings)
  8 siblings, 0 replies; 37+ messages in thread
From: Zhangjin Wu @ 2023-07-18 21:16 UTC (permalink / raw)
  To: w; +Cc: thomas, arnd, falcon, linux-kernel, linux-kselftest

The default qemu-system-ppc g3beige machine [1] is used to run 32-bit
powerpc kernel.

The pmac32_defconfig is used with extra PMACZILOG console options to
enable normal print.

Note, zImage doesn't boot due to "qemu-system-ppc: Some ROM regions are
overlapping" error, so, vmlinux is used instead.

[1]: https://qemu.readthedocs.io/en/latest/system/ppc/powermac.html

Signed-off-by: Zhangjin Wu <falcon@tinylab.org>
---
 tools/testing/selftests/nolibc/Makefile               | 4 ++++
 tools/testing/selftests/nolibc/configs/powerpc.config | 3 +++
 2 files changed, 7 insertions(+)
 create mode 100644 tools/testing/selftests/nolibc/configs/powerpc.config

diff --git a/tools/testing/selftests/nolibc/Makefile b/tools/testing/selftests/nolibc/Makefile
index b17a82efe6de..9c375fab84e5 100644
--- a/tools/testing/selftests/nolibc/Makefile
+++ b/tools/testing/selftests/nolibc/Makefile
@@ -28,6 +28,7 @@ IMAGE_x86        = arch/x86/boot/bzImage
 IMAGE_arm64      = arch/arm64/boot/Image
 IMAGE_arm        = arch/arm/boot/zImage
 IMAGE_mips       = vmlinuz
+IMAGE_powerpc    = vmlinux
 IMAGE_riscv      = arch/riscv/boot/Image
 IMAGE_s390       = arch/s390/boot/bzImage
 IMAGE_loongarch  = arch/loongarch/boot/vmlinuz.efi
@@ -41,6 +42,7 @@ DEFCONFIG_x86        = defconfig
 DEFCONFIG_arm64      = defconfig
 DEFCONFIG_arm        = multi_v7_defconfig
 DEFCONFIG_mips       = malta_defconfig
+DEFCONFIG_powerpc    = pmac32_defconfig
 DEFCONFIG_riscv      = defconfig
 DEFCONFIG_s390       = defconfig
 DEFCONFIG_loongarch  = defconfig
@@ -59,6 +61,7 @@ QEMU_ARCH_x86        = x86_64
 QEMU_ARCH_arm64      = aarch64
 QEMU_ARCH_arm        = arm
 QEMU_ARCH_mips       = mipsel  # works with malta_defconfig
+QEMU_ARCH_powerpc    = ppc
 QEMU_ARCH_riscv      = riscv64
 QEMU_ARCH_s390       = s390x
 QEMU_ARCH_loongarch  = loongarch64
@@ -71,6 +74,7 @@ QEMU_ARGS_x86        = -M pc -append "console=ttyS0,9600 i8042.noaux panic=-1 $(
 QEMU_ARGS_arm64      = -M virt -cpu cortex-a53 -append "panic=-1 $(TEST:%=NOLIBC_TEST=%)"
 QEMU_ARGS_arm        = -M virt -append "panic=-1 $(TEST:%=NOLIBC_TEST=%)"
 QEMU_ARGS_mips       = -M malta -append "panic=-1 $(TEST:%=NOLIBC_TEST=%)"
+QEMU_ARGS_powerpc    = -M g3beige -append "console=ttyS0 panic=-1 $(TEST:%=NOLIBC_TEST=%)"
 QEMU_ARGS_riscv      = -M virt -append "console=ttyS0 panic=-1 $(TEST:%=NOLIBC_TEST=%)"
 QEMU_ARGS_s390       = -M s390-ccw-virtio -m 1G -append "console=ttyS0 panic=-1 $(TEST:%=NOLIBC_TEST=%)"
 QEMU_ARGS_loongarch  = -M virt -append "console=ttyS0,115200 panic=-1 $(TEST:%=NOLIBC_TEST=%)"
diff --git a/tools/testing/selftests/nolibc/configs/powerpc.config b/tools/testing/selftests/nolibc/configs/powerpc.config
new file mode 100644
index 000000000000..b1975f8253f7
--- /dev/null
+++ b/tools/testing/selftests/nolibc/configs/powerpc.config
@@ -0,0 +1,3 @@
+CONFIG_SERIAL_PMACZILOG=y
+CONFIG_SERIAL_PMACZILOG_TTYS=y
+CONFIG_SERIAL_PMACZILOG_CONSOLE=y
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [PATCH v1 7/8] selftests/nolibc: add test support for powerpc64le
  2023-07-18 21:09 [PATCH v1 0/8] tools/nolibc: add 32/64-bit powerpc support Zhangjin Wu
                   ` (5 preceding siblings ...)
  2023-07-18 21:16 ` [PATCH v1 6/8] selftests/nolibc: add test support for powerpc Zhangjin Wu
@ 2023-07-18 21:17 ` Zhangjin Wu
  2023-07-22 12:07   ` Willy Tarreau
  2023-07-18 21:18 ` [PATCH v1 8/8] selftests/nolibc: add test support for powerpc64 Zhangjin Wu
  2023-07-23  7:47 ` [PATCH v1 0/8] tools/nolibc: add 32/64-bit powerpc support Thomas Weißschuh
  8 siblings, 1 reply; 37+ messages in thread
From: Zhangjin Wu @ 2023-07-18 21:17 UTC (permalink / raw)
  To: w; +Cc: thomas, arnd, falcon, linux-kernel, linux-kselftest

Here adds test support for little endian 64-bit PowerPC.

The powernv machine of qemu-system-ppc64le is used for there is just a
working powernv_defconfig.

As the document [1] shows:

  PowerNV (as Non-Virtualized) is the “bare metal” platform using the
  OPAL firmware. It runs Linux on IBM and OpenPOWER systems and it can be
  used as an hypervisor OS, running KVM guests, or simply as a host OS.

[1]: https://qemu.readthedocs.io/en/latest/system/ppc/powernv.html

Signed-off-by: Zhangjin Wu <falcon@tinylab.org>
---
 tools/testing/selftests/nolibc/Makefile | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/tools/testing/selftests/nolibc/Makefile b/tools/testing/selftests/nolibc/Makefile
index 9c375fab84e5..fbdf7fd9bf96 100644
--- a/tools/testing/selftests/nolibc/Makefile
+++ b/tools/testing/selftests/nolibc/Makefile
@@ -19,6 +19,7 @@ endif
 XARCH           ?= $(or $(XARCH_$(ARCH)),$(ARCH))
 
 # ARCH is supported by kernel
+ARCH_powerpc64le = powerpc
 ARCH            := $(or $(ARCH_$(XARCH)),$(XARCH))
 
 # kernel image names by architecture
@@ -29,6 +30,7 @@ IMAGE_arm64      = arch/arm64/boot/Image
 IMAGE_arm        = arch/arm/boot/zImage
 IMAGE_mips       = vmlinuz
 IMAGE_powerpc    = vmlinux
+IMAGE_powerpc64le= arch/powerpc/boot/zImage
 IMAGE_riscv      = arch/riscv/boot/Image
 IMAGE_s390       = arch/s390/boot/bzImage
 IMAGE_loongarch  = arch/loongarch/boot/vmlinuz.efi
@@ -43,6 +45,7 @@ DEFCONFIG_arm64      = defconfig
 DEFCONFIG_arm        = multi_v7_defconfig
 DEFCONFIG_mips       = malta_defconfig
 DEFCONFIG_powerpc    = pmac32_defconfig
+DEFCONFIG_powerpc64le= powernv_defconfig
 DEFCONFIG_riscv      = defconfig
 DEFCONFIG_s390       = defconfig
 DEFCONFIG_loongarch  = defconfig
@@ -62,6 +65,7 @@ QEMU_ARCH_arm64      = aarch64
 QEMU_ARCH_arm        = arm
 QEMU_ARCH_mips       = mipsel  # works with malta_defconfig
 QEMU_ARCH_powerpc    = ppc
+QEMU_ARCH_powerpc64le= ppc64le
 QEMU_ARCH_riscv      = riscv64
 QEMU_ARCH_s390       = s390x
 QEMU_ARCH_loongarch  = loongarch64
@@ -75,6 +79,7 @@ QEMU_ARGS_arm64      = -M virt -cpu cortex-a53 -append "panic=-1 $(TEST:%=NOLIBC
 QEMU_ARGS_arm        = -M virt -append "panic=-1 $(TEST:%=NOLIBC_TEST=%)"
 QEMU_ARGS_mips       = -M malta -append "panic=-1 $(TEST:%=NOLIBC_TEST=%)"
 QEMU_ARGS_powerpc    = -M g3beige -append "console=ttyS0 panic=-1 $(TEST:%=NOLIBC_TEST=%)"
+QEMU_ARGS_powerpc64le= -M powernv -append "console=hvc0 panic=-1 $(TEST:%=NOLIBC_TEST=%)"
 QEMU_ARGS_riscv      = -M virt -append "console=ttyS0 panic=-1 $(TEST:%=NOLIBC_TEST=%)"
 QEMU_ARGS_s390       = -M s390-ccw-virtio -m 1G -append "console=ttyS0 panic=-1 $(TEST:%=NOLIBC_TEST=%)"
 QEMU_ARGS_loongarch  = -M virt -append "console=ttyS0,115200 panic=-1 $(TEST:%=NOLIBC_TEST=%)"
@@ -90,6 +95,7 @@ else
 Q=@
 endif
 
+CFLAGS_powerpc64le = -m64 -mlittle-endian -Wl,-EL,-melf64ppc
 CFLAGS_s390 = -m64
 CFLAGS_mips = -EL
 CFLAGS_STACKPROTECTOR ?= $(call cc-option,-mstack-protector-guard=global $(call cc-option,-fstack-protector-all))
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [PATCH v1 8/8] selftests/nolibc: add test support for powerpc64
  2023-07-18 21:09 [PATCH v1 0/8] tools/nolibc: add 32/64-bit powerpc support Zhangjin Wu
                   ` (6 preceding siblings ...)
  2023-07-18 21:17 ` [PATCH v1 7/8] selftests/nolibc: add test support for powerpc64le Zhangjin Wu
@ 2023-07-18 21:18 ` Zhangjin Wu
  2023-07-22 12:10   ` Willy Tarreau
  2023-07-23  7:47 ` [PATCH v1 0/8] tools/nolibc: add 32/64-bit powerpc support Thomas Weißschuh
  8 siblings, 1 reply; 37+ messages in thread
From: Zhangjin Wu @ 2023-07-18 21:18 UTC (permalink / raw)
  To: w; +Cc: thomas, arnd, falcon, linux-kernel, linux-kselftest

Here adds test support for big endian 64-bit PowerPC.

The powernv machine of qemu-system-ppc64 is used with
powernv_be_defconfig.

As the document [1] shows:

  PowerNV (as Non-Virtualized) is the “bare metal” platform using the
  OPAL firmware. It runs Linux on IBM and OpenPOWER systems and it can be
  used as an hypervisor OS, running KVM guests, or simply as a host OS.

Note, differs from little endian 64-bit PowerPC, vmlinux is used instead
of zImage, because big endian zImage [2] only boot on qemu with x-vof=on
(added from qemu v7.0) and a fixup patch [3] for qemu v7.0.51:

[1]: https://qemu.readthedocs.io/en/latest/system/ppc/powernv.html
[2]: https://github.com/linuxppc/issues/issues/402
[3]: https://lore.kernel.org/qemu-devel/20220504065536.3534488-1-aik@ozlabs.ru/

Signed-off-by: Zhangjin Wu <falcon@tinylab.org>
---
 tools/testing/selftests/nolibc/Makefile | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/tools/testing/selftests/nolibc/Makefile b/tools/testing/selftests/nolibc/Makefile
index fbdf7fd9bf96..cced1d60ecf9 100644
--- a/tools/testing/selftests/nolibc/Makefile
+++ b/tools/testing/selftests/nolibc/Makefile
@@ -19,6 +19,7 @@ endif
 XARCH           ?= $(or $(XARCH_$(ARCH)),$(ARCH))
 
 # ARCH is supported by kernel
+ARCH_powerpc64   = powerpc
 ARCH_powerpc64le = powerpc
 ARCH            := $(or $(ARCH_$(XARCH)),$(XARCH))
 
@@ -30,6 +31,7 @@ IMAGE_arm64      = arch/arm64/boot/Image
 IMAGE_arm        = arch/arm/boot/zImage
 IMAGE_mips       = vmlinuz
 IMAGE_powerpc    = vmlinux
+IMAGE_powerpc64  = vmlinux
 IMAGE_powerpc64le= arch/powerpc/boot/zImage
 IMAGE_riscv      = arch/riscv/boot/Image
 IMAGE_s390       = arch/s390/boot/bzImage
@@ -45,6 +47,7 @@ DEFCONFIG_arm64      = defconfig
 DEFCONFIG_arm        = multi_v7_defconfig
 DEFCONFIG_mips       = malta_defconfig
 DEFCONFIG_powerpc    = pmac32_defconfig
+DEFCONFIG_powerpc64  = powernv_be_defconfig
 DEFCONFIG_powerpc64le= powernv_defconfig
 DEFCONFIG_riscv      = defconfig
 DEFCONFIG_s390       = defconfig
@@ -65,6 +68,7 @@ QEMU_ARCH_arm64      = aarch64
 QEMU_ARCH_arm        = arm
 QEMU_ARCH_mips       = mipsel  # works with malta_defconfig
 QEMU_ARCH_powerpc    = ppc
+QEMU_ARCH_powerpc64  = ppc64
 QEMU_ARCH_powerpc64le= ppc64le
 QEMU_ARCH_riscv      = riscv64
 QEMU_ARCH_s390       = s390x
@@ -79,6 +83,7 @@ QEMU_ARGS_arm64      = -M virt -cpu cortex-a53 -append "panic=-1 $(TEST:%=NOLIBC
 QEMU_ARGS_arm        = -M virt -append "panic=-1 $(TEST:%=NOLIBC_TEST=%)"
 QEMU_ARGS_mips       = -M malta -append "panic=-1 $(TEST:%=NOLIBC_TEST=%)"
 QEMU_ARGS_powerpc    = -M g3beige -append "console=ttyS0 panic=-1 $(TEST:%=NOLIBC_TEST=%)"
+QEMU_ARGS_powerpc64  = -M powernv -append "console=hvc0 panic=-1 $(TEST:%=NOLIBC_TEST=%)"
 QEMU_ARGS_powerpc64le= -M powernv -append "console=hvc0 panic=-1 $(TEST:%=NOLIBC_TEST=%)"
 QEMU_ARGS_riscv      = -M virt -append "console=ttyS0 panic=-1 $(TEST:%=NOLIBC_TEST=%)"
 QEMU_ARGS_s390       = -M s390-ccw-virtio -m 1G -append "console=ttyS0 panic=-1 $(TEST:%=NOLIBC_TEST=%)"
@@ -95,6 +100,7 @@ else
 Q=@
 endif
 
+CFLAGS_powerpc64 = -m64 -mbig-endian -mmultiple -Wl,-EB,-melf64ppc
 CFLAGS_powerpc64le = -m64 -mlittle-endian -Wl,-EL,-melf64ppc
 CFLAGS_s390 = -m64
 CFLAGS_mips = -EL
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 37+ messages in thread

* Re: [PATCH v1 3/8] selftests/nolibc: select_null: fix up for big endian powerpc64
  2023-07-18 21:13 ` [PATCH v1 3/8] selftests/nolibc: select_null: fix up for big endian powerpc64 Zhangjin Wu
@ 2023-07-18 22:17   ` Thomas Weißschuh
  2023-07-18 23:56     ` Zhangjin Wu
  0 siblings, 1 reply; 37+ messages in thread
From: Thomas Weißschuh @ 2023-07-18 22:17 UTC (permalink / raw)
  To: Zhangjin Wu; +Cc: w, arnd, linux-kernel, linux-kselftest

As this would be a generic bugfix it should be at the front of the
series, but...

On 2023-07-19 05:13:01+0800, Zhangjin Wu wrote:
> The following error reported while running nolibc-test on the big endian
> 64-bit PowerPC kernel compiled with powerpc64le-linux-gnu-gcc in Ubuntu
> 20.04.
> 
>     56 select_nullinit[1]: illegal instruction (4) at 100042a8 nip 100042a8 lr 100042a8 code 1 in init[10000000+10000]
>     init[1]: code: 7c6307b4 7c234840 4081f580 7c6300d0 907d0000 3860ffff 4bfff570 3ca2fffe
>     init[1]: code: 38800038 38a5d547 7fc3f378 4bffcd65 <1000038c> 38c10060 38a00000 38800000
> 
> Let's explicitly initialize all of the timeval members to zero.
> 
> Signed-off-by: Zhangjin Wu <falcon@tinylab.org>
> ---
>  tools/testing/selftests/nolibc/nolibc-test.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/tools/testing/selftests/nolibc/nolibc-test.c b/tools/testing/selftests/nolibc/nolibc-test.c
> index 03b1d30f5507..ec2c7774522e 100644
> --- a/tools/testing/selftests/nolibc/nolibc-test.c
> +++ b/tools/testing/selftests/nolibc/nolibc-test.c
> @@ -858,7 +858,7 @@ int run_syscall(int min, int max)
>  		CASE_TEST(read_badf);         EXPECT_SYSER(1, read(-1, &tmp, 1), -1, EBADF); break;
>  		CASE_TEST(rmdir_blah);        EXPECT_SYSER(1, rmdir("/blah"), -1, ENOENT); break;
>  		CASE_TEST(sched_yield);       EXPECT_SYSZR(1, sched_yield()); break;
> -		CASE_TEST(select_null);       EXPECT_SYSZR(1, ({ struct timeval tv = { 0 }; select(0, NULL, NULL, NULL, &tv); })); break;
> +		CASE_TEST(select_null);       EXPECT_SYSZR(1, ({ struct timeval tv = { 0, 0 }; select(0, NULL, NULL, NULL, &tv); })); break;

This doesn't really make sense.
Firstly, "{ 0 }" zeroes the whole structure.

Also the warning talks about "illegal instruction" while this structure
is data and should never be executed as code.

Is this failure reproducible?
Maybe the error is actually in the syscall wrapper?
I'll also take a look tomorrow.

>  		CASE_TEST(select_stdout);     EXPECT_SYSNE(1, ({ fd_set fds; FD_ZERO(&fds); FD_SET(1, &fds); select(2, NULL, &fds, NULL, NULL); }), -1); break;
>  		CASE_TEST(select_fault);      EXPECT_SYSER(1, select(1, (void *)1, NULL, NULL, 0), -1, EFAULT); break;
>  		CASE_TEST(stat_blah);         EXPECT_SYSER(1, stat("/proc/self/blah", &stat_buf), -1, ENOENT); break;
> -- 
> 2.25.1
> 

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v1 3/8] selftests/nolibc: select_null: fix up for big endian powerpc64
  2023-07-18 22:17   ` Thomas Weißschuh
@ 2023-07-18 23:56     ` Zhangjin Wu
  2023-07-19  4:33       ` Willy Tarreau
  0 siblings, 1 reply; 37+ messages in thread
From: Zhangjin Wu @ 2023-07-18 23:56 UTC (permalink / raw)
  To: thomas; +Cc: arnd, falcon, linux-kernel, linux-kselftest, w

Hi, Thomas

> As this would be a generic bugfix it should be at the front of the
> series, but...
>

Yes, moved it but not as the first.

> On 2023-07-19 05:13:01+0800, Zhangjin Wu wrote:
> > The following error reported while running nolibc-test on the big endian
> > 64-bit PowerPC kernel compiled with powerpc64le-linux-gnu-gcc in Ubuntu
> > 20.04.
> > 
> >     56 select_nullinit[1]: illegal instruction (4) at 100042a8 nip 100042a8 lr 100042a8 code 1 in init[10000000+10000]
> >     init[1]: code: 7c6307b4 7c234840 4081f580 7c6300d0 907d0000 3860ffff 4bfff570 3ca2fffe
> >     init[1]: code: 38800038 38a5d547 7fc3f378 4bffcd65 <1000038c> 38c10060 38a00000 38800000
> > 
> > Let's explicitly initialize all of the timeval members to zero.
> > 
> > Signed-off-by: Zhangjin Wu <falcon@tinylab.org>
> > ---
> >  tools/testing/selftests/nolibc/nolibc-test.c | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> > 
> > diff --git a/tools/testing/selftests/nolibc/nolibc-test.c b/tools/testing/selftests/nolibc/nolibc-test.c
> > index 03b1d30f5507..ec2c7774522e 100644
> > --- a/tools/testing/selftests/nolibc/nolibc-test.c
> > +++ b/tools/testing/selftests/nolibc/nolibc-test.c
> > @@ -858,7 +858,7 @@ int run_syscall(int min, int max)
> >  		CASE_TEST(read_badf);         EXPECT_SYSER(1, read(-1, &tmp, 1), -1, EBADF); break;
> >  		CASE_TEST(rmdir_blah);        EXPECT_SYSER(1, rmdir("/blah"), -1, ENOENT); break;
> >  		CASE_TEST(sched_yield);       EXPECT_SYSZR(1, sched_yield()); break;
> > -		CASE_TEST(select_null);       EXPECT_SYSZR(1, ({ struct timeval tv = { 0 }; select(0, NULL, NULL, NULL, &tv); })); break;
> > +		CASE_TEST(select_null);       EXPECT_SYSZR(1, ({ struct timeval tv = { 0, 0 }; select(0, NULL, NULL, NULL, &tv); })); break;
> 
> This doesn't really make sense.
> Firstly, "{ 0 }" zeroes the whole structure.
>

Will compare the difference carefully ...

> Also the warning talks about "illegal instruction" while this structure
> is data and should never be executed as code.
>

Yeah, I'm a little lazy, test shows no issue happen, so, not looked into it, this may really hide some issues.

> Is this failure reproducible?

It could be reproduced with powerpc64le-linux-gnu-gcc (9.3.0) + run:

    $ make run XARCH=powerpc64 CROSS_COMPILE=powerpc64le-linux-gnu-

but not happen with powerpc64le-linux-gnu-gcc (9.3.0) + run-user:

    $ make run-user XARCH=powerpc64 CROSS_COMPILE=powerpc64le-linux-gnu-

And neither reproduce if switch to the big endian powerpc64-linux-gcc 13.1.0
toolchain from https://mirrors.edge.kernel.org/pub/tools/crosstool/

> Maybe the error is actually in the syscall wrapper?
> I'll also take a look tomorrow.
>

ok, just rechecked this, found the nolibc side is:

    int sys_select(int nfds, fd_set *rfds, fd_set *wfds, fd_set *efds, struct timeval *timeout)
        --> return my_syscall5(__NR__newselect, nfds, rfds, wfds, efds, timeout);

And the bad code is (even with -O0):

    1000e3ac:   10 00 03 8c     vspltisw v0,0
    1000e3b0:   39 3f 00 e0     addi    r9,r31,224
    1000e3b4:   7c 00 4f 99     stxvd2x vs32,0,r9
    1000e3b8:   39 3f 00 e0     addi    r9,r31,224
    1000e3bc:   7d 27 4b 78     mr      r7,r9

The error log:

    56 select_nullinit[1]: illegal instruction (4) at 1000e3ac nip 1000e3ac lr 1000e398 code 1 in init[10000000+14000]
    init[1]: code: e93f0076 3ca2fffe 38a5aad0 7d244b78 3c62fffe 3863a630 4bffaedd 7c691b78 
    init[1]: code: 7d2a4b78 813f008c 7d295214 913f008c <1000038c> 393f00e0 7c004f99 393f00e0 
    Kernel panic - not syncing: Attempted to kill init! exitcode=0x00000004

So, the critical info "illegal instruction" means the vspltisw instruction is
not supported by this compiled kernel.

Let's look at the good one (only plus one instruction):

    1000e3ac:   39 20 00 00     li      r9,0
    1000e3b0:   f9 3f 00 e0     std     r9,224(r31)
    1000e3b4:   39 20 00 00     li      r9,0
    1000e3b8:   f9 3f 00 e8     std     r9,232(r31)
    1000e3bc:   39 3f 00 e0     addi    r9,r31,224
    1000e3c0:   7d 27 4b 78     mr      r7,r9

It means, adding one more 0 will let the compiler generate different code, it
will not use the vspltisw instruction any more, so, different result.

It made me recalled I have at last disabled (not enabled for tinyconfig) the following options:

    CONFIG_ALTIVEC
    CONFIG_VSX --> This option enables kernel support for the Vector Scaler extensions

Or we can disable the vsx instructions explicitly:

    -mno-vsx

Both of them work well, but I prefer -mno-vsx for to get a faster build, what about you?

    +CFLAGS_powerpc64 = -m64 -mbig-endian -Wl,-EB,-melf64ppc -mno-vsx
    +CFLAGS_powerpc64le = -m64 -mlittle-endian -Wl,-EL,-melf64ppc -mno-vsx

So, this patch itself is wrong, let's drop it from the next revision.

Thanks very much,
Zhangjin

> >  		CASE_TEST(select_stdout);     EXPECT_SYSNE(1, ({ fd_set fds; FD_ZERO(&fds); FD_SET(1, &fds); select(2, NULL, &fds, NULL, NULL); }), -1); break;
> >  		CASE_TEST(select_fault);      EXPECT_SYSER(1, select(1, (void *)1, NULL, NULL, 0), -1, EFAULT); break;
> >  		CASE_TEST(stat_blah);         EXPECT_SYSER(1, stat("/proc/self/blah", &stat_buf), -1, ENOENT); break;
> > -- 
> > 2.25.1
> > 

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v1 3/8] selftests/nolibc: select_null: fix up for big endian powerpc64
  2023-07-18 23:56     ` Zhangjin Wu
@ 2023-07-19  4:33       ` Willy Tarreau
  2023-07-19  6:49         ` Zhangjin Wu
  0 siblings, 1 reply; 37+ messages in thread
From: Willy Tarreau @ 2023-07-19  4:33 UTC (permalink / raw)
  To: Zhangjin Wu; +Cc: thomas, arnd, linux-kernel, linux-kselftest

Hi Zhangjin,

On Wed, Jul 19, 2023 at 07:56:37AM +0800, Zhangjin Wu wrote:
> It made me recalled I have at last disabled (not enabled for tinyconfig) the following options:
> 
>     CONFIG_ALTIVEC
>     CONFIG_VSX --> This option enables kernel support for the Vector Scaler extensions
> 
> Or we can disable the vsx instructions explicitly:
> 
>     -mno-vsx
> 
> Both of them work well, but I prefer -mno-vsx for to get a faster build, what about you?
> 
>     +CFLAGS_powerpc64 = -m64 -mbig-endian -Wl,-EB,-melf64ppc -mno-vsx
>     +CFLAGS_powerpc64le = -m64 -mlittle-endian -Wl,-EL,-melf64ppc -mno-vsx
> 
> So, this patch itself is wrong, let's drop it from the next revision.

Better explicitly disable it in the CFLAGS (2nd option) if we want to
make sure we don't want to rely on this, at least for portability
purposes.

Willy

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v1 3/8] selftests/nolibc: select_null: fix up for big endian powerpc64
  2023-07-19  4:33       ` Willy Tarreau
@ 2023-07-19  6:49         ` Zhangjin Wu
  2023-07-19 20:25           ` Willy Tarreau
  2023-07-20  6:11           ` Thomas Weißschuh
  0 siblings, 2 replies; 37+ messages in thread
From: Zhangjin Wu @ 2023-07-19  6:49 UTC (permalink / raw)
  To: w; +Cc: arnd, falcon, linux-kernel, linux-kselftest, thomas

Hi, Willy

> Hi Zhangjin,
> 
> On Wed, Jul 19, 2023 at 07:56:37AM +0800, Zhangjin Wu wrote:
> > It made me recalled I have at last disabled (not enabled for tinyconfig) the following options:
> > 
> >     CONFIG_ALTIVEC
> >     CONFIG_VSX --> This option enables kernel support for the Vector Scaler extensions
> > 
> > Or we can disable the vsx instructions explicitly:
> > 
> >     -mno-vsx
> > 
> > Both of them work well, but I prefer -mno-vsx for to get a faster build, what about you?
> > 
> >     +CFLAGS_powerpc64 = -m64 -mbig-endian -Wl,-EB,-melf64ppc -mno-vsx
> >     +CFLAGS_powerpc64le = -m64 -mlittle-endian -Wl,-EL,-melf64ppc -mno-vsx
> > 
> > So, this patch itself is wrong, let's drop it from the next revision.
> 
> Better explicitly disable it in the CFLAGS (2nd option) if we want to
> make sure we don't want to rely on this, at least for portability
> purposes.

Ok, thanks, have updated CFLAGS in these two patches locally:

    [PATCH v1 7/8] selftests/nolibc: add test support for powerpc64le
    [PATCH v1 8/8] selftests/nolibc: add test support for powerpc64

what about the other ones? I'm ready to send v2 ;-)

Best regards,
Zhangjin

> 
> Willy

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v1 3/8] selftests/nolibc: select_null: fix up for big endian powerpc64
  2023-07-19  6:49         ` Zhangjin Wu
@ 2023-07-19 20:25           ` Willy Tarreau
  2023-07-20  6:11           ` Thomas Weißschuh
  1 sibling, 0 replies; 37+ messages in thread
From: Willy Tarreau @ 2023-07-19 20:25 UTC (permalink / raw)
  To: Zhangjin Wu; +Cc: arnd, linux-kernel, linux-kselftest, thomas

Hi Zhangjin,

On Wed, Jul 19, 2023 at 02:49:12PM +0800, Zhangjin Wu wrote:
> Hi, Willy
> 
> > Hi Zhangjin,
> > 
> > On Wed, Jul 19, 2023 at 07:56:37AM +0800, Zhangjin Wu wrote:
> > > It made me recalled I have at last disabled (not enabled for tinyconfig) the following options:
> > > 
> > >     CONFIG_ALTIVEC
> > >     CONFIG_VSX --> This option enables kernel support for the Vector Scaler extensions
> > > 
> > > Or we can disable the vsx instructions explicitly:
> > > 
> > >     -mno-vsx
> > > 
> > > Both of them work well, but I prefer -mno-vsx for to get a faster build, what about you?
> > > 
> > >     +CFLAGS_powerpc64 = -m64 -mbig-endian -Wl,-EB,-melf64ppc -mno-vsx
> > >     +CFLAGS_powerpc64le = -m64 -mlittle-endian -Wl,-EL,-melf64ppc -mno-vsx
> > > 
> > > So, this patch itself is wrong, let's drop it from the next revision.
> > 
> > Better explicitly disable it in the CFLAGS (2nd option) if we want to
> > make sure we don't want to rely on this, at least for portability
> > purposes.
> 
> Ok, thanks, have updated CFLAGS in these two patches locally:
> 
>     [PATCH v1 7/8] selftests/nolibc: add test support for powerpc64le
>     [PATCH v1 8/8] selftests/nolibc: add test support for powerpc64
> 
> what about the other ones? I'm ready to send v2 ;-)

I have not had the time to review them yet. Please just don't send
another series yet, that just adds more noise and makes it hard to
distinguish all of them. I hope to be able to check these and hopefully
the tinyconfig series by the week-end.

Thanks,
Willy

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v1 3/8] selftests/nolibc: select_null: fix up for big endian powerpc64
  2023-07-19  6:49         ` Zhangjin Wu
  2023-07-19 20:25           ` Willy Tarreau
@ 2023-07-20  6:11           ` Thomas Weißschuh
  1 sibling, 0 replies; 37+ messages in thread
From: Thomas Weißschuh @ 2023-07-20  6:11 UTC (permalink / raw)
  To: Zhangjin Wu; +Cc: w, arnd, linux-kernel, linux-kselftest

Hi Zhangjin,

On 2023-07-19 14:49:12+0800, Zhangjin Wu wrote:
> > On Wed, Jul 19, 2023 at 07:56:37AM +0800, Zhangjin Wu wrote:
> > > It made me recalled I have at last disabled (not enabled for tinyconfig) the following options:
> > > 
> > >     CONFIG_ALTIVEC
> > >     CONFIG_VSX --> This option enables kernel support for the Vector Scaler extensions
> > > 
> > > Or we can disable the vsx instructions explicitly:
> > > 
> > >     -mno-vsx
> > > 
> > > Both of them work well, but I prefer -mno-vsx for to get a faster build, what about you?
> > > 
> > >     +CFLAGS_powerpc64 = -m64 -mbig-endian -Wl,-EB,-melf64ppc -mno-vsx
> > >     +CFLAGS_powerpc64le = -m64 -mlittle-endian -Wl,-EL,-melf64ppc -mno-vsx
> > > 
> > > So, this patch itself is wrong, let's drop it from the next revision.
> > 
> > Better explicitly disable it in the CFLAGS (2nd option) if we want to
> > make sure we don't want to rely on this, at least for portability
> > purposes.
> 
> Ok, thanks, have updated CFLAGS in these two patches locally:
> 
>     [PATCH v1 7/8] selftests/nolibc: add test support for powerpc64le
>     [PATCH v1 8/8] selftests/nolibc: add test support for powerpc64
> 
> what about the other ones? I'm ready to send v2 ;-)

Unfortunately I won't have the time for a proper review this week.

Thomas

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v1 4/8] selftests/nolibc: add extra config file customize support
  2023-07-18 21:14 ` [PATCH v1 4/8] selftests/nolibc: add extra config file customize support Zhangjin Wu
@ 2023-07-22 12:00   ` Willy Tarreau
  2023-07-25 14:30     ` Zhangjin Wu
  0 siblings, 1 reply; 37+ messages in thread
From: Willy Tarreau @ 2023-07-22 12:00 UTC (permalink / raw)
  To: Zhangjin Wu
  Cc: thomas, arnd, linux-kernel, linux-kselftest, Thomas Weißschuh

On Wed, Jul 19, 2023 at 05:14:07AM +0800, Zhangjin Wu wrote:
> The default DEFCONFIG_<ARCH> may not always work for all architectures,
> some architectures require to add extra kernel config options, this adds
> a new 'extconfig' target for this requirement.
> 
> It allows to customize extra kernel config options via the common
> common.config and the architecture specific <ARCH>.config, at last
> trigger 'allnoconfig' to let them take effect with missing config
> options as disabled.
> 
> The scripts/kconfig/merge_config.sh tool is used to merge the extra
> config files.
> 
> Suggested-by: Thomas Weißschuh <linux@weissschuh.net>
> Link: https://lore.kernel.org/lkml/67eb70d4-c9ff-4afc-bac7-7f36cc2c81bc@t-8ch.de/
> Signed-off-by: Zhangjin Wu <falcon@tinylab.org>
> ---
>  tools/testing/selftests/nolibc/Makefile | 7 +++++++
>  1 file changed, 7 insertions(+)
> 
> diff --git a/tools/testing/selftests/nolibc/Makefile b/tools/testing/selftests/nolibc/Makefile
> index f42adef87e12..08a5ca5f418b 100644
> --- a/tools/testing/selftests/nolibc/Makefile
> +++ b/tools/testing/selftests/nolibc/Makefile
> @@ -39,6 +39,9 @@ DEFCONFIG_s390       = defconfig
>  DEFCONFIG_loongarch  = defconfig
>  DEFCONFIG            = $(DEFCONFIG_$(ARCH))
>  
> +# extra kernel config files under configs/, include common + architecture specific
> +EXTCONFIG            = common.config $(ARCH).config
> +
>  # optional tests to run (default = all)
>  TEST =
>  
> @@ -162,6 +165,10 @@ initramfs: nolibc-test
>  defconfig:
>  	$(Q)$(MAKE) -C $(srctree) ARCH=$(ARCH) CC=$(CC) CROSS_COMPILE=$(CROSS_COMPILE) mrproper $(DEFCONFIG) prepare
>  
> +extconfig:
> +	$(Q)$(srctree)/scripts/kconfig/merge_config.sh -O "$(srctree)" -m "$(srctree)/.config" $(foreach c,$(EXTCONFIG),$(wildcard $(CURDIR)/configs/$c))
> +	$(Q)$(MAKE) -C $(srctree) ARCH=$(ARCH) CC=$(CC) CROSS_COMPILE=$(CROSS_COMPILE) KCONFIG_ALLCONFIG="$(srctree)/.config" allnoconfig
> +

Please also mention this entry in the "help" message.
Other than that, OK.

Willy

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v1 5/8] selftests/nolibc: add XARCH and ARCH mapping support
  2023-07-18 21:15 ` [PATCH v1 5/8] selftests/nolibc: add XARCH and ARCH mapping support Zhangjin Wu
@ 2023-07-22 12:03   ` Willy Tarreau
  0 siblings, 0 replies; 37+ messages in thread
From: Willy Tarreau @ 2023-07-22 12:03 UTC (permalink / raw)
  To: Zhangjin Wu; +Cc: thomas, arnd, linux-kernel, linux-kselftest

On Wed, Jul 19, 2023 at 05:15:13AM +0800, Zhangjin Wu wrote:
> -	@echo "  run-user     runs the executable under QEMU (uses \$$ARCH, \$$TEST)"
> +	@echo "  run-user     runs the executable under QEMU (uses \$$XARCH, \$$TEST)"

Most users will neither care about nor need XARCH and will wonder what to
put there. Since there's a fallback on ARCH, better indicate something
like:

    (uses $XARCH or $ARCH, and $TEST)

Willy

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v1 7/8] selftests/nolibc: add test support for powerpc64le
  2023-07-18 21:17 ` [PATCH v1 7/8] selftests/nolibc: add test support for powerpc64le Zhangjin Wu
@ 2023-07-22 12:07   ` Willy Tarreau
  0 siblings, 0 replies; 37+ messages in thread
From: Willy Tarreau @ 2023-07-22 12:07 UTC (permalink / raw)
  To: Zhangjin Wu; +Cc: thomas, arnd, linux-kernel, linux-kselftest

On Wed, Jul 19, 2023 at 05:17:26AM +0800, Zhangjin Wu wrote:
> Here adds test support for little endian 64-bit PowerPC.
> 
> The powernv machine of qemu-system-ppc64le is used for there is just a
> working powernv_defconfig.
> 
> As the document [1] shows:
> 
>   PowerNV (as Non-Virtualized) is the "bare metal" platform using the
>   OPAL firmware. It runs Linux on IBM and OpenPOWER systems and it can be
>   used as an hypervisor OS, running KVM guests, or simply as a host OS.
> 
> [1]: https://qemu.readthedocs.io/en/latest/system/ppc/powernv.html
> 
> Signed-off-by: Zhangjin Wu <falcon@tinylab.org>
> ---
>  tools/testing/selftests/nolibc/Makefile | 6 ++++++
>  1 file changed, 6 insertions(+)
> 
> diff --git a/tools/testing/selftests/nolibc/Makefile b/tools/testing/selftests/nolibc/Makefile
> index 9c375fab84e5..fbdf7fd9bf96 100644
> --- a/tools/testing/selftests/nolibc/Makefile
> +++ b/tools/testing/selftests/nolibc/Makefile
> @@ -19,6 +19,7 @@ endif
>  XARCH           ?= $(or $(XARCH_$(ARCH)),$(ARCH))
>  
>  # ARCH is supported by kernel
> +ARCH_powerpc64le = powerpc

Given that this one will only be used as an alias, I really think you
should call it "ppc64le" and not with that long a name. Everyone knows
that arch under the name ppc64 anyway so it's not like it would cause
any confusion.

Willy

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v1 8/8] selftests/nolibc: add test support for powerpc64
  2023-07-18 21:18 ` [PATCH v1 8/8] selftests/nolibc: add test support for powerpc64 Zhangjin Wu
@ 2023-07-22 12:10   ` Willy Tarreau
  2023-07-25  5:50     ` Zhangjin Wu
  0 siblings, 1 reply; 37+ messages in thread
From: Willy Tarreau @ 2023-07-22 12:10 UTC (permalink / raw)
  To: Zhangjin Wu; +Cc: thomas, arnd, linux-kernel, linux-kselftest

On Wed, Jul 19, 2023 at 05:18:32AM +0800, Zhangjin Wu wrote:
>  # ARCH is supported by kernel
> +ARCH_powerpc64   = powerpc
>  ARCH_powerpc64le = powerpc

And similarly let's simply call this one ppc64.

Aside these few comments, the series looks nice, thanks!
Willy

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v1 1/8] tools/nolibc: add support for powerpc
  2023-07-18 21:10 ` [PATCH v1 1/8] tools/nolibc: add support for powerpc Zhangjin Wu
@ 2023-07-23  7:32   ` Thomas Weißschuh
  2023-07-23  8:15     ` Willy Tarreau
  2023-07-25  6:14     ` Zhangjin Wu
  0 siblings, 2 replies; 37+ messages in thread
From: Thomas Weißschuh @ 2023-07-23  7:32 UTC (permalink / raw)
  To: Zhangjin Wu; +Cc: w, arnd, linux-kernel, linux-kselftest

On 2023-07-19 05:10:48+0800, Zhangjin Wu wrote:
> Both syscall declarations and _start code definition are added for
> powerpc to nolibc.
> 
> Like mips, powerpc uses a register (exactly, the summary overflow bit)
> to record the error occurred, and uses another register to return the
> value [1]. So, the return value of every syscall declaration must be
> normalized to easier the __sysret helper, return -value when there is an
> error, otheriwse, return value directly.
> 
> Glibc and musl use different methods to check the summary overflow bit,
> glibc (sysdeps/unix/sysv/linux/powerpc/sysdep.h) saves the cr register
> to r0 at first, and then check the summary overflow bit in cr0:
> 
>     mfcr r0
>     r0 & (1 << 28) ? -r3 : r3
> 
>     -->
> 
>     10003c14:       7c 00 00 26     mfcr    r0
>     10003c18:       74 09 10 00     andis.  r9,r0,4096
>     10003c1c:       41 82 00 08     beq     0x10003c24
>     10003c20:       7c 63 00 d0     neg     r3,r3
> 
> Musl (arch/powerpc/syscall_arch.h) directly checks the summary overflow
> bit with the 'bns' instruction:
> 
>     /* no summary overflow bit means no error, return value directly */
>     bns+ 1f
>     /* otherwise, return negated value */
>     neg r3, r3
>     1:
> 
>     -->
> 
>     10000418:       40 a3 00 08     bns     0x10000420
>     1000041c:       7c 63 00 d0     neg     r3,r3
> 
> The later one is smaller, here applies it.
> 
> arch/powerpc/include/asm/vdso/gettimeofday.h file uses the smaller
> method for do_syscall_2() too.
> 
> [1]: https://man7.org/linux/man-pages/man2/syscall.2.html
> 
> Signed-off-by: Zhangjin Wu <falcon@tinylab.org>
> ---
>  tools/include/nolibc/arch-powerpc.h | 156 ++++++++++++++++++++++++++++

This also should be added to nolibc/arch.h.

>  1 file changed, 156 insertions(+)
>  create mode 100644 tools/include/nolibc/arch-powerpc.h
> 
> diff --git a/tools/include/nolibc/arch-powerpc.h b/tools/include/nolibc/arch-powerpc.h
> new file mode 100644
> index 000000000000..100ec0f412dc
> --- /dev/null
> +++ b/tools/include/nolibc/arch-powerpc.h
> @@ -0,0 +1,156 @@
> +/* SPDX-License-Identifier: LGPL-2.1 OR MIT */
> +/*
> + * PowerPC specific definitions for NOLIBC
> + * Copyright (C) 2023 Zhangjin Wu <falcon@tinylab.org>

If it is taken from musl, shouldn't there also be a musl copyright?

> [..]

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v1 0/8] tools/nolibc: add 32/64-bit powerpc support
  2023-07-18 21:09 [PATCH v1 0/8] tools/nolibc: add 32/64-bit powerpc support Zhangjin Wu
                   ` (7 preceding siblings ...)
  2023-07-18 21:18 ` [PATCH v1 8/8] selftests/nolibc: add test support for powerpc64 Zhangjin Wu
@ 2023-07-23  7:47 ` Thomas Weißschuh
  8 siblings, 0 replies; 37+ messages in thread
From: Thomas Weißschuh @ 2023-07-23  7:47 UTC (permalink / raw)
  To: Zhangjin Wu; +Cc: w, arnd, linux-kernel, linux-kselftest

On 2023-07-19 05:09:42+0800, Zhangjin Wu wrote:
> Here is the powerpc support, includes 32-bit big-endian powerpc, 64-bit
> little endian and big endian powerpc.
> 
> [..]
> 
> Zhangjin Wu (8):
>   tools/nolibc: add support for powerpc
>   tools/nolibc: add support for powerpc64
>   selftests/nolibc: select_null: fix up for big endian powerpc64
>   selftests/nolibc: add extra config file customize support
>   selftests/nolibc: add XARCH and ARCH mapping support
>   selftests/nolibc: add test support for powerpc
>   selftests/nolibc: add test support for powerpc64le
>   selftests/nolibc: add test support for powerpc64

For the full series, after comment for patch 1 is addressed:

Reviewed-by: Thomas Weißschuh <linux@weissschuh.net>

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v1 1/8] tools/nolibc: add support for powerpc
  2023-07-23  7:32   ` Thomas Weißschuh
@ 2023-07-23  8:15     ` Willy Tarreau
  2023-07-25  5:44       ` Zhangjin Wu
  2023-07-25  6:14     ` Zhangjin Wu
  1 sibling, 1 reply; 37+ messages in thread
From: Willy Tarreau @ 2023-07-23  8:15 UTC (permalink / raw)
  To: Thomas Weißschuh; +Cc: Zhangjin Wu, arnd, linux-kernel, linux-kselftest

On Sun, Jul 23, 2023 at 09:32:37AM +0200, Thomas Weißschuh wrote:
> On 2023-07-19 05:10:48+0800, Zhangjin Wu wrote:
> > diff --git a/tools/include/nolibc/arch-powerpc.h b/tools/include/nolibc/arch-powerpc.h
> > new file mode 100644
> > index 000000000000..100ec0f412dc
> > --- /dev/null
> > +++ b/tools/include/nolibc/arch-powerpc.h
> > @@ -0,0 +1,156 @@
> > +/* SPDX-License-Identifier: LGPL-2.1 OR MIT */
> > +/*
> > + * PowerPC specific definitions for NOLIBC
> > + * Copyright (C) 2023 Zhangjin Wu <falcon@tinylab.org>
> 
> If it is taken from musl, shouldn't there also be a musl copyright?

In fact it depends. If code was taken there, not only the copyright is
needed, but the license' compatibility must be verified. If, however,
the code was only disassembled to be understood and reimplemented (as
it seems to me), then no code was taken there and it's not needed.

Thanks,
Willy

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v1 1/8] tools/nolibc: add support for powerpc
  2023-07-23  8:15     ` Willy Tarreau
@ 2023-07-25  5:44       ` Zhangjin Wu
  2023-07-25  6:29         ` Willy Tarreau
  0 siblings, 1 reply; 37+ messages in thread
From: Zhangjin Wu @ 2023-07-25  5:44 UTC (permalink / raw)
  To: w; +Cc: arnd, falcon, linux-kernel, linux-kselftest, thomas

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain; charset=UTF-8, Size: 3881 bytes --]

Hi, Thomas, Willy

> On Sun, Jul 23, 2023 at 09:32:37AM +0200, Thomas Weißschuh wrote:
> > On 2023-07-19 05:10:48+0800, Zhangjin Wu wrote:
> > > diff --git a/tools/include/nolibc/arch-powerpc.h b/tools/include/nolibc/arch-powerpc.h
> > > new file mode 100644
> > > index 000000000000..100ec0f412dc
> > > --- /dev/null
> > > +++ b/tools/include/nolibc/arch-powerpc.h
> > > @@ -0,0 +1,156 @@
> > > +/* SPDX-License-Identifier: LGPL-2.1 OR MIT */
> > > +/*
> > > + * PowerPC specific definitions for NOLIBC
> > > + * Copyright (C) 2023 Zhangjin Wu <falcon@tinylab.org>
> > 
> > If it is taken from musl, shouldn't there also be a musl copyright?
> 
> In fact it depends. If code was taken there, not only the copyright is
> needed, but the license' compatibility must be verified. If, however,
> the code was only disassembled to be understood and reimplemented (as
> it seems to me), then no code was taken there and it's not needed.
>

This discussion does inspire me a lot to shrink the whole architecture
specific nolibc my_syscall<N>() macros, like crt.h, a common syscall.h
is added to do so. I have finished most of them except the ones passing
arguments via stack, still trying to merge these ones.

With this new syscall.h, to support my_syscall<N>, the arch-<ARCH>.h
will only require to add ~10 lines to define their own syscall
instructions, registers and clobberlist, which looks like this (for
powerpc):

    #define _NOLIBC_SYSCALL_CALL "sc; bns+ 1f; neg  %0, %0; 1:"

    /* PowerPC doesn't always restore r3-r12 for us */
    #define _NOLIBC_SYSCALL_CLOBBERLIST 
    	"memory", "cr0", "r12", "r11", "r10", "r9", "r8", "r7", "r6", "r5", "r4"

    /* PowerPC write GPRS in kernel side but not restore them */
    #define _NOLIBC_GPRS_AS_OUTPUT_OPERANDS
    
    #define _NOLIBC_REG_NUM  "r0"
    #define _NOLIBC_REG_RET  "r3"
    #define _NOLIBC_REG_arg1 "r3"
    #define _NOLIBC_REG_arg2 "r4"
    #define _NOLIBC_REG_arg3 "r5"
    #define _NOLIBC_REG_arg4 "r6"
    #define _NOLIBC_REG_arg5 "r7"
    #define _NOLIBC_REG_arg6 "r8"

Before:

    $ ls tools/include/nolibc/arch-*.h | while read f; do git show dfef4fc45d5713eb23d87f0863aff9c33bd4bfaf:$f 2>/dev/null | wc -l | tr -d '\n'; echo " $f"; done
    157 tools/include/nolibc/arch-aarch64.h
    199 tools/include/nolibc/arch-arm.h
    178 tools/include/nolibc/arch-i386.h
    164 tools/include/nolibc/arch-loongarch.h
    195 tools/include/nolibc/arch-mips.h
    0 tools/include/nolibc/arch-powerpc.h
    160 tools/include/nolibc/arch-riscv.h
    186 tools/include/nolibc/arch-s390.h
    176 tools/include/nolibc/arch-x86_64.h

After:

    $ wc -l tools/include/nolibc/arch-*.h
       54 tools/include/nolibc/arch-aarch64.h
       84 tools/include/nolibc/arch-arm.h
       90 tools/include/nolibc/arch-i386.h                        /* the last one use stack to pass arguments, reserve as-is */
       59 tools/include/nolibc/arch-loongarch.h
      120 tools/include/nolibc/arch-mips.h                        /* the last two use stack to pass arguments, reserve as-is */
       73 tools/include/nolibc/arch-powerpc.h
       58 tools/include/nolibc/arch-riscv.h
       87 tools/include/nolibc/arch-s390.h
       67 tools/include/nolibc/arch-x86_64.h

syscall.h itself:

    $ wc -l tools/include/nolibc/syscall.h
    112 tools/include/nolibc/syscall.h 

Willy, do we need to rename my_syscall<N> to _nolibc_syscall<N> to limit
these macros nolibc internally? I plan to rename all of the new adding
macros with _nolibc_ (for funcs) or _NOLIBC_ (for variables).

Thomas, do we need to merge the syscall macros from unistd.h to this new
syscall.h? we do reuse the macros between them, like the _syscall_narg* ones.

Since this new syscall.h does help a lot to shrink the arch-powerpc.h, I plan
to send this syscall.h at first and then renew our powerpc patchset, what's
your idea? 

Thanks,
Zhangjin

> Thanks,
> Willy

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v1 8/8] selftests/nolibc: add test support for powerpc64
  2023-07-22 12:10   ` Willy Tarreau
@ 2023-07-25  5:50     ` Zhangjin Wu
  2023-07-25  6:02       ` Willy Tarreau
  0 siblings, 1 reply; 37+ messages in thread
From: Zhangjin Wu @ 2023-07-25  5:50 UTC (permalink / raw)
  To: w; +Cc: arnd, falcon, linux-kernel, linux-kselftest, thomas

Hi, Willy

> 
> On Wed, Jul 19, 2023 at 05:18:32AM +0800, Zhangjin Wu wrote:
> >  # ARCH is supported by kernel
> > +ARCH_powerpc64   = powerpc
> >  ARCH_powerpc64le = powerpc
> 
> And similarly let's simply call this one ppc64.
>

Well, I like these short ones too, what about also a ppc alias for
powerpc?

    ARCH_ppc     = powerpc
    ARCH_ppc64   = powerpc
    ARCH_ppc64le = powerpc
  

> Aside these few comments, the series looks nice, thanks!

Have applied all of your suggestions, thanks very much.

Best regards,
Zhangjin

> Willy

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v1 8/8] selftests/nolibc: add test support for powerpc64
  2023-07-25  5:50     ` Zhangjin Wu
@ 2023-07-25  6:02       ` Willy Tarreau
  0 siblings, 0 replies; 37+ messages in thread
From: Willy Tarreau @ 2023-07-25  6:02 UTC (permalink / raw)
  To: Zhangjin Wu; +Cc: arnd, linux-kernel, linux-kselftest, thomas

On Tue, Jul 25, 2023 at 01:50:31PM +0800, Zhangjin Wu wrote:
> Hi, Willy
> 
> > 
> > On Wed, Jul 19, 2023 at 05:18:32AM +0800, Zhangjin Wu wrote:
> > >  # ARCH is supported by kernel
> > > +ARCH_powerpc64   = powerpc
> > >  ARCH_powerpc64le = powerpc
> > 
> > And similarly let's simply call this one ppc64.
> >
> 
> Well, I like these short ones too, what about also a ppc alias for
> powerpc?
> 
>     ARCH_ppc     = powerpc
>     ARCH_ppc64   = powerpc
>     ARCH_ppc64le = powerpc

I thought about it as well. It could avoid the confusion between the
arch name (powerpc) that's used for both 32/64 and the user-requested
one.

Willy

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v1 1/8] tools/nolibc: add support for powerpc
  2023-07-23  7:32   ` Thomas Weißschuh
  2023-07-23  8:15     ` Willy Tarreau
@ 2023-07-25  6:14     ` Zhangjin Wu
  1 sibling, 0 replies; 37+ messages in thread
From: Zhangjin Wu @ 2023-07-25  6:14 UTC (permalink / raw)
  To: thomas; +Cc: arnd, falcon, linux-kernel, linux-kselftest, w

Hi, Thomas

> 
> On 2023-07-19 05:10:48+0800, Zhangjin Wu wrote:
> > Both syscall declarations and _start code definition are added for
> > powerpc to nolibc.
> > 
[...]
> > 
> > Signed-off-by: Zhangjin Wu <falcon@tinylab.org>
> > ---
> >  tools/include/nolibc/arch-powerpc.h | 156 ++++++++++++++++++++++++++++
> 
> This also should be added to nolibc/arch.h.
>

Thanks, it should be.

> >  1 file changed, 156 insertions(+)
> >  create mode 100644 tools/include/nolibc/arch-powerpc.h
> > 
> > diff --git a/tools/include/nolibc/arch-powerpc.h b/tools/include/nolibc/arch-powerpc.h
> > new file mode 100644
> > index 000000000000..100ec0f412dc
> > --- /dev/null
> > +++ b/tools/include/nolibc/arch-powerpc.h
> > @@ -0,0 +1,156 @@
> > +/* SPDX-License-Identifier: LGPL-2.1 OR MIT */
> > +/*
> > + * PowerPC specific definitions for NOLIBC
> > + * Copyright (C) 2023 Zhangjin Wu <falcon@tinylab.org>
> 
> If it is taken from musl, shouldn't there also be a musl copyright?
>

For this copyright issue, I have prepared two new versions without a line from
musl. even in our old version, most of them are different except the 'sc; bns+
1f; neg %1, %1; 1:' line and the register variables.

Seems 'sc; bns+ 1f; neg %1, %1; 1:' is also used in linux kernel:
arch/powerpc/include/asm/vdso/gettimeofday.h, so, it should be ok enough to
apply it.

The register varibles have been changed and aligned with othe arch-<ARCH>.h
locally, they are completely different now, and even further with the new
syscall.h mentioned in this reply [1], the file will be completely different.

Thomas, Have added your Reviewed-by lines too, thanks a lot!

Best regards,
Zhangjin
----
[1]: https://lore.kernel.org/lkml/20230725054414.15055-1-falcon@tinylab.org/


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v1 1/8] tools/nolibc: add support for powerpc
  2023-07-25  5:44       ` Zhangjin Wu
@ 2023-07-25  6:29         ` Willy Tarreau
  2023-07-25 11:02           ` Zhangjin Wu
  0 siblings, 1 reply; 37+ messages in thread
From: Willy Tarreau @ 2023-07-25  6:29 UTC (permalink / raw)
  To: Zhangjin Wu; +Cc: arnd, linux-kernel, linux-kselftest, thomas

On Tue, Jul 25, 2023 at 01:44:14PM +0800, Zhangjin Wu wrote:
> This discussion does inspire me a lot to shrink the whole architecture
> specific nolibc my_syscall<N>() macros, like crt.h, a common syscall.h
> is added to do so. I have finished most of them except the ones passing
> arguments via stack, still trying to merge these ones.
> 
> With this new syscall.h, to support my_syscall<N>, the arch-<ARCH>.h
> will only require to add ~10 lines to define their own syscall
> instructions, registers and clobberlist, which looks like this (for
> powerpc):
> 
>     #define _NOLIBC_SYSCALL_CALL "sc; bns+ 1f; neg  %0, %0; 1:"
> 
>     /* PowerPC doesn't always restore r3-r12 for us */
>     #define _NOLIBC_SYSCALL_CLOBBERLIST 
>     	"memory", "cr0", "r12", "r11", "r10", "r9", "r8", "r7", "r6", "r5", "r4"
> 
>     /* PowerPC write GPRS in kernel side but not restore them */
>     #define _NOLIBC_GPRS_AS_OUTPUT_OPERANDS
>     
>     #define _NOLIBC_REG_NUM  "r0"
>     #define _NOLIBC_REG_RET  "r3"
>     #define _NOLIBC_REG_arg1 "r3"
>     #define _NOLIBC_REG_arg2 "r4"
>     #define _NOLIBC_REG_arg3 "r5"
>     #define _NOLIBC_REG_arg4 "r6"
>     #define _NOLIBC_REG_arg5 "r7"
>     #define _NOLIBC_REG_arg6 "r8"
> 
> Before:
> 
>     $ ls tools/include/nolibc/arch-*.h | while read f; do git show dfef4fc45d5713eb23d87f0863aff9c33bd4bfaf:$f 2>/dev/null | wc -l | tr -d '\n'; echo " $f"; done
>     157 tools/include/nolibc/arch-aarch64.h
>     199 tools/include/nolibc/arch-arm.h
>     178 tools/include/nolibc/arch-i386.h
>     164 tools/include/nolibc/arch-loongarch.h
>     195 tools/include/nolibc/arch-mips.h
>     0 tools/include/nolibc/arch-powerpc.h
>     160 tools/include/nolibc/arch-riscv.h
>     186 tools/include/nolibc/arch-s390.h
>     176 tools/include/nolibc/arch-x86_64.h
> 
> After:
> 
>     $ wc -l tools/include/nolibc/arch-*.h
>        54 tools/include/nolibc/arch-aarch64.h
>        84 tools/include/nolibc/arch-arm.h
>        90 tools/include/nolibc/arch-i386.h                        /* the last one use stack to pass arguments, reserve as-is */
>        59 tools/include/nolibc/arch-loongarch.h
>       120 tools/include/nolibc/arch-mips.h                        /* the last two use stack to pass arguments, reserve as-is */
>        73 tools/include/nolibc/arch-powerpc.h
>        58 tools/include/nolibc/arch-riscv.h
>        87 tools/include/nolibc/arch-s390.h
>        67 tools/include/nolibc/arch-x86_64.h
> 
> syscall.h itself:
> 
>     $ wc -l tools/include/nolibc/syscall.h
>     112 tools/include/nolibc/syscall.h 

The important thing to consider is not the number of lines but the
*maintainability*. You factored the syscall code so much above with all
these macros that I don't even understand how they're going to interact
with each other, especially "%0". Also I don't know what the macro
_NOLIBC_GPRS_AS_OUTPUT_OPERANDS does. And when someone reports a bug
like we had in the past with programs randomly crashing depending on
stack alignment and such, it becomes particularly tricky to figure what
is expected and how it differs from reality.

Maybe it's possible to do something in between, like defining a few
more slightly larger blocks, I don't know. I still tend to think that
this significantly complicates the understanding of the whole thing.

Also, looking at different archs' syscall code, they're not all defined
the same way. Some like i386 use "=a" for the return value. Others use
"+r" as an input/output value, others "=r", others "+d" or "=d". And
reading the comments, there are some arch-specific choices for these
declarations, that are sometimes there to force the toolchain a little
bit. Others like MIPS pass some of their args in the stack, so a name
only is not sufficient. Thus, trying to factor all of this will not only
make it very hard to insert arch-specific adjusments, but it will also
make the code much less readable.

Also think about this for a moment: you had the opportunity to add two
new archs by simply restarting from existing ones. s390 and loongarch
were added the same way, leaving enough freedom to the implementers to
adjust the definitions to fit their environment's constraints. If you
make it very complicated, I suspect we won't get any such contributions
anymore just because nobody figures how to find their way through it,
or it would require to change again for everyone just to add one specific
case. I tend to think that the current situation is already fairly
minimal with quite readable and debuggable calls, and that it's what
*really* matters.

When you're trying to reorganize code, it's important to ask yourself
whether you would prefer to debug the old one or the new one at 3am.
Hint: at 3am, the more abstractions there are, the less understandable
it becomes.

> Willy, do we need to rename my_syscall<N> to _nolibc_syscall<N> to limit
> these macros nolibc internally? I plan to rename all of the new adding
> macros with _nolibc_ (for funcs) or _NOLIBC_ (for variables).

Why not, I'm not opposed to that. Just keep in mind that every single
rename complicates backports (or may even silently break them), so it's
important to know where you want to go and to try to make changes converge
towards something stable and durable.

Thanks,
Willy

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v1 1/8] tools/nolibc: add support for powerpc
  2023-07-25  6:29         ` Willy Tarreau
@ 2023-07-25 11:02           ` Zhangjin Wu
  2023-07-25 14:45             ` Ammar Faizi
  2023-07-25 18:27             ` Willy Tarreau
  0 siblings, 2 replies; 37+ messages in thread
From: Zhangjin Wu @ 2023-07-25 11:02 UTC (permalink / raw)
  To: w; +Cc: arnd, falcon, linux-kernel, linux-kselftest, thomas, svens, ammarfaizi2

Hi, Willy

Thanks very much for your comments on the new syscall.h proposal, just
replied more, it is a very long email, hope it explains more clearly ;-) 

Also CCed i386 and s390 committers, welcome your dicussion. 

> On Tue, Jul 25, 2023 at 01:44:14PM +0800, Zhangjin Wu wrote:
> > This discussion does inspire me a lot to shrink the whole architecture
> > specific nolibc my_syscall<N>() macros, like crt.h, a common syscall.h
> > is added to do so. I have finished most of them except the ones passing
> > arguments via stack, still trying to merge these ones.
> > 
> > With this new syscall.h, to support my_syscall<N>, the arch-<ARCH>.h
> > will only require to add ~10 lines to define their own syscall
> > instructions, registers and clobberlist, which looks like this (for
> > powerpc):
> > 
> >     #define _NOLIBC_SYSCALL_CALL "sc; bns+ 1f; neg  %0, %0; 1:"
> > 
> >     /* PowerPC doesn't always restore r3-r12 for us */
> >     #define _NOLIBC_SYSCALL_CLOBBERLIST 
> >     	"memory", "cr0", "r12", "r11", "r10", "r9", "r8", "r7", "r6", "r5", "r4"
> > 
> >     /* PowerPC write GPRS in kernel side but not restore them */
> >     #define _NOLIBC_GPRS_AS_OUTPUT_OPERANDS
> >     
> >     #define _NOLIBC_REG_NUM  "r0"
> >     #define _NOLIBC_REG_RET  "r3"
> >     #define _NOLIBC_REG_arg1 "r3"
> >     #define _NOLIBC_REG_arg2 "r4"
> >     #define _NOLIBC_REG_arg3 "r5"
> >     #define _NOLIBC_REG_arg4 "r6"
> >     #define _NOLIBC_REG_arg5 "r7"
> >     #define _NOLIBC_REG_arg6 "r8"
> > 
> > Before:
> > 
> >     $ ls tools/include/nolibc/arch-*.h | while read f; do git show dfef4fc45d5713eb23d87f0863aff9c33bd4bfaf:$f 2>/dev/null | wc -l | tr -d '\n'; echo " $f"; done
> >     157 tools/include/nolibc/arch-aarch64.h
> >     199 tools/include/nolibc/arch-arm.h
> >     178 tools/include/nolibc/arch-i386.h
> >     164 tools/include/nolibc/arch-loongarch.h
> >     195 tools/include/nolibc/arch-mips.h
> >     0 tools/include/nolibc/arch-powerpc.h
> >     160 tools/include/nolibc/arch-riscv.h
> >     186 tools/include/nolibc/arch-s390.h
> >     176 tools/include/nolibc/arch-x86_64.h
> > 
> > After:
> > 
> >     $ wc -l tools/include/nolibc/arch-*.h
> >        54 tools/include/nolibc/arch-aarch64.h
> >        84 tools/include/nolibc/arch-arm.h
> >        90 tools/include/nolibc/arch-i386.h                        /* the last one use stack to pass arguments, reserve as-is */
> >        59 tools/include/nolibc/arch-loongarch.h
> >       120 tools/include/nolibc/arch-mips.h                        /* the last two use stack to pass arguments, reserve as-is */
> >        73 tools/include/nolibc/arch-powerpc.h
> >        58 tools/include/nolibc/arch-riscv.h
> >        87 tools/include/nolibc/arch-s390.h
> >        67 tools/include/nolibc/arch-x86_64.h
> > 
> > syscall.h itself:
> > 
> >     $ wc -l tools/include/nolibc/syscall.h
> >     112 tools/include/nolibc/syscall.h 
> 
> The important thing to consider is not the number of lines but the
> *maintainability*.

The original goal is not really the number of lines (only a
'side-effect'), but is exactly easier porting/maintainability with
clearer code architecture, I have wrotten another message to explain
more about this background, so, let's directly reply here and discuss
more. 

> You factored the syscall code so much above with all
> these macros that I don't even understand how they're going to interact
> with each other, especially "%0".

Yeah, it is my fault, this should be cleaned up with the return register
directly:

    #define _NOLIBC_SYSCALL_CALL \
    	"sc; bns+ 1f; neg 3, 3; 1:"

> Also I don't know what the macro
> _NOLIBC_GPRS_AS_OUTPUT_OPERANDS does.

This is the root cause to inspire me to add the new syscall.h, let's
explain the background step by step.

All of the other architectures (except PowerPC) restore GPRS for us when
return from syscall, so, their clobber list simply not include the GPRS
and only need to add the input registers in the 'INPUT Operands' list.

But PowerPC doesn't restore such GPRS for us, I'm not sure if it is a
feature (Maybe) or a bug. for PowerPC32, the following line will restore
the GPRS for us, but may be not a good idea to do so for it may decrease
the syscall performance although save some instructions in user-space
and also, the other libcs also follow the current rule, so, this may be
a design intention we must follow (welcome your suggestions).

    diff --git a/arch/powerpc/kernel/entry_32.S b/arch/powerpc/kernel/entry_32.S
    index fe27d41f9a3d..1ed535e9144c 100644
    --- a/arch/powerpc/kernel/entry_32.S
    +++ b/arch/powerpc/kernel/entry_32.S
    @@ -155,6 +155,7 @@ syscall_exit_finish:
            bne     3f
            mtcr    r5
    
    +       REST_GPRS(3, 12, r1)
     1:     REST_GPR(2, r1)
            REST_GPR(1, r1)
            rfi

For PowerPC, if with the previous method like the other architectures,
the clobber list differs for every my_syscall<N> and all of the input
registers must be put in the 'OUTPUT Operands' too to avoid compiler to
save and resue a variable in such GPRS across my_syscall<N> calls.

Originally in my local new version of arch-powerpc.h, we got such code
for every my_syscall<N>, use my_syscall6() as an example:

    #define my_syscall6(num, arg1, arg2, arg3, arg4, arg5, arg6)                 \
    ({                                                                           \
            register long _ret  __asm__ ("r3");                                  \
            register long _num  __asm__ ("r0") = (num);                          \
            register long _arg1 __asm__ ("r3") = (long)(arg1);                   \
            register long _arg2 __asm__ ("r4") = (long)(arg2);                   \
            register long _arg3 __asm__ ("r5") = (long)(arg3);                   \
            register long _arg4 __asm__ ("r6") = (long)(arg4);                   \
            register long _arg5 __asm__ ("r7") = (long)(arg5);                   \
            register long _arg6 __asm__ ("r8") = (long)(arg6);                   \
                                                                                 \
            __asm__ volatile (                                                   \
                    "       sc\n"                                                \
                    "       bns+ 1f\n"                                           \
                    "       neg  %0, %0\n"                                       \
                    "1:\n"                                                       \
                    : "=r"(_ret),                                                \
		      "+r"(_arg2), "+r"(_arg3), "+r"(_arg4),                     \
                      "+r"(_arg5), "+r"(_arg6)                                   \
                    : "0"(_arg1), "r"(_num)                                      \
                    : _NOLIBC_SYSCALL_CLOBBERLIST                                \
            );                                                                   \
            _ret;                                                                \
    })

It almost aligns with the other architectures, but the full clobber list
differs for every my_syscall<N>, the basic one is:

    /* PowerPC kernel doesn't always restore r4-r12 for us */
    #define _NOLIBC_SYSCALL_CLOBBERLIST \
        "memory", "cr0", "r12", "r11", "r10", "r9",

Use my_syscall0() as a further example, we need something like this:

    #define my_syscall0(num)                                                     \
    ({                                                                           \
            register long _ret  __asm__ ("r3");                                  \
            register long _num  __asm__ ("r0") = (num);                          \
                                                                                 \
            __asm__ volatile (                                                   \
                    "       sc\n"                                                \
                    "       bns+ 1f\n"                                           \
                    "       neg  %0, %0\n"                                       \
                    "1:\n"                                                       \
                    : "=r"(_ret)                                                 \
                    : "r"(_num)                                                  \
                    : _NOLIBC_SYSCALL_CLOBBERLIST, "r8", "r7", "r6", "r5", "r4"  \
            );                                                                   \
            _ret;                                                                \
    })

The additional "r8"..."r4" must be appended to the clobber list for they
can not be put together for every my_syscall<N> due to conflicts between
they between the clobber list and the "OUTPUT/INPUT operands".

I found a solution to share the same _NOLIBC_SYSCALL_CLOBBERLIST, that
is split the "OUTPUT/INPUT Operands" list out of the core syscall
assembly:

    #define my_syscall6(num, arg1, arg2, arg3, arg4, arg5, arg6)                 \
    ({                                                                           \
            register long _ret  __asm__ ("r3");                                  \
            register long _num  __asm__ ("r0") = (num);                          \
            register long _arg1 __asm__ ("r3") = (long)(arg1);                   \
            register long _arg2 __asm__ ("r4") = (long)(arg2);                   \
            register long _arg3 __asm__ ("r5") = (long)(arg3);                   \
            register long _arg4 __asm__ ("r6") = (long)(arg4);                   \
            register long _arg5 __asm__ ("r7") = (long)(arg5);                   \
            register long _arg6 __asm__ ("r8") = (long)(arg6);                   \
                                                                                 \
            __asm__ volatile ("": "+r"(_arg2), "+r"(_arg3), "+r"(_arg4),         \
                                  "+r"(_arg5), "+r"(_arg6)::);                   \
                                                                                 \
            __asm__ volatile (                                                   \
                    "       sc\n"                                                \
                    "       bns+ 1f\n"                                           \
                    "       neg  %0, %0\n"                                       \
                    "1:\n"                                                       \
                    : "=r"(_ret)                                                 \
                    : "0"(_arg1), "r"(_num)                                      \
                    : _NOLIBC_SYSCALL_CLOBBERLIST                                \
            );                                                                   \
            _ret;                                                                \
    })

Note, a question here is, the above split still require more discussion
to make sure it does work for different toolchains (only test for gcc
currently) or even if this method is right from scratch, welcome your
suggestion.

As a result, all of the my_syscall<N> are able to share the core syscall
calling assembly block, so, here is what we at last have:

    #define _my_syscall_tail()                                              \
    	__asm__ volatile (                                                  \
    		_NOLIBC_SYSCALL_CALL                                        \
    		: "=r"(_ret)                                                \
    		: "r"(_num)                                                 \
    		: _NOLIBC_SYSCALL_CLOBBERLIST                               \
    	);                                                                  \
    	_ret

And further, we also found it was possible to share most of them among
these not ugly but completely duplicated lines:

            register long _ret  __asm__ ("r3");                                  \
            register long _num  __asm__ ("r0") = (num);                          \
            register long _arg1 __asm__ ("r3") = (long)(arg1);                   \
            register long _arg2 __asm__ ("r4") = (long)(arg2);                   \
            register long _arg3 __asm__ ("r5") = (long)(arg3);                   \
            register long _arg4 __asm__ ("r6") = (long)(arg4);                   \
            register long _arg5 __asm__ ("r7") = (long)(arg5);                   \
            register long _arg6 __asm__ ("r8") = (long)(arg6);                   \
                                                                                 \
            __asm__ volatile ("": "+r"(_arg2), "+r"(_arg3), "+r"(_arg4),         \
                                  "+r"(_arg5), "+r"(_arg6)::);
                                                            
That's why at last we have such blocks (of course, for PowerPC itself
it is a big change and not necessary):

    #define _my_syscall_head(num)                                               \
    	register long _ret __asm__ (_NOLIBC_REG_RET);                           \
    	register long _num __asm__ (_NOLIBC_REG_NUM) = (num)                    \
    
    #ifdef _NOLIBC_REG_ERR
    #define _NOLIBC_REG_EXTRA _NOLIBC_REG_ERR
    #endif
    
    #ifdef _NOLIBC_REG_EXTRA
    #define _my_syscall_extra() \
    	register long reg_extra __asm__ (_NOLIBC_REG_EXTRA);                   \
    	__asm__ volatile ("": "=r"(reg_extra)::)
    #else
    #define _my_syscall_extra()
    #endif
    
    /* Architectures like PowerPC write GPRS in kernel side and not restore them */
    #ifndef _NOLIBC_GPRS_AS_OUTPUT_OPERANDS
    #define _my_syscall_argn(n, arg)                                            \
    	register long _arg##n __asm__ (_NOLIBC_REG_arg##n) = (long)(arg);       \
    	__asm__ volatile ("":: "r"(_arg##n):)
    #else
    #define _my_syscall_argn(n, arg)                                            \
    	register long _arg##n __asm__ (_NOLIBC_REG_arg##n) = (long)(arg);       \
    	__asm__ volatile ("": "+r"(_arg##n)::)
    #endif

And this further build args for us:

    #define _my_syscall_args0()
    
    #define _my_syscall_args1(...) \
    	_my_syscall_args0(); \
    	_my_syscall_argn(1, __VA_ARGS__)
    
    #define _my_syscall_args2(arg2, ...) \
    	_my_syscall_args1(__VA_ARGS__); \
    	_my_syscall_argn(2, arg2)
    
    #define _my_syscall_args3(arg3, ...) \
    	_my_syscall_args2(__VA_ARGS__); \
    	_my_syscall_argn(3, arg3)
    
    #define _my_syscall_args4(arg4, ...) \
    	_my_syscall_args3(__VA_ARGS__); \
    	_my_syscall_argn(4, arg4)
    
    #define _my_syscall_args5(arg5, ...) \
    	_my_syscall_args4(__VA_ARGS__); \
    	_my_syscall_argn(5, arg5)
    
    #define _my_syscall_args6(arg6, ...) \
    	_my_syscall_args5(__VA_ARGS__); \
    	_my_syscall_argn(6, arg6)

And at last:

    #define __my_syscall_args(N, ...) _my_syscall_args##N(__VA_ARGS__)
    #define _my_syscall_args(N, ...) __my_syscall_args(N, ##__VA_ARGS__)

    #define __my_syscall_narg(_0, _1, _2, _3, _4, _5, _6, N, ...) N
    #define _my_syscall_narg(...) __my_syscall_narg(__VA_ARGS__, 6, 5, 4, 3, 2, 1, 0)

    #define __my_syscall_argsn(N, argn, ...) \
    	_my_syscall_args(_my_syscall_narg(NULL, ##__VA_ARGS__), ##__VA_ARGS__); \
    	_my_syscall_argn(N, argn)
    
    #define _my_syscall_argsn(...) __my_syscall_argsn(_my_syscall_narg(NULL, ##__VA_ARGS__), ##__VA_ARGS__)

    /* Note, my_syscall0() has no argument, can not use my_syscalln() */
    #define my_syscall0(num)                                                           \
    ({                                                                                 \
    	_my_syscall_head(num);                                                         \
    	_my_syscall_extra();                                                           \
    	_my_syscall_tail();                                                            \
    })
    
    #define my_syscalln(num, ...)                                                      \
    ({                                                                                 \
    	_my_syscall_head(num);                                                         \
    	_my_syscall_extra();                                                           \
    	_my_syscall_argsn(__VA_ARGS__);                                                \
    	_my_syscall_tail();                                                            \
    })
    
    #define my_syscall1(num, arg1)                               my_syscalln(num, arg1)
    #define my_syscall2(num, arg1, arg2)                         my_syscalln(num, arg2, arg1)
    #define my_syscall3(num, arg1, arg2, arg3)                   my_syscalln(num, arg3, arg2, arg1)
    #define my_syscall4(num, arg1, arg2, arg3, arg4)             my_syscalln(num, arg4, arg3, arg2, arg1)
    
    #ifndef my_syscall5
    #define my_syscall5(num, arg1, arg2, arg3, arg4, arg5)       my_syscalln(num, arg5, arg4, arg3, arg2, arg1)
    #endif
    #ifndef my_syscall6
    #define my_syscall6(num, arg1, arg2, arg3, arg4, arg5, arg6) my_syscalln(num, arg6, arg5, arg4, arg3, arg2, arg1)
    #endif

At last, I found this worked on all of the supported architectures, so,
the new syscall.h is proposed.

BTW, another question here is, to utilize the feature of __VA_ARGS__ to
easier getting the last argument, the order of arguments are reversed
during the declarations of the my_syscall<N>, any suggestion on this
part? is it possible to not reverse the order? If possible, the
my_syscall<N> declarations may be simplified to:

    #define my_syscall1(...) my_syscalln(__VA_ARGS__)

And even be possible to define syscall() from unistd.h as a alias of
my_syscalln():

    #define syscall(...) my_syscalln(__VA_ARGS__)

And further, these three from unistd.h are sharable:

    #define __syscall_narg(_0, _1, _2, _3, _4, _5, _6, N, ...) N
    #define _syscall_narg(...) __syscall_narg(__VA_ARGS__, 6, 5, 4, 3, 2, 1, 0)
    #define _syscall_n(N, ...) _syscall(N, __VA_ARGS__)

> And when someone reports a bug
> like we had in the past with programs randomly crashing depending on
> stack alignment and such, it becomes particularly tricky to figure what
> is expected and how it differs from reality.
>

Macros are really hard to debug, the above code lines cost me two days ;-)

but after the common syscall.h, the left architecture specific parts are
very few and easier to debug and even less copy and paste.

> Maybe it's possible to do something in between, like defining a few
> more slightly larger blocks,

Yeah, here is the method we applied, new blocks added are:

    /* for ret and num */
    #define _my_syscall_head(num)                                           \

    /* for extra err output */
    #define _my_syscall_extra()                                             \

    /* for every argument, with additional OUTPUT/INPUT operands setting */
    #define _my_syscall_argn(n, arg)                                        \

    /* for the core syscall calling and return */
    #define _my_syscall_tail()                                              \

    /* for variable number of args */
    #define _my_syscall_args0()
    #define _my_syscall_args1(...) \
    #define _my_syscall_args2(arg2, ...) \
    #define _my_syscall_args3(arg3, ...) \
    #define _my_syscall_args4(arg4, ...) \
    #define _my_syscall_args5(arg5, ...) \
    #define _my_syscall_args6(arg6, ...) \

    /* unified variable number of args */
    #define _my_syscall_argsn(...)

    /* my_syscall0 */
    #define my_syscall0(num)                                                               \

    /* my_syscalln */
    #define my_syscalln(num, ...) \

Still require more discussion on their names or even more
simplification.

> I don't know. I still tend to think that
> this significantly complicates the understanding of the whole thing.
>

Willy, don't worry, I do think it make things easier, the worse case is
using the old code or only use part of our new blocks helpers ;-)

For this new syscall.h, it mainly clear the inline assembly codes, as we
know, inline assembly code is a very complicated thing, If we clear the
logic carefully (as we target but not yet) in our common code,
architecture developers only require to focus on the platform specific
definitions, it should be better for portability, review and
maintainability.

It is very hard to learn the meanning of the OUTPUT operands, INPUT
operands and even the clobber list and even the flags you mentioned
below, clearing up them is really required, this new syscall.h also
require more comments on the macro functions and variables. 

> Also, looking at different archs' syscall code, they're not all defined
> the same way. Some like i386 use "=a" for the return value. Others use
> "+r" as an input/output value, others "=r",

Agree, this is a very hard part of the inline assembly codes, clearing
up the generic versions in syscall.h with additional comments may really
help a lot.

For OUTPUT/INPUT operands, they may be enough for the generic versions:

     #define _my_syscall_tail()                                              \
    	__asm__ volatile (                                                  \
    		_NOLIBC_SYSCALL_CALL                                        \
    		: "=r"(_ret)                                                \
    		: "r"(_num)                                                 \
    		: _NOLIBC_SYSCALL_CLOBBERLIST                               \
    	);                                                                  \
    	_ret

    #ifdef _NOLIBC_REG_ERR
    #define _NOLIBC_REG_EXTRA _NOLIBC_REG_ERR
    #endif
    
    #ifdef _NOLIBC_REG_EXTRA
    #define _my_syscall_extra() \
    	register long reg_extra __asm__ (_NOLIBC_REG_EXTRA);                   \
    	__asm__ volatile ("": "=r"(reg_extra)::)
    #else
    #define _my_syscall_extra()
    #endif


    /* Architectures like PowerPC write GPRS in kernel side and not restore them */
    #ifndef _NOLIBC_GPRS_AS_OUTPUT_OPERANDS
    #define _my_syscall_argn(n, arg)                                            \
    	register long _arg##n __asm__ (_NOLIBC_REG_arg##n) = (long)(arg);       \
    	__asm__ volatile ("":: "r"(_arg##n):)
    #else
    #define _my_syscall_argn(n, arg)                                            \
    	register long _arg##n __asm__ (_NOLIBC_REG_arg##n) = (long)(arg);       \
    	__asm__ volatile ("": "+r"(_arg##n)::)
    #endif

- "=r" is only required for output registers, it is mainly for "RET"
  register and "ERR" register.

- "r" is for input, mainly for "NUM" register and the input arguments
  (sometimes, the input arguments are used as "RET" and "ERR" too).

- "+r" means both input and output, only used by powerpc currently.

  For the input registers used as "ERR" or "RET" output, "+r" is used
  before, but now, we split them to two parts, one is "=r"(_ret) in
  _my_syscall_tail(), another is in _my_syscall_argn(n, arg), they
  together work for "+r" = "=r" + "r"

Btw, have checked "=r" instead of "=a" works on i386 too for we already
bind the _ret variable with "RET" register, but still need to check if
"=a" is necessary?

> others "+d" or "=d". And
> reading the comments, there are some arch-specific choices for these
> declarations, that are sometimes there to force the toolchain a little
> bit.

Even for s390, test shows, "r" instead of "d" works too (boot and test
passed, not yet checked the size and codes generated), but I didn't find
any document about "d" for s390 from the gcc online doc. This part
(include all of the supported architectures) should be carefully checked
if really adding our new syscall.h. add s390 nolibc committer to CC: list.

If "r" really not works on a target, we can use the logic like this
(include syscall.h after architecture specific definitions):

    #ifndef my_syscall5
    #define my_syscall5(num, arg1, arg2, arg3, arg4, arg5)       my_syscalln(num, arg5, arg4, arg3, arg2, arg1)
    #endif
    #ifndef my_syscall6
    #define my_syscall6(num, arg1, arg2, arg3, arg4, arg5, arg6) my_syscalln(num, arg6, arg5, arg4, arg3, arg2, arg1)
    #endif

The blocks from syscalls.h are only provided as generic support, the
architectures can define their own versions. all of our new blocks can
be protected like my_syscall<N> above, for example:

    #ifndef _my_syscall_tail
    #define _my_syscall_tail ...
    #endif

But for s390 here, if it really requires "d" instead of "r", we are able
to allow architectures define their own modifier like this:

    #ifndef _NOLIBC_INLINE_ASM_MODIFIER
    #define _NOLIBC_INLINE_ASM_MODIFIER "r"
    #endif

Then, we can define _NOLIBC_INLINE_ASM_MODIFIER for s390:
    
    #define _NOLIBC_INLINE_ASM_MODIFIER "d"

But architectures like i386, If "=a", "=b", ... modifiers are necessary,
new versions of the blocks should be added for these architectures.

As a short summary for this part, the modifiers used by different
architectures should be carefully checked, welcome more suggestions from
the toolchains developers or the architecture maintainers, I will
compare the code generated too.

> Others like MIPS pass some of their args in the stack, so a name
> only is not sufficient. Thus, trying to factor all of this will not only
> make it very hard to insert arch-specific adjusments, but it will also
> make the code much less readable.
>

Currently, I didn't yet work on merging the 'stack' support, we used
'#ifdef's in syscall.h:

    #ifndef my_syscall5
    #define my_syscall5 ...
    #endif
    #ifndef my_syscall6
    #define my_syscall6 ...
    #endif

So, they are the same as before: 

    $ grep "#define my_syscall" -ur tools/include/nolibc/arch-*.h
    tools/include/nolibc/arch-i386.h:#define my_syscall6(num, arg1, arg2, arg3, arg4, arg5, arg6)	\
    tools/include/nolibc/arch-mips.h:#define my_syscall5(num, arg1, arg2, arg3, arg4, arg5)                        \
    tools/include/nolibc/arch-mips.h:#define my_syscall6(num, arg1, arg2, arg3, arg4, arg5, arg6)

Still require more work on how to support these ones, but it is not that
urgent ;-)

> Also think about this for a moment: you had the opportunity to add two
> new archs by simply restarting from existing ones. s390 and loongarch
> were added the same way, leaving enough freedom to the implementers to
> adjust the definitions to fit their environment's constraints.

I think this reserves as-is if the new arch-<ARCH>.h simply not include
our new syscalls.h, so, the new syscalls.h is optional, the worse case
is as-is, no regression, enough freedom as before ;-)

And further, some architectures may resue some helpers from our new syscalls.h
or at least learn something from what we have done for all of the supported
architectures.

And eventually, if our generic versions fits, just defining the
syscall instructions, registers and clobber list should be enough.

> If you
> make it very complicated, I suspect we won't get any such contributions
> anymore just because nobody figures how to find their way through it,
> or it would require to change again for everyone just to add one specific
> case. I tend to think that the current situation is already fairly
> minimal with quite readable and debuggable calls, and that it's what
> *really* matters.
>

This may really influence the upstream and review flow:

- If people may send a new architecture support, if just fits the new
  syscall.h, we may need to review carefully about the test results,
  especially for the input/output operands, error register.

  As tests for powerpc shows, the above issues should be covered by our
  nolibc-test.

- If people may send a new architecture support as before, If we find it
  fits our new syscalls.h or it can apply part of the blocks, we can
  give some suggestions.

- If people send something not just fit our new syscall.h, we may be
  able to think about if a new 'hook' is required, but it is not
  necessary, we can delay this change requirement to after a careful
  design (just like the argument passing via 'stack' case currently) .

> When you're trying to reorganize code, it's important to ask yourself
> whether you would prefer to debug the old one or the new one at 3am.
> Hint: at 3am, the more abstractions there are, the less understandable
> it becomes.
>

Interesting and agree, but this abstraction does clear something to be more
undersatndable too ;-)

I do hate hard-debuggable macros, but as we mentioned above, the inline
assembly code is another harder parts, the new carefully tuned blocks may really
help us to avoid introduce bugs with manually wrotten new codes and also it may
help us to avoid copy and paste multiple duplicated lines of the same codes. 

> > Willy, do we need to rename my_syscall<N> to _nolibc_syscall<N> to limit
> > these macros nolibc internally? I plan to rename all of the new adding
> > macros with _nolibc_ (for funcs) or _NOLIBC_ (for variables).
> 
> Why not, I'm not opposed to that. Just keep in mind that every single
> rename complicates backports (or may even silently break them), so it's
> important to know where you want to go and to try to make changes converge
> towards something stable and durable.

Yeah, let's keep the changes minimal, we could prefix the new macros
with _nolibc_ or _NOLIBC_, the old ones can be kept as-is.

Best regards,
Zhangjin

> 
> Thanks,
> Willy

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v1 4/8] selftests/nolibc: add extra config file customize support
  2023-07-22 12:00   ` Willy Tarreau
@ 2023-07-25 14:30     ` Zhangjin Wu
  2023-07-29  7:45       ` Willy Tarreau
  0 siblings, 1 reply; 37+ messages in thread
From: Zhangjin Wu @ 2023-07-25 14:30 UTC (permalink / raw)
  To: w; +Cc: arnd, falcon, linux-kernel, linux-kselftest, linux, thomas

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain; charset=UTF-8, Size: 2989 bytes --]

Hi, Willy

> On Wed, Jul 19, 2023 at 05:14:07AM +0800, Zhangjin Wu wrote:
> > The default DEFCONFIG_<ARCH> may not always work for all architectures,
> > some architectures require to add extra kernel config options, this adds
> > a new 'extconfig' target for this requirement.
> > 
> > It allows to customize extra kernel config options via the common
> > common.config and the architecture specific <ARCH>.config, at last
> > trigger 'allnoconfig' to let them take effect with missing config
> > options as disabled.
> > 
> > The scripts/kconfig/merge_config.sh tool is used to merge the extra
> > config files.
> > 
> > Suggested-by: Thomas Weißschuh <linux@weissschuh.net>
> > Link: https://lore.kernel.org/lkml/67eb70d4-c9ff-4afc-bac7-7f36cc2c81bc@t-8ch.de/
> > Signed-off-by: Zhangjin Wu <falcon@tinylab.org>
> > ---
> >  tools/testing/selftests/nolibc/Makefile | 7 +++++++
> >  1 file changed, 7 insertions(+)
> > 
> > diff --git a/tools/testing/selftests/nolibc/Makefile b/tools/testing/selftests/nolibc/Makefile
> > index f42adef87e12..08a5ca5f418b 100644
> > --- a/tools/testing/selftests/nolibc/Makefile
> > +++ b/tools/testing/selftests/nolibc/Makefile
> > @@ -39,6 +39,9 @@ DEFCONFIG_s390       = defconfig
> >  DEFCONFIG_loongarch  = defconfig
> >  DEFCONFIG            = $(DEFCONFIG_$(ARCH))
> >  
> > +# extra kernel config files under configs/, include common + architecture specific
> > +EXTCONFIG            = common.config $(ARCH).config
> > +
> >  # optional tests to run (default = all)
> >  TEST =
> >  
> > @@ -162,6 +165,10 @@ initramfs: nolibc-test
> >  defconfig:
> >  	$(Q)$(MAKE) -C $(srctree) ARCH=$(ARCH) CC=$(CC) CROSS_COMPILE=$(CROSS_COMPILE) mrproper $(DEFCONFIG) prepare
> >  
> > +extconfig:
> > +	$(Q)$(srctree)/scripts/kconfig/merge_config.sh -O "$(srctree)" -m "$(srctree)/.config" $(foreach c,$(EXTCONFIG),$(wildcard $(CURDIR)/configs/$c))
> > +	$(Q)$(MAKE) -C $(srctree) ARCH=$(ARCH) CC=$(CC) CROSS_COMPILE=$(CROSS_COMPILE) KCONFIG_ALLCONFIG="$(srctree)/.config" allnoconfig
> > +
> 
> Please also mention this entry in the "help" message.
> Other than that, OK.
>

Willy, as we discussed in another tinyconfig series, in order to avoid
add more complexity to users, I plan to drop this standalone 'extconfig'
target and move the extra config operations to defconfig target like
this:

    defconfig:
     	$(Q)$(MAKE) -C $(srctree) ARCH=$(ARCH) CC=$(CC) CROSS_COMPILE=$(CROSS_COMPILE) mrproper $(DEFCONFIG) prepare
     	$(Q)$(srctree)/scripts/kconfig/merge_config.sh -O "$(srctree)" -m "$(srctree)/.config" $(foreach c,$(EXTCONFIG),$(wildcard $(CURDIR)/configs/$c))
     	$(Q)$(MAKE) -C $(srctree) ARCH=$(ARCH) CC=$(CC) CROSS_COMPILE=$(CROSS_COMPILE) KCONFIG_ALLCONFIG="$(srctree)/.config" allnoconfig

This extra config options are really critical to default boot and test, so, it
should be a 'default' config action as the 'defconfig' target originally mean.

Will test carefully about this change, what's your idea?

Thanks,
Zhangjin

> Willy

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v1 1/8] tools/nolibc: add support for powerpc
  2023-07-25 11:02           ` Zhangjin Wu
@ 2023-07-25 14:45             ` Ammar Faizi
  2023-07-25 17:04               ` Zhangjin Wu
  2023-07-25 18:27             ` Willy Tarreau
  1 sibling, 1 reply; 37+ messages in thread
From: Ammar Faizi @ 2023-07-25 14:45 UTC (permalink / raw)
  To: Zhangjin Wu
  Cc: Willy Tarreau, Arnd Bergmann, Sven Schnelle,
	Thomas Weißschuh, Linux Kernel Mailing List,
	Linux Kselftest Mailing List

Hi Zhangjin, 

On Tue, Jul 25, 2023 at 07:02:55PM +0800, Zhangjin Wu wrote:
> Btw, have checked "=r" instead of "=a" works on i386 too for we already
> bind the _ret variable with "RET" register, but still need to check if
> "=a" is necessary?

I need to tell you that syscall6() for i386 can't use "r" and "=r"
because there was a historical bug that made GCC stuck in a loop forever
when compiling the nolibc code. It's already fixed in the latest version
of GCC, but we should still support older compilers.

Ref: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105032

I discovered that bug in 2022 in the latest version of GCC at that time,
so it's pretty new, and those buggy versions are very likely still in
the wild today.

-- 
Ammar Faizi


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v1 1/8] tools/nolibc: add support for powerpc
  2023-07-25 14:45             ` Ammar Faizi
@ 2023-07-25 17:04               ` Zhangjin Wu
  2023-07-25 18:23                 ` Ammar Faizi
  0 siblings, 1 reply; 37+ messages in thread
From: Zhangjin Wu @ 2023-07-25 17:04 UTC (permalink / raw)
  To: ammarfaizi2; +Cc: arnd, falcon, linux-kernel, linux-kselftest, svens, thomas, w

Hi,

> Hi Zhangjin, 
> 
> On Tue, Jul 25, 2023 at 07:02:55PM +0800, Zhangjin Wu wrote:
> > Btw, have checked "=r" instead of "=a" works on i386 too for we already
> > bind the _ret variable with "RET" register, but still need to check if
> > "=a" is necessary?
> 
> I need to tell you that syscall6() for i386 can't use "r" and "=r"
> because there was a historical bug that made GCC stuck in a loop forever
> when compiling the nolibc code. It's already fixed in the latest version
> of GCC, but we should still support older compilers.
>

Thanks very much, this information is really important.

My old 'reply' is not rigorous, since the syscall6() uses stack to pass
the 6th argument, so, our new syscall.h didn't support it currently,
the syscalls I have tested about "=r" instead of "=a" were only syscall1-5().

> Ref: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105032
> 
> I discovered that bug in 2022 in the latest version of GCC at that time,
> so it's pretty new, and those buggy versions are very likely still in
> the wild today.

Ok, so, with the new syscalls.h proposed, we'd better keep i386
syscall6() as-is.

For the left syscall1-5(), is there any risk when use '=r' instead of 'r'?

Thanks,
Zhangjin

> 
> -- 
> Ammar Faizi

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v1 1/8] tools/nolibc: add support for powerpc
  2023-07-25 17:04               ` Zhangjin Wu
@ 2023-07-25 18:23                 ` Ammar Faizi
  2023-07-25 20:23                   ` Zhangjin Wu
  0 siblings, 1 reply; 37+ messages in thread
From: Ammar Faizi @ 2023-07-25 18:23 UTC (permalink / raw)
  To: Zhangjin Wu
  Cc: Arnd Bergmann, Sven Schnelle, Thomas Weißschuh,
	Willy Tarreau, Linux Kernel Mailing List,
	Linux Kselftest Mailing List

On Wed, Jul 26, 2023 at 01:04:26AM +0800, Zhangjin Wu wrote:
> My old 'reply' is not rigorous, since the syscall6() uses stack to pass
> the 6th argument, so, our new syscall.h didn't support it currently,
> the syscalls I have tested about "=r" instead of "=a" were only syscall1-5().

Yeah, it won't fit with the new design.

i386 runs out of GPRs very quickly. Given that, it had a hard time
implementing syscall6() properly in nolibc. The calling convention
itself actually doesn't require stack for executing 'int $0x80'.

The reason of why it uses stack is because the %ebp register cannot be
listed in the clobber list nor in the constraint if -fomit-frame-pointer
is not activated. Thus, we have to carefully preserve the value on the
stack before using %ebp as the 6-th argument to the syscall. It's a hack
to make it work on i386.

> Ok, so, with the new syscalls.h proposed, we'd better keep i386
> syscall6() as-is.
> 
> For the left syscall1-5(), is there any risk when use '=r' instead of 'r'?

Using "=r" instead of "r" doesn't make sense.

Did you mean "=r" instead of "=a"?

If that's what you mean:

So far I don't see the risk of using "=r" instead of "=a" as long as the
variable is properly marked as 'register' + asm("eax").

-- 
Ammar Faizi


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v1 1/8] tools/nolibc: add support for powerpc
  2023-07-25 11:02           ` Zhangjin Wu
  2023-07-25 14:45             ` Ammar Faizi
@ 2023-07-25 18:27             ` Willy Tarreau
  2023-07-25 20:52               ` Zhangjin Wu
  1 sibling, 1 reply; 37+ messages in thread
From: Willy Tarreau @ 2023-07-25 18:27 UTC (permalink / raw)
  To: Zhangjin Wu
  Cc: arnd, linux-kernel, linux-kselftest, thomas, svens, ammarfaizi2

Hi Zhangjin,

On Tue, Jul 25, 2023 at 07:02:55PM +0800, Zhangjin Wu wrote:
> > > With this new syscall.h, to support my_syscall<N>, the arch-<ARCH>.h
> > > will only require to add ~10 lines to define their own syscall
> > > instructions, registers and clobberlist, which looks like this (for
> > > powerpc):
> > > 
> > >     #define _NOLIBC_SYSCALL_CALL "sc; bns+ 1f; neg  %0, %0; 1:"
> > > 
> > >     /* PowerPC doesn't always restore r3-r12 for us */
> > >     #define _NOLIBC_SYSCALL_CLOBBERLIST 
> > >     	"memory", "cr0", "r12", "r11", "r10", "r9", "r8", "r7", "r6", "r5", "r4"
> > > 
> > >     /* PowerPC write GPRS in kernel side but not restore them */
> > >     #define _NOLIBC_GPRS_AS_OUTPUT_OPERANDS
> > >     
> > >     #define _NOLIBC_REG_NUM  "r0"
> > >     #define _NOLIBC_REG_RET  "r3"
> > >     #define _NOLIBC_REG_arg1 "r3"
> > >     #define _NOLIBC_REG_arg2 "r4"
> > >     #define _NOLIBC_REG_arg3 "r5"
> > >     #define _NOLIBC_REG_arg4 "r6"
> > >     #define _NOLIBC_REG_arg5 "r7"
> > >     #define _NOLIBC_REG_arg6 "r8"
> > > 
> > > Before:
> > > 
> > >     $ ls tools/include/nolibc/arch-*.h | while read f; do git show dfef4fc45d5713eb23d87f0863aff9c33bd4bfaf:$f 2>/dev/null | wc -l | tr -d '\n'; echo " $f"; done
> > >     157 tools/include/nolibc/arch-aarch64.h
> > >     199 tools/include/nolibc/arch-arm.h
> > >     178 tools/include/nolibc/arch-i386.h
> > >     164 tools/include/nolibc/arch-loongarch.h
> > >     195 tools/include/nolibc/arch-mips.h
> > >     0 tools/include/nolibc/arch-powerpc.h
> > >     160 tools/include/nolibc/arch-riscv.h
> > >     186 tools/include/nolibc/arch-s390.h
> > >     176 tools/include/nolibc/arch-x86_64.h
> > > 
> > > After:
> > > 
> > >     $ wc -l tools/include/nolibc/arch-*.h
> > >        54 tools/include/nolibc/arch-aarch64.h
> > >        84 tools/include/nolibc/arch-arm.h
> > >        90 tools/include/nolibc/arch-i386.h                        /* the last one use stack to pass arguments, reserve as-is */
> > >        59 tools/include/nolibc/arch-loongarch.h
> > >       120 tools/include/nolibc/arch-mips.h                        /* the last two use stack to pass arguments, reserve as-is */
> > >        73 tools/include/nolibc/arch-powerpc.h
> > >        58 tools/include/nolibc/arch-riscv.h
> > >        87 tools/include/nolibc/arch-s390.h
> > >        67 tools/include/nolibc/arch-x86_64.h
> > > 
> > > syscall.h itself:
> > > 
> > >     $ wc -l tools/include/nolibc/syscall.h
> > >     112 tools/include/nolibc/syscall.h 
> > 
> > The important thing to consider is not the number of lines but the
> > *maintainability*.
> 
> The original goal is not really the number of lines (only a
> 'side-effect'), but is exactly easier porting/maintainability with
> clearer code architecture,

I do feel the exact opposite. One is totally straightforward with
self-explanatory function names and their equivalent machine-specific
asm() statements, the other one involves countless cryptic macros for
which it is particularly difficult to figure what depends on what.

> > You factored the syscall code so much above with all
> > these macros that I don't even understand how they're going to interact
> > with each other, especially "%0".
> 
> Yeah, it is my fault, this should be cleaned up with the return register
> directly:
> 
>     #define _NOLIBC_SYSCALL_CALL \
>     	"sc; bns+ 1f; neg 3, 3; 1:"

This doesn't change my point of view on it, really.

> > Also I don't know what the macro
> > _NOLIBC_GPRS_AS_OUTPUT_OPERANDS does.
> 
> This is the root cause to inspire me to add the new syscall.h, let's
> explain the background step by step.
> 
> All of the other architectures (except PowerPC) restore GPRS for us when
> return from syscall, so, their clobber list simply not include the GPRS
> and only need to add the input registers in the 'INPUT Operands' list.

I still have no idea what a GPRS is.

> But PowerPC doesn't restore such GPRS for us, I'm not sure if it is a
> feature (Maybe) or a bug.

We don't really care. The *exact* purpose of an asm() statement is to
write stuff that cannot be expressed at a higher level. Sure sometimes
you can abuse macros. But this should be extremely light. Here you seem
to be using a common asm statement for everyone and going to stuff the
combination of all these macros into it. asm() statements are already
quite cryptic for a lot of people, and the minimum required is that
they are easy to read so that the few who know what these are doing can
help debug them. When Ammar spotted the alignment bug in our _start
code, it didn't take long to figure the root cause of the issue nor to
fix it, precisely because that code was straightforward for someone with
a bit of asm skills. But how do you want anyone to figure what's happening
in something full of abstractions ? Look for example, in order to add
the stackprot support, Thomas just had to append a call at various
points. When you need to do that in factored code that's forcefully
arranged to try to suit all archs and toolchains at once, it may end up
being almost impossible without breaking the organization and starting
to create arch-specific definitions again.

> for PowerPC32, the following line will restore
> the GPRS for us, but may be not a good idea to do so for it may decrease
> the syscall performance although save some instructions in user-space
> and also, the other libcs also follow the current rule, so, this may be
> a design intention we must follow (welcome your suggestions).
> 
>     diff --git a/arch/powerpc/kernel/entry_32.S b/arch/powerpc/kernel/entry_32.S
>     index fe27d41f9a3d..1ed535e9144c 100644
>     --- a/arch/powerpc/kernel/entry_32.S
>     +++ b/arch/powerpc/kernel/entry_32.S
>     @@ -155,6 +155,7 @@ syscall_exit_finish:
>             bne     3f
>             mtcr    r5
>     
>     +       REST_GPRS(3, 12, r1)
>      1:     REST_GPR(2, r1)
>             REST_GPR(1, r1)
>             rfi

I don't know PPC and I have zero opinion there. For this we'll ask one
of the PPC maintainers for guidance, and since we have clean asm code,
they will easily be able to say "yes that's fine", "hmmm no that's not
the right way to do it", or "I suspect you forgot to save flags here",
or anything else, because the code will match a pattern they know well.
With all the macros hell, it will just be "hmmm good luck".

> For PowerPC, if with the previous method like the other architectures,
> the clobber list differs for every my_syscall<N> and all of the input
> registers must be put in the 'OUTPUT Operands' too to avoid compiler to
> save and resue a variable in such GPRS across my_syscall<N> calls.

But do you realize that you're proposing to write macros to factor things
between archs that are already different within a single arch ? How is
this supposed to help at doing anything ?

> Originally in my local new version of arch-powerpc.h, we got such code
> for every my_syscall<N>, use my_syscall6() as an example:
> 
>     #define my_syscall6(num, arg1, arg2, arg3, arg4, arg5, arg6)                 \
>     ({                                                                           \
>             register long _ret  __asm__ ("r3");                                  \
>             register long _num  __asm__ ("r0") = (num);                          \
>             register long _arg1 __asm__ ("r3") = (long)(arg1);                   \
>             register long _arg2 __asm__ ("r4") = (long)(arg2);                   \
>             register long _arg3 __asm__ ("r5") = (long)(arg3);                   \
>             register long _arg4 __asm__ ("r6") = (long)(arg4);                   \
>             register long _arg5 __asm__ ("r7") = (long)(arg5);                   \
>             register long _arg6 __asm__ ("r8") = (long)(arg6);                   \
>                                                                                  \
>             __asm__ volatile (                                                   \
>                     "       sc\n"                                                \
>                     "       bns+ 1f\n"                                           \
>                     "       neg  %0, %0\n"                                       \
>                     "1:\n"                                                       \
>                     : "=r"(_ret),                                                \
> 		      "+r"(_arg2), "+r"(_arg3), "+r"(_arg4),                     \
>                       "+r"(_arg5), "+r"(_arg6)                                   \
>                     : "0"(_arg1), "r"(_num)                                      \
>                     : _NOLIBC_SYSCALL_CLOBBERLIST                                \
>             );                                                                   \
>             _ret;                                                                \
>     })
> 
> It almost aligns with the other architectures, but the full clobber list
> differs for every my_syscall<N>, the basic one is:
> 
>     /* PowerPC kernel doesn't always restore r4-r12 for us */
>     #define _NOLIBC_SYSCALL_CLOBBERLIST \
>         "memory", "cr0", "r12", "r11", "r10", "r9",
> 
> Use my_syscall0() as a further example, we need something like this:
> 
>     #define my_syscall0(num)                                                     \
>     ({                                                                           \
>             register long _ret  __asm__ ("r3");                                  \
>             register long _num  __asm__ ("r0") = (num);                          \
>                                                                                  \
>             __asm__ volatile (                                                   \
>                     "       sc\n"                                                \
>                     "       bns+ 1f\n"                                           \
>                     "       neg  %0, %0\n"                                       \
>                     "1:\n"                                                       \
>                     : "=r"(_ret)                                                 \
>                     : "r"(_num)                                                  \
>                     : _NOLIBC_SYSCALL_CLOBBERLIST, "r8", "r7", "r6", "r5", "r4"  \
>             );                                                                   \
>             _ret;                                                                \
>     })
> 
> The additional "r8"..."r4" must be appended to the clobber list for they
> can not be put together for every my_syscall<N> due to conflicts between
> they between the clobber list and the "OUTPUT/INPUT operands".

Perfect, yet another example of the real purpose of asm() statements.
They're not generic and are made to finely tune what you're inserting.

> I found a solution to share the same _NOLIBC_SYSCALL_CLOBBERLIST, that
> is split the "OUTPUT/INPUT Operands" list out of the core syscall
> assembly:
> 
>     #define my_syscall6(num, arg1, arg2, arg3, arg4, arg5, arg6)                 \
>     ({                                                                           \
>             register long _ret  __asm__ ("r3");                                  \
>             register long _num  __asm__ ("r0") = (num);                          \
>             register long _arg1 __asm__ ("r3") = (long)(arg1);                   \
>             register long _arg2 __asm__ ("r4") = (long)(arg2);                   \
>             register long _arg3 __asm__ ("r5") = (long)(arg3);                   \
>             register long _arg4 __asm__ ("r6") = (long)(arg4);                   \
>             register long _arg5 __asm__ ("r7") = (long)(arg5);                   \
>             register long _arg6 __asm__ ("r8") = (long)(arg6);                   \
>                                                                                  \
>             __asm__ volatile ("": "+r"(_arg2), "+r"(_arg3), "+r"(_arg4),         \
>                                   "+r"(_arg5), "+r"(_arg6)::);                   \
>                                                                                  \
>             __asm__ volatile (                                                   \
>                     "       sc\n"                                                \
>                     "       bns+ 1f\n"                                           \
>                     "       neg  %0, %0\n"                                       \
>                     "1:\n"                                                       \
>                     : "=r"(_ret)                                                 \
>                     : "0"(_arg1), "r"(_num)                                      \
>                     : _NOLIBC_SYSCALL_CLOBBERLIST                                \
>             );                                                                   \
>             _ret;                                                                \
>     })

So basically "it happens to work but we don't know why, but this is still
much more maintainable" ? Please no, really no, no, no. That's ugly,
tricky and you don't even know what the compiler will do between these
two statements.

> Note, a question here is, the above split still require more discussion
> to make sure it does work for different toolchains (only test for gcc
> currently) or even if this method is right from scratch, welcome your
> suggestion.

asm() statements are used to work around toolchain limitations/differences
and bugs. I've seen Ammar's response with the link to the gcc bug, that's
a good example as well of the reasons why we MUST NOT do these hacks.

> As a result, all of the my_syscall<N> are able to share the core syscall
> calling assembly block, so, here is what we at last have:
> 
>     #define _my_syscall_tail()                                              \
>     	__asm__ volatile (                                                  \
>     		_NOLIBC_SYSCALL_CALL                                        \
>     		: "=r"(_ret)                                                \
>     		: "r"(_num)                                                 \
>     		: _NOLIBC_SYSCALL_CLOBBERLIST                               \
>     	);                                                                  \
>     	_ret
> 
> And further, we also found it was possible to share most of them among
> these not ugly but completely duplicated lines:

But please, from the beginning, all I understand is "it is possible to",
but I still fail to understand the ultimate goal. Making the code uglier
and unmaintainable because it is possible is not a valid argument. For
doing stuff like above, there must be a serious limitation to work around
that has no other solution, and even then a huge #if/#endif could possibly
do it.

> That's why at last we have such blocks (of course, for PowerPC itself
> it is a big change and not necessary):
> 
>     #define _my_syscall_head(num)                                               \
>     	register long _ret __asm__ (_NOLIBC_REG_RET);                           \
>     	register long _num __asm__ (_NOLIBC_REG_NUM) = (num)                    \
>     
>     #ifdef _NOLIBC_REG_ERR
>     #define _NOLIBC_REG_EXTRA _NOLIBC_REG_ERR
>     #endif
>     
>     #ifdef _NOLIBC_REG_EXTRA
>     #define _my_syscall_extra() \
>     	register long reg_extra __asm__ (_NOLIBC_REG_EXTRA);                   \
>     	__asm__ volatile ("": "=r"(reg_extra)::)
>     #else
>     #define _my_syscall_extra()
>     #endif
>     
>     /* Architectures like PowerPC write GPRS in kernel side and not restore them */
>     #ifndef _NOLIBC_GPRS_AS_OUTPUT_OPERANDS
>     #define _my_syscall_argn(n, arg)                                            \
>     	register long _arg##n __asm__ (_NOLIBC_REG_arg##n) = (long)(arg);       \
>     	__asm__ volatile ("":: "r"(_arg##n):)
>     #else
>     #define _my_syscall_argn(n, arg)                                            \
>     	register long _arg##n __asm__ (_NOLIBC_REG_arg##n) = (long)(arg);       \
>     	__asm__ volatile ("": "+r"(_arg##n)::)
>     #endif

And someone is able to help us work around a compiler or assembler bug
in this ? I can't even spend enough concentration on the whole block to
understand what it's trying to do or what interacts with what. I'm sorry,
that's not a way to deal with asm, nor code shared with multiple developers
in general.

(...)
> And at last:
> 
>     #define __my_syscall_args(N, ...) _my_syscall_args##N(__VA_ARGS__)
>     #define _my_syscall_args(N, ...) __my_syscall_args(N, ##__VA_ARGS__)
> 
>     #define __my_syscall_narg(_0, _1, _2, _3, _4, _5, _6, N, ...) N
>     #define _my_syscall_narg(...) __my_syscall_narg(__VA_ARGS__, 6, 5, 4, 3, 2, 1, 0)
> 
>     #define __my_syscall_argsn(N, argn, ...) \
>     	_my_syscall_args(_my_syscall_narg(NULL, ##__VA_ARGS__), ##__VA_ARGS__); \
>     	_my_syscall_argn(N, argn)
>     
>     #define _my_syscall_argsn(...) __my_syscall_argsn(_my_syscall_narg(NULL, ##__VA_ARGS__), ##__VA_ARGS__)
> 
>     /* Note, my_syscall0() has no argument, can not use my_syscalln() */
>     #define my_syscall0(num)                                                           \
>     ({                                                                                 \
>     	_my_syscall_head(num);                                                         \
>     	_my_syscall_extra();                                                           \
>     	_my_syscall_tail();                                                            \
>     })
>     
>     #define my_syscalln(num, ...)                                                      \
>     ({                                                                                 \
>     	_my_syscall_head(num);                                                         \
>     	_my_syscall_extra();                                                           \
>     	_my_syscall_argsn(__VA_ARGS__);                                                \
>     	_my_syscall_tail();                                                            \
>     })
>     
>     #define my_syscall1(num, arg1)                               my_syscalln(num, arg1)
>     #define my_syscall2(num, arg1, arg2)                         my_syscalln(num, arg2, arg1)
>     #define my_syscall3(num, arg1, arg2, arg3)                   my_syscalln(num, arg3, arg2, arg1)
>     #define my_syscall4(num, arg1, arg2, arg3, arg4)             my_syscalln(num, arg4, arg3, arg2, arg1)
>     
>     #ifndef my_syscall5
>     #define my_syscall5(num, arg1, arg2, arg3, arg4, arg5)       my_syscalln(num, arg5, arg4, arg3, arg2, arg1)
>     #endif
>     #ifndef my_syscall6
>     #define my_syscall6(num, arg1, arg2, arg3, arg4, arg5, arg6) my_syscalln(num, arg6, arg5, arg4, arg3, arg2, arg1)
>     #endif
> 
> At last, I found this worked on all of the supported architectures, so,
> the new syscall.h is proposed.

If the ultimate goal is *just* to provide my_syscalln(), it's not needed
to rework all archs like this. Just doing this does the job as well, it
will allow my_syscalln(syscall_num, ...) to call the respective
my_syscall0/1/2/3/4/5/6 depending on the number of arguments:

  /* my_syscalln() will automatically map to my_syscall<n>() depending
   * on the number of arguments after the syscall, from 0 to 6. It uses
   * positional arguments after a VA_ARGS to resolve as an argument
   * count that's then used to build the underlying macro's name.
   */
  #define _my_syscall0(num, a, b, c, d, e, f) my_syscall0(num)
  #define _my_syscall1(num, a, b, c, d, e, f) my_syscall1(num, a)
  #define _my_syscall2(num, a, b, c, d, e, f) my_syscall2(num, a, b)
  #define _my_syscall3(num, a, b, c, d, e, f) my_syscall3(num, a, b, c)
  #define _my_syscall4(num, a, b, c, d, e, f) my_syscall4(num, a, b, c, d)
  #define _my_syscall5(num, a, b, c, d, e, f) my_syscall5(num, a, b, c, d, e)
  #define _my_syscall6(num, a, b, c, d, e, f) my_syscall6(num, a, b, c, d, e, f)
  #define _my_syscalln(num, a, b, c, d, e, f, g, ...) _my_syscall##g(num, a, b, c, d, e, f)
  #define my_syscalln(num, ...) _my_syscalln(num, ##__VA_ARGS__, 6, 5, 4, 3, 2, 1, 0)

This way there's no need to modify the arch-specific syscall definitions,
this will simply rely on them and preserve their maintainability.

> BTW, another question here is, to utilize the feature of __VA_ARGS__ to
> easier getting the last argument, the order of arguments are reversed
> during the declarations of the my_syscall<N>, any suggestion on this
> part? is it possible to not reverse the order?

There's no reverse order, it's a well-known method consisting in making
a number appear at a fixed position depending on the number of preceeding
arguments:

   my_syscalln(n, a, b, c, d, e, f) becomes
  _my_syscalln(n, a, b, c, d, e, f, 6, 5, 4, 3, 2, 1, 0)
                                    ^
        This macro extracts this number above to build the next macro name:
  _my_syscall6(n, a, b, c, d, e, f)
   my_syscall6(n, a, b, c, d, e, f)

If you use less arguments, say 3, you get this:

   my_syscalln(n, a, b, c)
  _my_syscalln(n, a, b, c, 6, 5, 4, 3, 2, 1, 0)
  _my_syscall3(n, a, b, c, d, e, f)
   my_syscall3(n, a, b, c)  // d, e, f are lost

The last level of macro is used to silently drop the extra args. When
target is already a macro, it's not even necessary as the macro
definition could already end with ", ...".

I've been using a similar one above in other projects for quite a while
and I know that it worked at least in gcc-3.4, so it's definitely safe.

> > And when someone reports a bug
> > like we had in the past with programs randomly crashing depending on
> > stack alignment and such, it becomes particularly tricky to figure what
> > is expected and how it differs from reality.
> >
> 
> Macros are really hard to debug, the above code lines cost me two days ;-)

Someone once said that it requires a must stronger brain to solve a problem
than the one that created it. If it took you two days to arrange this,
imagine how long it will take to someone having not designed this to debug
it! The time it took you is definitely not an argument for adopting this,
quite the opposite. Instead it should have convinced you that this was
going to become unmaintainable.

> but after the common syscall.h, the left architecture specific parts are
> very few and easier to debug and even less copy and paste.

Copy-paste is a problem when bugs need to be fixed. Here there's almost no
copy-paste, copy-paste is done initially to create a new arch but in fact
we're reusing a skeletton to write completely different code. Because code
for PPC and MIPS are different, there's no point imagining that once a bug
affects MIPS we need to automatically apply the same fix to PPC, because
it will be different.

> > I don't know. I still tend to think that
> > this significantly complicates the understanding of the whole thing.
> >
> 
> Willy, don't worry, I do think it make things easier, the worse case is
> using the old code or only use part of our new blocks helpers ;-)

Sorry and don't take it bad, I don't want you to feel it as being rude,
but for me the worst case would be to use the new method precisely
because for now only you probably know how that's supposed to work and
nobody can help us with side effects affecting it.

> For this new syscall.h, it mainly clear the inline assembly codes, as we
> know, inline assembly code is a very complicated thing,

That's why it must remain crystal clear.

> If we clear the
> logic carefully (as we target but not yet) in our common code,
> architecture developers only require to focus on the platform specific
> definitions, it should be better for portability, review and
> maintainability.

In order to work on assembly you first need to be able to locate it
and read it as a sequence of instructions. Here really, you need a
pen and paper to start to resolve it.

> It is very hard to learn the meanning of the OUTPUT operands, INPUT
> operands and even the clobber list and even the flags you mentioned
> below,

One more reason for not passing through this!

> > Also, looking at different archs' syscall code, they're not all defined
> > the same way. Some like i386 use "=a" for the return value. Others use
> > "+r" as an input/output value, others "=r",
> 
> Agree, this is a very hard part of the inline assembly codes, clearing
> up the generic versions in syscall.h with additional comments may really
> help a lot.

No precisely not. It's a hard part for people who don't deal with *that*
arch. But you will not find a developer at ease with all archs. Each one
has its own specifics. However you will find one or a few developers that
are experts on each architecture and who will instantly be able to correct
some of our mistakes, or warn us against toolchain bugs they're aware of
that we need to take care of. They will also know which constraints to
use. The constraint definitions are per-architecture, and for example "a"
on x86 is the accumulator (eax/rax). On other archs it can be something
else. There are archs which support register pairs, some must be aligned
on an even number and depending on how the calls are declared, the compiler
may improperly assign them and emit code that is impossible to assemble
(I've met this several times with the initial ARM code that we managed
to stabilize).

>   For the input registers used as "ERR" or "RET" output, "+r" is used
>   before, but now, we split them to two parts, one is "=r"(_ret) in
>   _my_syscall_tail(), another is in _my_syscall_argn(n, arg), they
>   together work for "+r" = "=r" + "r"

Please do not generalize. The example I gave above indicate stuff that
was initially hard to adjust precisely to help the compiler emit correct
code with a wide enough range of tools. The example Ammar talked about
is a perfect such example.

> Even for s390, test shows, "r" instead of "d" works too

With your toolchain and the code you tested with. There might be a
particular reason, I don't know. Maybe the maintainer is used to
using this because it also works this way on another compiler and
he will be more fluent with this one. That's something important as
well when dealing with asm statements.

> (boot and test
> passed, not yet checked the size and codes generated), but I didn't find
> any document about "d" for s390 from the gcc online doc. This part
> (include all of the supported architectures) should be carefully checked
> if really adding our new syscall.h. add s390 nolibc committer to CC: list.

I do trust the s390 maintainers who contributed this code to know better
than either you and me, and it's certainly not up to us to ask them to
justify their choice. Actually it would be the other way around, you
would need a solid argument for changing code that works.

> But architectures like i386, If "=a", "=b", ... modifiers are necessary,
> new versions of the blocks should be added for these architectures.

You'll just end up with as many blocks at the end, but dealing only with
a union of exception. That's exactly the worst that can be imagined for
maintenance.

> And further, some architectures may resue some helpers from our new syscalls.h
> or at least learn something from what we have done for all of the supported
> architectures.

The arch-specific code is already minimal. We have 7 asm statements for
7 syscall conventions. That's ridiculously low and they contain what
any such arch maintainer would need to find to extend or fix them.

> > If you
> > make it very complicated, I suspect we won't get any such contributions
> > anymore just because nobody figures how to find their way through it,
> > or it would require to change again for everyone just to add one specific
> > case. I tend to think that the current situation is already fairly
> > minimal with quite readable and debuggable calls, and that it's what
> > *really* matters.
> >
> 
> This may really influence the upstream and review flow:
> 
> - If people may send a new architecture support, if just fits the new
>   syscall.h, we may need to review carefully about the test results,
>   especially for the input/output operands, error register.

First they'd need to be able to figure what to put in what. Look, they
know what are the 3 instructions they need to put in an asm statement and
the list of registers, and suddenly they'd need to figure how to spread
them into cryptic macros, some of which are sometimes used, others always
etc. That turns a 20-minute test into half a day, without big assurance
at the end of the day that everything is right.

>   As tests for powerpc shows, the above issues should be covered by our
>   nolibc-test.
> 
> - If people may send a new architecture support as before, If we find it
>   fits our new syscalls.h or it can apply part of the blocks, we can
>   give some suggestions.
>
> - If people send something not just fit our new syscall.h, we may be
>   able to think about if a new 'hook' is required, but it is not
>   necessary, we can delay this change requirement to after a careful
>   design (just like the argument passing via 'stack' case currently) .

That's already the case with i386, s390 and so on. Except that it adds
significant burden for that person.

> > When you're trying to reorganize code, it's important to ask yourself
> > whether you would prefer to debug the old one or the new one at 3am.
> > Hint: at 3am, the more abstractions there are, the less understandable
> > it becomes.
> >
> 
> Interesting and agree, but this abstraction does clear something to be more
> undersatndable too ;-)

It's really the first time I hear that abstractions makes one-liner ASM
code clearer and more understandable. I'm sorry but not, really, that's
exactly the opposite.

> I do hate hard-debuggable macros, but as we mentioned above, the inline
> assembly code is another harder parts, the new carefully tuned blocks may really
> help us to avoid introduce bugs with manually wrotten new codes and also it may
> help us to avoid copy and paste multiple duplicated lines of the same codes. 

No, the asm blocks are trivial for those who speak this language and are
hard for other ones. The macros are significantly harder and for everyone.
I prefer to ask an s390 or PPC maintainer when I need help with their code
rather than tweak the generic code adding a "+r" for every arch then read
about reports saying that this arch breaks with that version of the
compiler on that program with that version of the assembler.

Please again, don't take any of this personally, I'm just feeling that
you tried to address a difficulty to dig into some arch-specific code,
that you wanted to hide and that you feel like it is more maintainable,
but it's not. Maintainability in a shared project doesn't mean that you
are suddenly skilled on everything, but that you are able to find someone
skilled on your problem. It's not necessarily your task to debug an
architecture you don't know (though it's often very instructive), there
are other people for this and that's perfectly fine. We need to make the
task easy for them so that they don't have to learn all the nolibc tricks
to share their knowledge. In the current form with the asm statements
it's perfectly feasible and that's what matters.

Hoping this clarifies my position on this.

Thanks,
Willy

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v1 1/8] tools/nolibc: add support for powerpc
  2023-07-25 18:23                 ` Ammar Faizi
@ 2023-07-25 20:23                   ` Zhangjin Wu
  0 siblings, 0 replies; 37+ messages in thread
From: Zhangjin Wu @ 2023-07-25 20:23 UTC (permalink / raw)
  To: ammarfaizi2; +Cc: arnd, falcon, linux-kernel, linux-kselftest, svens, thomas, w

> On Wed, Jul 26, 2023 at 01:04:26AM +0800, Zhangjin Wu wrote:
> > My old 'reply' is not rigorous, since the syscall6() uses stack to pass
> > the 6th argument, so, our new syscall.h didn't support it currently,
> > the syscalls I have tested about "=r" instead of "=a" were only syscall1-5().
> 
> Yeah, it won't fit with the new design.
> 
> i386 runs out of GPRs very quickly. Given that, it had a hard time
> implementing syscall6() properly in nolibc. The calling convention
> itself actually doesn't require stack for executing 'int $0x80'.
> 
> The reason of why it uses stack is because the %ebp register cannot be
> listed in the clobber list nor in the constraint if -fomit-frame-pointer
> is not activated. Thus, we have to carefully preserve the value on the
> stack before using %ebp as the 6-th argument to the syscall. It's a hack
> to make it work on i386.
> 
> > Ok, so, with the new syscalls.h proposed, we'd better keep i386
> > syscall6() as-is.
> > 
> > For the left syscall1-5(), is there any risk when use '=r' instead of 'r'?
> 
> Using "=r" instead of "r" doesn't make sense.
> 
> Did you mean "=r" instead of "=a"?
>

Yeah, sorry.

> If that's what you mean:
> 
> So far I don't see the risk of using "=r" instead of "=a" as long as the
> variable is properly marked as 'register' + asm("eax").
>

Thanks.
Zhangjin

> -- 
> Ammar Faizi

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v1 1/8] tools/nolibc: add support for powerpc
  2023-07-25 18:27             ` Willy Tarreau
@ 2023-07-25 20:52               ` Zhangjin Wu
  0 siblings, 0 replies; 37+ messages in thread
From: Zhangjin Wu @ 2023-07-25 20:52 UTC (permalink / raw)
  To: w; +Cc: ammarfaizi2, arnd, falcon, linux-kernel, linux-kselftest, svens, thomas

Hi Willy,

> On Tue, Jul 25, 2023 at 07:02:55PM +0800, Zhangjin Wu wrote:
> > > > With this new syscall.h, to support my_syscall<N>, the arch-<ARCH>.h
> > > > will only require to add ~10 lines to define their own syscall
> > > > instructions, registers and clobberlist, which looks like this (for
> > > > powerpc):
> > > > 
> > > >     #define _NOLIBC_SYSCALL_CALL "sc; bns+ 1f; neg  %0, %0; 1:"
> > > > 
> > > >     /* PowerPC doesn't always restore r3-r12 for us */
> > > >     #define _NOLIBC_SYSCALL_CLOBBERLIST 
> > > >     	"memory", "cr0", "r12", "r11", "r10", "r9", "r8", "r7", "r6", "r5", "r4"
> > > > 
> > > >     /* PowerPC write GPRS in kernel side but not restore them */
> > > >     #define _NOLIBC_GPRS_AS_OUTPUT_OPERANDS
> > > >     
> > > >     #define _NOLIBC_REG_NUM  "r0"
> > > >     #define _NOLIBC_REG_RET  "r3"
> > > >     #define _NOLIBC_REG_arg1 "r3"
> > > >     #define _NOLIBC_REG_arg2 "r4"
> > > >     #define _NOLIBC_REG_arg3 "r5"
> > > >     #define _NOLIBC_REG_arg4 "r6"
> > > >     #define _NOLIBC_REG_arg5 "r7"
> > > >     #define _NOLIBC_REG_arg6 "r8"
> > > > 
> > > > Before:
> > > > 
> > > >     $ ls tools/include/nolibc/arch-*.h | while read f; do git show dfef4fc45d5713eb23d87f0863aff9c33bd4bfaf:$f 2>/dev/null | wc -l | tr -d '\n'; echo " $f"; done
> > > >     157 tools/include/nolibc/arch-aarch64.h
> > > >     199 tools/include/nolibc/arch-arm.h
> > > >     178 tools/include/nolibc/arch-i386.h
> > > >     164 tools/include/nolibc/arch-loongarch.h
> > > >     195 tools/include/nolibc/arch-mips.h
> > > >     0 tools/include/nolibc/arch-powerpc.h
> > > >     160 tools/include/nolibc/arch-riscv.h
> > > >     186 tools/include/nolibc/arch-s390.h
> > > >     176 tools/include/nolibc/arch-x86_64.h
> > > > 
> > > > After:
> > > > 
> > > >     $ wc -l tools/include/nolibc/arch-*.h
> > > >        54 tools/include/nolibc/arch-aarch64.h
> > > >        84 tools/include/nolibc/arch-arm.h
> > > >        90 tools/include/nolibc/arch-i386.h                        /* the last one use stack to pass arguments, reserve as-is */
> > > >        59 tools/include/nolibc/arch-loongarch.h
> > > >       120 tools/include/nolibc/arch-mips.h                        /* the last two use stack to pass arguments, reserve as-is */
> > > >        73 tools/include/nolibc/arch-powerpc.h
> > > >        58 tools/include/nolibc/arch-riscv.h
> > > >        87 tools/include/nolibc/arch-s390.h
> > > >        67 tools/include/nolibc/arch-x86_64.h
> > > > 
> > > > syscall.h itself:
> > > > 
> > > >     $ wc -l tools/include/nolibc/syscall.h
> > > >     112 tools/include/nolibc/syscall.h 
> > > 
>
> [...]
>
> Hoping this clarifies my position on this.
>

Willy, Thanks very much for your detailed reply, based on your reply, I
plan to renew the powerpc patchset itself at first since both you and
Thomas have already reviewed it carefully.

After that, I will come back to read your reply again and discuss more
about our new syscall.h, I still think it is something valuable to take
a look at, although something about it still need more attention,
perhaps a RFC patchset is better for more discuss, it may show us the
profile easily.

Best regards,
Zhangjin

> Thanks,
> Willy

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v1 4/8] selftests/nolibc: add extra config file customize support
  2023-07-25 14:30     ` Zhangjin Wu
@ 2023-07-29  7:45       ` Willy Tarreau
  2023-07-29  9:43         ` Zhangjin Wu
  0 siblings, 1 reply; 37+ messages in thread
From: Willy Tarreau @ 2023-07-29  7:45 UTC (permalink / raw)
  To: Zhangjin Wu; +Cc: arnd, linux-kernel, linux-kselftest, linux, thomas

Hi Zhangjin,

On Tue, Jul 25, 2023 at 10:30:06PM +0800, Zhangjin Wu wrote:
> Hi, Willy
> 
> > On Wed, Jul 19, 2023 at 05:14:07AM +0800, Zhangjin Wu wrote:
> > > The default DEFCONFIG_<ARCH> may not always work for all architectures,
> > > some architectures require to add extra kernel config options, this adds
> > > a new 'extconfig' target for this requirement.
> > > 
> > > It allows to customize extra kernel config options via the common
> > > common.config and the architecture specific <ARCH>.config, at last
> > > trigger 'allnoconfig' to let them take effect with missing config
> > > options as disabled.
> > > 
> > > The scripts/kconfig/merge_config.sh tool is used to merge the extra
> > > config files.
> > > 
> > > Suggested-by: Thomas Wei?schuh <linux@weissschuh.net>
> > > Link: https://lore.kernel.org/lkml/67eb70d4-c9ff-4afc-bac7-7f36cc2c81bc@t-8ch.de/
> > > Signed-off-by: Zhangjin Wu <falcon@tinylab.org>
> > > ---
> > >  tools/testing/selftests/nolibc/Makefile | 7 +++++++
> > >  1 file changed, 7 insertions(+)
> > > 
> > > diff --git a/tools/testing/selftests/nolibc/Makefile b/tools/testing/selftests/nolibc/Makefile
> > > index f42adef87e12..08a5ca5f418b 100644
> > > --- a/tools/testing/selftests/nolibc/Makefile
> > > +++ b/tools/testing/selftests/nolibc/Makefile
> > > @@ -39,6 +39,9 @@ DEFCONFIG_s390       = defconfig
> > >  DEFCONFIG_loongarch  = defconfig
> > >  DEFCONFIG            = $(DEFCONFIG_$(ARCH))
> > >  
> > > +# extra kernel config files under configs/, include common + architecture specific
> > > +EXTCONFIG            = common.config $(ARCH).config
> > > +
> > >  # optional tests to run (default = all)
> > >  TEST =
> > >  
> > > @@ -162,6 +165,10 @@ initramfs: nolibc-test
> > >  defconfig:
> > >  	$(Q)$(MAKE) -C $(srctree) ARCH=$(ARCH) CC=$(CC) CROSS_COMPILE=$(CROSS_COMPILE) mrproper $(DEFCONFIG) prepare
> > >  
> > > +extconfig:
> > > +	$(Q)$(srctree)/scripts/kconfig/merge_config.sh -O "$(srctree)" -m "$(srctree)/.config" $(foreach c,$(EXTCONFIG),$(wildcard $(CURDIR)/configs/$c))
> > > +	$(Q)$(MAKE) -C $(srctree) ARCH=$(ARCH) CC=$(CC) CROSS_COMPILE=$(CROSS_COMPILE) KCONFIG_ALLCONFIG="$(srctree)/.config" allnoconfig
> > > +
> > 
> > Please also mention this entry in the "help" message.
> > Other than that, OK.
> >
> 
> Willy, as we discussed in another tinyconfig series, in order to avoid
> add more complexity to users, I plan to drop this standalone 'extconfig'
> target and move the extra config operations to defconfig target like
> this:
> 
>     defconfig:
>      	$(Q)$(MAKE) -C $(srctree) ARCH=$(ARCH) CC=$(CC) CROSS_COMPILE=$(CROSS_COMPILE) mrproper $(DEFCONFIG) prepare
>      	$(Q)$(srctree)/scripts/kconfig/merge_config.sh -O "$(srctree)" -m "$(srctree)/.config" $(foreach c,$(EXTCONFIG),$(wildcard $(CURDIR)/configs/$c))
>      	$(Q)$(MAKE) -C $(srctree) ARCH=$(ARCH) CC=$(CC) CROSS_COMPILE=$(CROSS_COMPILE) KCONFIG_ALLCONFIG="$(srctree)/.config" allnoconfig
> 
> This extra config options are really critical to default boot and test, so, it
> should be a 'default' config action as the 'defconfig' target originally mean.
> 
> Will test carefully about this change, what's your idea?

Well, *iff* we need to have per-arch config settings, probably to better
support Qemu, the console output, etc, then indeed we'd rather add them
to all configs indeed. However the example above is bogus. First you
create a default config, then prepare the rest of the kernel, then
merge that config with the extensions (based on the arch and not the
sub-arch BTW) then erase everything to restart from an allnoconfig.

Also if you're using merge_config you'll need -Q to disable warning
about overridden options since you're starting from a defconfig which
contains plenty of them. Usually I just do defconfig, append the few
changes, then oldconfig and that's done. But merge_config can probably
be fine as well. Also make prepare should be done last, to make sure
that if it depends on anything in the config (e.g. 32 vs 64 bit etc)
it's up to date.

Willy

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v1 4/8] selftests/nolibc: add extra config file customize support
  2023-07-29  7:45       ` Willy Tarreau
@ 2023-07-29  9:43         ` Zhangjin Wu
  0 siblings, 0 replies; 37+ messages in thread
From: Zhangjin Wu @ 2023-07-29  9:43 UTC (permalink / raw)
  To: w; +Cc: arnd, falcon, linux-kernel, linux-kselftest, linux, thomas

> Hi Zhangjin,
> 
> On Tue, Jul 25, 2023 at 10:30:06PM +0800, Zhangjin Wu wrote:
> > Hi, Willy
> > 
> > > On Wed, Jul 19, 2023 at 05:14:07AM +0800, Zhangjin Wu wrote:
> > > > The default DEFCONFIG_<ARCH> may not always work for all architectures,
> > > > some architectures require to add extra kernel config options, this adds
> > > > a new 'extconfig' target for this requirement.
> > > > 
> > > > It allows to customize extra kernel config options via the common
> > > > common.config and the architecture specific <ARCH>.config, at last
> > > > trigger 'allnoconfig' to let them take effect with missing config
> > > > options as disabled.
> > > > 
> > > > The scripts/kconfig/merge_config.sh tool is used to merge the extra
> > > > config files.
> > > > 
> > > > Suggested-by: Thomas Wei?schuh <linux@weissschuh.net>
> > > > Link: https://lore.kernel.org/lkml/67eb70d4-c9ff-4afc-bac7-7f36cc2c81bc@t-8ch.de/
> > > > Signed-off-by: Zhangjin Wu <falcon@tinylab.org>
> > > > ---
> > > >  tools/testing/selftests/nolibc/Makefile | 7 +++++++
> > > >  1 file changed, 7 insertions(+)
> > > > 
> > > > diff --git a/tools/testing/selftests/nolibc/Makefile b/tools/testing/selftests/nolibc/Makefile
> > > > index f42adef87e12..08a5ca5f418b 100644
> > > > --- a/tools/testing/selftests/nolibc/Makefile
> > > > +++ b/tools/testing/selftests/nolibc/Makefile
> > > > @@ -39,6 +39,9 @@ DEFCONFIG_s390       = defconfig
> > > >  DEFCONFIG_loongarch  = defconfig
> > > >  DEFCONFIG            = $(DEFCONFIG_$(ARCH))
> > > >  
> > > > +# extra kernel config files under configs/, include common + architecture specific
> > > > +EXTCONFIG            = common.config $(ARCH).config
> > > > +
> > > >  # optional tests to run (default = all)
> > > >  TEST =
> > > >  
> > > > @@ -162,6 +165,10 @@ initramfs: nolibc-test
> > > >  defconfig:
> > > >  	$(Q)$(MAKE) -C $(srctree) ARCH=$(ARCH) CC=$(CC) CROSS_COMPILE=$(CROSS_COMPILE) mrproper $(DEFCONFIG) prepare
> > > >  
> > > > +extconfig:
> > > > +	$(Q)$(srctree)/scripts/kconfig/merge_config.sh -O "$(srctree)" -m "$(srctree)/.config" $(foreach c,$(EXTCONFIG),$(wildcard $(CURDIR)/configs/$c))
> > > > +	$(Q)$(MAKE) -C $(srctree) ARCH=$(ARCH) CC=$(CC) CROSS_COMPILE=$(CROSS_COMPILE) KCONFIG_ALLCONFIG="$(srctree)/.config" allnoconfig
> > > > +
> > > 
> > > Please also mention this entry in the "help" message.
> > > Other than that, OK.
> > >
> > 
> > Willy, as we discussed in another tinyconfig series, in order to avoid
> > add more complexity to users, I plan to drop this standalone 'extconfig'
> > target and move the extra config operations to defconfig target like
> > this:
> > 
> >     defconfig:
> >      	$(Q)$(MAKE) -C $(srctree) ARCH=$(ARCH) CC=$(CC) CROSS_COMPILE=$(CROSS_COMPILE) mrproper $(DEFCONFIG) prepare
> >      	$(Q)$(srctree)/scripts/kconfig/merge_config.sh -O "$(srctree)" -m "$(srctree)/.config" $(foreach c,$(EXTCONFIG),$(wildcard $(CURDIR)/configs/$c))
> >      	$(Q)$(MAKE) -C $(srctree) ARCH=$(ARCH) CC=$(CC) CROSS_COMPILE=$(CROSS_COMPILE) KCONFIG_ALLCONFIG="$(srctree)/.config" allnoconfig
> > 
> > This extra config options are really critical to default boot and test, so, it
> > should be a 'default' config action as the 'defconfig' target originally mean.
> > 
> > Will test carefully about this change, what's your idea?
> 
> Well, *iff* we need to have per-arch config settings, probably to better
> support Qemu, the console output, etc, then indeed we'd rather add them
> to all configs indeed. However the example above is bogus. First you
> create a default config, then prepare the rest of the kernel , then
> merge that config with the extensions (based on the arch and not the
> sub-arch BTW) then erase everything to restart from an allnoconfig.
>

allnoconfig is based on the "$(srctree)/.config" generated by defconfig
plus the extra ARCH specific config options, so, it should be additional?

Btw, what do you mean? "based on the arch and not the sub-arch BTW", in
reality, the ARCH will be changed to XARCH, so, this is sub-arch/variant
specific.

> Also if you're using merge_config you'll need -Q to disable warning
> about overridden options since you're starting from a defconfig which
> contains plenty of them.

Yeah, we do need this to silence some warnings.

> Usually I just do defconfig, append the few
> changes, then oldconfig and that's done.

I have used olddefconfig (similar to oldconfig but without prompts)
before, but allnoconfig is used to mainly to only compile the ones we
explicitly enable, with the new additional options as No, which may be
more deterministic, it should not touch the old configured ones in
.config? I have learned its usage from the merge_config.sh

> But merge_config can probably
> be fine as well. Also make prepare should be done last, to make sure
> that if it depends on anything in the config (e.g. 32 vs 64 bit etc)
> it's up to date.
>

Ok, let's move 'prepare' at the end of all above.

And also add the -Q option to silence the potential warnings:

    defconfig:
      	$(Q)$(MAKE) -C $(srctree) ARCH=$(ARCH) CC=$(CC) CROSS_COMPILE=$(CROSS_COMPILE) mrproper $(DEFCONFIG)
      	$(Q)$(srctree)/scripts/kconfig/merge_config.sh -Q -O "$(srctree)" -m "$(srctree)/.config" $(foreach c,$(EXTCONFIG),$(wildcard $(CURDIR)/configs/$c))
      	$(Q)$(MAKE) -C $(srctree) ARCH=$(ARCH) CC=$(CC) CROSS_COMPILE=$(CROSS_COMPILE) KCONFIG_ALLCONFIG="$(srctree)/.config" allnoconfig
      	$(Q)$(MAKE) -C $(srctree) ARCH=$(ARCH) CC=$(CC) CROSS_COMPILE=$(CROSS_COMPILE) prepare

We may still need to clear the usage of allnoconfig, In my test, it does
work as I explained above. If I really misused it, let's change it back
to olddefconfig.

Thanks,
Zhangjin

> Willy

^ permalink raw reply	[flat|nested] 37+ messages in thread

end of thread, other threads:[~2023-07-29  9:44 UTC | newest]

Thread overview: 37+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-07-18 21:09 [PATCH v1 0/8] tools/nolibc: add 32/64-bit powerpc support Zhangjin Wu
2023-07-18 21:10 ` [PATCH v1 1/8] tools/nolibc: add support for powerpc Zhangjin Wu
2023-07-23  7:32   ` Thomas Weißschuh
2023-07-23  8:15     ` Willy Tarreau
2023-07-25  5:44       ` Zhangjin Wu
2023-07-25  6:29         ` Willy Tarreau
2023-07-25 11:02           ` Zhangjin Wu
2023-07-25 14:45             ` Ammar Faizi
2023-07-25 17:04               ` Zhangjin Wu
2023-07-25 18:23                 ` Ammar Faizi
2023-07-25 20:23                   ` Zhangjin Wu
2023-07-25 18:27             ` Willy Tarreau
2023-07-25 20:52               ` Zhangjin Wu
2023-07-25  6:14     ` Zhangjin Wu
2023-07-18 21:11 ` [PATCH v1 2/8] tools/nolibc: add support for powerpc64 Zhangjin Wu
2023-07-18 21:13 ` [PATCH v1 3/8] selftests/nolibc: select_null: fix up for big endian powerpc64 Zhangjin Wu
2023-07-18 22:17   ` Thomas Weißschuh
2023-07-18 23:56     ` Zhangjin Wu
2023-07-19  4:33       ` Willy Tarreau
2023-07-19  6:49         ` Zhangjin Wu
2023-07-19 20:25           ` Willy Tarreau
2023-07-20  6:11           ` Thomas Weißschuh
2023-07-18 21:14 ` [PATCH v1 4/8] selftests/nolibc: add extra config file customize support Zhangjin Wu
2023-07-22 12:00   ` Willy Tarreau
2023-07-25 14:30     ` Zhangjin Wu
2023-07-29  7:45       ` Willy Tarreau
2023-07-29  9:43         ` Zhangjin Wu
2023-07-18 21:15 ` [PATCH v1 5/8] selftests/nolibc: add XARCH and ARCH mapping support Zhangjin Wu
2023-07-22 12:03   ` Willy Tarreau
2023-07-18 21:16 ` [PATCH v1 6/8] selftests/nolibc: add test support for powerpc Zhangjin Wu
2023-07-18 21:17 ` [PATCH v1 7/8] selftests/nolibc: add test support for powerpc64le Zhangjin Wu
2023-07-22 12:07   ` Willy Tarreau
2023-07-18 21:18 ` [PATCH v1 8/8] selftests/nolibc: add test support for powerpc64 Zhangjin Wu
2023-07-22 12:10   ` Willy Tarreau
2023-07-25  5:50     ` Zhangjin Wu
2023-07-25  6:02       ` Willy Tarreau
2023-07-23  7:47 ` [PATCH v1 0/8] tools/nolibc: add 32/64-bit powerpc support Thomas Weißschuh

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.