linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v4 0/6] x86/syscall: use int for x86-64 system calls
@ 2021-05-18 19:12 H. Peter Anvin
  2021-05-18 19:12 ` [PATCH v4 1/6] x86/syscall: update and extend selftest syscall_numbering_64 H. Peter Anvin
                   ` (6 more replies)
  0 siblings, 7 replies; 18+ messages in thread
From: H. Peter Anvin @ 2021-05-18 19:12 UTC (permalink / raw)
  To: Thomas Gleixner, Ingo Molnar, Andy Lutomirski, Borislav Petkov,
	H. Peter Anvin
  Cc: Linux Kernel Mailing List

From: "H. Peter Anvin (Intel)" <hpa@zytor.com>

This patchset addresses several inconsistencies in the handling of
system call numbers in x86-64 (and x32).

Right now, *some* code will treat e.g. 0x00000001_00000001 as a system
call and some will not. Some of the code, notably in ptrace and
seccomp, will treat 0x00000001_ffffffff as a system call and some will
not.

Furthermore, right now, e.g. 335 for x86-64 will force the exit code
to be set to -ENOSYS even if poked by ptrace, but 548 will not,
because there is an observable difference between an out of range
system call and a system call number that falls outside the range of
the tables.

Both of these issues are visible to the user; for example the
syscall_numbering_64 kernel selftest fails if run under ptrace for
this reason (system calls succeed with the high bits set, whereas they
fail when not being traced.)

The architecture independent code in Linux expects "int" for the
system call number, per the API documented, but not implemented, in
<asm-generic/syscalls.h>: system call numbers are expected to be
"int", with -1 as the only non-system-call sentinel.

Treating the same data in multiple ways in different context is at the
very best confusing, but it also has the potential to cause security
problems (no such security problems are known at this time, however.)

This is an ABI change, but it is in fact a return to the original
x86-64 ABI: the original assembly entry code would zero-extend the
system call number passed and only the bottom 32 bits were examined.

1. Consistently treat the system call number as a signed int. This is
   what syscall_get_nr() already does, and therefore what all
   architecture-independent code (e.g. seccomp) already expects.

2. As per the defined semantics of syscall_get_nr(), only the value -1
   is defined as a non-system call, so comparing >= 0 is
   incorrect. Change to != -1.

3. Call sys_ni_syscall() for system calls which are out of range
   except for -1, which is used by ptrace and seccomp as a "skip
   system call" marker) just as for system call numbers that
   correspond to holes in the table.

4. Updates and extends the syscall_numbering_64 selftest, including
   testing the system call numbering when running under ptrace.
   
Changes from v3:

* Reorganize the patchset to have the selftest change first.
* Add tests running under ptrace to selftest.

Changes from v2:

* Factor out and split what was a single patch in the v2 patchset; the
  rest of the patches have already been applied.
* Fix the syscall_numbering_64 selftest to match the definition
  changes, make its output more informative, and extend it to more
  tests. Avoid using the glibc syscall() wrapper to make sure we test
  what we think we are testing.
* Better documentation of the changes.

Changes from v1:

* Only -1 should be a non-system call per the cross-architectural
  definition of sys_ni_syscall().
* Fix/improve patch descriptions.

--- 
 arch/x86/entry/common.c                         |  93 +++--
 arch/x86/entry/entry_64.S                       |   2 +-
 arch/x86/include/asm/syscall.h                  |   2 +-
 tools/testing/selftests/x86/syscall_numbering.c | 488 +++++++++++++++++++++---
 4 files changed, 508 insertions(+), 77 deletions(-)

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCH v4 1/6] x86/syscall: update and extend selftest syscall_numbering_64
  2021-05-18 19:12 [PATCH v4 0/6] x86/syscall: use int for x86-64 system calls H. Peter Anvin
@ 2021-05-18 19:12 ` H. Peter Anvin
  2021-05-20 13:23   ` [tip: x86/entry] selftests/x86/syscall: Update and extend syscall_numbering_64 tip-bot2 for H. Peter Anvin (Intel)
  2021-05-18 19:12 ` [PATCH v4 2/6] x86/syscall: simplify message reporting in syscall_numbering.c H. Peter Anvin
                   ` (5 subsequent siblings)
  6 siblings, 1 reply; 18+ messages in thread
From: H. Peter Anvin @ 2021-05-18 19:12 UTC (permalink / raw)
  To: Thomas Gleixner, Ingo Molnar, Andy Lutomirski, Borislav Petkov,
	H. Peter Anvin
  Cc: Linux Kernel Mailing List

From: "H. Peter Anvin (Intel)" <hpa@zytor.com>

Update the syscall_numbering_64 selftest to reflect that a system call
is to be extended from 32 bits. Add a mix of tests for valid and
invalid system calls in 64-bit and x32 space.

Use an explicit system call instruction, because we cannot know if the
glibc syscall() wrapper intercepts instructions, extends the system
call number independently, or anything similar.

Use long long instead of long to make it possible to compile this test
on x32 as well as 64 bits.

Signed-off-by: H. Peter Anvin (Intel) <hpa@zytor.com>
---
 .../testing/selftests/x86/syscall_numbering.c | 274 ++++++++++++++----
 1 file changed, 222 insertions(+), 52 deletions(-)

diff --git a/tools/testing/selftests/x86/syscall_numbering.c b/tools/testing/selftests/x86/syscall_numbering.c
index d6b09cb1aa2c..7dd86bcbee25 100644
--- a/tools/testing/selftests/x86/syscall_numbering.c
+++ b/tools/testing/selftests/x86/syscall_numbering.c
@@ -1,6 +1,8 @@
 /* SPDX-License-Identifier: GPL-2.0 */
 /*
- * syscall_arg_fault.c - tests faults 32-bit fast syscall stack args
+ * syscall_numbering.c - test calling the x86-64 kernel with various
+ * valid and invalid system call numbers.
+ *
  * Copyright (c) 2018 Andrew Lutomirski
  */
 
@@ -11,79 +13,247 @@
 #include <stdbool.h>
 #include <errno.h>
 #include <unistd.h>
-#include <syscall.h>
+#include <string.h>
+#include <fcntl.h>
+#include <limits.h>
 
-static int nerrs;
+/* Common system call numbers */
+#define SYS_READ	  0
+#define SYS_WRITE	  1
+#define SYS_GETPID	 39
+/* x64-only system call numbers */
+#define X64_IOCTL	 16
+#define X64_READV	 19
+#define X64_WRITEV	 20
+/* x32-only system call numbers (without X32_BIT) */
+#define X32_IOCTL	514
+#define X32_READV	515
+#define X32_WRITEV	516
 
-#define X32_BIT 0x40000000UL
+#define X32_BIT 0x40000000
 
-static void check_enosys(unsigned long nr, bool *ok)
+static unsigned int nerr = 0;	/* Cumulative error count */
+static int nullfd = -1;		/* File descriptor for /dev/null */
+
+/*
+ * Directly invokes the given syscall with nullfd as the first argument
+ * and the rest zero. Avoids involving glibc wrappers in case they ever
+ * end up intercepting some system calls for some reason, or modify
+ * the system call number itself.
+ */
+static inline long long probe_syscall(int msb, int lsb)
 {
-	/* If this fails, a segfault is reasonably likely. */
-	fflush(stdout);
-
-	long ret = syscall(nr, 0, 0, 0, 0, 0, 0);
-	if (ret == 0) {
-		printf("[FAIL]\tsyscall %lu succeeded, but it should have failed\n", nr);
-		*ok = false;
-	} else if (errno != ENOSYS) {
-		printf("[FAIL]\tsyscall %lu had error code %d, but it should have reported ENOSYS\n", nr, errno);
-		*ok = false;
-	}
+	register long long arg1 asm("rdi") = nullfd;
+	register long long arg2 asm("rsi") = 0;
+	register long long arg3 asm("rdx") = 0;
+	register long long arg4 asm("r10") = 0;
+	register long long arg5 asm("r8")  = 0;
+	register long long arg6 asm("r9")  = 0;
+	long long nr = ((long long)msb << 32) | (unsigned int)lsb;
+	long long ret;
+
+	asm volatile("syscall"
+		     : "=a" (ret)
+		     : "a" (nr), "r" (arg1), "r" (arg2), "r" (arg3),
+		       "r" (arg4), "r" (arg5), "r" (arg6)
+		     : "rcx", "r11", "memory", "cc");
+
+	return ret;
 }
 
-static void test_x32_without_x32_bit(void)
+static const char *syscall_str(int msb, int start, int end)
 {
-	bool ok = true;
+	static char buf[64];
+	const char * const type = (start & X32_BIT) ? "x32" : "x64";
+	int lsb = start;
 
 	/*
-	 * Syscalls 512-547 are "x32" syscalls.  They are intended to be
-	 * called with the x32 (0x40000000) bit set.  Calling them without
-	 * the x32 bit set is nonsense and should not work.
+	 * Improve readability by stripping the x32 bit, but round
+	 * toward zero so we don't display -1 as -1073741825.
 	 */
-	printf("[RUN]\tChecking syscalls 512-547\n");
-	for (int i = 512; i <= 547; i++)
-		check_enosys(i, &ok);
+	if (lsb < 0)
+		lsb |= X32_BIT;
+	else
+		lsb &= ~X32_BIT;
+
+	if (start == end)
+		snprintf(buf, sizeof buf, "%s syscall %d:%d",
+			 type, msb, lsb);
+	else
+		snprintf(buf, sizeof buf, "%s syscalls %d:%d..%d",
+			 type, msb, lsb, lsb + (end-start));
+
+	return buf;
+}
+
+static unsigned int _check_for(int msb, int start, int end, long long expect,
+			       const char *expect_str)
+{
+	unsigned int err = 0;
+
+	for (int nr = start; nr <= end; nr++) {
+		long long ret = probe_syscall(msb, nr);
+
+		if (ret != expect) {
+			printf("[FAIL]\t      %s returned %lld, but it should have returned %s\n",
+			       syscall_str(msb, nr, nr),
+			       ret, expect_str);
+			err++;
+		}
+	}
 
+	if (err) {
+		nerr += err;
+		if (start != end)
+			printf("[FAIL]\t      %s had %u failure%s\n",
+			       syscall_str(msb, start, end),
+			       err, (err == 1) ? "s" : "");
+	} else {
+		printf("[OK]\t      %s returned %s as expected\n",
+		       syscall_str(msb, start, end), expect_str);
+	}
+
+	return err;
+}
+
+#define check_for(msb,start,end,expect) \
+	_check_for(msb,start,end,expect,#expect)
+
+static bool check_zero(int msb, int nr)
+{
+	return check_for(msb, nr, nr, 0);
+}
+
+static bool check_enosys(int msb, int nr)
+{
+	return check_for(msb, nr, nr, -ENOSYS);
+}
+
+/*
+ * Anyone diagnosing a failure will want to know whether the kernel
+ * supports x32. Tell them. This can also be used to conditionalize
+ * tests based on existence or nonexistence of x32.
+ */
+static bool test_x32(void)
+{
+	long long ret;
+	long long mypid = getpid();
+
+	printf("[RUN]\tChecking for x32 by calling x32 getpid()\n");
+	ret = probe_syscall(0, SYS_GETPID | X32_BIT);
+
+	if (ret == mypid) {
+		printf("[INFO]\t   x32 is supported\n");
+		return true;
+	} else if (ret == -ENOSYS) {
+		printf("[INFO]\t   x32 is not supported\n");
+		return false;
+	} else {
+		printf("[FAIL]\t   x32 getpid() returned %lld, but it should have returned either %lld or -ENOSYS\n", ret, mypid);
+		nerr++;
+		return true;	/* Proceed as if... */
+	}
+}
+
+static void test_syscalls_common(int msb)
+{
+	printf("[RUN]\t   Checking some common syscalls as 64 bit\n");
+	check_zero(msb, SYS_READ);
+	check_zero(msb, SYS_WRITE);
+
+	printf("[RUN]\t   Checking some 64-bit only syscalls as 64 bit\n");
+	check_zero(msb, X64_READV);
+	check_zero(msb, X64_WRITEV);
+
+	printf("[RUN]\t   Checking out of range system calls\n");
+	check_for(msb, -64, -1, -ENOSYS);
+	check_for(msb, X32_BIT-64, X32_BIT-1, -ENOSYS);
+	check_for(msb, -64-X32_BIT, -1-X32_BIT, -ENOSYS);
+	check_for(msb, INT_MAX-64, INT_MAX-1, -ENOSYS);
+}
+
+static void test_syscalls_with_x32(int msb)
+{
 	/*
-	 * Check that a handful of 64-bit-only syscalls are rejected if the x32
-	 * bit is set.
+	 * Syscalls 512-547 are "x32" syscalls.  They are
+	 * intended to be called with the x32 (0x40000000) bit
+	 * set.  Calling them without the x32 bit set is
+	 * nonsense and should not work.
 	 */
-	printf("[RUN]\tChecking some 64-bit syscalls in x32 range\n");
-	check_enosys(16 | X32_BIT, &ok);	/* ioctl */
-	check_enosys(19 | X32_BIT, &ok);	/* readv */
-	check_enosys(20 | X32_BIT, &ok);	/* writev */
+	printf("[RUN]\t   Checking x32 syscalls as 64 bit\n");
+	check_for(msb, 512, 547, -ENOSYS);
+
+	printf("[RUN]\t   Checking some common syscalls as x32\n");
+	check_zero(msb, SYS_READ   | X32_BIT);
+	check_zero(msb, SYS_WRITE  | X32_BIT);
+
+	printf("[RUN]\t   Checking some x32 syscalls as x32\n");
+	check_zero(msb, X32_READV  | X32_BIT);
+	check_zero(msb, X32_WRITEV | X32_BIT);
+
+	printf("[RUN]\t   Checking some 64-bit syscalls as x32\n");
+	check_enosys(msb, X64_IOCTL  | X32_BIT);
+	check_enosys(msb, X64_READV  | X32_BIT);
+	check_enosys(msb, X64_WRITEV | X32_BIT);
+}
+
+static void test_syscalls_without_x32(int msb)
+{
+	printf("[RUN]\t  Checking for absence of x32 system calls\n");
+	check_for(msb, 0 | X32_BIT, 999 | X32_BIT, -ENOSYS);
+}
+
+static void test_syscall_numbering(void)
+{
+	static const int msbs[] = {
+		0, 1, -1, X32_BIT-1, X32_BIT, X32_BIT-1, -X32_BIT, INT_MAX,
+		INT_MIN, INT_MIN+1
+	};
+	bool with_x32 = test_x32();
 
 	/*
-	 * Check some syscalls with high bits set.
+	 * The MSB is supposed to be ignored, so we loop over a few
+	 * to test that out.
 	 */
-	printf("[RUN]\tChecking numbers above 2^32-1\n");
-	check_enosys((1UL << 32), &ok);
-	check_enosys(X32_BIT | (1UL << 32), &ok);
+	for (size_t i = 0; i < sizeof(msbs)/sizeof(msbs[0]); i++) {
+		int msb = msbs[i];
+		printf("[RUN]\tChecking system calls with msb = %d (0x%x)\n",
+		       msb, msb);
 
-	if (!ok)
-		nerrs++;
-	else
-		printf("[OK]\tThey all returned -ENOSYS\n");
+		test_syscalls_common(msb);
+		if (with_x32)
+			test_syscalls_with_x32(msb);
+		else
+			test_syscalls_without_x32(msb);
+	}
 }
 
-int main()
+int main(void)
 {
 	/*
-	 * Anyone diagnosing a failure will want to know whether the kernel
-	 * supports x32.  Tell them.
+	 * It is quite likely to get a segfault on a failure, so make
+	 * sure the message gets out by setting stdout to nonbuffered.
 	 */
-	printf("\tChecking for x32...");
-	fflush(stdout);
-	if (syscall(39 | X32_BIT, 0, 0, 0, 0, 0, 0) >= 0) {
-		printf(" supported\n");
-	} else if (errno == ENOSYS) {
-		printf(" not supported\n");
-	} else {
-		printf(" confused\n");
-	}
+	setvbuf(stdout, NULL, _IONBF, 0);
 
-	test_x32_without_x32_bit();
+	/*
+	 * Harmless file descriptor to work on...
+	 */
+	nullfd = open("/dev/null", O_RDWR);
+	if (nullfd < 0) {
+		printf("[FAIL]\tUnable to open /dev/null: %s\n",
+		       strerror(errno));
+		printf("[SKIP]\tCannot execute test\n");
+		return 71;	/* EX_OSERR */
+	}
 
-	return nerrs ? 1 : 0;
+	test_syscall_numbering();
+	if (!nerr) {
+		printf("[OK]\tAll system calls succeeded or failed as expected\n");
+		return 0;
+	} else {
+		printf("[FAIL]\tA total of %u system call%s had incorrect behavior\n",
+		       nerr, nerr != 1 ? "s" : "");
+		return 1;
+	}
 }
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH v4 2/6] x86/syscall: simplify message reporting in syscall_numbering.c
  2021-05-18 19:12 [PATCH v4 0/6] x86/syscall: use int for x86-64 system calls H. Peter Anvin
  2021-05-18 19:12 ` [PATCH v4 1/6] x86/syscall: update and extend selftest syscall_numbering_64 H. Peter Anvin
@ 2021-05-18 19:12 ` H. Peter Anvin
  2021-05-20 13:23   ` [tip: x86/entry] selftests/x86/syscall: Simplify message reporting in syscall_numbering tip-bot2 for H. Peter Anvin (Intel)
  2021-05-18 19:13 ` [PATCH v4 3/6] x86/syscall: add tests under ptrace to syscall_numbering.c H. Peter Anvin
                   ` (4 subsequent siblings)
  6 siblings, 1 reply; 18+ messages in thread
From: H. Peter Anvin @ 2021-05-18 19:12 UTC (permalink / raw)
  To: Thomas Gleixner, Ingo Molnar, Andy Lutomirski, Borislav Petkov,
	H. Peter Anvin
  Cc: Linux Kernel Mailing List

From: "H. Peter Anvin (Intel)" <hpa@zytor.com>

Reduce some boiler plate in printing and indenting messages in
syscall_numbering.c. This makes it easier to produce clean status
output.

Signed-off-by: H. Peter Anvin (Intel) <hpa@zytor.com>
---
 .../testing/selftests/x86/syscall_numbering.c | 102 ++++++++++++------
 1 file changed, 71 insertions(+), 31 deletions(-)

diff --git a/tools/testing/selftests/x86/syscall_numbering.c b/tools/testing/selftests/x86/syscall_numbering.c
index 7dd86bcbee25..03915cd48cfc 100644
--- a/tools/testing/selftests/x86/syscall_numbering.c
+++ b/tools/testing/selftests/x86/syscall_numbering.c
@@ -34,6 +34,33 @@
 
 static unsigned int nerr = 0;	/* Cumulative error count */
 static int nullfd = -1;		/* File descriptor for /dev/null */
+static int indent = 0;
+
+static inline unsigned int offset(void)
+{
+	return 8+indent*4;
+}
+
+#define msg(lvl, fmt, ...) printf("%-*s" fmt, offset(), "[" #lvl "]", \
+                                 ## __VA_ARGS__)
+
+#define run(fmt, ...)  msg(RUN,  fmt, ## __VA_ARGS__)
+#define info(fmt, ...) msg(INFO, fmt, ## __VA_ARGS__)
+#define ok(fmt, ...)   msg(OK,   fmt, ## __VA_ARGS__)
+
+#define fail(fmt, ...)					\
+	do {						\
+		msg(FAIL, fmt, ## __VA_ARGS__);		\
+		nerr++;					\
+	} while (0)
+
+#define crit(fmt, ...)					\
+	do {						\
+		indent = 0;				\
+		msg(FAIL, fmt, ## __VA_ARGS__);		\
+		msg(SKIP, "Unable to run test\n");	\
+		exit(71); /* EX_OSERR */		\
+	} while (0)
 
 /*
  * Directly invokes the given syscall with nullfd as the first argument
@@ -91,28 +118,37 @@ static unsigned int _check_for(int msb, int start, int end, long long expect,
 {
 	unsigned int err = 0;
 
+	indent++;
+	if (start != end)
+		indent++;
+
 	for (int nr = start; nr <= end; nr++) {
 		long long ret = probe_syscall(msb, nr);
 
 		if (ret != expect) {
-			printf("[FAIL]\t      %s returned %lld, but it should have returned %s\n",
+			fail("%s returned %lld, but it should have returned %s\n",
 			       syscall_str(msb, nr, nr),
 			       ret, expect_str);
 			err++;
 		}
 	}
 
+	if (start != end)
+		indent--;
+
 	if (err) {
 		nerr += err;
 		if (start != end)
-			printf("[FAIL]\t      %s had %u failure%s\n",
+			fail("%s had %u failure%s\n",
 			       syscall_str(msb, start, end),
-			       err, (err == 1) ? "s" : "");
+			       err, err == 1 ? "s" : "");
 	} else {
-		printf("[OK]\t      %s returned %s as expected\n",
-		       syscall_str(msb, start, end), expect_str);
+		ok("%s returned %s as expected\n",
+		   syscall_str(msb, start, end), expect_str);
 	}
 
+	indent--;
+
 	return err;
 }
 
@@ -137,35 +173,38 @@ static bool check_enosys(int msb, int nr)
 static bool test_x32(void)
 {
 	long long ret;
-	long long mypid = getpid();
+	pid_t mypid = getpid();
+	bool with_x32;
 
-	printf("[RUN]\tChecking for x32 by calling x32 getpid()\n");
+	run("Checking for x32 by calling x32 getpid()\n");
 	ret = probe_syscall(0, SYS_GETPID | X32_BIT);
 
+	indent++;
 	if (ret == mypid) {
-		printf("[INFO]\t   x32 is supported\n");
-		return true;
+		info("x32 is supported\n");
+		with_x32 = true;
 	} else if (ret == -ENOSYS) {
-		printf("[INFO]\t   x32 is not supported\n");
-		return false;
+		info("x32 is not supported\n");
+		with_x32 = false;
 	} else {
-		printf("[FAIL]\t   x32 getpid() returned %lld, but it should have returned either %lld or -ENOSYS\n", ret, mypid);
-		nerr++;
-		return true;	/* Proceed as if... */
+		fail("x32 getpid() returned %lld, but it should have returned either %lld or -ENOSYS\n", ret, mypid);
+		with_x32 = false;
 	}
+	indent--;
+	return with_x32;
 }
 
 static void test_syscalls_common(int msb)
 {
-	printf("[RUN]\t   Checking some common syscalls as 64 bit\n");
+	run("Checking some common syscalls as 64 bit\n");
 	check_zero(msb, SYS_READ);
 	check_zero(msb, SYS_WRITE);
 
-	printf("[RUN]\t   Checking some 64-bit only syscalls as 64 bit\n");
+	run("Checking some 64-bit only syscalls as 64 bit\n");
 	check_zero(msb, X64_READV);
 	check_zero(msb, X64_WRITEV);
 
-	printf("[RUN]\t   Checking out of range system calls\n");
+	run("Checking out of range system calls\n");
 	check_for(msb, -64, -1, -ENOSYS);
 	check_for(msb, X32_BIT-64, X32_BIT-1, -ENOSYS);
 	check_for(msb, -64-X32_BIT, -1-X32_BIT, -ENOSYS);
@@ -180,18 +219,18 @@ static void test_syscalls_with_x32(int msb)
 	 * set.  Calling them without the x32 bit set is
 	 * nonsense and should not work.
 	 */
-	printf("[RUN]\t   Checking x32 syscalls as 64 bit\n");
+	run("Checking x32 syscalls as 64 bit\n");
 	check_for(msb, 512, 547, -ENOSYS);
 
-	printf("[RUN]\t   Checking some common syscalls as x32\n");
+	run("Checking some common syscalls as x32\n");
 	check_zero(msb, SYS_READ   | X32_BIT);
 	check_zero(msb, SYS_WRITE  | X32_BIT);
 
-	printf("[RUN]\t   Checking some x32 syscalls as x32\n");
+	run("Checking some x32 syscalls as x32\n");
 	check_zero(msb, X32_READV  | X32_BIT);
 	check_zero(msb, X32_WRITEV | X32_BIT);
 
-	printf("[RUN]\t   Checking some 64-bit syscalls as x32\n");
+	run("Checking some 64-bit syscalls as x32\n");
 	check_enosys(msb, X64_IOCTL  | X32_BIT);
 	check_enosys(msb, X64_READV  | X32_BIT);
 	check_enosys(msb, X64_WRITEV | X32_BIT);
@@ -199,7 +238,7 @@ static void test_syscalls_with_x32(int msb)
 
 static void test_syscalls_without_x32(int msb)
 {
-	printf("[RUN]\t  Checking for absence of x32 system calls\n");
+	run("Checking for absence of x32 system calls\n");
 	check_for(msb, 0 | X32_BIT, 999 | X32_BIT, -ENOSYS);
 }
 
@@ -217,14 +256,18 @@ static void test_syscall_numbering(void)
 	 */
 	for (size_t i = 0; i < sizeof(msbs)/sizeof(msbs[0]); i++) {
 		int msb = msbs[i];
-		printf("[RUN]\tChecking system calls with msb = %d (0x%x)\n",
-		       msb, msb);
+		run("Checking system calls with msb = %d (0x%x)\n",
+		    msb, msb);
+
+		indent++;
 
 		test_syscalls_common(msb);
 		if (with_x32)
 			test_syscalls_with_x32(msb);
 		else
 			test_syscalls_without_x32(msb);
+
+		indent--;
 	}
 }
 
@@ -241,19 +284,16 @@ int main(void)
 	 */
 	nullfd = open("/dev/null", O_RDWR);
 	if (nullfd < 0) {
-		printf("[FAIL]\tUnable to open /dev/null: %s\n",
-		       strerror(errno));
-		printf("[SKIP]\tCannot execute test\n");
-		return 71;	/* EX_OSERR */
+		crit("Unable to open /dev/null: %s\n", strerror(errno));
 	}
 
 	test_syscall_numbering();
 	if (!nerr) {
-		printf("[OK]\tAll system calls succeeded or failed as expected\n");
+		ok("All system calls succeeded or failed as expected\n");
 		return 0;
 	} else {
-		printf("[FAIL]\tA total of %u system call%s had incorrect behavior\n",
-		       nerr, nerr != 1 ? "s" : "");
+		fail("A total of %u system call%s had incorrect behavior\n",
+		     nerr, nerr != 1 ? "s" : "");
 		return 1;
 	}
 }
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH v4 3/6] x86/syscall: add tests under ptrace to syscall_numbering.c
  2021-05-18 19:12 [PATCH v4 0/6] x86/syscall: use int for x86-64 system calls H. Peter Anvin
  2021-05-18 19:12 ` [PATCH v4 1/6] x86/syscall: update and extend selftest syscall_numbering_64 H. Peter Anvin
  2021-05-18 19:12 ` [PATCH v4 2/6] x86/syscall: simplify message reporting in syscall_numbering.c H. Peter Anvin
@ 2021-05-18 19:13 ` H. Peter Anvin
  2021-05-20 13:23   ` [tip: x86/entry] selftests/x86/syscall: Add tests under ptrace to syscall_numbering_64 tip-bot2 for H. Peter Anvin (Intel)
  2021-05-18 19:13 ` [PATCH v4 4/6] x86/syscall: sign-extend system calls on entry to int H. Peter Anvin
                   ` (3 subsequent siblings)
  6 siblings, 1 reply; 18+ messages in thread
From: H. Peter Anvin @ 2021-05-18 19:13 UTC (permalink / raw)
  To: Thomas Gleixner, Ingo Molnar, Andy Lutomirski, Borislav Petkov,
	H. Peter Anvin
  Cc: Linux Kernel Mailing List

From: "H. Peter Anvin (Intel)" <hpa@zytor.com>

Add tests running under ptrace for syscall_numbering_64. ptrace
stopping on syscall entry and possibly modifying the syscall number
(regs.orig_rax) or the default return value (regs.rax) can have
different results that the normal system call path.

Signed-off-by: H. Peter Anvin (Intel) <hpa@zytor.com>
---
 .../testing/selftests/x86/syscall_numbering.c | 244 +++++++++++++++---
 1 file changed, 212 insertions(+), 32 deletions(-)

diff --git a/tools/testing/selftests/x86/syscall_numbering.c b/tools/testing/selftests/x86/syscall_numbering.c
index 03915cd48cfc..ef618f5ffb3b 100644
--- a/tools/testing/selftests/x86/syscall_numbering.c
+++ b/tools/testing/selftests/x86/syscall_numbering.c
@@ -16,6 +16,13 @@
 #include <string.h>
 #include <fcntl.h>
 #include <limits.h>
+#include <signal.h>
+#include <sys/ptrace.h>
+#include <sys/user.h>
+#include <sys/wait.h>
+#include <sys/mman.h>
+
+#include <linux/ptrace.h>
 
 /* Common system call numbers */
 #define SYS_READ	  0
@@ -32,13 +39,44 @@
 
 #define X32_BIT 0x40000000
 
-static unsigned int nerr = 0;	/* Cumulative error count */
 static int nullfd = -1;		/* File descriptor for /dev/null */
-static int indent = 0;
+static bool with_x32;		/* x32 supported on this kernel? */
+
+enum ptrace_pass {
+	PTP_NOTHING,
+	PTP_GETREGS,
+	PTP_WRITEBACK,
+	PTP_FUZZRET,
+	PTP_FUZZHIGH,
+	PTP_INTNUM,
+	PTP_DONE
+};
+
+static const char * const ptrace_pass_name[] =
+{
+	[PTP_NOTHING]	= "just stop, no data read",
+	[PTP_GETREGS]	= "only getregs",
+	[PTP_WRITEBACK]	= "getregs, unmodified setregs",
+	[PTP_FUZZRET]	= "modifying the default return",
+	[PTP_FUZZHIGH]	= "clobbering the top 32 bits",
+	[PTP_INTNUM]	= "sign-extending the syscall number",
+};
+
+/*
+ * Shared memory block between tracer and test
+ */
+struct shared {
+	unsigned int nerr;	/* Total error count */
+	unsigned int indent;	/* Message indentation level */
+	enum ptrace_pass ptrace_pass;
+	bool probing_syscall;	/* In probe_syscall() */
+};
+static volatile struct shared *sh;
 
 static inline unsigned int offset(void)
 {
-	return 8+indent*4;
+	unsigned int level = sh ? sh->indent : 0;
+	return 8+level*4;
 }
 
 #define msg(lvl, fmt, ...) printf("%-*s" fmt, offset(), "[" #lvl "]", \
@@ -48,19 +86,22 @@ static inline unsigned int offset(void)
 #define info(fmt, ...) msg(INFO, fmt, ## __VA_ARGS__)
 #define ok(fmt, ...)   msg(OK,   fmt, ## __VA_ARGS__)
 
-#define fail(fmt, ...)					\
-	do {						\
-		msg(FAIL, fmt, ## __VA_ARGS__);		\
-		nerr++;					\
-	} while (0)
+#define fail(fmt, ...)                                 \
+       do {                                            \
+               msg(FAIL, fmt, ## __VA_ARGS__);         \
+               sh->nerr++;                             \
+       } while (0)
+
+#define crit(fmt, ...)				       \
+       do {                                            \
+               sh->indent = 0;                         \
+               msg(FAIL, fmt, ## __VA_ARGS__);         \
+               msg(SKIP, "Unable to run test\n");      \
+               exit(71); /* EX_OSERR */                \
+       } while (0)
 
-#define crit(fmt, ...)					\
-	do {						\
-		indent = 0;				\
-		msg(FAIL, fmt, ## __VA_ARGS__);		\
-		msg(SKIP, "Unable to run test\n");	\
-		exit(71); /* EX_OSERR */		\
-	} while (0)
+/* Sentinel for ptrace-modified return value */
+#define MODIFIED_BY_PTRACE	-9999
 
 /*
  * Directly invokes the given syscall with nullfd as the first argument
@@ -68,7 +109,7 @@ static inline unsigned int offset(void)
  * end up intercepting some system calls for some reason, or modify
  * the system call number itself.
  */
-static inline long long probe_syscall(int msb, int lsb)
+static long long probe_syscall(int msb, int lsb)
 {
 	register long long arg1 asm("rdi") = nullfd;
 	register long long arg2 asm("rsi") = 0;
@@ -79,11 +120,21 @@ static inline long long probe_syscall(int msb, int lsb)
 	long long nr = ((long long)msb << 32) | (unsigned int)lsb;
 	long long ret;
 
+	/*
+	 * We pass in an extra copy of the extended system call number
+	 * in %rbx, so we can examine it from the ptrace handler without
+	 * worrying about it being possibly modified. This is to test
+	 * the validity of struct user regs.orig_rax a.k.a.
+	 * struct pt_regs.orig_ax.
+	 */
+	sh->probing_syscall = true;
 	asm volatile("syscall"
 		     : "=a" (ret)
-		     : "a" (nr), "r" (arg1), "r" (arg2), "r" (arg3),
+		     : "a" (nr), "b" (nr),
+		       "r" (arg1), "r" (arg2), "r" (arg3),
 		       "r" (arg4), "r" (arg5), "r" (arg6)
 		     : "rcx", "r11", "memory", "cc");
+	sh->probing_syscall = false;
 
 	return ret;
 }
@@ -118,9 +169,9 @@ static unsigned int _check_for(int msb, int start, int end, long long expect,
 {
 	unsigned int err = 0;
 
-	indent++;
+	sh->indent++;
 	if (start != end)
-		indent++;
+		sh->indent++;
 
 	for (int nr = start; nr <= end; nr++) {
 		long long ret = probe_syscall(msb, nr);
@@ -134,20 +185,19 @@ static unsigned int _check_for(int msb, int start, int end, long long expect,
 	}
 
 	if (start != end)
-		indent--;
+		sh->indent--;
 
 	if (err) {
-		nerr += err;
 		if (start != end)
 			fail("%s had %u failure%s\n",
-			       syscall_str(msb, start, end),
-			       err, err == 1 ? "s" : "");
+			     syscall_str(msb, start, end),
+			     err, err == 1 ? "s" : "");
 	} else {
 		ok("%s returned %s as expected\n",
 		   syscall_str(msb, start, end), expect_str);
 	}
 
-	indent--;
+	sh->indent--;
 
 	return err;
 }
@@ -174,12 +224,11 @@ static bool test_x32(void)
 {
 	long long ret;
 	pid_t mypid = getpid();
-	bool with_x32;
 
 	run("Checking for x32 by calling x32 getpid()\n");
 	ret = probe_syscall(0, SYS_GETPID | X32_BIT);
 
-	indent++;
+	sh->indent++;
 	if (ret == mypid) {
 		info("x32 is supported\n");
 		with_x32 = true;
@@ -187,15 +236,17 @@ static bool test_x32(void)
 		info("x32 is not supported\n");
 		with_x32 = false;
 	} else {
-		fail("x32 getpid() returned %lld, but it should have returned either %lld or -ENOSYS\n", ret, mypid);
+		fail("x32 getpid() returned %lld, but it should have returned either %lld or -ENOSYS\n", ret, (long long)mypid);
 		with_x32 = false;
 	}
-	indent--;
+	sh->indent--;
 	return with_x32;
 }
 
 static void test_syscalls_common(int msb)
 {
+	enum ptrace_pass pass = sh->ptrace_pass;
+
 	run("Checking some common syscalls as 64 bit\n");
 	check_zero(msb, SYS_READ);
 	check_zero(msb, SYS_WRITE);
@@ -205,7 +256,11 @@ static void test_syscalls_common(int msb)
 	check_zero(msb, X64_WRITEV);
 
 	run("Checking out of range system calls\n");
-	check_for(msb, -64, -1, -ENOSYS);
+	check_for(msb, -64, -2, -ENOSYS);
+	if (pass >= PTP_FUZZRET)
+		check_for(msb, -1, -1, MODIFIED_BY_PTRACE);
+	else
+		check_for(msb, -1, -1, -ENOSYS);
 	check_for(msb, X32_BIT-64, X32_BIT-1, -ENOSYS);
 	check_for(msb, -64-X32_BIT, -1-X32_BIT, -ENOSYS);
 	check_for(msb, INT_MAX-64, INT_MAX-1, -ENOSYS);
@@ -248,7 +303,8 @@ static void test_syscall_numbering(void)
 		0, 1, -1, X32_BIT-1, X32_BIT, X32_BIT-1, -X32_BIT, INT_MAX,
 		INT_MIN, INT_MIN+1
 	};
-	bool with_x32 = test_x32();
+
+	sh->indent++;
 
 	/*
 	 * The MSB is supposed to be ignored, so we loop over a few
@@ -259,7 +315,7 @@ static void test_syscall_numbering(void)
 		run("Checking system calls with msb = %d (0x%x)\n",
 		    msb, msb);
 
-		indent++;
+		sh->indent++;
 
 		test_syscalls_common(msb);
 		if (with_x32)
@@ -267,12 +323,119 @@ static void test_syscall_numbering(void)
 		else
 			test_syscalls_without_x32(msb);
 
-		indent--;
+		sh->indent--;
+	}
+
+	sh->indent--;
+}
+
+static void syscall_numbering_tracee(void)
+{
+	enum ptrace_pass pass;
+
+	if (ptrace(PTRACE_TRACEME, 0, 0, 0)) {
+		crit("Failed to request tracing\n");
+		return;
+	}
+	raise(SIGSTOP);
+
+	for (sh->ptrace_pass = pass = PTP_NOTHING; pass < PTP_DONE;
+	     sh->ptrace_pass = ++pass) {
+		run("Running tests under ptrace: %s\n", ptrace_pass_name[pass]);
+		test_syscall_numbering();
+	}
+}
+
+static void mess_with_syscall(pid_t testpid, enum ptrace_pass pass)
+{
+	struct user_regs_struct regs;
+
+	sh->probing_syscall = false; /* Do this on entry only */
+
+	/* For these, don't even getregs */
+	if (pass == PTP_NOTHING || pass == PTP_DONE)
+		return;
+
+	ptrace(PTRACE_GETREGS, testpid, NULL, &regs);
+
+	if (regs.orig_rax != regs.rbx) {
+		fail("orig_rax %#llx doesn't match syscall number %#llx\n",
+		     (unsigned long long)regs.orig_rax,
+		     (unsigned long long)regs.rbx);
+	}
+
+	switch (pass) {
+	case PTP_GETREGS:
+		/* Just read, no writeback */
+		return;
+	case PTP_WRITEBACK:
+		/* Write back the same register state verbatim */
+		break;
+	case PTP_FUZZRET:
+		regs.rax = MODIFIED_BY_PTRACE;
+		break;
+	case PTP_FUZZHIGH:
+		regs.rax = MODIFIED_BY_PTRACE;
+		regs.orig_rax = regs.orig_rax | 0xffffffff00000000ULL;
+		break;
+	case PTP_INTNUM:
+		regs.rax = MODIFIED_BY_PTRACE;
+		regs.orig_rax = (int)regs.orig_rax;
+		break;
+	default:
+		crit("invalid ptrace_pass\n");
+		break;
+	}
+
+	ptrace(PTRACE_SETREGS, testpid, NULL, &regs);
+}
+
+static void syscall_numbering_tracer(pid_t testpid)
+{
+	int wstatus;
+
+	do {
+		pid_t wpid = waitpid(testpid, &wstatus, 0);
+		if (wpid < 0 && errno != EINTR)
+			break;
+		if (wpid != testpid)
+			continue;
+		if (!WIFSTOPPED(wstatus))
+			break;	/* Thread exited? */
+
+		if (sh->probing_syscall && WSTOPSIG(wstatus) == SIGTRAP)
+			mess_with_syscall(testpid, sh->ptrace_pass);
+	} while (sh->ptrace_pass != PTP_DONE &&
+		 !ptrace(PTRACE_SYSCALL, testpid, NULL, NULL));
+
+	ptrace(PTRACE_DETACH, testpid, NULL, NULL);
+
+	/* Wait for the child process to terminate */
+	while (waitpid(testpid, &wstatus, 0) != testpid || !WIFEXITED(wstatus))
+		/* wait some more */;
+}
+
+static void test_traced_syscall_numbering(void)
+{
+	pid_t testpid;
+
+	/* Launch the test thread; this thread continues as the tracer thread */
+	testpid = fork();
+
+	if (testpid < 0) {
+		crit("Unable to launch tracer process\n");
+	} else if (testpid == 0) {
+		syscall_numbering_tracee();
+		_exit(0);
+	} else {
+		syscall_numbering_tracer(testpid);
 	}
 }
 
 int main(void)
 {
+	unsigned int nerr;
+
 	/*
 	 * It is quite likely to get a segfault on a failure, so make
 	 * sure the message gets out by setting stdout to nonbuffered.
@@ -287,7 +450,24 @@ int main(void)
 		crit("Unable to open /dev/null: %s\n", strerror(errno));
 	}
 
+	/*
+	 * Set up a block of shared memory...
+	 */
+	sh = mmap(NULL, sysconf(_SC_PAGE_SIZE), PROT_READ|PROT_WRITE,
+		  MAP_ANONYMOUS|MAP_SHARED, 0, 0);
+	if (sh == MAP_FAILED) {
+		crit("Unable to allocated shared memory block: %s\n",
+		     strerror(errno));
+	}
+
+	with_x32 = test_x32();
+
+	run("Running tests without ptrace...\n");
 	test_syscall_numbering();
+
+	test_traced_syscall_numbering();
+
+	nerr = sh->nerr;
 	if (!nerr) {
 		ok("All system calls succeeded or failed as expected\n");
 		return 0;
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH v4 4/6] x86/syscall: sign-extend system calls on entry to int
  2021-05-18 19:12 [PATCH v4 0/6] x86/syscall: use int for x86-64 system calls H. Peter Anvin
                   ` (2 preceding siblings ...)
  2021-05-18 19:13 ` [PATCH v4 3/6] x86/syscall: add tests under ptrace to syscall_numbering.c H. Peter Anvin
@ 2021-05-18 19:13 ` H. Peter Anvin
  2021-05-20 13:23   ` [tip: x86/entry] x86/entry/64: Sign-extend " tip-bot2 for H. Peter Anvin (Intel)
  2021-05-18 19:13 ` [PATCH v4 5/6] x86/syscall: treat out of range and gap system calls the same H. Peter Anvin
                   ` (2 subsequent siblings)
  6 siblings, 1 reply; 18+ messages in thread
From: H. Peter Anvin @ 2021-05-18 19:13 UTC (permalink / raw)
  To: Thomas Gleixner, Ingo Molnar, Andy Lutomirski, Borislav Petkov,
	H. Peter Anvin
  Cc: Linux Kernel Mailing List

From: "H. Peter Anvin (Intel)" <hpa@zytor.com>

Right now, *some* code will treat e.g. 0x0000000100000001 as a system
call and some will not. Some of the code, notably in ptrace, will
treat 0x000000018000000 as a system call and some will not. Finally,
right now, e.g. 335 for x86-64 will force the exit code to be set to
-ENOSYS even if poked by ptrace, but 548 will not, because there is an
observable difference between an out of range system call and a system
call number that falls outside the range of the table.

This is visible to the user: for example, the syscall_numbering_64
test fails if run under strace, because as strace uses ptrace, it ends
up clobbering the upper half of the 64-bit system call number.

The arch-independent code all assumes that a system call is "int" that
the value -1 specifically and not just any negative value is used for
a non-system call. This is the case on x86 as well when
arch-independent code is involved. The arch-independent API is
defined/documented (but not *implemented*!) in
<asm-generic/syscall.h>.

This is an ABI change, but is in fact a revert to the original x86-64
ABI. The original assembly entry code would zero-extend the system
call number; this patch uses sign extend to be explicit that this is
treated as a signed number (although in practice it makes no
difference, of course) and to avoid people getting the idea of
"optimizing" it, as has happened on at least two(!) separate
occasions.

Do not store the extended value into regs->orig_ax, however: on
x86-64, the ABI is that the callee is responsible for extending
parameters, so only examining the lower 32 bits is fully consistent
with any "int" argument to any system call, e.g. regs->di for
write(2). The full value of %rax on entry to the kernel is thus still
available.

Signed-off-by: H. Peter Anvin (Intel) <hpa@zytor.com>
---
 arch/x86/entry/entry_64.S | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/x86/entry/entry_64.S b/arch/x86/entry/entry_64.S
index 1d9db15fdc69..85f04ea0e368 100644
--- a/arch/x86/entry/entry_64.S
+++ b/arch/x86/entry/entry_64.S
@@ -108,7 +108,7 @@ SYM_INNER_LABEL(entry_SYSCALL_64_after_hwframe, SYM_L_GLOBAL)
 
 	/* IRQs are off. */
 	movq	%rsp, %rdi
-	movq	%rax, %rsi
+	movslq	%eax, %rsi
 	call	do_syscall_64		/* returns with IRQs disabled */
 
 	/*
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH v4 5/6] x86/syscall: treat out of range and gap system calls the same
  2021-05-18 19:12 [PATCH v4 0/6] x86/syscall: use int for x86-64 system calls H. Peter Anvin
                   ` (3 preceding siblings ...)
  2021-05-18 19:13 ` [PATCH v4 4/6] x86/syscall: sign-extend system calls on entry to int H. Peter Anvin
@ 2021-05-18 19:13 ` H. Peter Anvin
  2021-05-20 13:23   ` [tip: x86/entry] x86/entry: Treat " tip-bot2 for H. Peter Anvin (Intel)
  2021-05-18 19:13 ` [PATCH v4 6/6] x86/syscall: use int everywhere for system call numbers H. Peter Anvin
  2021-05-19 11:29 ` [PATCH v4 0/6] x86/syscall: use int for x86-64 system calls Ingo Molnar
  6 siblings, 1 reply; 18+ messages in thread
From: H. Peter Anvin @ 2021-05-18 19:13 UTC (permalink / raw)
  To: Thomas Gleixner, Ingo Molnar, Andy Lutomirski, Borislav Petkov,
	H. Peter Anvin
  Cc: Linux Kernel Mailing List

From: "H. Peter Anvin (Intel)" <hpa@zytor.com>

The current 64-bit system call entry code treats out-of-range system
calls differently than system calls that map to a hole in the system
call table. This is visible to the user if system calls are
intercepted via ptrace or seccomp and the return value (regs->ax) is
modified: in the former case, the return value is preserved, and in
the latter case, sys_ni_syscall() is called and the return value is
forced to -ENOSYS.

The API spec in <asm-generic/syscalls.h> is very clear that only
(int)-1 is the non-system-call sentinel value, so make the system call
behavior consistent by calling sys_ni_syscall() for all invalid system
call numbers except for -1.

Although currently sys_ni_syscall() simply returns -ENOSYS, calling it
explicitly is friendly for tracing and future possible extensions, and
as this is an error path there is no reason to optimize it.

Signed-off-by: H. Peter Anvin (Intel) <hpa@zytor.com>
---
 arch/x86/entry/common.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/arch/x86/entry/common.c b/arch/x86/entry/common.c
index 00da0f5420de..f51bc17262db 100644
--- a/arch/x86/entry/common.c
+++ b/arch/x86/entry/common.c
@@ -52,6 +52,8 @@ __visible noinstr void do_syscall_64(struct pt_regs *regs, unsigned long nr)
 					X32_NR_syscalls);
 		regs->ax = x32_sys_call_table[nr](regs);
 #endif
+	} else if (unlikely((int)nr != -1)) {
+		regs->ax = __x64_sys_ni_syscall(regs);
 	}
 	instrumentation_end();
 	syscall_exit_to_user_mode(regs);
@@ -76,6 +78,8 @@ static __always_inline void do_syscall_32_irqs_on(struct pt_regs *regs,
 	if (likely(nr < IA32_NR_syscalls)) {
 		nr = array_index_nospec(nr, IA32_NR_syscalls);
 		regs->ax = ia32_sys_call_table[nr](regs);
+	} else if (unlikely((int)nr != -1)) {
+		regs->ax = __ia32_sys_ni_syscall(regs);
 	}
 }
 
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH v4 6/6] x86/syscall: use int everywhere for system call numbers
  2021-05-18 19:12 [PATCH v4 0/6] x86/syscall: use int for x86-64 system calls H. Peter Anvin
                   ` (4 preceding siblings ...)
  2021-05-18 19:13 ` [PATCH v4 5/6] x86/syscall: treat out of range and gap system calls the same H. Peter Anvin
@ 2021-05-18 19:13 ` H. Peter Anvin
  2021-05-20  8:53   ` Thomas Gleixner
  2021-05-25  8:13   ` [tip: x86/entry] x86/entry: Use " tip-bot2 for H. Peter Anvin (Intel)
  2021-05-19 11:29 ` [PATCH v4 0/6] x86/syscall: use int for x86-64 system calls Ingo Molnar
  6 siblings, 2 replies; 18+ messages in thread
From: H. Peter Anvin @ 2021-05-18 19:13 UTC (permalink / raw)
  To: Thomas Gleixner, Ingo Molnar, Andy Lutomirski, Borislav Petkov,
	H. Peter Anvin
  Cc: Linux Kernel Mailing List

From: "H. Peter Anvin (Intel)" <hpa@zytor.com>

System call numbers are defined as int, so use int everywhere for
system call numbers. This patch is strictly a cleanup; it should not
change anything user visible; all ABI changes have been done in the
preceeding patches.

Signed-off-by: H. Peter Anvin (Intel) <hpa@zytor.com>
---
 arch/x86/entry/common.c        | 93 ++++++++++++++++++++++++----------
 arch/x86/include/asm/syscall.h |  2 +-
 2 files changed, 66 insertions(+), 29 deletions(-)

diff --git a/arch/x86/entry/common.c b/arch/x86/entry/common.c
index f51bc17262db..714804f0970c 100644
--- a/arch/x86/entry/common.c
+++ b/arch/x86/entry/common.c
@@ -36,49 +36,87 @@
 #include <asm/irq_stack.h>
 
 #ifdef CONFIG_X86_64
-__visible noinstr void do_syscall_64(struct pt_regs *regs, unsigned long nr)
+
+static __always_inline bool do_syscall_x64(struct pt_regs *regs, int nr)
+{
+	/*
+	 * Convert negative numbers to very high and thus out of range
+	 * numbers for comparisons. Use unsigned long to slightly
+	 * improve the array_index_nospec() generated code.
+	 */
+	unsigned long unr = nr;
+
+	if (likely(unr < NR_syscalls)) {
+		unr = array_index_nospec(unr, NR_syscalls);
+		regs->ax = sys_call_table[unr](regs);
+		return true;
+	}
+	return false;
+}
+
+static __always_inline bool do_syscall_x32(struct pt_regs *regs, int nr)
+{
+	/*
+	 * Adjust the starting offset of the table, and convert numbers
+	 * < __X32_SYSCALL_BIT to very high and thus out of range
+	 * numbers for comparisons. Use unsigned long to slightly
+	 * improve the array_index_nospec() generated code.
+	 */
+	unsigned long xnr = nr - __X32_SYSCALL_BIT;
+
+	if (IS_ENABLED(CONFIG_X86_X32_ABI) &&
+	    likely(xnr < X32_NR_syscalls)) {
+		xnr = array_index_nospec(xnr, X32_NR_syscalls);
+		regs->ax = x32_sys_call_table[xnr](regs);
+		return true;
+	}
+	return false;
+}
+
+__visible noinstr void do_syscall_64(struct pt_regs *regs, int nr)
 {
 	add_random_kstack_offset();
 	nr = syscall_enter_from_user_mode(regs, nr);
 
 	instrumentation_begin();
-	if (likely(nr < NR_syscalls)) {
-		nr = array_index_nospec(nr, NR_syscalls);
-		regs->ax = sys_call_table[nr](regs);
-#ifdef CONFIG_X86_X32_ABI
-	} else if (likely((nr & __X32_SYSCALL_BIT) &&
-			  (nr & ~__X32_SYSCALL_BIT) < X32_NR_syscalls)) {
-		nr = array_index_nospec(nr & ~__X32_SYSCALL_BIT,
-					X32_NR_syscalls);
-		regs->ax = x32_sys_call_table[nr](regs);
-#endif
-	} else if (unlikely((int)nr != -1)) {
+
+	if (!do_syscall_x64(regs, nr) &&
+	    !do_syscall_x32(regs, nr) &&
+	    unlikely(nr != -1)) {
+		/* Invalid system call, but still a system call? */
 		regs->ax = __x64_sys_ni_syscall(regs);
 	}
+
 	instrumentation_end();
 	syscall_exit_to_user_mode(regs);
 }
 #endif
 
 #if defined(CONFIG_X86_32) || defined(CONFIG_IA32_EMULATION)
-static __always_inline unsigned int syscall_32_enter(struct pt_regs *regs)
+static __always_inline int syscall_32_enter(struct pt_regs *regs)
 {
 	if (IS_ENABLED(CONFIG_IA32_EMULATION))
 		current_thread_info()->status |= TS_COMPAT;
 
-	return (unsigned int)regs->orig_ax;
+	return (int)regs->orig_ax;
 }
 
 /*
  * Invoke a 32-bit syscall.  Called with IRQs on in CONTEXT_KERNEL.
  */
-static __always_inline void do_syscall_32_irqs_on(struct pt_regs *regs,
-						  unsigned int nr)
+static __always_inline void do_syscall_32_irqs_on(struct pt_regs *regs, int nr)
 {
-	if (likely(nr < IA32_NR_syscalls)) {
-		nr = array_index_nospec(nr, IA32_NR_syscalls);
-		regs->ax = ia32_sys_call_table[nr](regs);
-	} else if (unlikely((int)nr != -1)) {
+	/*
+	 * Convert negative numbers to very high and thus out of range
+	 * numbers for comparisons. Use unsigned long to slightly
+	 * improve the array_index_nospec() generated code.
+	 */
+	unsigned long unr = nr;
+
+	if (likely(unr < IA32_NR_syscalls)) {
+		unr = array_index_nospec(unr, IA32_NR_syscalls);
+		regs->ax = ia32_sys_call_table[unr](regs);
+	} else if (unlikely(nr != -1)) {
 		regs->ax = __ia32_sys_ni_syscall(regs);
 	}
 }
@@ -86,15 +124,15 @@ static __always_inline void do_syscall_32_irqs_on(struct pt_regs *regs,
 /* Handles int $0x80 */
 __visible noinstr void do_int80_syscall_32(struct pt_regs *regs)
 {
-	unsigned int nr = syscall_32_enter(regs);
+	int nr = syscall_32_enter(regs);
 
 	add_random_kstack_offset();
 	/*
-	 * Subtlety here: if ptrace pokes something larger than 2^32-1 into
-	 * orig_ax, the unsigned int return value truncates it.  This may
-	 * or may not be necessary, but it matches the old asm behavior.
+	 * Subtlety here: if ptrace pokes something larger than 2^31-1 into
+	 * orig_ax, the int return value truncates it. This matches
+	 * the semantics of syscall_get_nr().
 	 */
-	nr = (unsigned int)syscall_enter_from_user_mode(regs, nr);
+	nr = syscall_enter_from_user_mode(regs, nr);
 	instrumentation_begin();
 
 	do_syscall_32_irqs_on(regs, nr);
@@ -105,7 +143,7 @@ __visible noinstr void do_int80_syscall_32(struct pt_regs *regs)
 
 static noinstr bool __do_fast_syscall_32(struct pt_regs *regs)
 {
-	unsigned int nr = syscall_32_enter(regs);
+	int nr = syscall_32_enter(regs);
 	int res;
 
 	add_random_kstack_offset();
@@ -140,8 +178,7 @@ static noinstr bool __do_fast_syscall_32(struct pt_regs *regs)
 		return false;
 	}
 
-	/* The case truncates any ptrace induced syscall nr > 2^32 -1 */
-	nr = (unsigned int)syscall_enter_from_user_mode_work(regs, nr);
+	nr = syscall_enter_from_user_mode_work(regs, nr);
 
 	/* Now this is just like a normal syscall. */
 	do_syscall_32_irqs_on(regs, nr);
diff --git a/arch/x86/include/asm/syscall.h b/arch/x86/include/asm/syscall.h
index f6593cafdbd9..f7e2d82d24fb 100644
--- a/arch/x86/include/asm/syscall.h
+++ b/arch/x86/include/asm/syscall.h
@@ -159,7 +159,7 @@ static inline int syscall_get_arch(struct task_struct *task)
 		? AUDIT_ARCH_I386 : AUDIT_ARCH_X86_64;
 }
 
-void do_syscall_64(struct pt_regs *regs, unsigned long nr);
+void do_syscall_64(struct pt_regs *regs, int nr);
 void do_int80_syscall_32(struct pt_regs *regs);
 long do_fast_syscall_32(struct pt_regs *regs);
 
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* Re: [PATCH v4 0/6] x86/syscall: use int for x86-64 system calls
  2021-05-18 19:12 [PATCH v4 0/6] x86/syscall: use int for x86-64 system calls H. Peter Anvin
                   ` (5 preceding siblings ...)
  2021-05-18 19:13 ` [PATCH v4 6/6] x86/syscall: use int everywhere for system call numbers H. Peter Anvin
@ 2021-05-19 11:29 ` Ingo Molnar
  2021-05-19 16:17   ` H. Peter Anvin
  6 siblings, 1 reply; 18+ messages in thread
From: Ingo Molnar @ 2021-05-19 11:29 UTC (permalink / raw)
  To: H. Peter Anvin
  Cc: Thomas Gleixner, Ingo Molnar, Andy Lutomirski, Borislav Petkov,
	Linux Kernel Mailing List


* H. Peter Anvin <hpa@zytor.com> wrote:

> From: "H. Peter Anvin (Intel)" <hpa@zytor.com>
> 
> This patchset addresses several inconsistencies in the handling of
> system call numbers in x86-64 (and x32).

>  arch/x86/entry/common.c                         |  93 +++--
>  arch/x86/entry/entry_64.S                       |   2 +-
>  arch/x86/include/asm/syscall.h                  |   2 +-
>  tools/testing/selftests/x86/syscall_numbering.c | 488 +++++++++++++++++++++---
>  4 files changed, 508 insertions(+), 77 deletions(-)

Thanks Peter - this series is really nice now, and I agree that this 
inconsistency should be fixed.

Thanks,

	Ingo


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v4 0/6] x86/syscall: use int for x86-64 system calls
  2021-05-19 11:29 ` [PATCH v4 0/6] x86/syscall: use int for x86-64 system calls Ingo Molnar
@ 2021-05-19 16:17   ` H. Peter Anvin
  0 siblings, 0 replies; 18+ messages in thread
From: H. Peter Anvin @ 2021-05-19 16:17 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Thomas Gleixner, Ingo Molnar, Andy Lutomirski, Borislav Petkov,
	Linux Kernel Mailing List

:)

On May 19, 2021 4:29:29 AM PDT, Ingo Molnar <mingo@kernel.org> wrote:
>
>* H. Peter Anvin <hpa@zytor.com> wrote:
>
>> From: "H. Peter Anvin (Intel)" <hpa@zytor.com>
>> 
>> This patchset addresses several inconsistencies in the handling of
>> system call numbers in x86-64 (and x32).
>
>>  arch/x86/entry/common.c                         |  93 +++--
>>  arch/x86/entry/entry_64.S                       |   2 +-
>>  arch/x86/include/asm/syscall.h                  |   2 +-
>>  tools/testing/selftests/x86/syscall_numbering.c | 488
>+++++++++++++++++++++---
>>  4 files changed, 508 insertions(+), 77 deletions(-)
>
>Thanks Peter - this series is really nice now, and I agree that this 
>inconsistency should be fixed.
>
>Thanks,
>
>	Ingo

-- 
Sent from my Android device with K-9 Mail. Please excuse my brevity.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v4 6/6] x86/syscall: use int everywhere for system call numbers
  2021-05-18 19:13 ` [PATCH v4 6/6] x86/syscall: use int everywhere for system call numbers H. Peter Anvin
@ 2021-05-20  8:53   ` Thomas Gleixner
  2021-05-21 21:36     ` H. Peter Anvin
  2021-05-25  8:13   ` [tip: x86/entry] x86/entry: Use " tip-bot2 for H. Peter Anvin (Intel)
  1 sibling, 1 reply; 18+ messages in thread
From: Thomas Gleixner @ 2021-05-20  8:53 UTC (permalink / raw)
  To: H. Peter Anvin, Ingo Molnar, Andy Lutomirski, Borislav Petkov,
	H. Peter Anvin
  Cc: Linux Kernel Mailing List

On Tue, May 18 2021 at 12:13, H. Peter Anvin wrote:
> +static __always_inline bool do_syscall_x64(struct pt_regs *regs, int nr)
> +{
> +	/*
> +	 * Convert negative numbers to very high and thus out of range
> +	 * numbers for comparisons. Use unsigned long to slightly
> +	 * improve the array_index_nospec() generated code.

How is that actually improving the generated code?

unsigned long:

 104:	48 81 fa bf 01 00 00 	cmp    $0x1bf,%rdx
 10b:	48 19 c0             	sbb    %rax,%rax
 10e:	48 21 c2             	and    %rax,%rdx
 111:	48 89 df             	mov    %rbx,%rdi
 114:	48 8b 04 d5 00 00 00 	mov    0x0(,%rdx,8),%rax
 11b:	00 
 11c:	e8 00 00 00 00       	callq  121 <do_syscall_64+0x41>

unsigned int:

  f1:	48 81 fa bf 01 00 00 	cmp    $0x1bf,%rdx
  f8:	48 19 d2             	sbb    %rdx,%rdx
  fb:	21 d0                	and    %edx,%eax
  fd:	48 89 df             	mov    %rbx,%rdi
 100:	48 8b 04 c5 00 00 00 	mov    0x0(,%rax,8),%rax
 107:	00 
 108:	e8 00 00 00 00       	callq  10d <do_syscall_64+0x3d>

Text size increases with that unsigned long cast.

I must be missing something.

Thanks,

        tglx

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [tip: x86/entry] x86/entry: Treat out of range and gap system calls the same
  2021-05-18 19:13 ` [PATCH v4 5/6] x86/syscall: treat out of range and gap system calls the same H. Peter Anvin
@ 2021-05-20 13:23   ` tip-bot2 for H. Peter Anvin (Intel)
  0 siblings, 0 replies; 18+ messages in thread
From: tip-bot2 for H. Peter Anvin (Intel) @ 2021-05-20 13:23 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: H. Peter Anvin (Intel), Thomas Gleixner, x86, linux-kernel

The following commit has been merged into the x86/entry branch of tip:

Commit-ID:     b337b4965e3a3e567f11828a9e3fe3fb3faefa47
Gitweb:        https://git.kernel.org/tip/b337b4965e3a3e567f11828a9e3fe3fb3faefa47
Author:        H. Peter Anvin (Intel) <hpa@zytor.com>
AuthorDate:    Tue, 18 May 2021 12:13:02 -07:00
Committer:     Thomas Gleixner <tglx@linutronix.de>
CommitterDate: Thu, 20 May 2021 15:19:49 +02:00

x86/entry: Treat out of range and gap system calls the same

The current 64-bit system call entry code treats out-of-range system
calls differently than system calls that map to a hole in the system
call table.

This is visible to the user if system calls are intercepted via ptrace or
seccomp and the return value (regs->ax) is modified: in the former case,
the return value is preserved, and in the latter case, sys_ni_syscall() is
called and the return value is forced to -ENOSYS.

The API spec in <asm-generic/syscalls.h> is very clear that only
(int)-1 is the non-system-call sentinel value, so make the system call
behavior consistent by calling sys_ni_syscall() for all invalid system
call numbers except for -1.

Although currently sys_ni_syscall() simply returns -ENOSYS, calling it
explicitly is friendly for tracing and future possible extensions, and
as this is an error path there is no reason to optimize it.

Signed-off-by: H. Peter Anvin (Intel) <hpa@zytor.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/r/20210518191303.4135296-6-hpa@zytor.com

---
 arch/x86/entry/common.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/arch/x86/entry/common.c b/arch/x86/entry/common.c
index 00da0f5..f51bc17 100644
--- a/arch/x86/entry/common.c
+++ b/arch/x86/entry/common.c
@@ -52,6 +52,8 @@ __visible noinstr void do_syscall_64(struct pt_regs *regs, unsigned long nr)
 					X32_NR_syscalls);
 		regs->ax = x32_sys_call_table[nr](regs);
 #endif
+	} else if (unlikely((int)nr != -1)) {
+		regs->ax = __x64_sys_ni_syscall(regs);
 	}
 	instrumentation_end();
 	syscall_exit_to_user_mode(regs);
@@ -76,6 +78,8 @@ static __always_inline void do_syscall_32_irqs_on(struct pt_regs *regs,
 	if (likely(nr < IA32_NR_syscalls)) {
 		nr = array_index_nospec(nr, IA32_NR_syscalls);
 		regs->ax = ia32_sys_call_table[nr](regs);
+	} else if (unlikely((int)nr != -1)) {
+		regs->ax = __ia32_sys_ni_syscall(regs);
 	}
 }
 

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [tip: x86/entry] x86/entry/64: Sign-extend system calls on entry to int
  2021-05-18 19:13 ` [PATCH v4 4/6] x86/syscall: sign-extend system calls on entry to int H. Peter Anvin
@ 2021-05-20 13:23   ` tip-bot2 for H. Peter Anvin (Intel)
  0 siblings, 0 replies; 18+ messages in thread
From: tip-bot2 for H. Peter Anvin (Intel) @ 2021-05-20 13:23 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: H. Peter Anvin (Intel), Thomas Gleixner, x86, linux-kernel

The following commit has been merged into the x86/entry branch of tip:

Commit-ID:     0595494891723a1dcca5eaa8eeca8ab54ad953b9
Gitweb:        https://git.kernel.org/tip/0595494891723a1dcca5eaa8eeca8ab54ad953b9
Author:        H. Peter Anvin (Intel) <hpa@zytor.com>
AuthorDate:    Tue, 18 May 2021 12:13:01 -07:00
Committer:     Thomas Gleixner <tglx@linutronix.de>
CommitterDate: Thu, 20 May 2021 15:19:49 +02:00

x86/entry/64: Sign-extend system calls on entry to int

Right now, *some* code will treat e.g. 0x0000000100000001 as a system
call and some will not. Some of the code, notably in ptrace, will
treat 0x000000018000000 as a system call and some will not. Finally,
right now, e.g. 335 for x86-64 will force the exit code to be set to
-ENOSYS even if poked by ptrace, but 548 will not, because there is an
observable difference between an out of range system call and a system
call number that falls outside the range of the table.

This is visible to the user: for example, the syscall_numbering_64
test fails if run under strace, because as strace uses ptrace, it ends
up clobbering the upper half of the 64-bit system call number.

The architecture independent code all assumes that a system call is "int"
that the value -1 specifically and not just any negative value is used for
a non-system call. This is the case on x86 as well when arch-independent
code is involved. The arch-independent API is defined/documented (but not
*implemented*!) in <asm-generic/syscall.h>.

This is an ABI change, but is in fact a revert to the original x86-64
ABI. The original assembly entry code would zero-extend the system call
number;

Use sign extend to be explicit that this is treated as a signed number
(although in practice it makes no difference, of course) and to avoid
people getting the idea of "optimizing" it, as has happened on at least
two(!) separate occasions.

Do not store the extended value into regs->orig_ax, however: on x86-64, the
ABI is that the callee is responsible for extending parameters, so only
examining the lower 32 bits is fully consistent with any "int" argument to
any system call, e.g. regs->di for write(2). The full value of %rax on
entry to the kernel is thus still available.

[ tglx: Add a comment to the ASM code ]

Signed-off-by: H. Peter Anvin (Intel) <hpa@zytor.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/r/20210518191303.4135296-5-hpa@zytor.com

---
 arch/x86/entry/entry_64.S | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/arch/x86/entry/entry_64.S b/arch/x86/entry/entry_64.S
index 1d9db15..a5f02d0 100644
--- a/arch/x86/entry/entry_64.S
+++ b/arch/x86/entry/entry_64.S
@@ -108,7 +108,8 @@ SYM_INNER_LABEL(entry_SYSCALL_64_after_hwframe, SYM_L_GLOBAL)
 
 	/* IRQs are off. */
 	movq	%rsp, %rdi
-	movq	%rax, %rsi
+	/* Sign extend the lower 32bit as syscall numbers are treated as int */
+	movslq	%eax, %rsi
 	call	do_syscall_64		/* returns with IRQs disabled */
 
 	/*

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [tip: x86/entry] selftests/x86/syscall: Simplify message reporting in syscall_numbering
  2021-05-18 19:12 ` [PATCH v4 2/6] x86/syscall: simplify message reporting in syscall_numbering.c H. Peter Anvin
@ 2021-05-20 13:23   ` tip-bot2 for H. Peter Anvin (Intel)
  0 siblings, 0 replies; 18+ messages in thread
From: tip-bot2 for H. Peter Anvin (Intel) @ 2021-05-20 13:23 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: H. Peter Anvin (Intel), Thomas Gleixner, x86, linux-kernel

The following commit has been merged into the x86/entry branch of tip:

Commit-ID:     c5c39488dcb5f818bb07f856a349262d667ef147
Gitweb:        https://git.kernel.org/tip/c5c39488dcb5f818bb07f856a349262d667ef147
Author:        H. Peter Anvin (Intel) <hpa@zytor.com>
AuthorDate:    Tue, 18 May 2021 12:12:59 -07:00
Committer:     Thomas Gleixner <tglx@linutronix.de>
CommitterDate: Thu, 20 May 2021 15:19:48 +02:00

selftests/x86/syscall: Simplify message reporting in syscall_numbering

Reduce some boiler plate in printing and indenting messages.
This makes it easier to produce clean status output.

Signed-off-by: H. Peter Anvin (Intel) <hpa@zytor.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/r/20210518191303.4135296-3-hpa@zytor.com

---
 tools/testing/selftests/x86/syscall_numbering.c | 103 ++++++++++-----
 1 file changed, 72 insertions(+), 31 deletions(-)

diff --git a/tools/testing/selftests/x86/syscall_numbering.c b/tools/testing/selftests/x86/syscall_numbering.c
index 7dd86bc..434fe0e 100644
--- a/tools/testing/selftests/x86/syscall_numbering.c
+++ b/tools/testing/selftests/x86/syscall_numbering.c
@@ -16,6 +16,7 @@
 #include <string.h>
 #include <fcntl.h>
 #include <limits.h>
+#include <sysexits.h>
 
 /* Common system call numbers */
 #define SYS_READ	  0
@@ -34,6 +35,33 @@
 
 static unsigned int nerr = 0;	/* Cumulative error count */
 static int nullfd = -1;		/* File descriptor for /dev/null */
+static int indent = 0;
+
+static inline unsigned int offset(void)
+{
+	return 8 + indent * 4;
+}
+
+#define msg(lvl, fmt, ...) printf("%-*s" fmt, offset(), "[" #lvl "]", \
+				  ## __VA_ARGS__)
+
+#define run(fmt, ...)  msg(RUN,  fmt, ## __VA_ARGS__)
+#define info(fmt, ...) msg(INFO, fmt, ## __VA_ARGS__)
+#define ok(fmt, ...)   msg(OK,   fmt, ## __VA_ARGS__)
+
+#define fail(fmt, ...)					\
+	do {						\
+		msg(FAIL, fmt, ## __VA_ARGS__);		\
+		nerr++;					\
+	} while (0)
+
+#define crit(fmt, ...)					\
+	do {						\
+		indent = 0;				\
+		msg(FAIL, fmt, ## __VA_ARGS__);		\
+		msg(SKIP, "Unable to run test\n");	\
+		exit(EX_OSERR);
+	} while (0)
 
 /*
  * Directly invokes the given syscall with nullfd as the first argument
@@ -91,28 +119,37 @@ static unsigned int _check_for(int msb, int start, int end, long long expect,
 {
 	unsigned int err = 0;
 
+	indent++;
+	if (start != end)
+		indent++;
+
 	for (int nr = start; nr <= end; nr++) {
 		long long ret = probe_syscall(msb, nr);
 
 		if (ret != expect) {
-			printf("[FAIL]\t      %s returned %lld, but it should have returned %s\n",
+			fail("%s returned %lld, but it should have returned %s\n",
 			       syscall_str(msb, nr, nr),
 			       ret, expect_str);
 			err++;
 		}
 	}
 
+	if (start != end)
+		indent--;
+
 	if (err) {
 		nerr += err;
 		if (start != end)
-			printf("[FAIL]\t      %s had %u failure%s\n",
+			fail("%s had %u failure%s\n",
 			       syscall_str(msb, start, end),
-			       err, (err == 1) ? "s" : "");
+			       err, err == 1 ? "s" : "");
 	} else {
-		printf("[OK]\t      %s returned %s as expected\n",
-		       syscall_str(msb, start, end), expect_str);
+		ok("%s returned %s as expected\n",
+		   syscall_str(msb, start, end), expect_str);
 	}
 
+	indent--;
+
 	return err;
 }
 
@@ -137,35 +174,38 @@ static bool check_enosys(int msb, int nr)
 static bool test_x32(void)
 {
 	long long ret;
-	long long mypid = getpid();
+	pid_t mypid = getpid();
+	bool with_x32;
 
-	printf("[RUN]\tChecking for x32 by calling x32 getpid()\n");
+	run("Checking for x32 by calling x32 getpid()\n");
 	ret = probe_syscall(0, SYS_GETPID | X32_BIT);
 
+	indent++;
 	if (ret == mypid) {
-		printf("[INFO]\t   x32 is supported\n");
-		return true;
+		info("x32 is supported\n");
+		with_x32 = true;
 	} else if (ret == -ENOSYS) {
-		printf("[INFO]\t   x32 is not supported\n");
-		return false;
+		info("x32 is not supported\n");
+		with_x32 = false;
 	} else {
-		printf("[FAIL]\t   x32 getpid() returned %lld, but it should have returned either %lld or -ENOSYS\n", ret, mypid);
-		nerr++;
-		return true;	/* Proceed as if... */
+		fail("x32 getpid() returned %lld, but it should have returned either %lld or -ENOSYS\n", ret, mypid);
+		with_x32 = false;
 	}
+	indent--;
+	return with_x32;
 }
 
 static void test_syscalls_common(int msb)
 {
-	printf("[RUN]\t   Checking some common syscalls as 64 bit\n");
+	run("Checking some common syscalls as 64 bit\n");
 	check_zero(msb, SYS_READ);
 	check_zero(msb, SYS_WRITE);
 
-	printf("[RUN]\t   Checking some 64-bit only syscalls as 64 bit\n");
+	run("Checking some 64-bit only syscalls as 64 bit\n");
 	check_zero(msb, X64_READV);
 	check_zero(msb, X64_WRITEV);
 
-	printf("[RUN]\t   Checking out of range system calls\n");
+	run("Checking out of range system calls\n");
 	check_for(msb, -64, -1, -ENOSYS);
 	check_for(msb, X32_BIT-64, X32_BIT-1, -ENOSYS);
 	check_for(msb, -64-X32_BIT, -1-X32_BIT, -ENOSYS);
@@ -180,18 +220,18 @@ static void test_syscalls_with_x32(int msb)
 	 * set.  Calling them without the x32 bit set is
 	 * nonsense and should not work.
 	 */
-	printf("[RUN]\t   Checking x32 syscalls as 64 bit\n");
+	run("Checking x32 syscalls as 64 bit\n");
 	check_for(msb, 512, 547, -ENOSYS);
 
-	printf("[RUN]\t   Checking some common syscalls as x32\n");
+	run("Checking some common syscalls as x32\n");
 	check_zero(msb, SYS_READ   | X32_BIT);
 	check_zero(msb, SYS_WRITE  | X32_BIT);
 
-	printf("[RUN]\t   Checking some x32 syscalls as x32\n");
+	run("Checking some x32 syscalls as x32\n");
 	check_zero(msb, X32_READV  | X32_BIT);
 	check_zero(msb, X32_WRITEV | X32_BIT);
 
-	printf("[RUN]\t   Checking some 64-bit syscalls as x32\n");
+	run("Checking some 64-bit syscalls as x32\n");
 	check_enosys(msb, X64_IOCTL  | X32_BIT);
 	check_enosys(msb, X64_READV  | X32_BIT);
 	check_enosys(msb, X64_WRITEV | X32_BIT);
@@ -199,7 +239,7 @@ static void test_syscalls_with_x32(int msb)
 
 static void test_syscalls_without_x32(int msb)
 {
-	printf("[RUN]\t  Checking for absence of x32 system calls\n");
+	run("Checking for absence of x32 system calls\n");
 	check_for(msb, 0 | X32_BIT, 999 | X32_BIT, -ENOSYS);
 }
 
@@ -217,14 +257,18 @@ static void test_syscall_numbering(void)
 	 */
 	for (size_t i = 0; i < sizeof(msbs)/sizeof(msbs[0]); i++) {
 		int msb = msbs[i];
-		printf("[RUN]\tChecking system calls with msb = %d (0x%x)\n",
-		       msb, msb);
+		run("Checking system calls with msb = %d (0x%x)\n",
+		    msb, msb);
+
+		indent++;
 
 		test_syscalls_common(msb);
 		if (with_x32)
 			test_syscalls_with_x32(msb);
 		else
 			test_syscalls_without_x32(msb);
+
+		indent--;
 	}
 }
 
@@ -241,19 +285,16 @@ int main(void)
 	 */
 	nullfd = open("/dev/null", O_RDWR);
 	if (nullfd < 0) {
-		printf("[FAIL]\tUnable to open /dev/null: %s\n",
-		       strerror(errno));
-		printf("[SKIP]\tCannot execute test\n");
-		return 71;	/* EX_OSERR */
+		crit("Unable to open /dev/null: %s\n", strerror(errno));
 	}
 
 	test_syscall_numbering();
 	if (!nerr) {
-		printf("[OK]\tAll system calls succeeded or failed as expected\n");
+		ok("All system calls succeeded or failed as expected\n");
 		return 0;
 	} else {
-		printf("[FAIL]\tA total of %u system call%s had incorrect behavior\n",
-		       nerr, nerr != 1 ? "s" : "");
+		fail("A total of %u system call%s had incorrect behavior\n",
+		     nerr, nerr != 1 ? "s" : "");
 		return 1;
 	}
 }

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [tip: x86/entry] selftests/x86/syscall: Add tests under ptrace to syscall_numbering_64
  2021-05-18 19:13 ` [PATCH v4 3/6] x86/syscall: add tests under ptrace to syscall_numbering.c H. Peter Anvin
@ 2021-05-20 13:23   ` tip-bot2 for H. Peter Anvin (Intel)
  0 siblings, 0 replies; 18+ messages in thread
From: tip-bot2 for H. Peter Anvin (Intel) @ 2021-05-20 13:23 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: H. Peter Anvin (Intel), Thomas Gleixner, x86, linux-kernel

The following commit has been merged into the x86/entry branch of tip:

Commit-ID:     795e2a023b8080b95442811f26f0762184116caa
Gitweb:        https://git.kernel.org/tip/795e2a023b8080b95442811f26f0762184116caa
Author:        H. Peter Anvin (Intel) <hpa@zytor.com>
AuthorDate:    Tue, 18 May 2021 12:13:00 -07:00
Committer:     Thomas Gleixner <tglx@linutronix.de>
CommitterDate: Thu, 20 May 2021 15:19:48 +02:00

selftests/x86/syscall: Add tests under ptrace to syscall_numbering_64

Add tests running under ptrace for syscall_numbering_64. ptrace stopping on
syscall entry and possibly modifying the syscall number (regs.orig_rax) or
the default return value (regs.rax) can have different results than the
normal system call path.

Signed-off-by: H. Peter Anvin (Intel) <hpa@zytor.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/r/20210518191303.4135296-4-hpa@zytor.com

---
 tools/testing/selftests/x86/syscall_numbering.c | 232 +++++++++++++--
 1 file changed, 207 insertions(+), 25 deletions(-)

diff --git a/tools/testing/selftests/x86/syscall_numbering.c b/tools/testing/selftests/x86/syscall_numbering.c
index 434fe0e..9915917 100644
--- a/tools/testing/selftests/x86/syscall_numbering.c
+++ b/tools/testing/selftests/x86/syscall_numbering.c
@@ -16,8 +16,16 @@
 #include <string.h>
 #include <fcntl.h>
 #include <limits.h>
+#include <signal.h>
 #include <sysexits.h>
 
+#include <sys/ptrace.h>
+#include <sys/user.h>
+#include <sys/wait.h>
+#include <sys/mman.h>
+
+#include <linux/ptrace.h>
+
 /* Common system call numbers */
 #define SYS_READ	  0
 #define SYS_WRITE	  1
@@ -33,13 +41,45 @@
 
 #define X32_BIT 0x40000000
 
-static unsigned int nerr = 0;	/* Cumulative error count */
 static int nullfd = -1;		/* File descriptor for /dev/null */
-static int indent = 0;
+static bool with_x32;		/* x32 supported on this kernel? */
+
+enum ptrace_pass {
+	PTP_NOTHING,
+	PTP_GETREGS,
+	PTP_WRITEBACK,
+	PTP_FUZZRET,
+	PTP_FUZZHIGH,
+	PTP_INTNUM,
+	PTP_DONE
+};
+
+static const char * const ptrace_pass_name[] =
+{
+	[PTP_NOTHING]	= "just stop, no data read",
+	[PTP_GETREGS]	= "only getregs",
+	[PTP_WRITEBACK]	= "getregs, unmodified setregs",
+	[PTP_FUZZRET]	= "modifying the default return",
+	[PTP_FUZZHIGH]	= "clobbering the top 32 bits",
+	[PTP_INTNUM]	= "sign-extending the syscall number",
+};
+
+/*
+ * Shared memory block between tracer and test
+ */
+struct shared {
+	unsigned int nerr;	/* Total error count */
+	unsigned int indent;	/* Message indentation level */
+	enum ptrace_pass ptrace_pass;
+	bool probing_syscall;	/* In probe_syscall() */
+};
+static volatile struct shared *sh;
 
 static inline unsigned int offset(void)
 {
-	return 8 + indent * 4;
+	unsigned int level = sh ? sh->indent : 0;
+
+	return 8 + level * 4;
 }
 
 #define msg(lvl, fmt, ...) printf("%-*s" fmt, offset(), "[" #lvl "]", \
@@ -52,16 +92,19 @@ static inline unsigned int offset(void)
 #define fail(fmt, ...)					\
 	do {						\
 		msg(FAIL, fmt, ## __VA_ARGS__);		\
-		nerr++;					\
-	} while (0)
+		sh->nerr++;				\
+       } while (0)
 
 #define crit(fmt, ...)					\
 	do {						\
-		indent = 0;				\
+		sh->indent = 0;				\
 		msg(FAIL, fmt, ## __VA_ARGS__);		\
 		msg(SKIP, "Unable to run test\n");	\
-		exit(EX_OSERR);
-	} while (0)
+		exit(EX_OSERR);				\
+       } while (0)
+
+/* Sentinel for ptrace-modified return value */
+#define MODIFIED_BY_PTRACE	-9999
 
 /*
  * Directly invokes the given syscall with nullfd as the first argument
@@ -69,7 +112,7 @@ static inline unsigned int offset(void)
  * end up intercepting some system calls for some reason, or modify
  * the system call number itself.
  */
-static inline long long probe_syscall(int msb, int lsb)
+static long long probe_syscall(int msb, int lsb)
 {
 	register long long arg1 asm("rdi") = nullfd;
 	register long long arg2 asm("rsi") = 0;
@@ -80,11 +123,21 @@ static inline long long probe_syscall(int msb, int lsb)
 	long long nr = ((long long)msb << 32) | (unsigned int)lsb;
 	long long ret;
 
+	/*
+	 * We pass in an extra copy of the extended system call number
+	 * in %rbx, so we can examine it from the ptrace handler without
+	 * worrying about it being possibly modified. This is to test
+	 * the validity of struct user regs.orig_rax a.k.a.
+	 * struct pt_regs.orig_ax.
+	 */
+	sh->probing_syscall = true;
 	asm volatile("syscall"
 		     : "=a" (ret)
-		     : "a" (nr), "r" (arg1), "r" (arg2), "r" (arg3),
+		     : "a" (nr), "b" (nr),
+		       "r" (arg1), "r" (arg2), "r" (arg3),
 		       "r" (arg4), "r" (arg5), "r" (arg6)
 		     : "rcx", "r11", "memory", "cc");
+	sh->probing_syscall = false;
 
 	return ret;
 }
@@ -119,9 +172,9 @@ static unsigned int _check_for(int msb, int start, int end, long long expect,
 {
 	unsigned int err = 0;
 
-	indent++;
+	sh->indent++;
 	if (start != end)
-		indent++;
+		sh->indent++;
 
 	for (int nr = start; nr <= end; nr++) {
 		long long ret = probe_syscall(msb, nr);
@@ -135,20 +188,19 @@ static unsigned int _check_for(int msb, int start, int end, long long expect,
 	}
 
 	if (start != end)
-		indent--;
+		sh->indent--;
 
 	if (err) {
-		nerr += err;
 		if (start != end)
 			fail("%s had %u failure%s\n",
-			       syscall_str(msb, start, end),
-			       err, err == 1 ? "s" : "");
+			     syscall_str(msb, start, end),
+			     err, err == 1 ? "s" : "");
 	} else {
 		ok("%s returned %s as expected\n",
 		   syscall_str(msb, start, end), expect_str);
 	}
 
-	indent--;
+	sh->indent--;
 
 	return err;
 }
@@ -175,12 +227,11 @@ static bool test_x32(void)
 {
 	long long ret;
 	pid_t mypid = getpid();
-	bool with_x32;
 
 	run("Checking for x32 by calling x32 getpid()\n");
 	ret = probe_syscall(0, SYS_GETPID | X32_BIT);
 
-	indent++;
+	sh->indent++;
 	if (ret == mypid) {
 		info("x32 is supported\n");
 		with_x32 = true;
@@ -188,15 +239,17 @@ static bool test_x32(void)
 		info("x32 is not supported\n");
 		with_x32 = false;
 	} else {
-		fail("x32 getpid() returned %lld, but it should have returned either %lld or -ENOSYS\n", ret, mypid);
+		fail("x32 getpid() returned %lld, but it should have returned either %lld or -ENOSYS\n", ret, (long long)mypid);
 		with_x32 = false;
 	}
-	indent--;
+	sh->indent--;
 	return with_x32;
 }
 
 static void test_syscalls_common(int msb)
 {
+	enum ptrace_pass pass = sh->ptrace_pass;
+
 	run("Checking some common syscalls as 64 bit\n");
 	check_zero(msb, SYS_READ);
 	check_zero(msb, SYS_WRITE);
@@ -206,7 +259,11 @@ static void test_syscalls_common(int msb)
 	check_zero(msb, X64_WRITEV);
 
 	run("Checking out of range system calls\n");
-	check_for(msb, -64, -1, -ENOSYS);
+	check_for(msb, -64, -2, -ENOSYS);
+	if (pass >= PTP_FUZZRET)
+		check_for(msb, -1, -1, MODIFIED_BY_PTRACE);
+	else
+		check_for(msb, -1, -1, -ENOSYS);
 	check_for(msb, X32_BIT-64, X32_BIT-1, -ENOSYS);
 	check_for(msb, -64-X32_BIT, -1-X32_BIT, -ENOSYS);
 	check_for(msb, INT_MAX-64, INT_MAX-1, -ENOSYS);
@@ -249,7 +306,8 @@ static void test_syscall_numbering(void)
 		0, 1, -1, X32_BIT-1, X32_BIT, X32_BIT-1, -X32_BIT, INT_MAX,
 		INT_MIN, INT_MIN+1
 	};
-	bool with_x32 = test_x32();
+
+	sh->indent++;
 
 	/*
 	 * The MSB is supposed to be ignored, so we loop over a few
@@ -260,7 +318,7 @@ static void test_syscall_numbering(void)
 		run("Checking system calls with msb = %d (0x%x)\n",
 		    msb, msb);
 
-		indent++;
+		sh->indent++;
 
 		test_syscalls_common(msb);
 		if (with_x32)
@@ -268,12 +326,119 @@ static void test_syscall_numbering(void)
 		else
 			test_syscalls_without_x32(msb);
 
-		indent--;
+		sh->indent--;
+	}
+
+	sh->indent--;
+}
+
+static void syscall_numbering_tracee(void)
+{
+	enum ptrace_pass pass;
+
+	if (ptrace(PTRACE_TRACEME, 0, 0, 0)) {
+		crit("Failed to request tracing\n");
+		return;
+	}
+	raise(SIGSTOP);
+
+	for (sh->ptrace_pass = pass = PTP_NOTHING; pass < PTP_DONE;
+	     sh->ptrace_pass = ++pass) {
+		run("Running tests under ptrace: %s\n", ptrace_pass_name[pass]);
+		test_syscall_numbering();
+	}
+}
+
+static void mess_with_syscall(pid_t testpid, enum ptrace_pass pass)
+{
+	struct user_regs_struct regs;
+
+	sh->probing_syscall = false; /* Do this on entry only */
+
+	/* For these, don't even getregs */
+	if (pass == PTP_NOTHING || pass == PTP_DONE)
+		return;
+
+	ptrace(PTRACE_GETREGS, testpid, NULL, &regs);
+
+	if (regs.orig_rax != regs.rbx) {
+		fail("orig_rax %#llx doesn't match syscall number %#llx\n",
+		     (unsigned long long)regs.orig_rax,
+		     (unsigned long long)regs.rbx);
+	}
+
+	switch (pass) {
+	case PTP_GETREGS:
+		/* Just read, no writeback */
+		return;
+	case PTP_WRITEBACK:
+		/* Write back the same register state verbatim */
+		break;
+	case PTP_FUZZRET:
+		regs.rax = MODIFIED_BY_PTRACE;
+		break;
+	case PTP_FUZZHIGH:
+		regs.rax = MODIFIED_BY_PTRACE;
+		regs.orig_rax = regs.orig_rax | 0xffffffff00000000ULL;
+		break;
+	case PTP_INTNUM:
+		regs.rax = MODIFIED_BY_PTRACE;
+		regs.orig_rax = (int)regs.orig_rax;
+		break;
+	default:
+		crit("invalid ptrace_pass\n");
+		break;
+	}
+
+	ptrace(PTRACE_SETREGS, testpid, NULL, &regs);
+}
+
+static void syscall_numbering_tracer(pid_t testpid)
+{
+	int wstatus;
+
+	do {
+		pid_t wpid = waitpid(testpid, &wstatus, 0);
+		if (wpid < 0 && errno != EINTR)
+			break;
+		if (wpid != testpid)
+			continue;
+		if (!WIFSTOPPED(wstatus))
+			break;	/* Thread exited? */
+
+		if (sh->probing_syscall && WSTOPSIG(wstatus) == SIGTRAP)
+			mess_with_syscall(testpid, sh->ptrace_pass);
+	} while (sh->ptrace_pass != PTP_DONE &&
+		 !ptrace(PTRACE_SYSCALL, testpid, NULL, NULL));
+
+	ptrace(PTRACE_DETACH, testpid, NULL, NULL);
+
+	/* Wait for the child process to terminate */
+	while (waitpid(testpid, &wstatus, 0) != testpid || !WIFEXITED(wstatus))
+		/* wait some more */;
+}
+
+static void test_traced_syscall_numbering(void)
+{
+	pid_t testpid;
+
+	/* Launch the test thread; this thread continues as the tracer thread */
+	testpid = fork();
+
+	if (testpid < 0) {
+		crit("Unable to launch tracer process\n");
+	} else if (testpid == 0) {
+		syscall_numbering_tracee();
+		_exit(0);
+	} else {
+		syscall_numbering_tracer(testpid);
 	}
 }
 
 int main(void)
 {
+	unsigned int nerr;
+
 	/*
 	 * It is quite likely to get a segfault on a failure, so make
 	 * sure the message gets out by setting stdout to nonbuffered.
@@ -288,7 +453,24 @@ int main(void)
 		crit("Unable to open /dev/null: %s\n", strerror(errno));
 	}
 
+	/*
+	 * Set up a block of shared memory...
+	 */
+	sh = mmap(NULL, sysconf(_SC_PAGE_SIZE), PROT_READ|PROT_WRITE,
+		  MAP_ANONYMOUS|MAP_SHARED, 0, 0);
+	if (sh == MAP_FAILED) {
+		crit("Unable to allocated shared memory block: %s\n",
+		     strerror(errno));
+	}
+
+	with_x32 = test_x32();
+
+	run("Running tests without ptrace...\n");
 	test_syscall_numbering();
+
+	test_traced_syscall_numbering();
+
+	nerr = sh->nerr;
 	if (!nerr) {
 		ok("All system calls succeeded or failed as expected\n");
 		return 0;

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [tip: x86/entry] selftests/x86/syscall: Update and extend syscall_numbering_64
  2021-05-18 19:12 ` [PATCH v4 1/6] x86/syscall: update and extend selftest syscall_numbering_64 H. Peter Anvin
@ 2021-05-20 13:23   ` tip-bot2 for H. Peter Anvin (Intel)
  0 siblings, 0 replies; 18+ messages in thread
From: tip-bot2 for H. Peter Anvin (Intel) @ 2021-05-20 13:23 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: H. Peter Anvin (Intel), Thomas Gleixner, x86, linux-kernel

The following commit has been merged into the x86/entry branch of tip:

Commit-ID:     15c82d98a0f783bd4b2715ea910f7bb526367f54
Gitweb:        https://git.kernel.org/tip/15c82d98a0f783bd4b2715ea910f7bb526367f54
Author:        H. Peter Anvin (Intel) <hpa@zytor.com>
AuthorDate:    Tue, 18 May 2021 12:12:58 -07:00
Committer:     Thomas Gleixner <tglx@linutronix.de>
CommitterDate: Thu, 20 May 2021 15:19:48 +02:00

selftests/x86/syscall: Update and extend syscall_numbering_64

Update the syscall_numbering_64 selftest to reflect that a system call is
to be extended from 32 bits. Add a mix of tests for valid and invalid
system calls in 64-bit and x32 space.

Use an explicit system call instruction, because the glibc syscall()
wrapper might intercept instructions, extend the system call number
independently, or anything similar.

Use long long instead of long to make it possible to compile this test
on x32 as well as 64 bits.

Signed-off-by: H. Peter Anvin (Intel) <hpa@zytor.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/r/20210518191303.4135296-2-hpa@zytor.com

---
 tools/testing/selftests/x86/syscall_numbering.c | 274 ++++++++++++---
 1 file changed, 222 insertions(+), 52 deletions(-)

diff --git a/tools/testing/selftests/x86/syscall_numbering.c b/tools/testing/selftests/x86/syscall_numbering.c
index d6b09cb..7dd86bc 100644
--- a/tools/testing/selftests/x86/syscall_numbering.c
+++ b/tools/testing/selftests/x86/syscall_numbering.c
@@ -1,6 +1,8 @@
 /* SPDX-License-Identifier: GPL-2.0 */
 /*
- * syscall_arg_fault.c - tests faults 32-bit fast syscall stack args
+ * syscall_numbering.c - test calling the x86-64 kernel with various
+ * valid and invalid system call numbers.
+ *
  * Copyright (c) 2018 Andrew Lutomirski
  */
 
@@ -11,79 +13,247 @@
 #include <stdbool.h>
 #include <errno.h>
 #include <unistd.h>
-#include <syscall.h>
+#include <string.h>
+#include <fcntl.h>
+#include <limits.h>
 
-static int nerrs;
+/* Common system call numbers */
+#define SYS_READ	  0
+#define SYS_WRITE	  1
+#define SYS_GETPID	 39
+/* x64-only system call numbers */
+#define X64_IOCTL	 16
+#define X64_READV	 19
+#define X64_WRITEV	 20
+/* x32-only system call numbers (without X32_BIT) */
+#define X32_IOCTL	514
+#define X32_READV	515
+#define X32_WRITEV	516
 
-#define X32_BIT 0x40000000UL
+#define X32_BIT 0x40000000
 
-static void check_enosys(unsigned long nr, bool *ok)
+static unsigned int nerr = 0;	/* Cumulative error count */
+static int nullfd = -1;		/* File descriptor for /dev/null */
+
+/*
+ * Directly invokes the given syscall with nullfd as the first argument
+ * and the rest zero. Avoids involving glibc wrappers in case they ever
+ * end up intercepting some system calls for some reason, or modify
+ * the system call number itself.
+ */
+static inline long long probe_syscall(int msb, int lsb)
 {
-	/* If this fails, a segfault is reasonably likely. */
-	fflush(stdout);
-
-	long ret = syscall(nr, 0, 0, 0, 0, 0, 0);
-	if (ret == 0) {
-		printf("[FAIL]\tsyscall %lu succeeded, but it should have failed\n", nr);
-		*ok = false;
-	} else if (errno != ENOSYS) {
-		printf("[FAIL]\tsyscall %lu had error code %d, but it should have reported ENOSYS\n", nr, errno);
-		*ok = false;
-	}
+	register long long arg1 asm("rdi") = nullfd;
+	register long long arg2 asm("rsi") = 0;
+	register long long arg3 asm("rdx") = 0;
+	register long long arg4 asm("r10") = 0;
+	register long long arg5 asm("r8")  = 0;
+	register long long arg6 asm("r9")  = 0;
+	long long nr = ((long long)msb << 32) | (unsigned int)lsb;
+	long long ret;
+
+	asm volatile("syscall"
+		     : "=a" (ret)
+		     : "a" (nr), "r" (arg1), "r" (arg2), "r" (arg3),
+		       "r" (arg4), "r" (arg5), "r" (arg6)
+		     : "rcx", "r11", "memory", "cc");
+
+	return ret;
 }
 
-static void test_x32_without_x32_bit(void)
+static const char *syscall_str(int msb, int start, int end)
 {
-	bool ok = true;
+	static char buf[64];
+	const char * const type = (start & X32_BIT) ? "x32" : "x64";
+	int lsb = start;
 
 	/*
-	 * Syscalls 512-547 are "x32" syscalls.  They are intended to be
-	 * called with the x32 (0x40000000) bit set.  Calling them without
-	 * the x32 bit set is nonsense and should not work.
+	 * Improve readability by stripping the x32 bit, but round
+	 * toward zero so we don't display -1 as -1073741825.
 	 */
-	printf("[RUN]\tChecking syscalls 512-547\n");
-	for (int i = 512; i <= 547; i++)
-		check_enosys(i, &ok);
+	if (lsb < 0)
+		lsb |= X32_BIT;
+	else
+		lsb &= ~X32_BIT;
+
+	if (start == end)
+		snprintf(buf, sizeof buf, "%s syscall %d:%d",
+			 type, msb, lsb);
+	else
+		snprintf(buf, sizeof buf, "%s syscalls %d:%d..%d",
+			 type, msb, lsb, lsb + (end-start));
+
+	return buf;
+}
+
+static unsigned int _check_for(int msb, int start, int end, long long expect,
+			       const char *expect_str)
+{
+	unsigned int err = 0;
+
+	for (int nr = start; nr <= end; nr++) {
+		long long ret = probe_syscall(msb, nr);
+
+		if (ret != expect) {
+			printf("[FAIL]\t      %s returned %lld, but it should have returned %s\n",
+			       syscall_str(msb, nr, nr),
+			       ret, expect_str);
+			err++;
+		}
+	}
 
+	if (err) {
+		nerr += err;
+		if (start != end)
+			printf("[FAIL]\t      %s had %u failure%s\n",
+			       syscall_str(msb, start, end),
+			       err, (err == 1) ? "s" : "");
+	} else {
+		printf("[OK]\t      %s returned %s as expected\n",
+		       syscall_str(msb, start, end), expect_str);
+	}
+
+	return err;
+}
+
+#define check_for(msb,start,end,expect) \
+	_check_for(msb,start,end,expect,#expect)
+
+static bool check_zero(int msb, int nr)
+{
+	return check_for(msb, nr, nr, 0);
+}
+
+static bool check_enosys(int msb, int nr)
+{
+	return check_for(msb, nr, nr, -ENOSYS);
+}
+
+/*
+ * Anyone diagnosing a failure will want to know whether the kernel
+ * supports x32. Tell them. This can also be used to conditionalize
+ * tests based on existence or nonexistence of x32.
+ */
+static bool test_x32(void)
+{
+	long long ret;
+	long long mypid = getpid();
+
+	printf("[RUN]\tChecking for x32 by calling x32 getpid()\n");
+	ret = probe_syscall(0, SYS_GETPID | X32_BIT);
+
+	if (ret == mypid) {
+		printf("[INFO]\t   x32 is supported\n");
+		return true;
+	} else if (ret == -ENOSYS) {
+		printf("[INFO]\t   x32 is not supported\n");
+		return false;
+	} else {
+		printf("[FAIL]\t   x32 getpid() returned %lld, but it should have returned either %lld or -ENOSYS\n", ret, mypid);
+		nerr++;
+		return true;	/* Proceed as if... */
+	}
+}
+
+static void test_syscalls_common(int msb)
+{
+	printf("[RUN]\t   Checking some common syscalls as 64 bit\n");
+	check_zero(msb, SYS_READ);
+	check_zero(msb, SYS_WRITE);
+
+	printf("[RUN]\t   Checking some 64-bit only syscalls as 64 bit\n");
+	check_zero(msb, X64_READV);
+	check_zero(msb, X64_WRITEV);
+
+	printf("[RUN]\t   Checking out of range system calls\n");
+	check_for(msb, -64, -1, -ENOSYS);
+	check_for(msb, X32_BIT-64, X32_BIT-1, -ENOSYS);
+	check_for(msb, -64-X32_BIT, -1-X32_BIT, -ENOSYS);
+	check_for(msb, INT_MAX-64, INT_MAX-1, -ENOSYS);
+}
+
+static void test_syscalls_with_x32(int msb)
+{
 	/*
-	 * Check that a handful of 64-bit-only syscalls are rejected if the x32
-	 * bit is set.
+	 * Syscalls 512-547 are "x32" syscalls.  They are
+	 * intended to be called with the x32 (0x40000000) bit
+	 * set.  Calling them without the x32 bit set is
+	 * nonsense and should not work.
 	 */
-	printf("[RUN]\tChecking some 64-bit syscalls in x32 range\n");
-	check_enosys(16 | X32_BIT, &ok);	/* ioctl */
-	check_enosys(19 | X32_BIT, &ok);	/* readv */
-	check_enosys(20 | X32_BIT, &ok);	/* writev */
+	printf("[RUN]\t   Checking x32 syscalls as 64 bit\n");
+	check_for(msb, 512, 547, -ENOSYS);
+
+	printf("[RUN]\t   Checking some common syscalls as x32\n");
+	check_zero(msb, SYS_READ   | X32_BIT);
+	check_zero(msb, SYS_WRITE  | X32_BIT);
+
+	printf("[RUN]\t   Checking some x32 syscalls as x32\n");
+	check_zero(msb, X32_READV  | X32_BIT);
+	check_zero(msb, X32_WRITEV | X32_BIT);
+
+	printf("[RUN]\t   Checking some 64-bit syscalls as x32\n");
+	check_enosys(msb, X64_IOCTL  | X32_BIT);
+	check_enosys(msb, X64_READV  | X32_BIT);
+	check_enosys(msb, X64_WRITEV | X32_BIT);
+}
+
+static void test_syscalls_without_x32(int msb)
+{
+	printf("[RUN]\t  Checking for absence of x32 system calls\n");
+	check_for(msb, 0 | X32_BIT, 999 | X32_BIT, -ENOSYS);
+}
+
+static void test_syscall_numbering(void)
+{
+	static const int msbs[] = {
+		0, 1, -1, X32_BIT-1, X32_BIT, X32_BIT-1, -X32_BIT, INT_MAX,
+		INT_MIN, INT_MIN+1
+	};
+	bool with_x32 = test_x32();
 
 	/*
-	 * Check some syscalls with high bits set.
+	 * The MSB is supposed to be ignored, so we loop over a few
+	 * to test that out.
 	 */
-	printf("[RUN]\tChecking numbers above 2^32-1\n");
-	check_enosys((1UL << 32), &ok);
-	check_enosys(X32_BIT | (1UL << 32), &ok);
+	for (size_t i = 0; i < sizeof(msbs)/sizeof(msbs[0]); i++) {
+		int msb = msbs[i];
+		printf("[RUN]\tChecking system calls with msb = %d (0x%x)\n",
+		       msb, msb);
 
-	if (!ok)
-		nerrs++;
-	else
-		printf("[OK]\tThey all returned -ENOSYS\n");
+		test_syscalls_common(msb);
+		if (with_x32)
+			test_syscalls_with_x32(msb);
+		else
+			test_syscalls_without_x32(msb);
+	}
 }
 
-int main()
+int main(void)
 {
 	/*
-	 * Anyone diagnosing a failure will want to know whether the kernel
-	 * supports x32.  Tell them.
+	 * It is quite likely to get a segfault on a failure, so make
+	 * sure the message gets out by setting stdout to nonbuffered.
 	 */
-	printf("\tChecking for x32...");
-	fflush(stdout);
-	if (syscall(39 | X32_BIT, 0, 0, 0, 0, 0, 0) >= 0) {
-		printf(" supported\n");
-	} else if (errno == ENOSYS) {
-		printf(" not supported\n");
-	} else {
-		printf(" confused\n");
-	}
+	setvbuf(stdout, NULL, _IONBF, 0);
 
-	test_x32_without_x32_bit();
+	/*
+	 * Harmless file descriptor to work on...
+	 */
+	nullfd = open("/dev/null", O_RDWR);
+	if (nullfd < 0) {
+		printf("[FAIL]\tUnable to open /dev/null: %s\n",
+		       strerror(errno));
+		printf("[SKIP]\tCannot execute test\n");
+		return 71;	/* EX_OSERR */
+	}
 
-	return nerrs ? 1 : 0;
+	test_syscall_numbering();
+	if (!nerr) {
+		printf("[OK]\tAll system calls succeeded or failed as expected\n");
+		return 0;
+	} else {
+		printf("[FAIL]\tA total of %u system call%s had incorrect behavior\n",
+		       nerr, nerr != 1 ? "s" : "");
+		return 1;
+	}
 }

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* Re: [PATCH v4 6/6] x86/syscall: use int everywhere for system call numbers
  2021-05-20  8:53   ` Thomas Gleixner
@ 2021-05-21 21:36     ` H. Peter Anvin
  2021-05-22 13:19       ` David Laight
  0 siblings, 1 reply; 18+ messages in thread
From: H. Peter Anvin @ 2021-05-21 21:36 UTC (permalink / raw)
  To: Thomas Gleixner, Ingo Molnar, Andy Lutomirski, Borislav Petkov
  Cc: Linux Kernel Mailing List



On 5/20/21 1:53 AM, Thomas Gleixner wrote:
> On Tue, May 18 2021 at 12:13, H. Peter Anvin wrote:
>> +static __always_inline bool do_syscall_x64(struct pt_regs *regs, int nr)
>> +{
>> +	/*
>> +	 * Convert negative numbers to very high and thus out of range
>> +	 * numbers for comparisons. Use unsigned long to slightly
>> +	 * improve the array_index_nospec() generated code.
> 
> How is that actually improving the generated code?
> 
> unsigned long:
> 
>   104:	48 81 fa bf 01 00 00 	cmp    $0x1bf,%rdx
>   10b:	48 19 c0             	sbb    %rax,%rax
>   10e:	48 21 c2             	and    %rax,%rdx
>   111:	48 89 df             	mov    %rbx,%rdi
>   114:	48 8b 04 d5 00 00 00 	mov    0x0(,%rdx,8),%rax
>   11b:	00
>   11c:	e8 00 00 00 00       	callq  121 <do_syscall_64+0x41>
> 
> unsigned int:
> 
>    f1:	48 81 fa bf 01 00 00 	cmp    $0x1bf,%rdx
>    f8:	48 19 d2             	sbb    %rdx,%rdx
>    fb:	21 d0                	and    %edx,%eax
>    fd:	48 89 df             	mov    %rbx,%rdi
>   100:	48 8b 04 c5 00 00 00 	mov    0x0(,%rax,8),%rax
>   107:	00
>   108:	e8 00 00 00 00       	callq  10d <do_syscall_64+0x3d>
> 
> Text size increases with that unsigned long cast.
> 
> I must be missing something.
> 

"unsigned long" gave slightly better code than "int", but as you 
correctly point out here, "unsigned int" is even better.

Thanks for catching that.

	-hpa


^ permalink raw reply	[flat|nested] 18+ messages in thread

* RE: [PATCH v4 6/6] x86/syscall: use int everywhere for system call numbers
  2021-05-21 21:36     ` H. Peter Anvin
@ 2021-05-22 13:19       ` David Laight
  0 siblings, 0 replies; 18+ messages in thread
From: David Laight @ 2021-05-22 13:19 UTC (permalink / raw)
  To: 'H. Peter Anvin',
	Thomas Gleixner, Ingo Molnar, Andy Lutomirski, Borislav Petkov
  Cc: Linux Kernel Mailing List

From: H. Peter Anvin
> Sent: 21 May 2021 22:37
> 
> On 5/20/21 1:53 AM, Thomas Gleixner wrote:
> > On Tue, May 18 2021 at 12:13, H. Peter Anvin wrote:
> >> +static __always_inline bool do_syscall_x64(struct pt_regs *regs, int nr)
> >> +{
> >> +	/*
> >> +	 * Convert negative numbers to very high and thus out of range
> >> +	 * numbers for comparisons. Use unsigned long to slightly
> >> +	 * improve the array_index_nospec() generated code.
> >
> > How is that actually improving the generated code?
> >
> > unsigned long:
> >
> >   104:	48 81 fa bf 01 00 00 	cmp    $0x1bf,%rdx
> >   10b:	48 19 c0             	sbb    %rax,%rax
> >   10e:	48 21 c2             	and    %rax,%rdx
> >   111:	48 89 df             	mov    %rbx,%rdi
> >   114:	48 8b 04 d5 00 00 00 	mov    0x0(,%rdx,8),%rax
> >   11b:	00
> >   11c:	e8 00 00 00 00       	callq  121 <do_syscall_64+0x41>
> >
> > unsigned int:
> >
> >    f1:	48 81 fa bf 01 00 00 	cmp    $0x1bf,%rdx
> >    f8:	48 19 d2             	sbb    %rdx,%rdx
> >    fb:	21 d0                	and    %edx,%eax
> >    fd:	48 89 df             	mov    %rbx,%rdi
> >   100:	48 8b 04 c5 00 00 00 	mov    0x0(,%rax,8),%rax
> >   107:	00
> >   108:	e8 00 00 00 00       	callq  10d <do_syscall_64+0x3d>
> >
> > Text size increases with that unsigned long cast.
> >
> > I must be missing something.
> >
> 
> "unsigned long" gave slightly better code than "int", but as you
> correctly point out here, "unsigned int" is even better.

Indexing arrays with 'int' almost always ends up generating
an extra instruction to sign-extend the 32bit value to 64bits.
This lengthens the register dependency chain as is likely to
add a clock.

OTOH using 'unsigned int' can save a 'reg' prefix (as here)
marginally reducing the cache footprint.
That might speed it up, but may slow it down!
Rather depends on the exact alignment of instructions
relative to (on Intel cpu) the 16-byte fetch/decode blocks.

Looking at the above code, out of range values get masked
to zero to ensure that speculative execution doesn't expose
anything.
If the syscall number is offset by one before masking
a zero will only be generated for invalid values:

https://godbolt.org/z/av839bsxf

bool do_syscall_x64(struct pt_regs *regs, int nr)
{
	unsigned long unr = nr + 1;

	unr = array_index_nospec(unr, NR_syscalls + 1);
	if (!unr)
		return false;
	regs->ax = sys_call_table[unr - 1](regs);
	return true;
}

This speeds up the native system calls with a slight slow down
of the compat ones.

In principle sys_call_table[] could be offset by one.
So that invalid numbers go through sys_call_table[0].
You wouldn't want to do this if a second table follows.

I'm also seeing better code for 'unsigned long'.
Probably because array_index_mask_nospec() is defined for long.

	David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [tip: x86/entry] x86/entry: Use int everywhere for system call numbers
  2021-05-18 19:13 ` [PATCH v4 6/6] x86/syscall: use int everywhere for system call numbers H. Peter Anvin
  2021-05-20  8:53   ` Thomas Gleixner
@ 2021-05-25  8:13   ` tip-bot2 for H. Peter Anvin (Intel)
  1 sibling, 0 replies; 18+ messages in thread
From: tip-bot2 for H. Peter Anvin (Intel) @ 2021-05-25  8:13 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: H. Peter Anvin (Intel), Thomas Gleixner, x86, linux-kernel

The following commit has been merged into the x86/entry branch of tip:

Commit-ID:     2978996f620001f4e748c79af0fe89be729ef58d
Gitweb:        https://git.kernel.org/tip/2978996f620001f4e748c79af0fe89be729ef58d
Author:        H. Peter Anvin (Intel) <hpa@zytor.com>
AuthorDate:    Tue, 18 May 2021 12:13:03 -07:00
Committer:     Thomas Gleixner <tglx@linutronix.de>
CommitterDate: Tue, 25 May 2021 10:07:00 +02:00

x86/entry: Use int everywhere for system call numbers

System call numbers are defined as int, so use int everywhere for system
call numbers. This is strictly a cleanup; it should not change anything
user visible; all ABI changes have been done in the preceeding patches.

[ tglx: Replaced the unsigned long cast ]

Signed-off-by: H. Peter Anvin (Intel) <hpa@zytor.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/r/20210518191303.4135296-7-hpa@zytor.com

---
 arch/x86/entry/common.c        | 87 ++++++++++++++++++++++-----------
 arch/x86/include/asm/syscall.h |  2 +-
 2 files changed, 60 insertions(+), 29 deletions(-)

diff --git a/arch/x86/entry/common.c b/arch/x86/entry/common.c
index f51bc17..ee95fe3 100644
--- a/arch/x86/entry/common.c
+++ b/arch/x86/entry/common.c
@@ -36,49 +36,81 @@
 #include <asm/irq_stack.h>
 
 #ifdef CONFIG_X86_64
-__visible noinstr void do_syscall_64(struct pt_regs *regs, unsigned long nr)
+
+static __always_inline bool do_syscall_x64(struct pt_regs *regs, int nr)
+{
+	/*
+	 * Convert negative numbers to very high and thus out of range
+	 * numbers for comparisons.
+	 */
+	unsigned int unr = nr;
+
+	if (likely(unr < NR_syscalls)) {
+		unr = array_index_nospec(unr, NR_syscalls);
+		regs->ax = sys_call_table[unr](regs);
+		return true;
+	}
+	return false;
+}
+
+static __always_inline bool do_syscall_x32(struct pt_regs *regs, int nr)
+{
+	/*
+	 * Adjust the starting offset of the table, and convert numbers
+	 * < __X32_SYSCALL_BIT to very high and thus out of range
+	 * numbers for comparisons.
+	 */
+	unsigned int xnr = nr - __X32_SYSCALL_BIT;
+
+	if (IS_ENABLED(CONFIG_X86_X32_ABI) && likely(xnr < X32_NR_syscalls)) {
+		xnr = array_index_nospec(xnr, X32_NR_syscalls);
+		regs->ax = x32_sys_call_table[xnr](regs);
+		return true;
+	}
+	return false;
+}
+
+__visible noinstr void do_syscall_64(struct pt_regs *regs, int nr)
 {
 	add_random_kstack_offset();
 	nr = syscall_enter_from_user_mode(regs, nr);
 
 	instrumentation_begin();
-	if (likely(nr < NR_syscalls)) {
-		nr = array_index_nospec(nr, NR_syscalls);
-		regs->ax = sys_call_table[nr](regs);
-#ifdef CONFIG_X86_X32_ABI
-	} else if (likely((nr & __X32_SYSCALL_BIT) &&
-			  (nr & ~__X32_SYSCALL_BIT) < X32_NR_syscalls)) {
-		nr = array_index_nospec(nr & ~__X32_SYSCALL_BIT,
-					X32_NR_syscalls);
-		regs->ax = x32_sys_call_table[nr](regs);
-#endif
-	} else if (unlikely((int)nr != -1)) {
+
+	if (!do_syscall_x64(regs, nr) && !do_syscall_x32(regs, nr) && nr != -1) {
+		/* Invalid system call, but still a system call. */
 		regs->ax = __x64_sys_ni_syscall(regs);
 	}
+
 	instrumentation_end();
 	syscall_exit_to_user_mode(regs);
 }
 #endif
 
 #if defined(CONFIG_X86_32) || defined(CONFIG_IA32_EMULATION)
-static __always_inline unsigned int syscall_32_enter(struct pt_regs *regs)
+static __always_inline int syscall_32_enter(struct pt_regs *regs)
 {
 	if (IS_ENABLED(CONFIG_IA32_EMULATION))
 		current_thread_info()->status |= TS_COMPAT;
 
-	return (unsigned int)regs->orig_ax;
+	return (int)regs->orig_ax;
 }
 
 /*
  * Invoke a 32-bit syscall.  Called with IRQs on in CONTEXT_KERNEL.
  */
-static __always_inline void do_syscall_32_irqs_on(struct pt_regs *regs,
-						  unsigned int nr)
+static __always_inline void do_syscall_32_irqs_on(struct pt_regs *regs, int nr)
 {
-	if (likely(nr < IA32_NR_syscalls)) {
-		nr = array_index_nospec(nr, IA32_NR_syscalls);
-		regs->ax = ia32_sys_call_table[nr](regs);
-	} else if (unlikely((int)nr != -1)) {
+	/*
+	 * Convert negative numbers to very high and thus out of range
+	 * numbers for comparisons.
+	 */
+	unsigned int unr = nr;
+
+	if (likely(unr < IA32_NR_syscalls)) {
+		unr = array_index_nospec(unr, IA32_NR_syscalls);
+		regs->ax = ia32_sys_call_table[unr](regs);
+	} else if (nr != -1) {
 		regs->ax = __ia32_sys_ni_syscall(regs);
 	}
 }
@@ -86,15 +118,15 @@ static __always_inline void do_syscall_32_irqs_on(struct pt_regs *regs,
 /* Handles int $0x80 */
 __visible noinstr void do_int80_syscall_32(struct pt_regs *regs)
 {
-	unsigned int nr = syscall_32_enter(regs);
+	int nr = syscall_32_enter(regs);
 
 	add_random_kstack_offset();
 	/*
-	 * Subtlety here: if ptrace pokes something larger than 2^32-1 into
-	 * orig_ax, the unsigned int return value truncates it.  This may
-	 * or may not be necessary, but it matches the old asm behavior.
+	 * Subtlety here: if ptrace pokes something larger than 2^31-1 into
+	 * orig_ax, the int return value truncates it. This matches
+	 * the semantics of syscall_get_nr().
 	 */
-	nr = (unsigned int)syscall_enter_from_user_mode(regs, nr);
+	nr = syscall_enter_from_user_mode(regs, nr);
 	instrumentation_begin();
 
 	do_syscall_32_irqs_on(regs, nr);
@@ -105,7 +137,7 @@ __visible noinstr void do_int80_syscall_32(struct pt_regs *regs)
 
 static noinstr bool __do_fast_syscall_32(struct pt_regs *regs)
 {
-	unsigned int nr = syscall_32_enter(regs);
+	int nr = syscall_32_enter(regs);
 	int res;
 
 	add_random_kstack_offset();
@@ -140,8 +172,7 @@ static noinstr bool __do_fast_syscall_32(struct pt_regs *regs)
 		return false;
 	}
 
-	/* The case truncates any ptrace induced syscall nr > 2^32 -1 */
-	nr = (unsigned int)syscall_enter_from_user_mode_work(regs, nr);
+	nr = syscall_enter_from_user_mode_work(regs, nr);
 
 	/* Now this is just like a normal syscall. */
 	do_syscall_32_irqs_on(regs, nr);
diff --git a/arch/x86/include/asm/syscall.h b/arch/x86/include/asm/syscall.h
index f6593ca..f7e2d82 100644
--- a/arch/x86/include/asm/syscall.h
+++ b/arch/x86/include/asm/syscall.h
@@ -159,7 +159,7 @@ static inline int syscall_get_arch(struct task_struct *task)
 		? AUDIT_ARCH_I386 : AUDIT_ARCH_X86_64;
 }
 
-void do_syscall_64(struct pt_regs *regs, unsigned long nr);
+void do_syscall_64(struct pt_regs *regs, int nr);
 void do_int80_syscall_32(struct pt_regs *regs);
 long do_fast_syscall_32(struct pt_regs *regs);
 

^ permalink raw reply related	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2021-05-25  8:13 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-05-18 19:12 [PATCH v4 0/6] x86/syscall: use int for x86-64 system calls H. Peter Anvin
2021-05-18 19:12 ` [PATCH v4 1/6] x86/syscall: update and extend selftest syscall_numbering_64 H. Peter Anvin
2021-05-20 13:23   ` [tip: x86/entry] selftests/x86/syscall: Update and extend syscall_numbering_64 tip-bot2 for H. Peter Anvin (Intel)
2021-05-18 19:12 ` [PATCH v4 2/6] x86/syscall: simplify message reporting in syscall_numbering.c H. Peter Anvin
2021-05-20 13:23   ` [tip: x86/entry] selftests/x86/syscall: Simplify message reporting in syscall_numbering tip-bot2 for H. Peter Anvin (Intel)
2021-05-18 19:13 ` [PATCH v4 3/6] x86/syscall: add tests under ptrace to syscall_numbering.c H. Peter Anvin
2021-05-20 13:23   ` [tip: x86/entry] selftests/x86/syscall: Add tests under ptrace to syscall_numbering_64 tip-bot2 for H. Peter Anvin (Intel)
2021-05-18 19:13 ` [PATCH v4 4/6] x86/syscall: sign-extend system calls on entry to int H. Peter Anvin
2021-05-20 13:23   ` [tip: x86/entry] x86/entry/64: Sign-extend " tip-bot2 for H. Peter Anvin (Intel)
2021-05-18 19:13 ` [PATCH v4 5/6] x86/syscall: treat out of range and gap system calls the same H. Peter Anvin
2021-05-20 13:23   ` [tip: x86/entry] x86/entry: Treat " tip-bot2 for H. Peter Anvin (Intel)
2021-05-18 19:13 ` [PATCH v4 6/6] x86/syscall: use int everywhere for system call numbers H. Peter Anvin
2021-05-20  8:53   ` Thomas Gleixner
2021-05-21 21:36     ` H. Peter Anvin
2021-05-22 13:19       ` David Laight
2021-05-25  8:13   ` [tip: x86/entry] x86/entry: Use " tip-bot2 for H. Peter Anvin (Intel)
2021-05-19 11:29 ` [PATCH v4 0/6] x86/syscall: use int for x86-64 system calls Ingo Molnar
2021-05-19 16:17   ` H. Peter Anvin

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).